Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Course Scraper #2

Open
Waidhoferj opened this issue Sep 26, 2021 · 2 comments
Open

Course Scraper #2

Waidhoferj opened this issue Sep 26, 2021 · 2 comments
Assignees
Labels
enhancement New feature or request web scraping

Comments

@Waidhoferj
Copy link
Member

Create a new scraper in the scrapers folder that gets course information from the cal poly course catalog. See the Course object in models.py to understand the schema. Review CollegeScraper as an example of a web scraper template.

@Waidhoferj Waidhoferj added enhancement New feature or request web scraping labels Sep 26, 2021
@IdoPesok IdoPesok self-assigned this Nov 10, 2021
@probably-neb
Copy link

Looking into the sections issue today I found this website which has a table of all of the classes, what they are cross-listed as, their units, the GE area they fulfill, and the terms they are typically offered in. As some of this (most notably the terms they are typically offered in) is useful information and does not seem to be collected in #9 or #10 I felt It might be helpful to share. As far as I can tell the table can't be scraped using Beautiful Soup as it is rendered using JavaScript to pull from this csv file. I would be happy to make a pull request after #9 is merged trying to add the extra information from the csv if that would be helpful.

@Waidhoferj
Copy link
Member Author

Thanks for finding that @probably-neb. If you need to scrape a site rendered with JS, you can use a headless browser like this lib. Another option would be to pull in the CSV and parse that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request web scraping
Projects
None yet
Development

No branches or pull requests

3 participants