Hello
So 6 month ago I decided I wanted to scrape pcandparts.com for their parts data. I used BeautifulSoup and URLlib in Python to get web data, scheduled to run every 6 hours if my laptop is running.
A couple of weeks ago pcandparts changed their site, my old script would have to be updated, it was time to go through the data. And the results are here! I made another script to identify the parts and used Pandas to get everything together.
Here is a sample result: A graph of some SSD prices:
Pc&Parts data.zip
Things I can do:
- Go through the RAM section, which is currently being skipped because it uses a different format
- Identify parts by Description, not Part#. Sometimes the same Part# is used twice, or used for two different parts at different times (Ex: 2TB HDD then 4TB HDD).
If you have any questions, or want to learn about web scraping, please let me know.
Enjoy data