In our previous blog post, we discussed the different options for programmers to create web scrapers in several programming languages. This approach to web scraping is fine for people with a proficient background in programming. Unfortunately, there are far more people who need to extract data from the web, that don’t have the necessary programming skills.
Smart Harvesting II
In the project “Smart Harvesting II”, software-based solutions for the collection and processing of bibliographic data from the web are developed. Up to now, this work has been done manually in many facilities and is therefore very labour-intensive and time-consuming.
A main focus of the Smart Harvesting II project was the topic of web scraping. In this post, we are going to explain what web scraping - also called web data extraction - really is, and how you would program a software that performs this task.