Python Web Scraping Cookbook
上QQ阅读APP看书,第一时间看更新

Using Scrapy selectors

Scrapy is a Python web spider framework that is used to extract data from websites. It provides many powerful features for navigating entire websites, such as the ability to follow links. One feature it provides is the ability to find data within a document using the DOM, and using the now, quite familiar, XPath.

In this recipe we will load the list of current questions on StackOverflow, and then parse this using a scrapy selector. Using that selector, we will extract the text of each question.