更新时间:2021-07-09 21:29:02
coverpage
Web Scraping with Python
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files eBooks discount offers and more
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Chapter 1. Introduction to Web Scraping
When is web scraping useful?
Is web scraping legal?
Background research
Crawling your first website
Summary
Chapter 2. Scraping the Data
Analyzing a web page
Three approaches to scrape a web page
Chapter 3. Caching Downloads
Adding cache support to the link crawler
Disk cache
Database cache
Chapter 4. Concurrent Downloading
One million web pages
Sequential crawler
Threaded crawler
Performance
Chapter 5. Dynamic Content
An example dynamic web page
Reverse engineering a dynamic web page
Rendering a dynamic web page
Chapter 6. Interacting with Forms
The Login form
Extending the login script to update content
Automating forms with the Mechanize module
Chapter 7. Solving CAPTCHA
Registering an account
Optical Character Recognition
Solving complex CAPTCHAs
Chapter 8. Scrapy
Installation
Starting a project
Visual scraping with Portia
Automated scraping with Scrapely
Chapter 9. Overview
Google search engine
Facebook
Gap
BMW
Index