Python Web Scraping Cookbook
上QQ阅读APP看书,第一时间看更新

How to do it...

The script for this recipe is 01/04_events_with_selenium.py.

  1. The following is the code:
from selenium import webdriver

def get_upcoming_events(url):
driver = webdriver.Firefox()
driver.get(url)

events = driver.find_elements_by_xpath('//ul[contains(@class, "list-recent-events")]/li')

for event in events:
event_details = dict()
event_details['name'] = event.find_element_by_xpath('h3[@class="event-title"]/a').text
event_details['location'] = event.find_element_by_xpath('p/span[@class="event-location"]').text
event_details['time'] = event.find_element_by_xpath('p/time').text
print(event_details)

driver.close()

get_upcoming_events('https://www.python.org/events/python-events/')
  1. And run the script with Python.  You will see familiar output:
~ $ python 04_events_with_selenium.py
{'name': 'PyCascades 2018', 'location': 'Granville Island Stage, 1585 Johnston St, Vancouver, BC V6H 3R9, Canada', 'time': '22 Jan. – 24 Jan.'}
{'name': 'PyCon Cameroon 2018', 'location': 'Limbe, Cameroon', 'time': '24 Jan. – 29 Jan.'}
{'name': 'FOSDEM 2018', 'location': 'ULB Campus du Solbosch, Av. F. D. Roosevelt 50, 1050 Bruxelles, Belgium', 'time': '03 Feb. – 05 Feb.'}
{'name': 'PyCon Pune 2018', 'location': 'Pune, India', 'time': '08 Feb. – 12 Feb.'}
{'name': 'PyCon Colombia 2018', 'location': 'Medellin, Colombia', 'time': '09 Feb. – 12 Feb.'}
{'name': 'PyTennessee 2018', 'location': 'Nashville, TN, USA', 'time': '10 Feb. – 12 Feb.'}

During this process, Firefox will pop up and open the page. We have reused the previous recipe and adopted Selenium. 

The Window Popped up by Firefox