Practical Data Science Cookbook(Second Edition)
上QQ阅读APP看书,第一时间看更新

Acquiring automobile fuel efficiency data

Every data science project starts with data and this chapter begins in the same manner. For this recipe, we will dive into a dataset that contains fuel efficiency performance metrics, measured in Miles Per Gallon (MPG) over time, for most makes and models of automobiles available in the US since 1984. This data is courtesy of the U.S. Department of Energy and the US Environmental Protection Agency. In addition to fuel efficiency data, the dataset also contains several features and attributes of the automobiles listed, thereby providing the opportunity to summarize and group data to determine which groups tend to have better fuel efficiency historically and how this has changed over the years. The latest version of the dataset is available at http://www.fueleconomy.gov/feg/epadata/vehicles.csv.zip, and information about the variables in the dataset can be found at http://www.fueleconomy.gov/feg/ws/index.shtml#vehicle. The data was last updated on December 4, 2013 and was downloaded on December 8, 2013.

We recommend that you use the copy of the dataset provided with the code for this book to ensure that the results described in this chapter match what your efforts produce.