
There's more...
In step 9, it appears that manual transmissions are more efficient than automatic transmissions, and they both exhibit the same increase, on an average, since 2008. However, there is something odd here. There appears to be many very efficient cars (less than 40 MPG) with automatic transmissions in later years, and almost no manual transmission cars with similar efficiencies in the same time frame. The pattern is reversed in earlier years. Is there a change in the proportion of manual cars available each year? Yes.
With this recipe, we threw you into the deep end of data analysis with R, using two very important R packages, plyr and ggplot2. Just as traditional software development has design patterns for common constructs, a few such patterns are emerging in the field of data science. One of the most notable is the split-apply-combine pattern highlighted by Dr. Hadley Wickham. In this strategy, one breaks up the problem into smaller, more manageable pieces by some variable. Once aggregated, you perform an operation on the new grouped data, and then combine the results into a new data structure. As you can see in this recipe, we used this strategy of split-apply-combine repeatedly, examining the data from many different perspectives, as a result.
Beyond plyr, this recipe heavily leveraged the ggplot2 library, which deserves additional exposition. We will refrain from providing an extensive ggplot2 tutorial as there are a number of excellent tutorials available online. What is important is that you understand the important idea of how ggplot2 allows you to construct such complex statistical visualizations in such a terse fashion.