R Graphs Cookbook Second Edition
上QQ阅读APP看书,第一时间看更新

Creating basic scatter plots

This recipe describes how to make scatter plots using some very simple commands. We'll go from a single line of code, which makes a scatter plot from preloaded data, to a script of few lines that produces a scatter plot customized with colors, titles, and axes limits specified by us.

Getting ready

All you need to do to get started is start R. You should have the R prompt on your screen as shown in the following screenshot:

How to do it...

Let's use one of R's built-in datasets called cars to look at the relationship between the speed of cars and the distances taken to stop the cars (recorded in the 1920s).

To make your first scatter plot, type the following command in the R prompt:

plot(cars$dist~cars$speed)

This should bring up a window with the following graph that shows the relationship between the distance travelled by cars plotted with their speeds:

Now, let's tweak the graph to make it look better. Type the following code in the R prompt:

plot(cars$dist~cars$speed, # y~x
main="Relationship between car distance & speed", # Plot Title
xlab="Speed (miles per hour)", #X axis title
ylab="Distance travelled (miles)", #Y axis title
xlim=c(0,30), #Set x axis limits from 0 to 30
ylim=c(0,140), #Set y axis limits from 0 to 140
xaxs="i", #Set x axis style as internal
yaxs="i", #Set y axis style as internal 
col="red", #Set the color of plotting symbol to red
pch=19) #Set the plotting symbol to filled dots

This should produce the following result:

How it works...

R comes preloaded with many datasets. In the example, we used one such dataset called cars, which has two columns of data, with the names speed and dist. To see the data, simply type cars in the R prompt and press the Enter key:

>cars
   speed  dist
1      4     2
2      4    10
3      7     4
4      7    22
. . .
47    24    92
48    24    93
49    24   120
50    25    85
>

As the output from the R command line shows, the cars dataset has 2 columns and 50 rows of data.

The plot() command is the simplest way to make scatter plots (and other types of plots, as we'll see in a moment). In the first example, we simply pass the x and y arguments that we want to plot in the plot(y~x) form, that is, we want to plot distance versus speed. This produces a simple scatter plot. In the second example, we pass a few additional arguments that provide R with more information on how we want the graph to look.

The main argument sets the plot title; xlab and ylab set the x and y axes titles, respectively; xlim and ylim set the minimum and maximum values of the labels on the x and y axes, respectively; xaxs and yaxs set the style of the axes; and col and pch set the scatter plot symbol color and type, respectively. All of these arguments and more are explained in detail in Chapter 3, Beyond the Basics – Adjusting Key Parameters.

There's more...

Instead of the plot(y~x) notation used in the preceding examples, you can also use plot(x,y). For more details on all the arguments the plot() command can take, see the help documentation by typing in ?plot or help(plot) at the R prompt, after plotting the first dataset with plot().

If you want to plot another set of points on the same graph, say from another dataset or the same data points but with another symbol on top, you can use the points() function:

points(cars$dist~cars$speed,pch=3)

A note on R's built-in datasets

In addition to the cars dataset used in the example, R has many more datasets, which come as part of the base installation in a package called datasets. To see the complete list of available datasets, call the data() function simply by running it at the R prompt:

data()

See also

Scatter plots are covered in a lot more detail in Chapter 4, Creating Scatter Plots.