Creating basic scatter plots
This recipe describes how to make scatter plots using some very simple commands. We'll go from a single line of code, which makes a scatter plot from preloaded data, to a script of few lines that produces a scatter plot customized with colors, titles, and axes limits specified by us.
Getting ready
All you need to do to get started is start R. You should have the R prompt on your screen as shown in the following screenshot:
How to do it...
Let's use one of R's built-in datasets called cars
to look at the relationship between the speed of cars and the distances taken to stop the cars (recorded in the 1920s).
To make your first scatter plot, type the following command in the R prompt:
plot(cars$dist~cars$speed)
This should bring up a window with the following graph that shows the relationship between the distance travelled by cars plotted with their speeds:
Now, let's tweak the graph to make it look better. Type the following code in the R prompt:
plot(cars$dist~cars$speed, # y~x main="Relationship between car distance & speed", # Plot Title xlab="Speed (miles per hour)", #X axis title ylab="Distance travelled (miles)", #Y axis title xlim=c(0,30), #Set x axis limits from 0 to 30 ylim=c(0,140), #Set y axis limits from 0 to 140 xaxs="i", #Set x axis style as internal yaxs="i", #Set y axis style as internal col="red", #Set the color of plotting symbol to red pch=19) #Set the plotting symbol to filled dots
This should produce the following result:
How it works...
R comes preloaded with many datasets. In the example, we used one such dataset called cars
, which has two columns of data, with the names speed
and dist
. To see the data, simply type cars
in the R prompt and press the Enter key:
>cars speed dist 1 4 2 2 4 10 3 7 4 4 7 22 . . . 47 24 92 48 24 93 49 24 120 50 25 85 >
As the output from the R command line shows, the cars dataset has 2 columns and 50 rows of data.
The plot()
command is the simplest way to make scatter plots (and other types of plots, as we'll see in a moment). In the first example, we simply pass the x
and y
arguments that we want to plot in the plot(y~x)
form, that is, we want to plot distance versus speed. This produces a simple scatter plot. In the second example, we pass a few additional arguments that provide R with more information on how we want the graph to look.
The main
argument sets the plot title; xlab
and ylab
set the x and y axes titles, respectively; xlim
and ylim
set the minimum and maximum values of the labels on the x and y axes, respectively; xaxs
and yaxs
set the style of the axes; and col
and pch
set the scatter plot symbol color and type, respectively. All of these arguments and more are explained in detail in Chapter 3, Beyond the Basics – Adjusting Key Parameters.
There's more...
Instead of the plot(y~x)
notation used in the preceding examples, you can also use plot(x,y)
. For more details on all the arguments the plot()
command can take, see the help documentation by typing in ?plot
or help(plot)
at the R prompt, after plotting the first dataset with plot()
.
If you want to plot another set of points on the same graph, say from another dataset or the same data points but with another symbol on top, you can use the points()
function:
points(cars$dist~cars$speed,pch=3)
A note on R's built-in datasets
In addition to the cars dataset used in the example, R has many more datasets, which come as part of the base installation in a package called datasets. To see the complete list of available datasets, call the data()
function simply by running it at the R prompt:
data()
See also
Scatter plots are covered in a lot more detail in Chapter 4, Creating Scatter Plots.