Minitab Cookbook
上QQ阅读APP看书,第一时间看更新

Creating a subset of data in a new worksheet

Here, we will want to obtain a smaller set of data from a large worksheet. This is useful when we do not want to analyze all the data, or maybe only a small portion is of interest to us. We will create a subset of this worksheet to look only at the most recent years, that is, from 2000 onwards.

Getting ready

The data shown in the following screenshot is from the Met office (http://www.metoffice.gov.uk/climate/uk/stationdata/) and shows weather details for the Oxford weather station on a monthly basis from 1853. We will subset the data to view the results from 2000 onwards. Select the Oxford station data and copy the data into Minitab.

Note

When copying the data into Minitab from the website, only select the information for the year and weather. Copying the column header information will create text columns. Minitab only allows one header for column names, and the dataset here has a header for names and a second header for units. It is advised that we only grab the information for now and rename the columns afterwards.

The Oxford weather (cleaned).mtw file is provided in the code bundle.

Getting ready

How to do it...

The following instructions will generate a new worksheet of weather data 2000 onwards:

  1. Go to the Data menu and click on Subset Worksheet.
  2. Rename the new worksheet as Temperature for 2000 onwards and click on the Condition button.
    How to do it...
  3. In the Condition section, double-click on the Year column to move this into the Condition section, and then add >= 2000, as shown in the following screenshot. Then click on OK twice.
    How to do it...

How it works…

Subset worksheet can create a new worksheet based on an entered condition, row numbers, or selected data that we have brushed on a chart.

This gives a bit more control than split worksheet, and we only generate one worksheet.

There's more…

With this command, data can be excluded or included very quickly.

Row numbers can be specified in the Row numbers field, either singly, with a space between each value (for example, 2 4 8 9 10), or a range of row numbers can be entered by using a colon (for example, 100:150).

The Condition section works in the same way as the calculator. Multiple conditions can be added with the use of And, Or, and Not. Text values should be referred to within double quotes (for example, "April") and date formatted data should be used as DATE("05/11/2000").

See also

  • The Splitting a worksheet by categorical column recipe