Measures of Central Tendency

 

Statistical analysis often begins with description of typical values of variables, their means, medians, and modes, also called the measures of central tendency. The arithmetic mean is a measure of central tendency commonly referred to as an 'average.' The median was discussed by Gauss in 1816 and the idea was elaborated by Fechner in 1878. Fechner called the median Centralwerth, the central value of an ordered series of scores, symbolized by the letter C. The mode, Dichtestewerth, was defined as the locus of a distribution where it is densest, symbolized by the letter D.

Learning Statistics by Doing Statistics

Task A

Professor Stanley administered a statistics test to seven students in his class. The students earned the following scores: 10, 6, 6, 6, 8, 9, 7. Our first task is to find the mean, the median, and the mode.

1. Data Entry

Start a new project by choosing (Projects, New Project). Next, choose (Data, Enter) to bring up the Data Entry window. Click on the default letter A and label the first variable as X. Press the Enter key. The cursor will be advanced to the first data cell.

Enter the following seven scores (10, 6, 6, 6, 8, 9, 7). Remember to press the Enter key following each data entry. Note that the cursor should be on row 8, marked with the pencil tip.

Click the Accept button and the data set will be transferred to the Vector Display. By default, the associated descriptors will be displayed toward the bottom. Note that there are 7 cases (n = 7). However, only six cases are visible. Click on Expand to show all 7 cases.

Drag the top blue bar to move the Vector Display window to a preferred position.

 

2. Name the Project

Choose (Projects, Project Name). Type Measures of Central Tendency in the textbox. Click Accept.

3. The Median and the Mode

A distribution means the arrangement of any set of scores in order of magnitude. First, arrange the scores in order from smallest to largest values Select (Modify, Sort). Click Sort Order and the Sort in Ascending Order option will appear. Select the variable X and  click Append. The ordered series of scores will  be added and the new variable is automatically labeled as SortAsc.

Note that this variable has odd numbers of cases (n = 7) and the median represents the midpoint of a distribution. Thus, the median (C) is 7. Half of the other scores are below it and half are above it. Also, notice that the score of 6 occurs three times. It is the most frequently occurring score in the distribution of the variable. The mode, D, is 6.

4. The Mean and the Median

Select Descriptors: Click on any descriptors on the Vector Display to bring up optional descriptors. Select Sum, Median, and click Accept.

    

The mean (M) can be defined as the sum (∑) of all scores, divided by the number of cases (n). Thus, M = 52/7 = 7.429. Note that the mean is pulled higher than the median because of these three scores (8, 9, and10). 

5. Create a Frequency Table

The frequency of a given score is the number of times the score occurs. A frequency table can be constructed by listing scores in ascending order with their corresponding frequencies. Choose (Frequencies, Frequencies) from the top menu. Select the variable SortAsc and click Accept. Examine the frequency table. The lowest value is 6 and the highest value is 10. Note that there are three students who earned a score of 6, one student who earned a score of 7, one student who earned a score of 8, and etc. The mode is 6.

  

6. Rename a Variable Using a Shortcut

Click on any variable name on the Vector Display to bring up the Specify Column Names dialog box. Erase the variable name, argFreq. Rename it to Test Scores. Click Accept.

7. Visualize of a Frequency Distribution

Create a line chart with the test scores plotted on the horizontal axis and the frequency plotted on the vertical axis. These two variables, Test Scores and Frequency, are required to plot the line graph. Select Graphs from the Modules bar. Click  Line Graphs under the Graphs Indexed by Attributes category. Abscissa: Select the variable Test Scores. Ordinate: select the variable Frequency. Click Accept. Change the default 3-D graph to a 2D graph by clicking on the 3D/2D icon. The resulting line graph would look like this.

Note that the chart has a single peak. The distribution is not symmetric. The tail is toward larger values. The distribution is skewed to the right. It has a positive skew. In general, the mean will be higher than the median when a distribution has a positive skew.

8. Label the Axes and the Chart Using Shortcuts

To label the X axis, right click a data value on the X-axis to bring up a shortcut menu. Select Edit title and type Test Scores in the textbox. Click anywhere outside the textbox to exit.

To label the Y axis, right click a data value on the Y-axis to bring up a shortcut menu. Select Edit title and type Frequency in the textbox. To label the chart, right click the Chart area (the gray area outside the plot) to bring up a shortcut menu and select Edit title. Type Distribution of Test Scores in the textbox. You may copy and paste the chart to a word processor. Close the Chart window.

9. Produce Descriptive Statistics and Frequency Tables

Choose (Analysis I, Descriptive Statistics). Select the variable X. Select n, Median, Mean, List Data, and click Enter. The data set and the associated descriptive statistics will be displayed in the output window. To obtain the frequency table, choose (Frequencies, List Frequencies). Select the variable X and click Accept. The frequency table will appear, along with other related tables (cumulative frequencies and the proportions). Copy and paste the results to a word process. 

 

Cumulative Frequency: The cumulative frequency is the running total of frequencies. For example, the cumulative frequency for the score of 7 is 3 + 1 = 4. The cumulative frequency for the score of 9 is 3 + 1 + 1 + 1 = 6.

Relative Frequency: Frequency counts can be measured in terms of proportions or percentages. For example, there were three student who earned a score of 6. The total number of students is 7. Thus, 3 / 7 = .43 = 43%. About 43 percent of the students scored 6 on the test.

Cumulative Proportions: For our example, approximately 86% of the students had a score of 9 or less.  

10. Compute the Median for Even Number of Cases

Median is computed differently for odd and even number of cases. If the number of scores in the distribution is even, the median is the middle value extrapolated from the adjacent scores to the theoretical midpoint of the distribution. This extrapolation is frequently accomplished by averaging both adjacent scores.

To create an even set of numbers, use the Truncate command to shorten the length of the variable X.

(1) First, delete the variables on the Vector Display except the variable X. The three variables to be cleared are SortAsc, argFreq, and Frequency. Choose (Data, Delete). Click the variable SortAsc and hold down the Shift key while pressing the Down Arrow key twice to highlight the variables we want to remove. Release the Shift key and click Clear and Compact or Clear Selected Variables as shown below. 

 

 


(2) Truncate the variable X: Adjust the length of the variable X to six cases. Choose
(Reshape, Truncate). Select the variable X. Define Length: 6. Click Accept.

(3) Sort Data: A distribution means the arrangement of any set of scores in order of magnitude. To form a distribution of the scores, sort the scores in ascending order. Choose (Modify, Sort). Click Sorting Order to select Sort in Ascending Order. Select the variable X and click Append.

(4) Compute the media for the even set of numbers. Examine the ordered series of scores. There are six scores (n = 6). The two middle values are 6 and 8. To find the half-way between them, add them together and divide by 2. C = (6 + 8) / 2 = 7

Task B

Throw a coin 15 times. How many heads are you likely to see? To summarize and describe the result of your experiment, create a frequency table and a bar chart.

1. Animated Coin

Start a new project. Choose (Animations, Animated Coin) to start tossing a coin. Click the End button to terminate the experiment when the number of throws is 15.

The result will appear on the Vector Display. There are two possible results. Heads are coded as 1. Tails are coded as 0. Note that the result will not be the same as shown below due to random chance.
 

 

Binary numbers are defined as numbers taking on only 0 and 1 values. The variable Throw is a binary variable. Count the number of zeros. Count the number of ones. Choose (Frequencies, Frequencies) and select the variable Throw to obtain a frequency table.

Frequency counts can be measured in terms of proportions or percentages. Next, Choose (Frequencies, Proportions) and and select the variable Throw.

Create a bar chart with the number of heads on the horizontal axis and the percentage frequency on the vertical axis. Choose Graphs. Select Bar Graphs under the Graphs Indexed by Attributes category. Abscissa: arg Prop. Ordinate: Prop, Right click a data value on the Y-axis to bring up a shortcut menu. Choose Properties. Select Scale tab and click drop down arrow next to format and select Percentage. Finally, label the chart and the axes as shown below.