LESSON TWO

Procedure Frequencies  

In this lesson, first we will learn how to enter data, edit the variable name, and assign labels and values. Next, we will create a frequency table, produce descriptive statistics, and plot a histogram.

Quantitative Data

How to Enter Data Data

Example 1 Dr. Smith gave a 20-item quiz to ten students.

 

Produce a frequency table for this small data set. 

 SPSS for Windows

 A. Open a new SPSS Data Editor window: File / New / Data.

 B. How to Enter data

In the Data View

     a. A heavy border appears around the first data cell in the first column.

  b. Type the first score `16`.

Note that this value will appear in the cell editor. Press the Enter key. Wait a moment. The data value will appear in the cell. Continue entering all the remaining data values.

 

 

C. By entering a value in the first column, you automatically create a new variable with the default name var00001.

D. Create your own variable names

 Click the Variable View tab as shown below

  

 The Variable View window will appear.

 

To edit the variable name, double click on var00001. Delete var00001 and type in `score` to replace the default one.

 Next, click the Data View tab to return to the data view window.   

E. Choose a statistical procedure.

    a. From the main menus choose: Analyze / Descriptive Statistics / Frequencies

[To know more about Procedure Frequencies, click on the Help button. Close the help topic window when you are done.]

     b. Select the variable ‘scores’ to be analyzed. By default, it will display the frequency table.

     c. Click the Statistics button and select the statistics you want SPSS to compute as shown below.

   


Percentile Values: Select Quartiles. Quartiles are points which divide a distribution of scores  into quarters.

Dispersion: Select Standard deviation, variance, range, Minimum, Maximum.

Central Tendency: Select Mean, Median, Mode, and Sum.

Distribution: Select Skewness and Kurtosis.

d. Click continue. Return to Frequencies dialog box. Click the Charts button. Since the variable "score" is a continuous variable. Select Histograms with normal curve as shown below.




Click Continue and OK.

SPSS Output

Descriptive Statistics

 

1. Measures of Central Tendency and Variability

Measures of Central Tendency: The single most representative  value in the distribution of scores..

(1) The median, the central value a set of ordered scores, is 15. It divides the distribution of scores into equal halves. Half of the students scored less than 15 while other half scored greater than 15.

(2) The most frequently occurring score is 16. It appears four times. The mode is 16.

(3) The mean score is 15. The mean can be defined as the sum of all scores, divided by the number of scores: 150/10 = 15.

Measures of Variability: Quantify the degree to which the scores are different from each other. The range, variance, and standard deviation are examples of measures of variability. Note that SPSS only compute the unbiased variance and standard deviation.

Measures of Relative Standing

A student made a score of 13 on the test. Did he or she do well?

Transform obtained scores into standard scores.

(1) Compare a score of 13 to the mean.

 X - M = 13 - 15 =
-2. 

Is a score of 13 is
below or above the mean?

(2) Compute the standard score

The
unbiased standard deviation (s) equals 1.0541.

z = (13-15 ) / 1.0541= -1.9

A score of 13 is
1.9 standard deviations below the mean. It is not a good score.

2. Shape of the Distribution

Symmetry

A value of zero for the skewness indicates a symmetric distribution. Describe the distribution.

Skewness = -.712. The sign is negative. The distribution is negatively skewed. The tail of the distribution points toward the lower end of the scale. 

Hypothesis Testing

Is the coefficient of skewness significantly different from zero? We will discuss the topic in other lessons.

Peakedness

A value of zero for the kurtosis indicates a shape close to normal.

Kurtosis = -.45. The sign is negative. The distribution is platykurtic. (It is flatter than the normal distribution.) 

Hypothesis Testing

Is the coefficient of kurtosis significantly different from zero? We will discuss the topic in other lessons.

Visualization

 "Score" is a quantitative variable. We will visualize the distribution by plotting a histogram later.

3. Quartiles are points which divide a distribution of scores  into quarters. The first quartile is the 25th percentile. The second quartile is the 50th percentile or the median. The 75th percentile is the third quartile.

The 25th percentile score is 14.  That is, 25% of the students who took the same test scored at or below a score of 14.

The 50th percentile score is 15.  That is, 50% of the students who took the same test scored at or below a score of 15.

The 75th percentile score is 16.  That is, 75% of the students who took the same test scored at or below a score of 16.

Optional Reading: Percentiles and Textbook Definitions - Confused or What by Pal Barrett

Frequency Table

There are only a few data values. The frequency table is shown below

 Examine the Frequency column.

Count the number of times a score occurs. The frequency associated with the value of 13 is 1. The frequency associated with the value of 14 is 2.The frequency associated with the value of 15 is 3. The frequency associated with the value of 16 is 4. The mode is defined as the most frequently occurring score in the distribution of a variable. Thus, the mode is 16. 

 Examine the Cumulative Percent column.

Approximately 60% of the students had a score of 15 or less.

Histogram

"Score" is a quantitative variable. Visualize the frequency distribution: histogram. The distribution was slightly skewed to the left. The tail of the distribution points toward the lower end of the scale. Skewness = -.712. 

 

Categorical (Qualitative) Data

How to Define Labels and Values

Example 2  Dr. Smith asked fifteen students in his class on what days of the week they were born. The results are shown below.

 

 A. Code Data

The variable `dayofwk` is a categorical variable. One way to simplify data entry is to assign numbers or symbols to represent responses.

Assign 1 for `Sunday`, 2 for 'Monday', 3 for 'Tuesday', 4 for 'Wednesday', 5 for 'Thursday', 6 for 'Friday', 7 for 'Saturday' and 9 for ' Missing '.

B. The most common method of representing frequency of categorical membership is a bar chart. Our task is to produce a bar chart.

 

SPSS for Windows

A. Open a new Data Editor window. From the menus choose: File / New / Data.

B. Define variable names, variable labels, value labels, and user-missing values.

a. Name: Define the variable name. Click the Variable View tab. Double click on the textbox. Type in the variable name “dayofwk” as shown below

   

             b. Label: Assign the variable DAYOFWK an extended descriptive label DAY OF THE WEEK.

             Double click on the textbox. Type in the long label DAY OF THE WEEK as shown below

   

             c. Values: Assign descriptive labels to values.

   Double click on the textbox and a gray square will appear. Click on the gray square as shown below

             A Value Label dialog box will appear.

            

            (a) Click inside of the Value text box and type 1.

            (b) Press the Tab key or click inside of the Value Label text box and type Sunday.

            (c) Click on Add. The value label is added to the list as shown below.

            

(d) Continue entering the other values (2 to 7) and their descriptive labels (Monday to Saturday).

 Finally, click Ok to end the Value Labels input.

            d. Missing Values.

The easiest way to handle missing data is to leave them blank when entering data. To distinguish among different types of missing data, the missing values command can be used. For example, we can code a respondent’s forgetfulness as 9 and a respondent’s refusal as 99.

Double click on the textbox and a gray square will appear. Click on the gray square as shown below

 

 The Missing Values dialog box will appear.

 (a) Select Discrete Missing Value.

 (b) Type 9 in the first text box.

            

             
(c) Click OK.  

e. Measure

 Click on Measure. A down arrow will appear.

            

Click on the down arrow, choose Nominal. (In our example, the order of the days of the week is not important.)

    

    Finally, click on the Data View tab and return to the Data View window.  

C. Enter data values.

D. Procedure frequencies

             a. From the menus choose: Analyze / Descriptive Statistics / Frequencies

             b. Highlight the variable ‘dayofwk’ and Click on the > pushbutton.

             c. Click on Charts. This opens a Frequencies: Charts dialog box.

             (a) Select Bar Chart(s). Note that the variable `dayofwk` is a categorical variable.

             (b) Click Continue or press the Enter key. Return to the Frequencies dialog box.

d. Click OK in the Frequencies dialog box. The frequency table and the chart will be displayed in the Viewer window.

            

 About 27% of students were born on Thursday.

SPSS Output

 

 

Visualize the frequency distribution of the categorical variable, Day of the Week: Bar chart.

Examine the highest point in the chart.
The mode is _____.  

The mode is Thursday. Note that the mode is often used  when the data are on a nominal scale. The mode is the simplest measure of location of a distribution.