eBabbie Resource Center

SPSS Learner's Guide

SPSS 11.0 for Windows

Getting Started
Opening a Data File
Saving Changes
Getting Around with SPSS Windows
Frequency Distributions
Cross-Tabulations
Recoding Variables
Multivariate Tables
Tests of Statistical Significance
Correlation and Regression
Creating Indexes
Graphics
Making Copies of Results for a Paper
Shutting Down

    This guide will provide you with a brief overview of the Statistical Package for the Social SciencesTM (SPSS). For many years, SPSS has been the most commonly used program for quantitative data analysis in the social sciences. It has gone through many versions for both the Windows and Macintosh platforms. This guide will use SPSS 11.0 along with data from the 2000 General Social Survey. If you’re using a different version of SPSS or a different data set, you’ll need to make some adjustments, but this guide nonetheless introduces you to the overall logic and application of SPSS. Whatever version you have, consult the user manual for whatever additional assistance you need.

    In addition, several books have recently been written to introduce social researchers to SPSS. One is Earl Babbie, Fred Halley, and Jeanne Zaino, Adventures in Social Research, Thousand Oaks, CA: Pine Forge Press, 2000.

Getting Started

    When you first open SPSS 11.0 for Windows on your computer, Figure 1 below will appear.  You have several options to start with.  You can click on “Run tutorial” and click OK.  This is a good step for you to get familiar with some of the basic features of SPSS.  You also have the option to create your own data.  You would choose this option if you had collected your own survey and were ready to transpose your respondents’ answers into an SPSS data spreadsheet.  For you who are using the GSS data, you will need to ask SPSS to import a data file already formatted and ready for analysis.  GSS files will often come as SPSS files (they are recognizable by their .sav extension) or as SPSS portable files (saved as .por files). 

Figure 1. Opening SPSS

fig01

     The next step is to load some data into the program.


Opening a Data File

     When “open an existing data source” is selected click OK.  You will have to browse your computer and select the location in which your GSS file is saved.  Notice the bottom of this window.  You can opt for not using this dialog window in the future.  When you start SPSS, the window in the background will appear and you can simply click on “open” and select the GSS data file you want to analyze.  The advantage of this dialogue box is that after your first opening of the GSS file, SPSS will list your file in place where “More files…” is highlighted.  In other words, if you need to work on this file again you can simply select it with you mouse and click OK.  SPSS will open your file in one step only.

     Let me guide you through the process of opening the GSS file included in your textbook package.  Simply close the window in Figure 1.  Look at the top row in the figure below. It contains several menus: File, Edit, View, Data, Transform, Analyze, Graphs, Utilities, Windows, and Help. We’ll use these menus throughout this guide. 

     Notice that the first letter of each menu name is underlined (e.g., File). This means you can activate that menu by holding down the ALT (control) key and typing the underlined letter. Thus if you press ALT+F, you activate the File menu. You can accomplish the same thing by simply clicking the menu name. In either fashion, you would end up with the screen shown in Figure 2.

Figure 2. Opening the File Menu

fig02

    As you can see, the File menu offers several possible actions, but right now we’re only interested in opening a file. Click Open to indicate that you want to load a data file into the program. As you may recognize from the notation to the right of the command, we could have accomplished the same thing by striking CTRL+O without even going into the File menu.

    Next you’ll see a dialogue box asking you to select among several options.  Open the “data” menu.  Your steps in browsing your computer to open the data file is illustrated in Figures 3 and 4 below.

Figure 3. Picking a Data Set

fig03


    Select the data set you want by double-clicking it or by single-clicking it and then clicking Open. We’ve selected the 2000 General Social Survey (GSS) data set (located on the zip drive under the folder “2000”), containing hundreds of variables collected from 2,817 respondents. The research was conducted by the National Opinion Research Center at the University of Chicago to establish a representative sample of U.S. residents 18 and older. You, of course, may be working with a different data set, such as the one that came with this book.  In your case, open the CD drive to open your data file.

Figure 4. Changing the File Type

fig04

    It is important to note that SPSS is set to open .sav files by default.  In my case, the GSS 2000 data file was created as a .por file.  I had to change to change the file type to “SPSS Portable (*.por)” as illustrated in Figure 4 above.  When my GSS200.por file appeared I made sure it was highlighted and clicked “open.”  See Figure 5 below.

Figure 5. Choosing a SPSS Portable Data Set

fig05



Saving Changes

    When you modify your data set—creating recoded variables, for example—you’ll probably want to save those changes for later use. Realize that any such changes will stay in effect throughout this SPSS session, but when you exit the program (see the last section of these SPSS Guidelines) you can lose all your changes. It’s wise to save changes as soon as you’re sure you want to.

    Saving an altered file is hardly rocket science. First, select the Data View window or the Variable View window (not the Output window which contains all statistical jobs you asked SPSS to perform on your data). Then, under the File menu, select Save. Alternatively, you can simply press CTRL+S. From now on, when you open this file, it will contain your alterations.

    Realize that when you save the file in this fashion, the changed file replaces the original one. So if you madly deleted data or altered variables using their original names, you’ll have put the original file forever out of reach.

    If you wish to save the original file as well as your changes, choose Save As under the File menu (see Figure 6). This time, SPSS will ask you to supply a name for the data set about to be saved. Use some name other than that of the original data set. Also, pay attention to where on your disk it is saved so you can find it later on.

Figure 6. Saving a SPSS portable Data Set

fig06

    You may also want to save your data file as an SPSS file if it is a portable file or other type of data spreadsheet. In Figure 7, I saved my GSS2000.por as a GSS2000.sav which will save me some time in processing this file in the future.  There won’t be a need to wait for conversion time.

Figure 7. Saving as a .sas file

fig07



Getting Around with SPSS Windows

    By default, the window that appears is the “Data Editor” window.  No matter what data file you open, the setting of this spreadsheet is always the same. Each of the columns represents a variable, such as the respondent’s gender, age, or attitude about abortion. Each row represents a particular respondent. Thus each cell of the matrix stores some item of information about a person.  In Figure 2, all the cells are empty.

    Once you’ve instructed SPSS to open your data set, the original matrix will be filled with data, the way it is in Figure 8 below. Notice that the row just above the matrix now contains the names of the variables comprising the data set: hrs1, wrkgovt, and so forth. SPSS uses abbreviated labels, each no more than eight characters long.

Figure 8. A Full Data Matrix

fig08

    Notice in Figure 8 that the upper-right cell, which is the cell that link together case number one (respondent number 1) and the variable labeled wrkgovt.  The first respondent has a value of 2 on the variable wrkgovt. But what on earth does that mean?

    This is where SPSS 11.0 is different from earlier versions.  In addition to the “Data View” sub-window, the Data Editor window has a Variable View sub-window.  You can simply click on “Variable View” at the bottom of the Data Editor window.  In Figure 9 below, you can see how this window is organized.  This time, the rows represent the variables and the columns are the various categorizations associated with each variable.  The columns are as follows: Name, Type, Width, Decimals, Label, Values, Missing, Columns, Align, and Measure.  The Name is the abbreviated name of the variable (it is always no longer than 8 characters).  The Type of the variable is often “numeric” but could be “string” if you wanted to input words as data instead of numbers.  Label is the description of the variable and indicates more clearly what the question on the questionnaire was about.  In the column “Value” you can find the values associated with each possible answer for each variable.  Notice also that you can increase the width of any of these columns on the variable view of the data editor.  This feature is particularly useful if you want to read the variable label in its totality.

Figure 9. A Full Variable Matrix

fig09

    You can find the meaning of a particular variable label in several ways. First, and easiest, by finding the variable name on the variable view. If you were on the data view, you can double-click the variable name in the column heading and the variable view will open automatically and highlight the row of the variable you just double-clicked on. Here’s another way to learn about variables in the data set. Go to the Utilities menu above the data matrix and select the first option, Variables.   See Figure 10 below for an illustration.  A list of all the variables and the way they were formatted will appear and you can simply select the variable of your choice from this window by clicking on it.  Here I selected the divorce variable.


Figure 10. Decoding the Variable Divorce

fig10

    Variables are listed in the order they were imported from the GSS site.  However, if you want to see them listed alphabetically, open the Edit menu and select the Option submenu.  Then as in Figure 11 below, select Alphabetical from the Variable List option in the General sub-window.  Then click OK twice. Note: The list in the left column may consist of the variable names instead of the abbreviated labels, but you can change this easily. In the Edit menu, select Options. In the General tag, find the section on Variable Lists in the right-hand column. Click Labels. You’ll have to reload the data set, but it will be worth the effort, because you’ll be able to track down the abbreviated name you’re looking for.


Figure 11. Sorting the Variable List

fig11

    Now reopen the variable information window from the Utilities menu.  All variables are listed alphabetically.  Notice the words next to Variable Label: “EVER BEEN DIVORCED OR SEPARATED.” While this is still abbreviated, you may figure out that it represents whether this person has been divorced or separated. You can also view the value labels instead of the numeric value they have been coded as.  See Figure 12 below.  Simply make sure you are on the data view, select View from the menu and click on Value Label.  A check mark will appear next to this menu selection and you will be able to read directly what your respondents’ answers were for each variable.  For instance, we now can see that respondent number 1623 is female, 50 years old, and married to a man who is 51 years old. To turn it off, by the way, simply open View and click Value Labels again. Notice that the check mark indicates whether the feature is on or off.


Figure 12. Viewing Variable Value Labels

fig12

    In Figure 13 you can see the numeric values “1” and “2” for divorce replaced by “Yes” and “No” respectively.


Figure 13. Viewing Variable Value Labels on Data View

fig13

    You can obtain the full wording most easily from the GSS Web site. The codebook index (by variable name) is located at http://www.icpsr.umich.edu/GSS. Click on “Mnemonic.” That will take you to a list of variables beginning with a.  You can see any other variable by simply clicking on the first letter of the variable you are looking for.
 
    Let’s take a close look at the variable abany, which we will use in our analysis further on. Now you can see more clearly what this variable represents. Respondents were asked a battery of questions concerning their attitudes toward abortion—specifically, the conditions (e.g., rape, danger of birth defects) under which they felt a woman should be able to obtain one legally. In this case, they were asked if they would support a woman’s right to a legal abortion as a purely personal choice: “for any reason.”

    Besides presenting the actual wording of the question, this Web page also reports the answer categories and the results of several surveys that asked the question over the years. Notice that a 1 “punch” stands for saying “yes.” Now we know that the first person in the data set feels that a woman should be able to choose an abortion for any reason.

    Let’s go back to variable abany.  To find out what “0” means, let’s learn how to examine variable codes within SPSS.  Double-click abany in the column heading.  The variable view window opens and the variable divorce is automatically highlighted.  Now click on the little square in the value label cell for abany as shown in Figure 14 below.


Figure 14. Viewing Decoding Value Labels on Variable View

fig14

    As you can see in Figure 15, “0” stands for “NAP,” which means “not applicable.” In other words, this particular question was not asked of some respondents.


Figure 15. Code Labels for Abany

fig15

    Another method to learn about the value labels of a particular variable is to select Variables from the Utilities menu.  A list of variables appear alphabetically as you can see in Figure 16 below.  Figure 16 presents the result of this action.  First you’ll see that we have pretty much the same information we obtained before. Notice the column to the left, however. It’s the beginning of a list of all the variables in the data set. (You can use the scroll bar to see the rest of the list.) Find the name of a variable you’re interested in and click it. You’ll instantly get the variable and value labels.

    Again you can view all value labels for abany by simply selecting this variable.  The advantage of this subcomand is that you also have the option to find the variable abany quickly on your SPSS data editor by clicking on Go To.  The abany column will be selected and highlighted instantly on your data view window. 


Figure 16. Viewing Decoding Value Labels for Abany

fig16

    Person 3, for example, was not asked this question. By asking different sets of questions of different people in the sample, the researchers can collect data for hundreds of variables without driving any of the respondents to suicide or homocide. 

Frequency Distributions

    Now that we’ve seen what the abbreviated variable labels and numerical code categories stand for, we’re ready to examine some public opinion. Think about the question we’ve looked at so far. How do you suppose people in the United States feel about a woman’s right to an abortion? That is to say, what percentage do you suppose said “yes” and what percentage said “no”? To start finding out, select Frequencies... from the Descriptive Statistics menu in the Analyze general menu(see Figure 17).


Figure 17. Getting Frequency Distributions

fig17

This command will get you a list of variables to choose from, as illustrated in Figure 18.


Figure 18. Choosing a Variable for Frequencies

fig18

    Now you can double-click a variable label or else single-click it and then click the right-pointing triangular arrow. Either of these actions will move labels from the left-hand to the right-hand column. Figure 19 shows the results of three variables being selected this way.


Figure 19. Selecting Frequency Variables

fig19

    Next we click the OK button. SPSS will zick and whirr as it determines the distributions of responses to each of the three variables in our example. It will then produce the frequency tables shown in Figure 20.


Figure 20. Frequency Distribution Tables

fig20

    Notice that SPSS has now opened a new window. The first was a data window and the new one is labeled Output. As you continue with SPSS, you’ll often work back and forth between these two windows; often the program alternates them automatically.

    The left-hand frame in Figure 20 presents an outline of the results. Click on any item in the outline to fill the right-hand frame with the data you’ve requested.  Here is where you can easily cut and paste from SPSS to a MS Word document.  Click on the outline Frequency Table in the left-hand-side.  All the frequency tables are now selected.  Select Copy objects from the Edit menu.  Now open your Word document and select Paste from the Edit menu.  Figures 21 and 22 illustrate this process.  In this case I was only interested in copying and pasting the frequency table for abany.


Figure 21. Selecting Frequency Distribution Tables

fig21


Figure 22. Copying Frequency Distribution Tables

fig22

    In Figure 20 the right-hand side presents two tables. The first one summarizes the three variables we chose originally. All we are told here, however, is the number of respondents with valid responses and those without. The second table gives the distribution of data for the question of whether a woman should have the right to an abortion for any reason. In addition to “yes” and “no,” the table reports three other possibilities:

        NAP: “Not applicable” (the question was not asked)
        DK: Respondents who said they “Don’t know”
        NA: Respondents who were asked but gave “No answer”

    In the second table’s Frequency column you can learn how many respondents fall under each of the categories. The Percent column puts the information into a more useful form by showing the percentage represented by each category. The most useful column is Valid Percent. This column tells us that of the 1,768 respondents who gave a valid response, 39.9 percent said “yes” and 60.1 percent said “no.” We might interpret these results by saying that opinions on this issue are almost evenly divided.

    By scrolling down the window or using the outline in the left-hand frame, you can check the results for the other variables. For now let’s move along to more complex analyses.

Cross-Tabulations

    The frequency distributions we’ve just undertaken are called univariate analyses (analyzing one variable at a time). Now we’ll turn to bivariate analyses (two variables at a time).

    Let’s stay with the issue of “abortion for any reason.” We’ve seen that U.S. residents are about evenly divided on the issue. What do you suppose accounts for this difference? People often guess that women would be more likely than men to support abortion as a woman’s right. Let’s see how to determine the accuracy of that guess.

    Return to the Descriptive Statistics menu, but check Crosstabs this time. This brings you a somewhat different dialog box, as indicated in Figure 23.


Figure 23. Crosstabs Dialog Box

fig23

    We are now going to set up a percentage table involving two variables: abany and sex. The table will have both columns and rows. While there are many ways to construct such a table, we’re going to assign the categories of sex (male and female) to the columns. Then we’ll look at the opinions on abany within each of those categories. In the logic and language of SPSS, that makes abany the “row variable” and sex the “column variable.” To assign categories, select variable labels from the list and drag them to the appropriate windows on the right. Figure 24 shows this step.


Figure 24. Selecting Variables for the Crosstab

fig24

    To find a variable label in the list, you can either scroll through the list or click any label in the list and then type the variable you want. It may take a little experimentation to discover how quickly you must type to have it work.

    Thus far, we’ve told SPSS to organize the table like this:

                                        Men    Women
                    Approve       
                    Disapprove       

    To complete our request, we have to tell SPSS how to percentage the data. In this case, we’ll ask for the percentage of men who approve of abortion and the percentage who disapprove, with the two percentages totaling 100 percent. Then we’ll ask for the corresponding percentages of women. In other words, ask SPSS to “percentage down” the columns. The Crosstab option provides a means for us to indicate that preference. Click the Cells button in the dialog box (See Figure 25).


Figure 25. Specifying the Percentaging Method

fig25

    When the dialog box opens, the Observed box will already be checked. Leave it that way. In the section on Percentages, click the Column box. That instructs SPSS to percentage down the columns. Click Continue to complete this dialog, and then click OK to launch the request for a crosstab. Once SPSS has completed the table, we’ll be returned to the Output window, as in Figure 26.


Figure 26. Crosstab of abany and sex

fig26

    Let’s see what the table tells us. We wanted to find out if men and women differed in their attitudes about whether a woman should be able to choose an abortion just because she wanted one. The table suggests that there’s no appreciable difference. The same proportion of men (39.9%) and women (39.8%) say a woman should have the right to an abortion for any reason.

    Let’s try another variable that could affect people’s attitudes toward abortion: political orientation. In the GSS, polviews represents a standard item that asks respondents to characterize their political views as something between “Extremely Liberal” and “Extremely Conservative.” Figure 27 shows impact of this variable on attitudes toward abortion.


Figure 27. Crosstab of abany and polviews

fig27

    Because there are so many categories for political views, you may have to use the scroll bar at the bottom of the window to move back and forth across the table. Notice that we’ve scrolled all the way to the right in Figure 27.
The impact of political views on abortion attitudes is pretty clear. Overall, liberals support abortion more than do conservatives. The only exception to the pattern is that people who are “Extremely Liberal” are less supportive than those who are “Liberal.” This result appears a lot, perhaps because of the different ways people interpret the two political terms.

Recoding Variables

    It’s often useful to recode variables with many categories, reducing the number to something more manageable. In the present case, we might want to combine the categories in polviews to make three: “Liberal,” “Moderate,” and “Conservative.”

    We can combine categories by hand from the kind of table presented in Figure 27. For example, we can easily calculate that 447 of the respondents in the table considered themselves liberals (62 + 203 + 182). Of those, 247 supported a woman’s right to an abortion for any reason (42 + 110 + 95). Dividing these two numbers tells us that 55 percent of the liberals supported abortion. A similar calculation tells us that 152 of the 553 conservatives—27 percent—were supportive. The 42-percent support among moderates fits neatly between the liberals and conservatives.

    Combining categories like this makes it easier to use the variable in further analyses. However, we should have SPSS create a new, recoded variable so that we don’t have to undertake the job by hand each time. To do this, we must first return to the Data window. If you’re in the Output window, you can simply click the Data window icon in the task bar at the bottom of your screen or select the SPSS Data Editor tag from the Window menu (see Figure 28 below).


Figure 28. Switching to Data View

fig28

 Once you’ve returned to the Data window, click the Transform menu and move your pointer to the Recode option. When you do that, you’ll be presented with another choice, as Figure 29 shows.


Figure 29. Requesting a Recode

fig29

    SPSS offers two options for recoding: Either it will modify the data contained under the existing variable label (Same Variables) or it will create a new variable for the modified results (Different Variables). Choose Different Variables, because the first option will destroy the original data.

    Next you’ll see a large dialog box like the one in Figure 30.


Figure 30. The Recode Dialog Box

fig30

    Initially the right-hand frame will have nothing in it. To create the situation shown in Figure 31:

1. Select polviews in the variable list and move it to the center frame by double-clicking it or using the triangular arrow.
2. Type polviewr in the space under Output Variable Name and click Change.
3. Type in a descriptive label to identify what polviewr stands for.

Figure 31. The Completed Recode Dialog Box

fig31

    To continue the process, click Old and New Values. This will bring you the dialog box shown in Figure 32.


Figure 32. Specifying How to Recode Categories

fig32

    To tell SPSS how to create polviewr from polviews, we identify values of polviews and indicating what values they should get in polviewr. Let’s start by creating a “Liberals” category that includes everyone with a “1,” “2,” or “3” on polviews. We’ll give the new category the value “1.”

    In Figure 33, we’ve chosen the “Range” option and indicated that anyone with a value of “1” through “3” on polviews should be assigned a “1” on polviewr. Make sure you see where those instructions are entered in the dialog box.


Figure 33. Creating “Liberals” as a Single Category

fig33

    When you click the Add button, the transformation instruction is transferred to the field on the right-hand side of the dialog box, as you can see in Figure 34.


Figure 34. Renumbering the “Moderate” Category

fig34

    We’ll use a different option to create a new “Moderate” category. As you recall, they were scored “4” on polviews. We’ll give them a “2” in polviewr by entering the old and new values in the Old Value and New Value fields. When we click Add again, the new instruction is added to the field. Now take a moment to figure out how you would create a “Conservative” category, transforming scores of 5, 6, and 7 on polviews into a score of 3 on polviewr. Once you’ve done that, you should have the dialog box shown in Figure 35. All that remains now is to click Continue, which will return you to the earlier dialog box, and then click OK.


Figure 35. The Recoding Instructions Completed

fig35

    Let’s tidy up our new variable. First return to the Data View window. Next, scroll across the list of variables to the far-right end. SPSS places each new variable at the end of all the other variables.  Since you just created a new variable called polviewr, SPSS created a new column located last on your spreadsheet.  When you find polviewr, double-click the variable label at the head of the column. This will open up the Variable View window and polviewr will be autolatically selected at the bottom of your variable list.  See Figure 36 below.

Figure 36. Finding Polviewr in Variable View

fig36

    Click on the righ-hand-side of the cell located in the Decimals column and polviewr row. A little square located in the cell will appear as shown in Figure 37. You can then reduce to 0 the number of decimals for each value of polviewr.  In other words, you can convert each 1.00 score to simply 1. 


Figure 37. Changing Decimals Format for Polviewr in Variable View

fig37

    Now click on the righ-hand-side button in the Values column and the polviewr row.  A Value Labels dialog box will appear as shown in Figure 38. Give then names to the new category values:
1. Type “1” in the Value field
2. Type “Liberal” in the Value Label field
3. Click the Add button
    Repeat the process to assign “Moderate” to the value of “2” and “Conservative” to “3,” being sure to click Add each time. When you’re done, the dialog box should look like Figure 38. Click Continue, then OK.


Figure 38. Assigning Value Labels to polviewr

fig38

    Click then on the righ-hand-side button in the Missing column and the polviewr row.  The dialog box shown in Figure 39.  Type 9 in the first space available under Discrete Missing Values.  You have just indicated to SPSS that any value 9 in the data set for polviewr should be considered “missing answer” and removed from statistical computation.


Figure 39. Assigning Missing Values to polviewr

fig39

    Finally you can change the measurement type for polviewr as shown in Figure 40. I selected Ordinal for polviewr since the values can be ranked from low to high level of conservatism.


Figure 40. Assigning Measurement Type to polviewr

fig40

Now when you select Analyze/Descriptive Statistics/Frequencies and scroll through the list of variables, you’ll find a new entry in the list: polviewr. Choose it to see the frequency distribution generated by our new categories (see Figure 41).


Figure 41. Frequency Distribution for polviewr

fig41

    Since we have gone to all this trouble to make our analysis simpler, let’s see if it worked. Let’s use polviewr to reexamine the relationship between political orientations and attitudes toward abortion. Use Analyze/Descriptive Statistics /Crosstabs to create a table with abany and polviewr. Figure 41 illustrates what you should get.


Figure 42. Crosstab of abany and polviewr **

fig42

    Notice how much easier it is to read this table, compared with the one presented in Figure 27. We see that 55 percent of the liberals, 42 percent of the moderates and 28 percent of the conservatives support a woman’s right to an abortion for any reason. (It’s good to round off the decimal points in percentages like these, since they’re based on samples, which only provide estimates of populations in the first place.)

Multivariate Tables

    Bivariate tables are typically only the beginning of quantitative data analysis. For example, you might want to see if the observed relationship between politics and abortion holds equally for men and women. SPSS makes it a simple matter to satisfy your curiosity uabout such matters.

    Return to Analyze/Descriptive Statistics /Crosstabs and specify a third variable as shown in Figure 43.


Figure 43. Trivariate Table Request

fig43

    Notice that we’ve simply transferred the third variable, sex, into the bottom field in the dialog box. Press OK to see the result, illustrated in Figure 44.


Figure 44. Table of abany by polviewr by sex

fig44

    In a sense, this new table splits the one shown in Figure 42 into two parts. The top half shows the relationship between polviewr and abany for men, the bottom half shows the same relationship for women. We can see immediately that the original relationship is replicated for each of the gender categories.

    In the far-right column, the summary statistics show you the relationship between gender and support for abortion. Overall (i.e., forgetting about political orientations), an equal 41 percent for men and women support a woman’s right to an abortion for any reason—an interesting similarity. (Notice that I've rounded off the figures, 40.5% and 40.8%, presented in the table.)  It seems that there is no sex effect on abany.  The sex of the respondent does not matter for this GSS question. 

    Comparing men and women in the other columns of the table tells us that sex has little impact no matter what a person’s political orientation is. Women are more supportive among liberals, men are slightly more supportive among moderates and among conservatives. None of the differences are very large, however.

    SPSS allows you to go beyond trivariate tables, though they grow increasingly difficult to read and analyze. To experiment with this possibility, click the Next button near the bottom of the Crosstabs dialog box to add new Layers of variables to the table.

Tests of Statistical Significance

    In the previous example, I casually remarked that the percentage differences were not very large. This was a subjective assessment of the substantive significance of the differences.

    As you know, tests of statistical significance can determine the likelihood that relationships observed in a sample are merely an artifact of sampling error rather than a reflection of a real difference in the population from which the sample was drawn. Let’s take a look at how SPSS offers us the use of those tests.

    Return to the Crosstabs dialog box, via Analyze/Descriptive Statistics. At the bottom of the box, click a button marked Statistics. Figure 45 illustrates the results.


Figure 45. Choice of Statistics in a Crosstabs Dialog Box **

fig45

    As you can see, SPSS offers several summary statistics, three of which you’ll recognize from this textbook: chi square, lambda, and gamma. Recall that chi square is appropriate to nominal variables such as abany and sex, so let’s use them to see how we can use SPSS to work with chi square.

    Click the Chi-square box. Then click Continue and enter abany and sex in the appropriate places in the Crosstabs dialog box. In addition to the regular percentage table, SPSS now provides an additional table, shown in Figure 46.


Figure 46. Chi Square for abany and sex

fig46

    If you’ve had a statistics course, you’ll recognize many of the tests presented in this table. For our purposes, let’s focus on the first row of the results, the “Pearson Chi-Square.” The third column tells us the probability that sampling error alone could have generated a relationship as strong as the one we’ve observed, if men and women in the whole population were exactly the same in their attitudes toward abortion. Specifically, it tells us that the probability is .972, 97 chances in 100.  This probability level is extremely high.  Thus, the chi square test confirms what we had concluded subjectively from our crosstabulations: Men and women do not differ at all in their support for a woman’s right to an abortion.
The relationship between abany and polviewr was much stronger. Let’s see how chi square evaluates that relationship. Repeat the above procedure, changing sex to polviewr. Notice that you don’t have to select Chi-square again, any more than you have to select Columns. SPSS maintains those specifications until you shut down the program. When you start it up again, you’ll have to specify such preferences again. And, of course, you can turn them off any time you no longer desire them.


Figure 47. Chi square for abany and polviewr

fig47

    See Figure 47 above for the chi square evaluation of abany and polviewr. Notice that the significance in this case is calculated at “.000”. SPSS only presents the first three decimal points in this calculation. Hence, the likelihood of the observed relationship being simply a product of sampling error isn’t exactly zero—it could happen‚ but the chances are not very high that it did. Specifically, the probability is less than .001, or less than one chance in a thousand, which is a commonly used standard for statistical significance. Thus, we conclude that the relationship we’ve observed in this carefully selected sample very likely represents something that exists in the larger population.

Correlation and Regression

    Thus far we’ve been examining nominal and ordinal data, which constitute the bulk of social science data. SPSS can also help you work with interval and ratio data.
For example, you may have heard that highly educated people tend to have fewer children than do those with less education. Let’s use SPSS to see if it’s so. In the GSS, these variables are educ and childs. Under Analyze, select Correlate and, when asked, Bivariate. (SPSS can undertake more complex correlational analyses, but we’ll keep it simple for this introduction.) In the Correlations dialog box, select educ and childs and click OK. That will produce the result shown in Figure 48.


Figure 48. Pearson’s Product Moment Correlation

fig48

    As you can see, there is a correlation of –.210 (or a “negative correlation of .210”) between the two variables. The negative correlation means that as the years of education increase, the number of children decreases.

    Of course, this analysis cannot determine the causal direction, so we could also say that as the number of children increases, the amount of education completed decreases. Both interpretations make sense and probably apply in some cases. Some young people have to cut their education short to accommodate the demands of parenthood, and those who keep going to school may have to delay parenthood and have fewer children once they get started.

    Whatever the explanation for the relationship, SPSS informs us that the correlation is significant at the 0.01 level. In other words, sampling error could account for a correlation like this one less than once in a hundred times.
Although we entered only two variables in this analysis, SPSS will accept as many at a time as you want and will create a correlation matrix in which every variable is correlated with every other variable. Experiment with this possibility.
Regression analysis builds on the logic of correlation and creates equations that predict values of one variable based on values of others. Here’s how we could represent the relationship between childs and educ as a regression equation. Under Analyze, select Regression, choosing Linear from among the alternatives offered. This will present you with the dialog box presented in Figure 49.


Figure 49. Regression Dialog Box

fig49


    Let’s use the logic of accounting for the number of children people have; thus, childs is our dependent variable, and educ is the independent variable we’ll use to account for differences in numbers of children. Enter the two variables into the dialog box as show in Figure 49 above. Click OK, to get the result shown in Figure 50.


Figure 50. Linear Regression Predicting childs with educ

fig50

    SPSS will present you with three tables of calculations, but we are only interested here in the third one, Coefficients. In fact, we’re only interested in the first column of this table, Unstandardized Coefficients. The first of these, the Constant, represents the value of the dependent variable (number of children) when the value of the independent variable (years of education) is zero. Statisticians sometimes refer to this as the y-intercept or the point where the line crosses the y-axis, when the regression line is plotted on a graph.

    The B value associated with the independent variable (–.121) indicates how much the dependent variable changes with each added unit of the independent variable In our example, this means what change in the number of children we should expect for each added year of education. Stated as an equation, the regression looks like this:
childs = 3.407 – (.121 x educ)
    Suppose a person has 10 years of education. We would predict she or he has
3.407 – 1.21 = 2.197 children
    For college graduates with 16 years of education we’d predict they have
3.407 – 1.936 = 1.471 children
    Clearly, these estimates represent statistical averages, because no one can have 2.197 or 1.471 children. Still if you were to bet on the number of children people had and knew only their education, this equation would be your best guide for betting. If you could make a lot of bets on this basis, you’d be a winner overall.

    To explore regression further, try adding another independent variable. SPSS will provide you with a new y-intercept and coefficients for each of the independent variables. Be sure to interpret positive and negative signs correctly.

Creating Indexes

    The textbook discussed the creation of composite measures, such as indexes and scales. This section looks briefly at how to use SPSS to create a simple index.

    Without reviewing the logic of index construction, let’s create an index of sexual permissiveness including the following GSS variables:
premarsx: sex before marriage
xmarsex: sex with person other than spouse
homosex: homosexual relations
    In each of these items, respondents were asked whether the action was
1. Always wrong
2. Almost always wrong
3. Sometimes wrong
4. Not wrong at all
    Given the format of these three items, we can create a composite index quite simply. Although the values 1–4 used to represent the answers to these questions are merely labels-just as we used “1” for male and “2” for female—we can in this case take advantage of their numerical quality. In each of these items, the higher the numerical code, the higher the level of sexual permissiveness. If we add the values respondents received on the three items, the possible totals range from 3 to 12, with 12 representing the highest and 3 the lowest degree of sexual permissiveness.

    We can now generate the index by using the Transform/Compute menu option, as illustrated in Figure 51. Enter the information by typing or by selecting the variable names from the list and clicking the plus sign in the keypad provided in the dialog box (see Figure 51).


Figure 51. Adding the values of premarsx, xmarsex, and homosex

fig51

    When you’re through, click OK in the dialog box. SPSS will create a new variable, sexperm, in your data set and will assign the appropriate values to each of the respondents. In the Data window, scroll to the far right and find the new variable in the last column. Scroll up and down to see the values assigned to respondents. Those with no values in the new column were missing data on the three items used to construct it.

For a more comprehensive view of the new variable, run the frequency distribution for sexperm (Figure 52).


Figure 52. The Frequency Distribution for sexperm

fig52

    Having created a composite measure such as this one, it’s always good to validate the scores if possible. That is, if the index scores truly distinguish levels of sexual permissiveness, then those scores should predict the answers people gave to other questions. For example, we might wonder if attitudes toward abortion are related to sexual permissiveness. We can find out by running a cross-tabulation of the index and, say, abany.

    The result of this validation effort is presented in Figure 53. Notice that this table uses a somewhat different format than those we’ve created earlier. Given the large number of categories comprising sexperm, it’s difficult to fit the table on the computer screen (and in this book). Thus, I have made sexperm the row variable, made abany the column variable, and requested that the table be percentaged by row rather than column. Thus we read this table “down,” whereas we’ve been reading earlier ones “across.”


Figure 53. Validating the sexperm Index

fig53

    The relationship between these two variables is extremely strong and consistent. Of those with a score of 3 on the index (representing the lowest level of sexual permissiveness), only 19 percent support a woman’s right to an abortion for any reason. This percentage increases steadily as index scores increase, reaching 67 percent among those with a score of 12 on the index.

    Creating an index from variables that do not permit such a simple addition of code values is a little more involved. To illustrate, let’s create an index of where respondents stand on the issue of guns. Two items in the GSS are relevant:
gunlaw: favor or oppose gun permits
1. Favor
2. Oppose
owngun: have gun in home
1. Yes
2. No
    It makes sense that those who have a gun and oppose requiring permits for owning guns are the most pro-gun, while those without a gun of their own and who favor gun permits would be the most anti-gun. Notice, however, that the pro-gun position is represented by a “2” on gunlaw and a “1” on owngun. Thus, we can’t simply add the values Here’s how to generate a simple index from these two items.

    Let’s create a new variable, progun, for which higher scores indicate greater support for guns. To start this process, return to Transform/Compute. Type in the Target Variable and give everyone a starting score of “0,” as in Figure 54. Click OK to create the variable. Then return to Transform/Compute and change the “0” to “progun + 1” as indicated in Figure 55.


Figure 54. Initializing progun

fig54


Figure 55. Adding a Point to progun

fig55

    We’re not going to add a point to everyone’s index score, however. Click the If button near the bottom of the dialog box, so we can specify the conditions under which we want to add a point. Next, click the button beside the phrase “include if case satisfies condition.” Then create the condition shown in Figure 56. By doing this, we’re telling SPSS to give an additional point to anyone who said they oppose gun permits (i.e., a “2” on gunlaw).


Figure 56. Adding a Point for Opposing Gun Permits

fig56

    Click Continue to return to the earlier dialog box. Then click OK to instruct SPSS to take the action specified. When SPSS tells you that you’re about to change an existing variable, say “yes.”

    Select Transform/Compute again. Notice that the earlier instruction to add a point is still there. Leave it, but click If in order to modify the condition. Change it to specify those who said they owned a gun (“1” on owngun) by indicating “owngun = 1” as the condition. Click Continue, then OK, then “yes,” as before. Now those who had a score of 0 for favoring gun permits will now get 1 point (for a total of “1”) if they own a gun; they still have 0 points if they don’t have a gun. Those who scored a point for opposing gun permits will get another point if they own a gun (a total of “2”) but will stay at 1 point if they don’t have a gun. The resulting index is made up of the scores “0,” “1,” and “2.”

    There’s only one problem with the index as it stands. Since everyone started with 0 points, those who didn’t answer one or both of these questions will end up with a score of zero, thus seeming to oppose guns. The final step in creating this index involves culling out those with missing data.

    First, let’s create a “missing data” code. We’ll use “99.” Return to Transform/Compute. In the first dialog box, type “progun = 99.” Click If to specify the condition: “MISSING(gunlaw)” as shown in Figure 57. You need to first select the “MISSING(variable)” function and select gunlaw and click on the arrow.  Click Continue and OK, then repeat the procedure for “MISSING(owngun).”


Figure 57. Missing Data as a Condition

fig57

    If you examine the response possibilities for owngun, however, you’ll find that 23 people refused to answer and were coded “3.” Return to Transform/Compute and assign an index value of 99 for anyone with “owngun = 3.”
As a final step, we’re going to recode the 99. Select Transform/Recode, but this time choose Same Variables. Once you reach the dialog box, convert the 99 to a SYSMIS as illustrated in Figure 58. Enter 99 as the old value, click ‘system-missing’ then click the Add button.

    To complete the action, click Continue and OK. The index is now complete. You can check it out by running Analyze/Descriptive Statistics/Frequencies. To reassure yourself further, run a cross-tabulation between the two items to verify that the correct number of people received each of the scores.


Figure 58. Converting 99 to SYSMIS

fig58


Graphics

    With the improvement of computer graphics, SPSS now offers many options for presenting data in nontabular formats. Let’s explore a few of these, beginning with simple frequency distributions.

    Figure 59 presents the distribution of GSS data on religious affiliation (relig) as a pie chart. You can create this by (1) selecting Pie under Graphs, (2) choosing Summaries for Groups of Cases, then (3) specifying relig as the variable to portray.  Select “% of cases.”  Before clicking OK, click on the Titles button and type the title of your graph in the dialog box.  Here I typed “Pie Chart of Religious Affiliation.”


Figure 59. Pie Chart of Religious Affiliation

fig59

    Figure 60 shows the results of this operation.  As you can see, the pie chart is small and refers to all religious categories.  This pie chart is not very useful and requires some simple formatting.  We need to collapse into one larger label all these slices too small to make sense of the graph.  Double click on the graph in your output window and a graph dialog box will appear.  Select Option from the Chart menu.  A Pie Options dialog box appears shown in Figure 61.  Select “Collapse (sum) slices less than 5%” (note that you can change this percentge if you want to include more categories under this new collapsed one).  Under this same dialogue box select Percents so that the percentage of each slice is indicated on the graph.  Click Ok and close the SPSS Chart Editor window.


Figure 60. Pie Chart of Religious Affiliation in Output Window

fig60


Figure 61. Formatting Pie Chart of Religious Affiliation

fig61


    Figure 62 shows the result of this formatting procedure.  Only the “Catholic,” “None” and “Protestant” slices remain unchanged.  All other categories are collapsed under the “Other” pie slice. There are many options to explore and I invite you to experiment on your own in order to polish statistical visual representations in your papers.  When you are ready to import a SPSS graph to your paper, simply select this graph from the outline menu on the left side of the output window.  A frame around the graph will indicate that you have selected it.  Then choose “Copy Objects” from the Edit menu.  Open your Word document and paste the graph wherever you want.


Figure 62. Copying Pie Chart of Religious Affiliation

fig62

    If you’re on a diet that rules out pies, see Figure 63 for a bar graph of relig.


Figure 63. Bar Graph of relig

fig63

    Ratio variables, such as the number of years of education, might be presented as line graphs. See Figure 64.


Figure 64. Line Graph of educ

fig64

    These are just a few of the graphic options available to you in SPSS. Experiment with them to find the form of presentation most appropriate to your purposes.

Making Copies of Results for a Paper

    Often, you will use SPSS to undertake quantitative analyses for a term paper, thesis, or other project. Although you can retype the results of SPSS into your paper, you can also take advantage of some energy-saving options. Depending on your word-processing system, you may have to experiment a bit.

    As shown earlier, it is very easy o copy and paste from SPSS outputs to Word documents.  Simply make sure you have selected the objects you want to copy (which should be then framed) and use the Edit/Copy objects command from SPSS and then Paste in the Word document of your choice.

    Though the easiest strategy is to copy and paste from SPSS to Word, you can also export your statistical results.  To try making a hard copy of a graph, create the pie chart for polviewr. Click the resulting graph. As explained above, you’ll see a box appear around it, indicating that it has been selected by the computer. Then in the File menu select Export. Figure 65 illustrates the resulting dialog box.


Figure 65. Export Dialog Box

fig65

    You have several options here. You can export your output document with charts, without charts, or exclusively the charts of your output window.  For our purposes, export the Chart Only. You can then either change the name of the export file you are going to create or accept (and remember) the name and location SPSS has proposed. Again, for present purposes, choose to export the Selected Objects and choose JPEG File (*.JPG) as the export format. Once you’ve done all this, click OK.

    Run your word-processing system and open the document that desperately needs this table. Click where you want the graph and select Picture and then From File from the Insert menu (this procedure may be different if you are using another word processor than MS Word). Browse your computer until you find your JPEG file.  Remember to change the Files of Type option to All Files to view all documents and not exclusively Word documents.

    To try making a hard copy of a table, create a frequency distribution for gunlaw. The same procedure you used to export graphs is possible if you choose to export tables.  However, you will lose the formatting of the tables you export.  I suggest that you choose the Export Format HTML file (*htm) option.  This format preserves the best the layout of your tables with SPSS.  Open then your Word document and select File from the Insert menu and browse until you find your output file.  You should be rewarded with something like the table I’ve put in Figure 66.


Figure 66. Text Version of gunlaw Frequency Distribution

fig66

    In all this, you may also want to take advantage of SPSS’s multitude of table formats. To explore these, choose Edit/Options in SPSS and click on the Pivot Tables tab. Once there, click the various options under TableLook and SPSS will give you a sample layout in the field to the right of the list. If you find a format that interests you, leave it highlighted when you close the dialog box and then create a new table. It will be done in the format you’ve specified.

Shutting Down

    As much as you may come to love SPSS, you’ll have to quit the program eventually. Go to the File menu and select Exit. SPSS will respond with a question that asks whether you want to save the Output file you’ve created. If you give it a name and a disk location for saving it, you’ll be able to open it later on and retrieve any data created in your analyses. If you’ve just been practicing SPSS, you’ll probably want to say “No.”

    If you’ve changed the data set, by creating a recoded variable, for example, SPSS will ask if you want to save the changes. Unless you want to get rid of the changes, say “Yes.” However, you should only alter the data file if you have permission to do that. If you’re sharing a file with others in your class, for example, it may not be appropriate for you to save your changes. Discuss this with your instructor if in doubt.

    SPSS will now close. And so will this guide. Have fun.