Discussion Assignment Instructions
DUE: by 11am Tuesday May 17, 2022. NO LATE WORK!
Survey research has become an important component of the public agency data analyst’s toolbox. Explain how survey research has been used to support and analyze policy decisions by Criminal Justice administrators.
You will post a thread presenting your scholarly response on the assigned topic, writing 750–850 words. For each thread, students must support their assertions with at least four (4) scholarly citations in APA format. The original thread must incorporate ideas and several scholarly citations from all of the Learn material for the assigned Week.
Then, you will post replies of 250–300 words (supported with at least two cites) each to 3 or more classmates’ threads. Each reply must incorporate at least two (2) scholarly citation(s) in APA format. The reply posts can integrate ideas and citations from the Learn material throughout the course.
Any sources cited must have been published within the last five years. Integrate Biblical principles in your personal thread and in all replies to peers.
Cronk, Brian C. (2018). How to use SPSS®: A step-by-step guide to analysis and interpretation (11th ed.). New York, NY: Routledge. ISBN: 978-0367355692.
Meier, Kenneth J., Brudney, Jeffrey L., & Bohte, John (2015). Applied statistics for public and nonprofit administration (9th ed.). Stamford, CT: Cengage Learning. ISBN: 9781285737232.
Meier, Brudney, & Bohte: CH.2
Meier, Kenneth J., Brudney, Jeffrey L., & Bohte, John (2015).Applied statistics for public and nonprofit administration(9th ed.). Stamford, CT: Cengage Learning. ISBN: 9781285737232
Using a statistical approach in public and nonprofit administration begins with measurement. Measurement is the assignment of numbers to some phenomenon that we are interested in analyzing. For example, the
effectiveness of army officers is measured by having senior officers rate junior officers on various traits. Educational attainment may be measured by how well a student scores on standardized achievement tests. Good performance by a city bus driver might be measured by the driver’s accident record and by his or her record of running on time. The success of a nonprofit agency’s fund-raising drive might be measured by the amount of money raised. How well a nonprofit agency’s board of directors represents client interests might be measured by the percentage of former or current clients on the board.
Frequently, the phenomenon of interest cannot be measured so precisely but only in terms of categories. For example, public and nonprofit administrators are often interested in characteristics and attitudes of the general populace and of various constituency groups. We can measure such things as the racial and gender composition of the individuals in these groups; their state of residence or their religious preferences; their attitudes toward a particular agency or gov- ernment in general; their views on space exploration, public spending, or the tax treatment of nonprofit organizations; and so on. Although such variables do not have quantitative measurement scales, it is still possible to measure them in terms of categories—for instance, white versus nonwhite; female versus male; favor tax decrease, favor no change, favor tax increase; and so on. Although these phenomena cannot be measured directly with numerical scales, they are impor- tant variables nonetheless. Public and nonprofit administrators need to know how to measure, describe, and analyze such variables statistically.
In many managerial situations, the manager does not consciously think about measurement. Rather, the manager obtains some data and subjects them to anal- ysis. Of course, problems arise with this approach. For example, in Chapter 11 we discuss a program where the Prudeville police department cracks down on prostitution in the city, and arrests by the vice squad increase from 3.4 to 4.0 on average per day. Based on these numbers, the police chief claims a success- ful program. This example illustrates a common measurement problem. The city council of Prudeville was concerned about the high level of prostitution activity, not the low level of prostitution arrests. Conceivably the number of prostitution arrests could be positively related to the level of prostitution activity (i.e., more prostitution arrests indicate greater prostitution activity). In this situation the police chief ’s data may reveal increased prostitution, not decreased prostitution. In fact, the only thing an analyst can say, given the police chief ’s data, is that the number of prostitution arrests increased.
In this chapter, we discuss some of the important aspects of measure- ment, both in theory and in application. The chapter presents the theory of measurement and discusses operational definitions and indicators. Following this discussion, the chapter explores the concept of measurement validity and then turns to increasing reliability and the types of measures, such as subjec- tive indicators, objective indicators, and unobtrusive indicators. Next, the chapter presents levels of measurement: nominal, ordinal, and interval. It follows with a discussion of the implications of selecting a particular level of measure- ment and concludes by considering performance measurement techniques and benchmarking.
Theory of Measurement
Measurement theory assumes that a concept that interests an analyst cannot be measured directly. Army officer effectiveness, educational achievement, bus driver performance, level of prostitution activity, social capital, program success, and civic engagement are all concepts that cannot be measured directly. Such concepts are measured indirectly through indicators specified by operational definitions. An operational definition is a statement that describes how a con- cept will be measured. An indicator is a variable, or set of observations, that results from applying the operational definition. Examples of operational defini- tions include the following:
•• Educational attainment for Head Start participants is defined by the achievement scores on the Iowa Tests of Basic Skills.
•• Officer effectiveness is defined by subjective evaluations by senior officers using form AJK147/285-Z.
•• Program success for the Maxwell rehabilitation program is defined as a recidivism rate of less than 50%.
•• A convict is considered a recidivist if, within 1 year of release from jail, the convict is arrested and found guilty.
•• Clients’ satisfaction with the service of the Department of Human Resources is measured according to the response categories that clients check on a questionnaire item (high satisfaction, medium satisfaction, and low satisfaction).
•• An active volunteer in the Environmental Justice Association is defined as a person who donates his or her time to the association at least 5 hours per week, on average.
One measure of board director activity of the Nature Society is the number of hours devoted by board members to this organization each month.
•• The efficiency of a fund-raising firm is defined as the money raised divided by the costs paid to the firm.
Operational definitions are often not stated explicitly but are implied from the research report, the memo, or the briefing. A manager should always en- courage research analysts to state explicitly their operational definitions. Then the manager can focus on these definitions and answer a variety of measurement questions, such as the ones we will discuss later. It is important for public and nonprofit managers to know how the complicated concepts they deal with are measured. Without this knowledge, they will be hard-pressed to understand quantitative analyses or explain them to others.
Reading the preceding operational definitions, you may have been troubled by the lack of complete congruence between the concept and the indicator. For example, assume the city transit system evaluates the job performance of its bus drivers by examining each one’s accident record and on-time rate. A driver may well have a good accident record and be on time in her bus runs and yet be a poor bus driver. Perhaps the on-time record was achieved by not stopping to pick up passengers when the driver was running late. Or perhaps the driver’s bus was continually in the shop because the driver did not see to maintaining the bus properly.
This example suggests that observed indicators may not offer a complete measure of the underlying concepts. Most students of measurement accept the following statement:
Indicator 5 concept 1 error
A good indicator of a concept contains very little error; a poor indicator is only remotely related to the underlying concept.
In many cases several indicators are used to measure a single concept. One reason for using multiple indicators is that a concept may have more than one dimension. For example, the effectiveness of a receptionist may be related to the receptionist’s efficiency and the receptionist’s courtesy to people. To measure ef- fectiveness adequately in this instance, we would need at least one indicator of efficiency and one of courtesy. To measure nonprofit financial standing would require several indicators of, for example, funds in reserve, diversity in funding sources, and operating efficiency. The term triangulation is sometimes used to describe how multiple indicators enclose or “hone in” on a concept.
Multiple indicators are also needed when the indicators are only poor or in- complete representations of the underlying concept. For example, a measure of a nonprofit agency’s receptivity to volunteers might include the presence of a volun- teer coordinator, procedures in place to welcome new volunteers, and an explicit orientation for new volunteers. The success of a neighborhood revitalization pro- gram would require several indicators. The increase in housing values might be one indicator. The decrease in crime, reduction in vandalism, willingness to walk outside at night, and general physical appearance might be other indicators. The start of a neighborhood association or of a day care cooperative could be addi- tional indicators. Each indicator reflects part of the concept of neighborhood re- vitalization but also reflects numerous other factors, such as economic growth in the entire city, demand for housing, street lighting, and so on. The theory behind multiple indicators in this situation is that the errors in one indicator will cancel out the errors in another indicator. What remains will measure the concept far better than any single indicator could alone. For these reasons, a multiple- indicator strategy to measure important concepts comes highly recommended in public and nonprofit management.
A valid indicator accurately measures the concept it is intended to measure. In other words, if the indicator contains very little error, then the indicator is a valid measure of the concept. The measurement validity of an indicator often becomes a managerial problem. For example, many governments administer civil service examinations that are supposed to be valid indicators of on-the-job performance. If minorities or women do not do as well as white males on these examinations, the agency is open to discrimination lawsuits. The agency’s only defense in such a situation is to prove that the civil service examination is a valid indicator of on-the-job performance (not an easy task).
Validity can be either convergent or discriminant. The preceding paragraph discusses convergent validity: Do the indicator and the concept converge? Does the indicator measure the concept in question? Discriminant validity asks whether the indicator allows the concept to be distinguished from other similar, but different, concepts. For example, using achievement scores on standardized tests may lack discriminant validity if the tests have some cultural bias. A good indicator of educational achievement will distinguish that concept from the con- cept of white, middle-class acculturation. A culture-biased test will indicate only educational achievement that corresponds with the dominant culture. As a result, such an indicator may not be valid.
Social scientists have long grappled with the idea of measurement validity. They have suggested several ways that validity can be established. An indicator has face validity if the manager using the indicator accepts it as a valid indicator of the concept in question. For example, years spent by students in school is accepted as a valid indicator of formal education. An indicator has consensual validity if numerous persons in different situations accept the indicator as a valid indicator of the concept. The recidivism rate, for example, has consensual validity as a good measure of a prison’s ability to reform a criminal. Often, consensual validity is established through finding a published research study in which the indicator has been used, thus suggesting its acceptance by scholars. An indicator has correlational validity if it correlates strongly with other indicators that are accepted as valid. For example, community satisfaction with a nonprofit organization as assessed in a survey might be strongly related to the amount of monetary donations received by the agency or the number of donors. (In Chapter 17 we discuss correlation and how it is measured.) Finally, an indicator has predictive validity if it correctly predicts a specified outcome. For example, if scores on a civil service examination accurately predict on-the-job performance, the exam has predictive validity.
These four types of validity offer ways in which a public or nonprofit manager can accept, or argue, that an indicator is valid. They do not, however, guarantee that the indicator is a particularly effective measure of the concept in question. An indicator may have face validity, consensual validity, correlational validity, and predictive validity and still not be as effective as other measures. Consider the Law School Admission Test (LSAT). The LSAT has face validity (it seems to make sense) and consensual validity (numerous law schools use it to screen appli- cants). It also has correlational validity (it correlates with undergraduate grades) and predictive validity (it correlates with law school grades). And yet the LSAT is not as strong a predictor of law school performance as is the socioeconomic status of the student’s family.
With all of the tests for validity and all of the different ways that an indi- cator can be validated, developing valid indicators of concepts remains an art. It requires all of the skills that a public or nonprofit manager has at his or her disposal. To be sure, in some cases, such as finding indicators of lawn-mower efficiency, valid indicators can be easy to derive. By contrast, developing valid indicators of community police effectiveness or of the “health” of the nonprofit community in a city is a very difficult task. Scholars and practitioners continually debate the meaning and measurement of crucial concepts, such as social capital, civic engagement, military preparedness, and board of directors’ leadership.
One approach to finding or developing valid indicators is to review the published literature in a field. Or you may check studies or reports from other jurisdictions or consult with experts in the field. In general, if an indicator is used in the published literature, it has at a minimum both face and consensual validity, and it may meet other validity criteria as well. Before (or while) you cre- ate your own indicators of an important concept, it is a good idea to consult the relevant literature. A review of the literature carries additional benefits, such as making you aware of how other researchers have approached related problems or issues and what they have found. Such information can make your analytical task easier. You may also find that the “answer” to your research question, for which you had planned to develop indicators, already exists, thus saving you the time and effort of conducting your own study.
A reliable indicator consistently assigns the same number to some phenomenon that has not, in fact, changed. For example, if a person measures the effectiveness of the police force in a neighborhood twice over a short period of time (short nough so that change is very unlikely) and arrives at the same value, then the indicator is termed reliable. Or, if the rate of volunteering to a volunteer center remains constant from one day to the next, it is probably a reliable indicator. If two different people use an indicator and arrive at the same value, then, again, we say that the indicator is reliable. Another way of defining a reliable indicator is to state that an indicator is a reliable measure if the values obtained by using the indicator are not affected by who is doing the measuring, by where the measur- ing is taking place, or by any other factors than variation in the concept being measured.
The two major threats to measurement reliability are subjectivity and lack of precision. A subjective measure relies on the judgment of the measurer or of a respondent, for example, in a survey. A general measure that requires the analyst to assess the quality of a neighborhood or the performance of a nonprofit board of directors is a subjective measure. Subjective measures have some inherent un- reliability because the final measures must incorporate judgment. Reliability can be improved by rigorous training of the individuals who will do the measuring. The goal of this training is to develop consistency. Another method of increasing reliability is to have several persons assign a value, and then select the consensus value as the measure of the phenomenon in question. Some studies report a mea- sured inter-rater reliability based on the consistency of measurement performed by several raters. Often, judgments about the effectiveness of nonprofit boards of directors are based on the ratings provided by multiple knowledgeable actors— for example, the board chairperson, the chief executive officer of the nonprofit, and nonprofit stakeholders such as funders, donors, and other similar nonprofits in the community.
Reliability can also be improved by eliminating the subjectivity of the analyst. Rather than providing a general assessment of the quality of the neighborhood, the analyst might be asked to answer a series of specific questions. Was there trash in the streets? Did houses have peeling paint? Were dogs running loose? Did the street have potholes? How many potholes? Or consider the performance of the local volunteer center. How many volunteers does it attract? What work do the volunteers perform? What are the results of their efforts for the community? Does the volunteer center recruit any new volunteers or only those already active in the community?
Reliability problems often arise in survey research. For example, suppose that you were asked to respond to survey questions concerning the performance of one of your instructors—or a local political figure, or “bureaucrats,” or the vol- unteers assisting in your agency—on a day that had been especially frustrating for you. You might well evaluate these subjects more harshly than on a day when all had seemed right with the world. Although nothing about these subjects had changed, extraneous factors could introduce volatility into the ratings, an indica- tion of unreliability. If your views of these subjects had actually changed though, and the survey instrument picked up the (true) changes, the measurement would be considered reliable. (For that reason, reliability is often assessed over a short time interval.) By contrast, a reliable measure, such as agency salaries or number of employees and volunteers, is not affected by such extraneous factors.
Unfortunately, although removing the subjective element from a measure will increase reliability, it may decrease validity. Certain concepts important to public and nonprofit managers—employee effectiveness, citizen satisfaction with services, the impact of a recreation program—are not amenable to a series of ob- jective indicators alone. In such situations a combination of objective and subjec- tive indicators may well be the preferred approach to measurement.
Lack of precision is the second major threat to reliability. To illustrate this problem, we use the example of Mrs. Barbara Kennedy, city manager of Barren, Montana (fictional cities are used throughout the book), who wants to identify the areas of Barren with high unemployment so that she can use the city’s federal job funds in those areas. Kennedy takes an employment survey and measures the unemployment rate in the city. Because her sample is fairly small, neighbor- hood unemployment rates have a potential error of 65%. This lack of precision makes the unemployment measure fairly unreliable. For example, neighborhood A might have a real unemployment rate of 5%, but the survey measure indicates 10%. Neighborhood B’s unemployment rate is 13.5%, but the survey measure indicates 10%. Thus, the manager has a problem with measurement imprecision.
One way to improve the precision of these measures is to take larger samples. But in many cases, having a larger sample is insufficient. For example, suppose the city of Barren has a measure of housing quality that terms neighborhood housing stock as “good,” “above average,” “average,” or “dilapidated.” Assume that 50% of the city’s housing falls into the dilapidated category. If the housing evaluation were undertaken to designate target areas for rehabilitation, the mea- sure lacks precision. No city can afford to rehabilitate 50% of its housing stock. Barren needs a more precise measure that can distinguish among houses in the dilapidated category. This need can be met by creating measures that are more sensitive to variations in dilapidated houses (the premise is that some dilapidated houses are more dilapidated than others; for example, “dilapidated” and “unin- habitable”). Improving precision in this instance is more difficult than simply increasing the sample size.
Unlike validity, the reliability of a measure can be determined objectively. A common method for assessing measurement reliability is to measure the same phenomenon or set of indicators or variables twice over a reasonably short time interval and to correlate the two sets of measures. The correlation coefficient is a measure of the statistical relationship or association between two characteristics or variables (see Chapter 17). In this instance, the higher the correlation between the two measures over time, the higher the reliability. This procedure is known as test-retest reliability. Another approach to determining reliability is to prepare alternative forms that are designed to be equivalent to measure a given concept, and then to ad- minister both of them at the same time. For example, near the beginning of a survey, a researcher may include a set of five questions to measure attitudes to- ward government spending or trust in nonprofit fund-raisers, and toward the end of the survey, he or she may present five more questions on the same topic, all parallel in content. The correlation between the responses obtained on the two sets of items is a measure of parallel forms reliability. Closely related is split-half reliability, in which the researcher divides a set of items intended to measure a given concept into two parts or halves; a common practice is to divide them into the even-numbered questions and the odd-numbered questions. The correlation between the responses obtained on the two halves is a measure of split-half reliability. Cronbach’s alpha, a common measure of reliability, is based on this method.
In all three types of reliability measurement—test-retest, parallel forms, and split-half—the higher the intercorrelations or statistical relationships among the items, the higher the reliability of the indicators.
If several individuals are responsible for collecting and coding data, it is also good practice to assess inter-rater reliability. Inter-rater reliability is based on the premise that the application of a measurement scheme should not vary de- pending on who is doing the measuring (see above). For example, in screening potential applicants for a food and clothing assistance program, a nonprofit com- munity center might use a 10-item checklist for assessing the level of need for each client. To determine whether agency staff are interpreting and applying the checklist consistently, we could ask five employees to screen the same group of 20 clients using the checklist. High inter-rater reliability would exist if all five employees came up with very similarly scored (or even identical) checklists for each client. Alternatively, if the scored checklists for each client turned out to be dramatically different, we would have low inter-rater reliability. Low inter-rater reliability can indicate that confusion exists over how a measurement instrument should be applied and interpreted.
Types of Measures
We have already presented examples of two types of indicators—subjective and objective. The subjective indicator requires some judgment to assign a value, whereas the objective indicator seeks to minimize discretion. Assume that the city manager wants to know the amount of city services delivered to each neigh- borhood in the city. Objective measures of city services would be acres of city parks, number of tons of trash collected, number of police patrols, and so on. Subjective measures of city services could be obtained by asking citizens whether the levels of various city services were adequate. A subjective measure of non- profit organizational effectiveness—a difficult concept to assess—might be the reputation of these organizations as assessed by local funding agencies. A third type of measure, an unobtrusive indicator, is intended to circum- vent the so-called Hawthorne effect, in which the act of measuring a phe- nomenon can alter the behavior being assessed. In the Hawthorne studies, employees who were observed by a research team seemed to change their workplace behavior as a result of being observed. For example, when you call the county services hotline and hear the message that “your call may be moni- tored and recorded,” you are likely to receive different (better) treatment than if no message were issued. As another example, asking city residents about the quality of police services may sensitize them to police actions. If an indi- vidual is asked his or her opinion again, the answer may be biased by earlier sensitizing. A city employment counselor, for example, will likely know that her evaluation is based on the number of individuals who are placed in jobs. She may then focus her efforts on the easiest persons to place, rather than devote time to all clients, to build up a favorable record. Any reactive measure (a measure that affects behavior when it is taken) has some inherent reliability and validity problems.
One way to circumvent this problem is through the use of unobtrusive measures (see Webb et al., 1999). A library, for example, could determine its most useful reference books (which do not circulate) by asking patrons which reference books they use most frequently. Among the problems in this situ- ation is that many people who do not use reference books might answer the question nevertheless. An unobtrusive measure of reference book popularity would be the amount of wear on each book. To gauge public interest in its operations, a nonprofit organization could count the number of Web “hits” on its home page.
Suppose the head of the Alcoholic Beverage Control Board in a state wants to know how much liquor is consumed “by the drink.” Because it is illegal to serve liquor by the drink in many counties in certain states, sending a survey questionnaire to private clubs would yield little (valid) response. An unobtrusive measure would be to count the number of empty liquor bottles found in the trash of private clubs. An unobtrusive measure of the interest of service volun- teers in the governance of a nonprofit board would be a simple count of how many volunteers attend open board sessions over a calendar year.
Unobtrusive measures can be used in a variety of situations and can take on as many different forms as the creative manager can devise. They do, how- ever, have some limitations. Unless care is taken in selection, the measures may lack validity. For example, a manager may decide that she can determine the amount of time an office spends in nonproductive socializing by mea- suring the office’s consumption of coffee and fountain water (this fountain uses bottled water). She assumes that more coffee and more water fountain meetings imply less productivity. In fact, one office might consume more cof- fee than another because it has older workers (who are more likely to drink coffee) or because the office puts in more overtime and needs more coffee to make it through the night.
Levels of Measurement
In many cases public and nonprofit managers can use actual numbers to measure phenomena: tons of garbage collected in a given town, number of arrests made by the police per week, response times in minutes of a local fire department, number of children attending a daily church after-school program, miles driven in a week by Meals on Wheels volunteers, and so forth. Because this information consists of real numbers, it is possible to perform all types of arithmetic calculations with the data— addition, subtraction, multiplication, and division. As we will learn in Chapter 5 on measures of central tendency, when we have numerical data, we can readily compute average scores, such as the mean or average number of tons of garbage collected per week, the average response time of the fire department, and so forth.
Unfortunately for public and nonprofit administrators, available data are often not measured in nearly as precise a fashion as are these variables. There are several reasons for the lack of precision. In some cases it is a reflection of the state of the art of measurement. For instance, although it may be possible to say that a client is “very satisfied,” “satisfied,” “neutral,” “dissatisfied,” or “very dissatisfied” with a new job training program contracted out to a nonprofit agency, it usually is not possible to state that his or her level of satisfaction is exactly 2.3—or 5 or 9.856 or 1.003. Most measures of attitudes and opinions do not allow this level of exactitude. In other instances, loss of precision results from errors in measurement or, perhaps, from lack of foresight. For example, you may be interested in the number of traf- fic fatalities in the town of Berrysville over the past few years. As a consequence of incomplete records or spotty reporting in the past, you may not be able to arrive at the exact number of fatalities in each of these years, but you may be quite confident in determining that there have been fewer fatalities this year than last year.
Finally, some variables inherently lack numerical precision: One could classify the citizens of a community according to race (white, African American, Hispanic, or other), gender (male or female), religion (Protestant, Catholic, Jewish, Buddhist, or other), and many other attributes. It would be futile to attempt to calculate the arithmetic average of race or religion, however, and it would be meaningless to say that a citizen is more female than male: A person is classified as either one or the other.
In discussing these different types of variables, social scientists usually refer to the concept of levels of measurement. Social scientists conventionally speak of three levels of measurement. The first or highest (most precise) level is known as the interval level of measurement. The name derives from the fact that the measurement is based on a unit or interval that is accepted as a common stan- dard and that yields identical results in repeated applications. Weight is measured in pounds or grams, height in feet and inches, distance in miles or kilometers, time in seconds or minutes, and so on. The variables discussed at the beginning of this section are all measured at the interval level: tons of garbage, number of arrests, response times in minutes. As a consequence of these standard units, it is possible to state not only that there were more arrests last week than this week but also that there were exactly 18 more arrests. (Some texts discuss a fourth level of measurement—ratio—but for our purposes it is effectively the same as interval measurement.)
The second level of measurement is called ordinal. At this level of measure- ment it is possible to say that one unit or observation (or event or phenomenon) has more or less of a given characteristic than another, but it is not possible to say how much more or less. Generally, we lack an agreed-on standard or metric (“interval”) at this level of measurement. Almost all assessments of attitudes and opinions are at the ordinal level.
Consider the previous example, which focused on client satisfaction with a new job training program contracted out to a nonprofit agency specializing in this function. At this writing no one is quite sure how to measure satisfaction or how a unit of satisfaction may be defined. Nevertheless, an interviewer could be dispatched to the field to ask clients: “How satisfied are you with the new job training program recently instituted in this community? Very satisfied, satisfied, neutral, dissatisfied, or very dissatisfied?” To create an ordinal-level variable, we might attach numbers to the response categories for this survey question as a rank ordering. The numbered categories might look like those displayed in Table 2.1.
A participant who is “very satisfied” is assigned a score of “1,” a participant who is “satisfied” is assigned a score of “2,” and so on. Table 2.2 shows what the satisfaction variable would look like if we entered data for a small sample of participants into a spreadsheet or statistical software package. Of course, it would not be possible to ascertain from a client his or her exact numerical level of satisfaction (e.g., 4.37; 16.23). For example, if a “1” is assigned to the response of one participant and a “3” to another, the pre- cise magnitude of difference between the participants cannot be determined (for more on why precise distances across cases cannot be determined using nominal- and ordinal-level measures, see the section titled “Some Cautions” in Chapter 5). If one citizen answers that he or she is “very satisfied” with the program, however, it is safe to conclude that he is more satisfied than if he had stated that he was “satisfied” (or “neutral,” “dissatisfied,” or “very dissatisfied”); similarly, a response of “very dissatisfied” indicates less satisfaction than one of “dissatisfied” (or “neutral,” “satisfied,” or “very satisfied”). How much more or less remains a mystery. As another example, a polling firm might ask a repre- sentative sample of citizens how good a job they believe the mayor is doing in running the city (very good, good, average, poor, or very poor) and to what extent they are interested in community affairs (very interested, interested, neutral, uninterested, or very uninterested). These variables are also ordinal- level and are subject to the same limitations as is the measure of satisfaction with the job training program.
The name ordinal measurement derives from the ordinal numbers: first, second, third, and so on. These numbers allow the ranking of a set of units or observations (or events or phenomena) with respect to some characteristic or attribute, but they do not indicate the exact distances or differences between the objects. For example, in an election the order of finish of the candidates does not say anything about the number of votes each one received. The order of finish in- dicates only that the winner received more votes than did the runner-up, who in turn received more votes than the third-place finisher. By contrast, the exact vote totals of the candidates are interval information that reflects how many more or fewer votes each candidate received than the other candidates.
At the third level of measurement one loses not only the ability to state exactly how much of a trait or characteristic an object or event possesses (in- terval measurement) but also the ability to state that it has more or less of the characteristic than has another object or event (ordinal measurement). In short, the nominal level of measurement lacks any sense of relative size or magnitude: It allows one to say only that things are the same or different. Measurement notwithstanding, some of the most important variables in the social sciences are nominal. These were mentioned before: race, gender, and religion. It is easy to ex- pand this list to public and nonprofit management: occupation, type of housing, job classification, sector of the economy, employment status. A nominal coding scheme for employee gender appears in Table 2.3.
If we entered data for a number of employees of a nonprofit agency into a spreadsheet or statistical software package program, the values for employee gender would look like those displayed in Table 2.4. In this example, employee Jones is female, whereas R. Smith is male.
Now that you have some idea of the three levels of measurement, write several examples of interval, ordinal, and nominal variables in the margins. After you have finished, fill in the level of measurement for each of the variables listed in Table 2.5.
Whether you were aware of it or not, if you are like most people, before you read this chapter, you probably assumed that measurement was easy and accu- rate: All you needed to do was to count whatever it is that interests you or rely on technology to do the same thing. From the number of dollars a person makes (income) to the number of people in your city (population) to the amount of de- grees registered on a thermometer (temperature) to the number of miles recorded by your automobile odometer as you drive to and from work (commute mile- age), measurement was straightforward and precise. In (management) practice, measurement is more challenging—and more interesting.
When deciding what level of measurement to collect and use as a public or non- profit manager, keep in mind that variables originally coded at a higher level of measurement can be transformed into variables at lower levels of measurement because more precise measurements can be grouped into broader categories. The opposite is generally not true, however. Variables originally coded at lower levels of measurement cannot be transformed into variables at higher levels of measure- ment because higher levels of measurement call for more information and more precise information than is built into lower levels.
For example, let’s say that the director of a nonprofit organization decides to collect data on the dollar amount of supplies and services spent on each client to get a sense of how efficiently services are provided. He can measure the cost per client at either the interval or the ordinal level. The director takes data for 10 recent clients and constructs both ordinal and interval variables just to see what the different data would look like. The results are displayed in Table 2.6.
If you were the director, how would you measure the cost variable? Think about the advantages of measuring cost at the interval level. With the interval- level variable, we know exactly how much money was spent on each client. With the ordinal-level variable, we only know that a client falls into a particu- lar cost range. If we construct the measure at the interval level, we can also determine the exact distance or numerical difference between individual cases. For example, it cost $19 more to serve the first client than it did to serve the second client. In contrast, the ordinal-level measure would tell us only that the first client was in the “moderate” cost range, whereas the second client was in the “low” cost range. In terms of flexibility, note that we can always recode or convert the interval- level version of the variable into the ordinal-level variable if we want or need to present the data differently at a later time. But what if we originally constructed the variable at the ordinal level and later found it necessary to report data on the actual dollar amount spent per client—for example, for a program evalua- tion or grant proposal? We would be unable to obtain this information because ordinal-level data cannot be transformed into interval-level data. Although we would know that the cost per client was between $50 and $100 in four cases, we would be unable to determine the exact cost per client if our data were originally measured at the ordinal level.
As a public or nonprofit manager, sometimes you will not be able to choose the level of measurement for the variables to be examined. This limitation often arises when working with data originally collected by another party, such as a government agency or private firm. In those cases where you are able to choose, keep in mind the many advantages of collecting and constructing variables at higher rather than lower levels of measurement.
In Parts 5 and 6 of this book, you will see that levels of measurement have important implications for data analysis. If you plan on using a particular statisti- cal method to analyze data, you need to make sure to use an appropriate level of measurement when constructing the variables. Otherwise you may spend a lot of time collecting data only to find that you are unable to analyze the data with the intended statistical technique. Contingency tables (the topic of Chapters 14–16) are used to analyze nominal- and ordinal-level data. Regression analysis (covered in Chapters 17–21), one of the most commonly used techniques in statistical analysis, generally requires the use of interval-level data.
In public or nonprofit administration, measurement can be a challenge. In the first place, as elaborated in the theory of measurement discussed earlier in the chapter, measurement often contains error. We do not have perfect measures of bureaucratic or organizational performance or employee morale or citizen satisfaction—or many of the other concepts that you may want to measure and use. Evaluating the quality and accuracy of measurement through validity and reliability assessment is desirable and frequently demanding. In the second place, the richness of public and nonprofit administration calls for several dif- ferent types of measures. Thus, you will need to use subjective, objective, and unobtrusive indicators. In addition, you will confront different levels of measure- ment. The interval level corresponds most closely to our typical understanding of measurement as counting things—for example, the number of hours of class attended in a week or the number of pages of reading required for an assignment are interval measures. The other two levels of measurement, ordinal and nominal, do not allow such ready accounting. Nevertheless, you may have to assess work attitudes in your organization (ordinal) for an organizational development effort or identify five areas of social need in your community (nominal) for a grant pro- posal. Bringing the full menu of measurement concepts to bear on a problem or issue is a useful technique in public and nonprofit administration.
Performance Measurement Techniques
No treatment of measure in public or nonprofit organizations can ignore performance measurement. Performance measurement has become increasingly important in both government and nonprofit settings. Performance measurement yields information necessary for explaining program results to external audiences and stakeholders. Donors to nonprofit organizations want to see evidence that their contributions are being spent on worthwhile activities. Citizens want to know that government agencies are putting their tax dollars to good use. The efficient use of funding is only one dimension of performance. External constitu- encies also want to know whether government and nonprofit organizations are making progress toward their stated goals. Is a government agency adequately addressing the problem it was created to address? Is a nonprofit organization generating results consistent with its mission statement, or is it having problems meeting key objectives?
In addition to presenting program operations to external constituencies, performance measurement provides organizations with information that can be used to adjust internal operations. Performance measurement can reveal how efficiently resources such as time and money are being used. Performance data can also help managers spot problems such as inconsistencies in how policies are being applied or implemented.
To apply statistical tools to performance measurement the public or nonprofit manager first needs to know what to measure and why.
In performance measurement the critical variables are inputs, outputs, outcomes, and efficiency.
Inputs are resources that an organization uses to achieve its goals. Common examples of inputs include the following:
· •• annual agency appropriations or funding for the organization
· •• dollar amount of donations received annually by the organization
· •• number of employees at the organization
· •• number of volunteers at the organization
Organizations acquire resources so that they can pursue their goals. Collecting data on resource variables such as these is crucial if we want to see how resources affect performance. Information about core resources like money and the work of employees provide the beginning for understanding organizational performance.
Outputs versus Outcomes
How do we measure or operationalize the concept of performance? To under- stand performance, we must first explain the difference between outputs and outcomes. Outputs are tangible indicators that show how an organization uses its resources. Common examples of outputs in public and nonprofit organiza- tions include the following:
· •• number of cases processed per employee
· •• number of forms processed per day
· •• number of clients each employee has served
· •• number of overtime hours worked per week
· •• number of times an agency stays within its spending targets
· •• average cost of serving a client
· •• average amount of time (minutes, hours, days) spent per client
Outputs are sometimes called workload measures because they show what orga- nizations have produced with the inputs they have. Are outputs indicators of performance? They can be, but measuring performance solely in terms of outputs has some serious drawbacks. Although outputs provide tangible evidence of what inputs such as money or employees can produce, they generally provide little information about quality. For example, an agency that processes 25% more forms than it did last month might appear to be making large gains in efficiency, but if the error rate in processing forms has tripled as a result, the higher output would not be very impressive. No organization wants to live by the motto “We may make a lot of errors, but at least we’re fast.”
When employees are sensitized to the fact that performance is being measured in terms of outputs, they may also feel pressured to produce desirable levels of output while ignoring the larger question of whether organizational goals are actually being achieved. For example, if counselors at a domestic abuse shelter are evaluated primarily by how quickly they serve each client, they may feel pressure to move from one client to the next without adequately assess- ing the circstudent scores on standardized achievement tests, they may “teach to the test” instead of focusing on broader learning objectives. When too much attention is placed on outputs, we sometimes lose sight of whether key organizational goals are being met.
Outcomes are more precise indicators of performance than outputs be- cause they focus more on quality than on quantity and more on results outside the organization than inside. Impressive output or workload statistics are not necessarily evidence that organizational goals are being achieved. For example, an output for a parole officer is the number of parolees supervised per year, a measure of organizational workload. By contrast, an outcome is the recidivism rate for the parolees being supervised. If a parole officer is supervising a large number of parolees (output) but more than 90% return to jail within a year (outcome), it is questionable whether progress toward core organizational goals is being made.
How do we determine outcomes? A nonprofit organization can identify im- portant outcomes by looking at its mission statement. Foundations and major grantmakers sometimes define outcomes that must be measured to ensure con- tinued funding. In the case of government agencies, legislative bodies such as city councils and state legislatures often select the outcomes to be achieved. Legisla- tures appropriate funds to public agencies with the expectation that certain goals will be achieved. Examples of outcomes include the following:
· •• percentage of job training participants (out of all program participants) who receive full-time jobs
· •• percentage of alcohol treatment patients who remain sober for 2 years after completing the program
· •• percentage of low-income students who graduate from college •• percentage of new immigrants who pass an English-language proficiency
· test after 1 year
· •• percentage of sixth-grade students who maintain healthy eating habits for 3 months
Measuring outcomes is particularly important for human services organizations that deal with complex social problems. Domestic abuse shelters, alcoholism treatment centers, homeless shelters, unemployment offices, agencies for the blind or disabled, and other social welfare agencies are all examples of human ser- vices organizations. The number of clients processed usually is not as important as whether the cases have been handled correctly to achieve the desired results when understanding and addressing complex social problems.
Defining performance in terms of outputs is usually more acceptable for “bottom-line” organizations, where tasks are very clear and little controversy exists over the meaning of performance. For example, the tasks of a municipal recycling facility are well-defined (collecting and processing recyclable material). A public transit department may have slightly more complicated performance goals, but the outputs—riders transported and fares collected—are still fairly easy to measure. Outputs and outcomes are valid indicators of organizational performance, but outputs are generally much easier to measure than outcomes. Outcomes are more challenging because they typically unfold over time, take more effort to compile and verify, and pertain to results achieved external to the organization (e.g., over- coming drug dependency). Measuring performance in organizations addressing complex social problems typically goes beyond simple output or workload mea- sures to outcome, the extent to which organizational goals have been achieved.
Inputs, Outputs, and Efficiency
Organizations often measure efficiency using data on inputs and outputs. One of the most common methods for evaluating efficiency involves dividing inputs by outputs to determine the average level of inputs per output (sometimes referred to as unit cost). For example, if a drug abuse treatment facility spends $750,000 in staff time and supplies (inputs) to serve 580 patients (outputs), the average cost per patient is $1,293 (i.e., 750,000/580). A public works department that spends $340,000 in salaries and supplies to complete 297 minor patching and repair jobs on city streets (outputs) spends an average of $1,144 per job (i.e., 340,000/297).
Efficiency measures provide information about costs on a per-client or per- event basis. They can be especially useful for comparison and assessment over time. For example, if a drug abuse treatment center spent $1,293 per client last year and $1,480 per client this year, officials would probably want to examine the factors responsible for the increase.
Relying heavily on efficiency measures to evaluate performance has serious drawbacks. First, efficiency measures generally cannot measure quality. If the cost of treating a client improves (decreases) from $1,293 to $920 but the percentage of cases successfully resolved goes down (worsens), the improvement in efficiency comes at a high price. Average cost data can also be misleading if the level of resources used varies dramatically per client or event. Costs for minor street re- pairs might vary from $200 to $4,900. Small changes in the number of high- or low-cost repairs from year to year could result in substantial fluctuations in the average cost per job.
Measuring efficiency can help a manager determine whether resources are being wasted. Efficiency, however, is rarely the central focus of most nonprofit and government organizations. If the average cost of putting out a fire goes up from one year to the next, it would be silly to tell the fire department to find a cheaper way to fight fires. If the average cost of treating a mental health patient increases by 10% from one year to the next, cutting corners to improve efficiency may result in reduced quality of care. Measuring organizational performance is more concerned with whether the job is done right rather than with how inex- pensively work is completed.
Outcome Measures from External Sources
One limitation of outcome measures generated by organizations is that employ- ees may overstate the number of successes and understate the number of failures
to portray themselves and their agencies in a better light. An advantage of exter- nally generated outcome measures is that individuals who do not have a vested interest in the success of an organization may provide more honest assessments of organizational performance.
Many agencies use client surveys for this reason. Survey data can provide a wealth of information about performance because clients may view an agency differently from how the agency’s employees view the agency. Clients can raise awareness of problems that managers might not know exist. For example, client survey questions might focus on how promptly services were provided, whether the client would be willing to use agency services in the future, and whether counselors did a good job explaining agency policies and answering client ques- tions. Similarly, the head of a public works department could validate job com- pletion data reported by snowplowing crews by asking citizens when and how often their streets were plowed and whether the crews did a good job clearing the roads.
In addition to survey data, other examples of outcome measures based on external data might include the number of complaints an agency receives each month, the number of newspaper articles about an organization that take a negative tone, certification by an accreditation agency, any awards or citations received by the agency, and the number of deficiencies in performance identified by an outside auditor.
The Importance of Using Multiple Output and Outcome Measures
A single output or outcome measure is rarely adequate for assessing the perfor- mance of an entire organization. Organizations typically address a variety of problems and thus have a variety of goals.
Having multiple output and outcome measures makes it possible to mea- sure performance more comprehensively and to spot inconsistencies more easily. If we know that two output measures should be highly correlated but the data reveal that they are not, performance problems may exist. If a social services agency asks both agency staff and clients to answer the question “Was the problem adequately solved?” large differences in the response patterns across the two groups would be undesirable. Public schools often measure performance in terms of the percentage of students who pass state-mandated standardized tests. A comparison performance measure in this case would be dropout rates. Increasing pass rates accompanied by increasing dropout rates might mean that performance gains are taking place not because the quality of education is improving but because weak test takers are dropping out of school.
Finally, performance is usually a multidimensional concept. Organiza- tions sometimes have short-, medium-, and long-term performance goals. If a nonprofit agency is receiving grant money from several different foundations, each foundation may require information on different outputs and outcomes. Defining performance using multiple measures helps avoid an all-or-nothing mentality, where success is determined by looking at only a single criterion.
The rationale for benchmarking is that a public or nonprofit organization can gain a better understanding of its own performance by comparing itself to similar organizations. The following examples illustrate the logic of benchmarking:
•• A police department for a city of 500,000 residents compares its crime clearance rate to those of police departments in 10 similarly sized cities.
•• A sanitation department compares its average cost for collecting a ton of trash to cost data for five other municipal sanitation departments within the county.
•• A drug treatment center compares its success rate in treating patients to those of three other drug treatment facilities across the state.
•• A YMCA branch compares its customer satisfaction rates to those of 10 other branch locations across the state.
•• A nonprofit job training center compares its job placement rate with the placement rates of 12 other job training centers in similar communities in the region.
Comparative performance data allow an organization to determine how well it is doing relative to its peers. Once top performers have been identified, organi- zations with performance problems can study the strategies used by their more successful peers and adopt similar strategies themselves. The term best practice is often used in conjunction with benchmarking because the process can provide organizations with strategies that can lead to better performance. The steps in the benchmarking process are as follows:
Step1.• Select the variable or performance measure to be used as a benchmark. Benchmarking is not feasible if data for a performance measure are un- available for the organizations in the peer group. Before selecting the measures to be used as benchmarks, the public or nonprofit manager must make sure that the desired data are available.
Because the results obtained from benchmarking are often presented to external audiences, benchmarks should be defined in units or incre- ments that are easily understood and interpreted. For example, a highly aggregated measure such as total expenditures in millions of dollars is more difficult to interpret than average expenditures per person (or citizen or client, etc.).
Descriptive statistics are useful in benchmarking because we want to know how close or far values for a particular variable or performance indicator are to a desired value or target. Averages or measures of cen- tral tendency such as means (see Chapter 5) and measures of dispersion such as standard deviations (Chapter 6) convey summary information about performance in ways that most people understand. The mean can often serve as a useful performance target.
Select a peer group of organizations. You should always be prepared to explain why particular organizations have been chosen. Because bench- marking is about making comparisons, the peer group should consist of similar organizations. For example, a police department in a rural Illinois town of 5,000 people would not want to benchmark its per- formance against that of the police department in the city of Chicago because the scale and nature of policing in a large metropolitan area are much different than in a small rural town. Similarly, a homeless shelter that handles 25,000 clients a year might have unique scale ad- vantages in purchasing food and supplies over a shelter that serves only 2,500 people a year. If the circumstances of different organizations vary dramatically, it is usually difficult to make valid comparisons.
Some examples of criteria that can be used to select peer organizations are the number of employees (such as organizations with 1,000 employ- ees or more), size in dollars (cities with annual budgets between $10 and $15 million), function (human services), location (urban, suburban, or rural), and program status (established programs versus new demonstra- tion or pilot programs).
Collect the benchmark data for each member (often organization) in the peer group. You may need to consult a variety of sources to obtain the desired information on performance. Budgets and comprehensive annual financial reports (CAFRs) provide detailed information about the activities of city and county governments. References such as the Statistical Abstract of the United States, the Book of the States, and the Municipal Year Book provide a wide array of data useful for bench- marking purposes. Detailed information on the financial performance of nonprofit organizations can be obtained from annual reports and Internal Revenue Service (IRS) Form 990 tax returns. Other system- atic information on nonprofit organizations can be found in The New Nonprofit Almanac and Desk Reference: The Essential Facts and Figures for Managers, Researchers, and Volunteers (Jossey-Bass).
Assess where your organization ranks in the peer group. The data for each member of the peer group should be summarized and displayed in a table. Comparing your organization’s score to the average score for the peer group can be very helpful in assessing its performance.
Determine what steps should be taken to correct performance deficien- cies if they occur. Look to better-performing peers for strategies that might help improve performance.
Performance Measurement Techniques 35
Benchmarking is best illustrated with an example. A group called Citizens Against Taxing and Spending (CATS) has gone before the media to complain about the “bloated bureaucracy” in the city of Dorchester, Michigan. The group feels that the city employs too many people compared to similar cities in the area. Mayor Sandra Jackson believes that CATS’s claims are exaggerated and that staffing levels in Dorchester are quite reasonable. She decides to benchmark Dorchester against nearby cities, using the number of government employees for every 1,000 citizens as the performance measure.
The mayor uses two criteria in choosing the other cities for comparison: population and size in square miles. Mayor Jackson decides that the population of each city should be within 10% above or below 52,000, the population of Dorchester. The size of the city also must be within 10% of Dorchester’s 42.5 square miles because geographic size has a major impact on the number of police and fire employees a city needs.
The mayor’s assistant reviews the annual budgets and CAFRs for 12 nearby cities and finds four that meet the above criteria. Data on the number of employ- ees were also obtained from each city’s annual budget.
The assistant divides the number of employees by the total residents for Dorchester and obtains a result of 0.0056. He sees nothing wrong with present- ing the results in terms of the number of employees per resident, but the mayor explains that 0.0056 employees per resident is not a number the typical citizen is likely to find very meaningful. He agrees and multiplies the results for all cities by 1,000 to facilitate interpretation so that the measure used becomes govern- ment employees per 1,000 residents. The final results are displayed in Table 2.7.
The mayor is pleased with the results because they support the view that Dorchester is not that much different from nearby peer cities when it comes to the ratio of city employees to residents. In fact, Dorchester is slightly below the average of 5.78 employees per 1,000 residents for cities in the peer group (see Figure 2.1). The mayor’s assistant comments that “it looks like CATS has been declawed,” but the mayor expects CATS to raise other questions. She begins thinking about additional benchmarks that would help illustrate how fiscally re- sponsible her administration has been, such as the ratio of supervisory personnel to line personnel and tax revenues per 1,000 residents. Can you think of any other benchmarks to use?
Measurement is the assignment of numbers or category labels to some phenomenon. Some variables cannot be measured so precisely with numbers but require category labels. An operational definition tells the analyst how a concept will be measured. An indicator is a variable linked to a concept through an opera- tional definition. The two key issues in measurement are reliability, or consistency of measurement, and validity, or meaningfulness of measurement (are we measur- ing what we think we are measuring?). The chapter discusses types of validity, types of reliability, and kinds of measures (subjective, objective, and unobtrusive).
In public and nonprofit management we use three levels of measurement: interval, ordinal, and nominal. Interval measurements are based on a standard unit or interval and are the most precise (e.g., population, income, tax assess- ments). Ordinal measurements lack such an agreed-on standard or unit but allow rank ordering. Nominal measurements lack any sense of relative size or magni- tude; they allow us to say only that observations fall into the same or different categories. Despite the differences in the precision of measurement, many impor- tant variables in public and nonprofit management are measured at the ordinal level (e.g., attitudes and opinions) or at the nominal level (race, gender, religion, etc.). The chapter concluded with a discussion of performance measurement and benchmarking in public and nonprofit organizations. The chapter explained in- puts, outputs, outcomes, and efficiency and other aspects of measuring organiza- tional performance.
CH.1 & 2 Cronk
Cronk, Brian C. (2018).How to use SPSS®: A step-by-step guide to analysis and interpretation(11th ed.). New York, NY: Routledge. ISBN: 978-0367355692.
Section 1.1 Starting SPSS
Startup procedures for SPSS will differ slightly, depending on the configuration of the machine on which it is installed. If you have difficulty finding it look for an IBM SPSS Statistics section of your Start Menu. This text uses screenshots from the Windows version of SPSS. The MacOS and Unix versions will have the same functionality, but could appear differently than what is depicted herein.
When SPSS is started, you may be presented with the dialog box above, depending on the options your system administrator selected for your version of the program. If you have the dialog box, click Type in data and OK, which will present a blank data window.1
If you were not presented with the dialog box above, SPSS should open automatically with a blank data window.
The data window and the output window provide the basic interface for SPSS. A blank data window is shown on page 3.
Section 1.2 Entering Data
One of the keys to success with SPSS is knowing how it stores and uses your data. To illustrate the basics of data entry with SPSS, we will use Example 1.2.1.
Example 1.2.1A survey was given to several students from four different classes (Tues/Thurs mornings, Tues/Thurs afternoons, Mon/Wed/Fri mornings, and Mon/Wed/Fri afternoons). The students were asked whether or not they were “morning people” and whether or not they worked. This survey also asked for their final grade in the class (100% being the highest grade possible).
Our goal is to enter the data from the two students into SPSS for use in future analyses. The first step is to determine the variables that need to be entered. Any information that can vary among participants is a variable that needs to be considered. Example 1.2.2 lists the variables we will use.
Day of class
Whether or not the student works outside school
In the SPSS data window, columns represent variables, and rows represent participants. Therefore, we will be creating a data file with six columns (variables) and two rows (students/participants).
Section 1.3 Defining Variables
Before we can enter any data, we must first enter some basic information about each variable into SPSS. For instance, variables must first be given names that
begin with a letter, and
do not contain a space.
Thus, the variable name “Q7” is acceptable, while the variable name “7Q” is not. Similarly, the variable name “PRE_TEST” is acceptable, but the variable name “PRE TEST” is not. Capitalization does not matter, but variable names are capitalized in this text to make it clear when we are referring to a variable name, even if the variable name is not necessarily capitalized in screenshots.
To define a variable, click on the Variable View tab at the bottom of the main screen. This will show you the Variable View window. To return to the Data View window, click on the Data View tab.
From the Variable View screen, SPSS allows you to create and edit all of the variables in your data file. Each column represents some property of a variable, and each row represents a variable. All variables must be given a name. To do that, click on the first empty cell in the Name column and type a valid SPSS variable name. The program will then fill in default values for most of the other properties.
One useful function of SPSS is the ability to define variable and value labels. Variable labels allow you to associate a description with each variable.
Value labels allow you to associate a description with each value of a variable. For instance, for most procedures, SPSS requires numerical values. Thus, for data such as the day of the class (i.e., Mon/Wed/Fri and Tues/Thurs), we need to first code the values as numbers. We can assign the number 1 to Mon/Wed/Fri and the number 2 to Tues/Thurs. To help us keep track of the numbers we have assigned to the values, we use value labels.
To assign value labels, click in the cell you want to assign values to in the Values column (in this case, for Variable 2). This will bring up a small gray button (shown below). Click on that button to bring up the Value Labels dialog box.
When you enter a value label, you must click Add after each entry. This will move the value and its associated label into the bottom section of the window. When all labels have been added, click OK to return to the Variable View window.
In addition to naming and labeling the variable, you have the option of defining the variable type. To do so, simply click on the Type, Width, or Decimals columns in the Variable View window. The default value is a numeric field that is eight digits wide with two decimal places displayed. If your data are more than eight digits to the left of the decimal place, they will be displayed in scientific notation (e.g., the number 2,000,000,000 will be displayed as 2.00E+09).2 SPSS maintains accuracy beyond two decimal places, but all output will be rounded to two decimal places unless otherwise indicated in the Decimals column.
There are several other options available in this screen, which are beyond the scope of this text. In our example, we will be using numeric variables with all the default values.
Create a data file for the six variables and two sample students presented in Example 1.2.1. Name your variables: ID, DAY, TIME, MORNING, GRADE, and WORK. You should code DAY as 1 = Mon/Wed/Fri, 2 = Tues/Thurs. Code TIME as 1 = morning, 2 = afternoon. Code MORNING as 0 = No, 1 = Yes. Code WORK as 0 = No, 1 = Part-time, 2 = Full-time. Be sure you enter value labels for the different variables. Note that because value labels are not appropriate for ID and GRADE (because the values themselves serve as labels), these are not coded. When complete, your Variable View window should look like the screenshot below.
Click on the Data View tab to open the data-entry screen. Enter data horizontally, beginning with the first student’s ID number. Enter the code for each variable in the appropriate column. To enter the GRADE variable value, enter the student’s class grade.
The previous data window can be changed to look like the screenshot on the next page by clicking on the Value Labels icon (see below). In this case, the cells display value labels rather than the corresponding codes. If data are entered in this mode, it is not necessary to enter codes, as clicking the button that appears in each cell as the cell is selected will present a drop-down list of the predefined labels. You may use whichever method you prefer.
Instead of clicking the Value Labels icon, you may toggle between views by clicking Value Labels under the View menu.
Section 1.4 Loading and Saving Data Files
Once you have entered your data, you will need to save it with a unique name so that you can retrieve it when necessary for later use.
Loading and saving SPSS data files works in the same way as most Windows-based software. Under the File menu, there are Open, Save, and Save As commands. SPSS data files have a “.sav” extension, which is added by default to the end of the filename (that is, do not type “.sav” after the filename; SPSS will add it automatically). This tells Windows that the file is an SPSS data file. Other SPSS extensions include “.spv” for saved output files and “.sps” for saved syntax files.
Save Your Data
When you save your data file (by clicking File, then clicking Save or Save As to specify a unique name), pay special attention to where you save it. You will probably want to save your data on a removable USB drive so that you can take the file with you.
Load Your Data
When you load your data (by clicking File, then clicking Open, then Data, or by clicking the open file folder icon), you get a similar window. This window lists all files with the “.sav” extension. If you have trouble locating your saved file, make sure you are looking in the right directory.
Instead of clicking the Value Labels icon, you may toggle between views by clicking Value Labels under the View menu.
To be sure that you have mastered saving and opening data files, name your sample data file “SAMPLE” and save it to a removable storage medium. Once it is saved, SPSS will display the name of the file at the top of the data window.
It is wise to save your work frequently, in case of computer crashes. Note that filenames may be uppercase or lowercase. In this text, uppercase is used for clarity. In naming files, though, screenshots may show lowercase only.
After you have saved your data, exit SPSS (by clicking File, then Exit). Restart SPSS and load your data by selecting the “SAMPLE.sav” file you just created.
Section 1.5 Running Your First Analysis
Any time you open a data window, you can run any of the analyses available. To get started, we will calculate the students’ average grade. (With only two students, you can easily check your answer by hand, but imagine a data file with 10,000 student records.)
The majority of the available statistical tests are under the Analyze menu. This menu displays all the options available for your version of the SPSS program (the menus in this book were created with SPSS Statistics Version 26). Other versions may have slightly different sets of options.
To calculate a mean (average), we are asking the computer to summarize our dataset. Therefore, we run the command by clicking Analyze, then Descriptive Statistics, then Descriptives.
This brings up the Descriptives dialog box. Note that the left side of the box contains a list of all the variables in our data file. On the right is an area labeled Variable(s), where we can specify the variables we would like to use in this particular analysis.
We want to compute the mean for the variable called GRADE. Thus, we need to select the variable name in the left window (by clicking on it). To transfer it to the right window, click on the right arrow between the two windows. The arrow always points to the window opposite the highlighted item and can be used to transfer selected variables in either direction. Note that double-clicking on the variable name will also transfer the variable to the opposite window. Standard Windows conventions of “Shift” clicking or “Ctrl” clicking to select multiple variables can be used as well. Note: Some configurations of SPSS show the variable names, and others show the variable labels (if any). This can be changed under Edit → Options → General.
When we click on the OK button, the analysis will be conducted, and we will be ready to examine our output.
Section 1.6 Examining and Printing Output Files
After an analysis is performed, the output is placed in the output window, and the output window becomes the active window. If this is the first analysis you have conducted since starting SPSS, then a new output window will be created. If you have run previous analyses and saved them, your output is added to the end of your previous output.
To switch back and forth between the data window and the output window, select the desired window from the Window menu bar. Alternately, you can select the window using the taskbar at the bottom of the screen.
The output window is split into two sections. The left section is an outline of the output (SPSS refers to this as the outline view). The right section is the output itself.
The section on the left of the output window provides an outline of the entire output window. All of the analyses are listed in the order in which they were conducted. Note that this outline can be used to quickly locate a section of the output. Simply click on the section you would like to see, and the right window will jump to the appropriate place.
Clicking on a statistical procedure also selects all of the output for that command. By pressing the Delete key, that output can be deleted from the output window. This is a quick way to be sure that the output window contains only the desired output. Output can also be selected and pasted into a word processor or spreadsheet by clicking Edit, then Copy to copy the output. You can then switch to your word processor and click Edit, then Paste.
To print your output, simply click File, then Print, or click on the printer icon on the toolbar. You will have the option of printing all of your output or just the currently selected section. Be careful when printing! Each time you run a command, the output is added to the end of your previous output. Thus, you could be printing a very large output file containing information you may not want or need.
One way to ensure that your output window contains only the results of the current command is to create a new output window just before running the command. To do this, click File, then New, then Output. All your subsequent commands will go into your new output window.
You can also save your output files as SPSS format files (.spv extension). Note that SPSS saves whatever window you have open. If you are on a data window you will save your data. If you are on an output window it will save your output.
Load the sample data file you created earlier (SAMPLE.sav). Run the Descriptives command for the variable GRADE, and print the output. Next, select the data window and print it.
Section 1.7 Modifying Data Files
Once you have created a data file, it is really quite simple to add additional cases (rows/participants) or additional variables (columns).
To add these data, simply place two additional rows in the Data View window (after loading your sample data). Notice that as new participants are added, the row numbers become bold. When done, the screen should look like the screenshot above.
New variables can also be added. For example, if the first two participants were given special training on time management, and the two new participants were not, the data file can be changed to reflect this additional information. The new variable could be called TRAINING (whether or not the participant received training), and it would be coded so that 0 = No and 1 = Yes. Thus, the first two participants would be assigned a “1” and the last two participants a “0.” To do this, switch to the Variable View window, then add the TRAINING variable to the bottom of the list. Then switch back to the Data View window to update the data.
Adding data and variables are logical extensions of the procedures we used to originally create the data file. Save this new data file. We will be using it again later in this book.
Follow the previous example (where TRAINING is the new variable). Make the modifications to your SAMPLE.sav data file and save it.
Entering and Modifying Data
In chapter 1, we learned how to create and save a simple data file, perform a basic analysis, and examine the output. In this section, we will go into more detail about variables and data.
Section 2.1 Variables and Data Representation
In SPSS, variables are represented as columns in the data file. Participants are represented as rows. Thus, if we collect four pieces of information from 100 participants, we will have a data file with four columns and 100 rows.
There are four types of measurement scales: nominal, ordinal, interval, and ratio. While the measurement scale will determine which statistical technique is appropriate for a given set of data, SPSS generally does not discriminate. Thus, we start this section with this warning: If you ask it to, SPSS may conduct an analysis that is not appropriate for your data. For a more complete description of these four measurement scales, consult your statistics text or the Glossary in Appendix E.
Newer versions of SPSS allow you to indicate which types of data you have when you define your variable. You do this using the Measure column. You can indicate Scale, Ordinal, or Nominal (SPSS does not distinguish between interval and ratio scales). Look at the SAMPLE.sav data file we created in Chapter 1. We calculated a mean for the variable GRADE. GRADE was measured on a ratio scale, and the mean is an acceptable summary statistic (assuming that the distribution is normal).
We could have had SPSS calculate a mean for the variable TIME instead of GRADE. If we did, we would get the output presented on the next page.
The output indicates that the average TIME was 1.25. Remember that TIME was coded as an ordinal variable (1 = morning class, 2 = afternoon class). Though the mean is not an appropriate statistic for an ordinal scale, SPSS calculated it anyway. The importance of considering the type of data cannot be overemphasized. Just because SPSS will compute a statistic for you does not mean that you should use it. Later in the text, when specific statistical procedures are discussed, the conditions under which they are appropriate will be addressed. Please note that there are some procedures (e.g., graphs and nonparametric tests) where SPSS limits what you can do based on the measurement scale. However, more often than not, it is up to the user to make that decision.
Often, participants do not provide complete data. For example, for some students, you may have a pretest score but not a posttest score. Perhaps one student left one question blank on a survey, or perhaps she did not state her age. Missing data can weaken any analysis. Often, a single missing answer can eliminate a subject from all analyses.
If you have missing data in your dataset, leave that cell blank. In the example shown above, the fourth subject did not complete Question 2 (q2). Note that the total score (which is calculated from both questions) is also blank because of the missing data for Question 2. SPSS represents missing data in the data window with a period (although you should not enter a period—just leave it blank). It is NOT good practice to create a filler value (e.g., “999” or “0”) to represent blank scores, because SPSS will see it as a value with meaning, whereas it will treat truly blank values as missing.
Section 2.2 Selection and Transformation of Data
We often have more data in a data file than we want to include in a specific analysis. For instance, our sample data file contains data from four participants, two of whom received special training and two of whom did not. If we wanted to conduct an analysis using only the two participants who did not receive the training, we would need to specify the appropriate subset.
Selecting a Subset
We can use the Select Cases command to specify a subset of our data. The Select Cases command is located under the Data menu. When you select this command, the dialog box below will appear. (Note the icons next to the variable names that indicate that all variables were defined as being measured on a nominal scale except grade, which was defined as scale.)
You can specify which cases (participants) you want to select by using the selection criteria, which appear on the right side of the Select Cases dialog box. By default, All cases will be selected. The most common way to select a subset is to click If condition is satisfied, then click on the button labeled If. This will bring up a new dialog box that allows you to indicate which cases you would like to use.
You can enter the logic used to select the subset in the upper section. If the logical statement is true for a given case, then that case will be selected. If the logical statement is false that case will not be selected. For instance, you can select all cases that were coded as Mon/Wed/Fri by entering the formula DAY = 1 in the upper-left part of the window. If DAY is 1, then the statement will be true, and SPSS will select the case. If DAY is anything other than 1, the statement will be false, and the case will not be selected. Once you have entered the logical statement, click Continue to return to the Select Cases dialog box. Then, click OK to return to the data window.
After you have selected the cases, the data window will slightly change. The cases that were not selected will be marked with a diagonal line through the case number. For instance, for our sample data, the first and third cases are not selected. Only the second and fourth cases are selected for this subset.
An additional variable will also be created in your data file. The new variable is called FILTER_$ and indicates whether a case was selected or not.
If we calculate a mean GRADE using the subset we just selected, we will receive the output here. Notice that we now have a mean of 78.00 with a sample size (N) of 2 instead of 4.
Be careful when you select subsets. The subset remains in effect until you run the command again and select all cases. You can tell if you have a subset selected because the bottom of the data window will indicate that a filter is on. In addition, when you examine your output, N will be less than the total number of records in your dataset if a subset is selected. The diagonal lines through some cases will also be evident when a subset is selected. Be careful not to save your data file with a subset selected, as this can cause considerable confusion later.
Computing a New Variable
SPSS can also be used to compute a new variable or manipulate your existing variables. To illustrate this, we will create a new data file. This file will contain data for four participants and three variables (Q1, Q2, and Q3). The variables represent the number of points each participant received on three different questions. Now enter the data shown on the screen below. When done, save this data file as “QUESTIONS.sav.” We will be using it again in later chapters.
Now you will calculate the total score for each subject. We could do this manually, but if the data file were large, or if there were a lot of questions, this would take a long time. It is more efficient (and more accurate) to have SPSS compute the totals for you. To do this, click Transform, and then click Compute Variable.
After clicking the Compute Variable command, we get the dialog box shown below.
The blank field marked Target Variable is where we enter the name of the new variable we want to create. In this example, we are creating a variable called TOTAL, so type the word total.
Notice that there is an equals sign between the Target Variable blank and the Numeric Expression blank. These two blank areas are the two sides of an equation that SPSS will calculate. For instance, total = Q1 + Q2 + Q3 is the equation that is entered in the sample presented here (screenshot shown above). Note that it is possible to create any equation here simply by using the number and operational keypad at the bottom of the dialog box. When we click OK, SPSS will create a new variable called TOTAL and make it equal to the sum of the three questions.
Save your data file again so that the new variable will be available for future sessions.
Recoding a Variable—Different Variable
SPSS can create a new variable based upon data from another variable. Say we want to split our participants on the basis of their total score. We want to create a variable called GROUP, which is coded 1 if the total score is low (less than or equal to 8) or 2 if the total score is high (9 or larger). To do this, we click Transform, then Recode into Different Variables.
This will bring up the Recode into Different Variables dialog box shown above. Transfer the variable TOTAL to the middle blank. Type group in the Name field under Output Variable. Click Change, and the middle blank will show that TOTAL is becoming GROUP, as shown below.
Click Old and New Values. This will bring up the Recode dialog box below.
In the example shown here, we have entered a 9 in the Range, value through HIGHEST field, and a 2 in the Value field under New Value. When we click Add, the blank on the right displays the recoding formula. We next entered an 8 on the left in the Range, LOWEST through value blank, and a 1 in the Value field under New Value. Click Add, then Continue.
Click OK. You will be redirected to the data window shown below. A new variable (GROUP) will have been added and coded as 1 or 2, based on TOTAL.
Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.
You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.Read more
Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.Read more
Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.Read more
Your email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.Read more
By sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.Read more