A CURRICULUM FRAMEWORK FOR

 

PREK-12 STATISTICS EDUCATION

 

 

 

Writers

Christine Franklin

Gary Kader

Denise S. Mewborn

Jerry Moreno

Roxy Peck

Mike Perry

Richard Schaeffer

 

 

Advisors

Susan Friel

Landy Godbold

Brad Hartlaub

Peter Holmes

Cliff Konold

 

 

Presented to the American Statistical Association

Board of Directors for Endorsement

 

March 2005

 

 


Table of Contents

 

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    1

 

Level A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

 

Level B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    35

 

Level C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    59

 

References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

 

Appendix for Level A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     88

 

Appendix for Level B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     94           

 

Appendix for Level C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     99

 

 

 

 


A CURRICULUM FRAMEWORK FOR

 

PREK-12 STATISTICS EDUCATION

 

Introduction

 

The Ultimate Goal:  Statistical Literacy

 

Every morning the newspaper or other media confront us with statistical information on topics which range from the economy to education, from movies to sports, from food to medicine, from public opinion to social behavior; such information informs decisions in our personal lives and enables us to meet our responsibilities as citizens.  At work we may be confronted by quantitative information on budgets, supplies, manufacturing specifications, market demands, sales forecasts or workloads. Teachers may be confronted with educational statistics concerning student performance or their own accountability. Medical scientists must understand the statistical results of experiments used for testing the effectiveness and safety of drugs. Law enforcement professionals depend on crime statistics. If we consider changing jobs and moving to another community, then our decision can be informed by statistics about cost of living, crime rate, and educational quality.

 

Our lives are governed by numbers. Every high school graduate should be able to use sound statistical reasoning in order to cope intelligently with the requirements of citizenship, employment and family, and to be prepared for a healthy, happy and productive life.

 

Citizenship

 

Public opinion polls are the most visible examples of a statistical application that has an impact on our lives. In addition to informing individual citizens directly, polls are used by others in ways that affect us. The political process employs opinion polls in several ways. Candidates for office use polling to guide campaign strategy. A poll can determine a candidateÕs strengths with voters, which can in turn be emphasized in the campaign. Citizens might be suspicious also that poll results might influence a candidate to take positions just because they are popular.

 

A citizen informed by polls needs to understand that the results were determined from a sample of the population under study, that the reliability of the results depends on how the sample was selected, and that the results are subject to sampling error. The statistically literate citizen should understand the behavior of ÒrandomÓ samples and be able to interpret a Òmargin of sampling errorÓ.

 

The Federal Government has been in the statistics business from its very inception. The U.S. Census was established in 1790 to provide an official count of the population for the purpose of allocating representatives to the congress. Not only has the role of the Census Bureau greatly expanded to include the collection of a broad spectrum of socio-economic data but other Federal departments produce extensive ÒofficialÓ statistics concerned with agriculture, health, education, environment and commerce. The information gathered by these agencies influences policy making, helps to determine priorities for government spending, and is also available for general use by individuals or private groups. Thus, statistics compiled by government agencies have a tremendous impact on the life of the ordinary citizen.

 

Personal Choices

 

Statistical literacy is required for daily personal choices. Statistics provides information on the composition of foods and thus inform our choices at the grocery store. Statistics helps to establish the safety and effectiveness of drugs to help us choose a treatment. Statistics helps to establish the safety of toys to assure that our little ones are not at risk. Our investment choices are guided by a plethora of statistical information about stocks and bonds. The Nielsen ratings decide which shows will survive on television and thus affect what is available. Many products have a previous statistical history and our choices of products can be affected by awareness of this history. The design of an automobile is aided by anthropometrics, the statistics of the human body, to enhance passenger comfort. Statistical ratings of fuel efficiency, safety and reliability are available to help us select a vehicle.

 

The Workplace and Professions

 

The individuals who are prepared to use statistical thinking in their jobs careers will have the opportunity to advance to more rewarding and challenging positions. A statistically competent work force will allow the United States to be more competitive in the global market place and improve its position in the international economy. An investment in statistical literacy is an investment in our nationÕs economic future as well as the well-being of individuals.

 

Efforts to improve quality and accountability are prominent among the many ways that statistical thinking and tools can be used to enhance productivity. The competitive marketplace demands quality. Quality control practices such as the statistical monitoring of design and manufacturing processes identify where improvement can be made and lead to better product quality. Systems of accountability can help produce more effective employees and organizations, but many accountability systems now in place are not based on sound statistical principles and may, in fact, have the opposite effect from the one desired.  Good accountability systems require proper use of statistical tools to determine and apply appropriate criteria.

 

Science

 

Life expectancy in the USA almost doubled during the 20th century and this rapid increase in life spans is the consequence of science. Science has enabled us to improve medical care and procedures, food production, and the detection and prevention of epidemics. And statistics plays a prominent role in this scientific progress.

 

The Federal Drug Administration requires extensive testing of drugs to determine effectiveness and side effects before they can be sold. A recent advertisement for a drug designed to reduce blood clots stated ÒPLAVIX, added to aspirin and your current medications, helps raise your protection against heart attack or strokeÓ.  But the advertisement also warns that ÒThe risk of bleeding may increase with PLAVIX...Ó

This was determined by a clinical trial involving over 12,000 subjects. Among the 6259 taking PLAVIX + aspirin 3.7% showed major bleeding problems while only 2.7% of the 6303 taking the placebo had major bleeding. This is viewed as a Òstatistically significantÓ result.

 

Statistical literacy involves a healthy dose of skepticism about ÒscientificÓ findings.  Is the information about side effects of PLAVIX treatment reliable? A statistically literate person should ask such questions and be able to answer them intelligently. A statistically literate high school graduate will be able to understand the conclusions from scientific investigations and to offer an informed opinion about the legitimacy of the reported results.   To quote from Mathematics and Democracy: The Case for Quantitative Literacy (Steen, 2001), such knowledge Òempowers people by giving them tools to think for themselves, to ask intelligent questions of experts, and to confront authority confidently.  These are skills required to survive in the modern worldÓ.

 

Summary

 

Statistical literacy is essential in our personal lives as consumers, citizens and professionals. Statistics plays a role in our health and happiness. Sound statistical reasoning skills take a long time to develop. They cannot be honed to the level needed in the modern world through one high school course. The surest way to reach the necessary skill level is to begin the educational process in the elementary grades and keep strengthening and expanding these skills throughout the middle and high school years. A statistically literate high school graduate will know how to interpret the data in the morning newspaper and will ask the right questions about statistical claims. He or she will be comfortable handling quantitative decisions that come up on the job, and will be able to make informed decision about quality of life issues.

 

The remainder of this document lays out a framework for educational programs designed to help students achieve this noble end.

 

The Case for Statistics Education

 

Over the past quarter century, statistics (often labeled data analysis and probability) has become a key component of the K-12 mathematics curriculum.  Advances in technology and in modern methods of data analysis of the 1980Õs, coupled with the data richness of society in the information age, led to the development of curriculum materials geared toward introducing statistical concepts into the school curriculum as early as the elementary grades.  This grass-roots effort was given sanction by the National Council of Teachers of Mathematics (NCTM) when their influential document Curriculum and Evaluation Standards for School Mathematics (NCTM, 1989), included Data Analysis and Probability as one of the five content strands.   As this document and its 2000 replacement entitled Principles and Standards for School Mathematics (NCTM, 2000) became the basis for reform of mathematics curricula in many states, the acceptance of and interest in statistics as part of mathematics education gained strength.  In recent years many mathematics educators and statisticians have devoted large segments of their careers to the improvement in statistics education materials and pedagogical techniques. 

           

NCTM is not the only group calling for improved statistics education beginning at the school level.  The National Assessment of Educational Progress (NAEP, 2005) is developed around the same strands as in the NCTM Standards, with data analysis and probability questions playing an increasingly prominent role in the NAEP exam. 

 

The emerging quantitative literacy movement calls for greater emphasis on practical quantitative skills that will help assure success for high school graduates in life and work; many of these skills are statistical in nature. To quote from Mathematics and Democracy: The Case for Quantitative Literacy (Steen, 2001):

 

á        Quantitative literacy, also called numeracy, is the natural tool for comprehending information in the computer age. The expectation that ordinary citizens be quantitatively literate is primarily a phenomenon of the late twentieth century.

á        Unfortunately, despite years of study and life experience in an environment immersed in data, many educated adults remain functionally illiterate.

á        Quantitative literacy empowers people by giving them tools to think for themselves, to ask intelligent questions of experts, and to confront authority confidently.  These are the skills required to thrive in the modern world.    

 

A recent study entitled Ready or Not: Creating a High School Diploma That Counts from the American Diploma Project recommends "must have" competencies needed for high school graduates "to succeed in postsecondary education or in high-performance, high- growth jobs" include, in addition to algebra and geometry, aspects of data analysis, statistics, and other applications that are vitally important for other subjects as well as for employment in today's data-rich economy.

 

Statistics education as proposed in this Framework can enable the "must have" competencies for graduates to Òthrive in the modern worldÓ.

 

NCTM Standards and the Framework

 

The main objective of this document is to provide a conceptual Framework for K-12 statistics education. The foundation for this Framework rests on the NCTM Principles and Standards for School Mathematics (2000).

 

The Framework is intended to support the objectives of the NCTM Principles and Standards.   It is intended to complement the NCTM recommendations, not to supplant them.

 

The NCTM Principles and Standards describes the statistics content strand as follows.

 

Data Analysis and Probability

 

Instructional programs from pre-kindergarten through grade 12 should enable all students toÑ

 

á        formulate questions that can be addressed with data and collect, organize, and display relevant data to answer them;

á        select and use appropriate statistical methods to analyze data;

á        develop and evaluate inferences and predictions that are based on data;

á        understand and apply basic concepts of probability.

 

The Data Analysis and Probability Standard recommends that students formulate questions that can be answered using data and addresses what is involved in gathering and using the data wisely. Students should learn how to collect data, organize their own or others' data, and display the data in graphs and charts that will be useful in answering their questions. This Standard also includes learning some methods for analyzing data and some ways of making inferences and drawing conclusions from data. The basic concepts and applications of probability are also addressed, with an emphasis on the way that probability and statistics are related.

 

The NCTM Standards elaborates on these themes somewhat and provides examples of the types of lessons and activities that might be used in a classroom. More complete examples can be found in the NCTM Navigation Series on Data Analysis and Probability (2002-2004). Statistics, however, is a relatively new subject for many teachers who have not had an opportunity to develop sound knowledge of the principles and concepts underlying the practices of data analysis that they are now called upon to teach. These teachers do not clearly understand the difference between statistics and mathematics. They do not see the statistics curriculum for grades K-12 as  a cohesive and coherent curriculum strand. These teachers may not see how the overall statistics curriculum provides  a developmental sequence of learning experiences.

 

This Framework provides a conceptual structure for statistics education which gives a coherent picture of the overall curriculum. This structure adds to but does not replace the NCTM recommendations.

 

The Difference between Statistics and Mathematics

 

"Statistics is a methodological discipline. It exists not for itself but rather to offer to other fields of study a coherent set of ideas and tools for dealing with data. The need for such a discipline arises from the omnipresence of variability". (Cobb and Moore, 1997)

 

A major objective of statistics education is to help students develop statistical thinking.  Statistical thinking, in large part, must deal with this omnipresence of variability; statistical problem solving and decision making depend on understanding, explaining and quantifying the variability in the data.

 

It is this focus on variability in data that sets statistics apart from mathematics.

 

The Nature of Variability

 

There are many different sources of variability in data. Some of the important sources are described below.

 

Measurement Variability

Repeated measurements on the same individual vary. Sometimes two measurements vary because the measuring device produces unreliable results, like when we try to measure a large distance with a small ruler. Other times variability results from changes in the system being measured. For example, even with a very precise measuring device your recorded blood pressure would differ from one moment to the next.

 

Natural Variability

Variability is inherent in nature. Individuals are different. When we measure the same quantity across several individuals we are bound to get some differences in the measurements. Although some of this may be due to our measuring instrument, most of it is simply due to the fact that individuals differ. People naturally have different heights, different aptitudes and abilities, or different opinions and emotional responses. When we measure any one of these traits we are bound to get variability in the measurements. Different seeds for the same variety of bean will grow to different sizes when subjected to the same environment because no two seeds are exactly alike; there is bound to be variability from seed to seed in the measurements of growth.

 

Induced Variability

If we plant one pack of bean seeds in one field, and another pack of seeds in another location with a different climate, then an observed difference in growth among the seeds in one location with those in the other might be due to inherent differences in the seeds (natural variability) or the observed difference might be due to the fact that the locations are not the same.  If one type of fertilizer is used on one field and another type on the other, then observed differences might be due to the difference in fertilizers. For that matter, the observed difference might be due to a factor that we haven't even thought about. A more carefully designed experiment can help us to determine the effects of different factors.

 

This one basic idea, comparing natural variability to the variability induced by other factors, forms the heart of modern statistics. It has allowed medical science to conclude that some drugs are effective and safe, whereas others are ineffective or have harmful side effects. It has been employed by agricultural scientists to demonstrate that a variety of corn grows better in one climate than another, that one fertilizer is more effective than another, or one type of feed is better for beef cattle than another.

 

Sampling Variability

In a voter poll, it seems reasonable to use the proportion of voters surveyed (a sample statistic) as an estimate of the unknown proportion of all voters who support a particular candidate. But if a second sample of the same size is used, it is almost certain that there would not be exactly the same proportion of voters in the sample who support the candidate. The value of the sample proportion will vary from sample to sample. This is called sampling variability. So what is to keep one sample from estimating that the true proportion is .60 and another from saying it is .40 ? This is possible but unlikely if proper sampling techniques are used. Poll results are useful because these techniques and an adequate sample size can assure that unacceptable differences among samples are quite unlikely.

 

An excellent discussion on the nature of variability is given in (Utts,1999).

 

The Role of Context

 

"The focus on variability naturally gives statistics a particular content that sets it apart from mathematics itself and from other mathematical sciences, but there is more than just content that distinguishes statistical thinking from mathematics. Statistics requires a different kind of thinking, because data are not just numbers, they are numbers with a context". (Cobb and Moore,1997)

 

Many mathematics problems arise from applied contexts, but the context is removed to reveal mathematical patterns.

 

Statisticians, like mathematicians, look for patterns, but the meaning of the patterns depends on the context.

 

"In mathematics, context obscures structure. In data analysis, context provides meaningÓ.

(Cobb and Moore, 1997)

 

A graph, which appears occasionally in the business section of newspapers, shows a plot of the Dow Jones Industrial Average (DJIA) over a ten-year period.  The variability of stock prices draws the attention of an investor. This stock index may go up or down over some intervals of time, may fall or rise sharply over a short  period.  In context the graph raises questions. A serious investor is not only interested in when or how rapidly the index goes up or down, but also why. What was going on in the world when the market went up, what was going on when it went down? But strip away the context. Remove time (years) from the horizontal axis and call it "X", remove stock value (DJIA) from the vertical axis and call it "Y", and there remains a graph of very little interest or mathematical content!

 

Probability

 

Probability is a tool for statistics

Probability is an important part of any mathematical education. It is a part of mathematics that enriches the subject as a whole by its interactions with other uses of mathematics. Probability is an essential tool in applied mathematics and mathematical modeling. It is also an essential tool in statistics.

 

The use of probability as a mathematical model and the use of probability as a tool in statistics employ not only different approaches, but also different kinds of reasoning.

Two problems and the nature of the solutions will illustrate the difference.

 

Problem 1

Assume a coin is "fair" .

Question: If we toss the coin 5 times, how many heads will we get?

 

Problem 2

You pick up a coin.

Question: Is this a fair coin?

 

Problem 1 is mathematical probability problem.

Problem 2 is a statistics problem that can use the mathematical probability model determined in problem 1 as a tool to seek a solution.

 

The answer to neither question is deterministic. Coin tossing produces random outcomes, which suggests that the answer is probabilistic. The solution to problem 1 starts with the assumption that the coin is fair and proceeds to logically deduce the numerical probabilities for each possible number of heads 0,1, ....,5.

 

The solution to problem 2 starts with an unfamiliar coin; we don't know if it is fair or biased. The search for an answer is experimental - toss the coin and see what happens. Examine the resulting data to see if it looks like it came from a fair coin or a biased coin. There are several possible approaches, including: Toss the coin 5 times and record the number of heads. Then do it again: Toss the coin 5 times and record the number of heads. Repeat 100 times. Compile the frequencies of outcomes for each possible number of heads. Compare these results to the frequencies predicted by the mathematical model for a fair coin in problem 1. If the empirical frequencies from the experiment are quite dissimilar from those predicted by the mathematical model for a fair coin and are not likely to be caused by random variation in coin tosses, then we conclude the coin is not fair. In this case we induce an answer by making a general conclusion from observations of experimental results.

 

Probability and Chance Variability

Two important uses of "randomization" in statistical work occur in sampling and experimental design. When sampling we "select at random" and in experiments we randomly assign individuals to different treatments". Randomization does much more than  remove bias in selections and assignments. Randomization leads to chance variability in outcomes that can be described with probability models.

 

The probability of something says about what percentage of the time it is expected to happen when the basic process is repeated over and over again.

 

Probability theory does not say very much about one toss of the coin; it makes predictions about the long-run behavior of the coin tosses.

 

Probability tells us little about the consequences of random selection for one sample but describes the variation we expect to see in samples when the sampling process is repeated a large number of times.

 

Probability tells us little about the consequences of random assignment for one experiment but describes the variation we expect to see in the results when the experiment is replicated a large number of times.

 

When randomness is present, the statistician wants to know if the observed result is due to chance, or something else.  This is the idea of statistical significance.

 

The Role of Mathematics in Statistics Education

 

The evidence that statistics is different from mathematics is not presented to argue that mathematics is not important to statistics education or that statistics education should not be a part of mathematics education. To the contrary, statistics education becomes increasingly mathematical as the level of understanding goes up.

 

But data collection design, exploration of data, and the interpretation of results should be emphasized in statistics education for statistical literacy. These are heavily dependent on context, but at the introductory level involve limited formal mathematics.

 

Probability plays an important role in statistical analysis, but formal mathematical probability should have its own place in the curriculum.  Pre-college statistics education should emphasize the ways that probability is used in statistical thinking; an intuitive grasp of probability will suffice at these levels.

 

The Framework

 

Underlying Principles

 

Statistical Problem Solving    

 

Statistical problem solving is an investigative process that involves four components:

 

Formulate Questions

á        clarify the problem at hand

á        formulate one (or more) questions that can be answered with data

 

Collect Data

á        design a plan to collect appropriate data

á        employ the plan to collect the data

 

Analyze Data

á        select appropriate graphical or numerical methods

á        use these methods to analyze the data

 

Interpret Results

á        interpret the analysis

á        relate the interpretation to the original question.

 

The Role of Variability in the Problem Solving Process

 

Formulate Question

Anticipating Variability -Making the statistics question distinction

 

The formulation of a statistics question requires an understanding of the difference between a question that anticipates a deterministic answer and a question that anticipates an answer based on data that vary. 

 

The question "How tall am I?" will be answered with a single height. It is not a statistics question. The question "How tall are adult men in the USA?" would not be a statistics question if all these men were exactly the same height! The fact that there are differing heights, however, implies that we anticipate an answer based on measurements of height that vary. This is a statistics question.

 

The poser of the question "How does sunlight affect the growth of a plant?" should anticipate that the growth of two plants of the same type exposed to the same sunlight will likely differ. This is a statistics question.

 

The anticipation of variability is the basis for understanding of the statistics question distinction; these are required for proper question formulation.

 

Collect Data

Acknowledging Variability -Designing for differences

 

Data collection designs must acknowledge variability in data and frequently are intended to reduce variability. Random sampling is intended to reduce the differences between sample and population, and the sample size influences the effect of sampling variability (error). Experimental designs are chosen to acknowledge the differences between groups subjected to different treatments. Random assignment to the groups is intended to reduce differences between the groups due to factors that are not manipulated in the experiment.  Some experimental designs pair subjects so that they are similar. Twins are frequently paired in medical experiments so that observed differences might be more likely attributed to the difference in treatments rather than differences in the subjects.

 

The understanding of data collection designs that acknowledge differences is required for effective collection of data.

 

Analyze Data

Accounting of Variability-Using Distributions

 

The main purpose of statistical analysis is to give an accounting of the variability in the data. When results of an election poll state that "42% of those polled support a particular candidate with margin of error   +/- 3% at the 95% confidence levelÓ, the focus is on sampling variability. The poll gives an estimate of the support among all voters. The margin of error indicates how far the sample result (42%+/-3%) might differ from the actual percentage of all voters who support the candidate. The confidence level tells us how often estimates produced by the method employed will produce correct results. This analysis is based on the distribution of estimates from repeated random sampling.

 

When test scores are described as "normally distributed with mean 450 and standard deviation 100" the focus is on how the scores differ from the mean. The normal distribution describes a bell-shaped pattern of scores and the standard deviation indicates the level of variation of the scores from the mean.

 

Accounting for variability with the use of distributions is the key idea in the analysis of data.

 

Interpret Results

Allowing for Variability-Looking beyond the data

 

Statistical interpretations are made in the presence of variability and must allow for it.

The result of an election poll must be interpreted as an estimate that can vary from sample to sample. The generalization of the poll results to the entire population of voters looks beyond the sample of voters surveyed and must allow for the possibility of variability of results among different samples. The results of a randomized comparative medical experiment must be interpreted in the presence of variability due to the fact that different individuals respond differently to the same treatment as well as the variability due to randomization. The generalization of the results looks beyond the data collected from the subjects who participated in the experiment and must allow for these sources of variability.

 

Looking beyond the data to make generalizations must allow for variability in the data.

 

Maturing over Levels

 

The mature statistician understands the role of variability in the statistical problem solving process. At the point of question formulation, the statistician anticipates the data collection, the nature of the analysis, and the possible interpretations, all of which must consider possible sources of variability. In the end, the mature practitioner reflects upon all aspects of data collection and analysis as well as the question itself when interpreting results. Likewise he links data collection and analysis to each other and the other two components.

 

The beginning student cannot be expected to make all of these linkages. They require years of experience as well as training. Statistical education should be viewed as a developmental process.  To meet the proposed goals, this report will provide a framework for statistical education over three levels. If the goal were to produce a mature practicing statistician, there would certainly be several levels beyond these. There is no attempt to tie these levels to specific grade levels.

 

The Framework uses three developmental Levels, A, B, and C. Although these three levels may parallel grade levels, they are based on development, not age. Thus, a middle school student who has had no prior experience with statistics will need to begin with Level A concepts and activities before moving to Level B. This holds true for a secondary student as well.  If a student hasn't had Level A and B experiences prior to high school, then it is not appropriate to jump into Level C expectations. The learning is more teacher-driven at Level A, but becomes student driven at Levels B and C.         

 

The Framework Model

 

The conceptual structure for statistics education is provided in the two-dimensional model shown in Figure 1. One dimension is defined by the problem-solving process components plus the nature of the variability considered and how we focus on variability. The second dimension is comprised of the three developmental levels.

 

Each of the first four rows describes a process component as it develops across levels. The fifth row indicates the nature of the variability considered at a given level. It is understood that work at Level B assumes and develops further the concepts from Level A, and likewise Level C assumes and uses concepts from the lower levels.

 

Reading down a column will describe a complete problem investigation for a particular level along with the nature of the variability considered.

 

 

Figure 1: The Framework

 

Process

Component

Level A

Level B

Level C

Formulate

Question

Beginning awareness of the statistics question distinction

 

 

Teachers pose questions of interest.

 

 

Questions restricted to classroom

Increased awareness of the statistics question distinction.

 

 

Students begin to pose their own questions of interest.

 

Questions not restricted to classroom

Students can make the statistics question distinction.

 

 

Students pose their own questions of interest.

 

Questions seek generalization

 

Collect

Data

Do not yet design for differences

 

 

Census of classroom

 

 

 

Simple experiment

 

 

Beginning awareness of  design for differences

 

Sample surveys

Begin to use random selection

 

Comparative experiment

Begin to use random allocation

 

Students make designs for differences

 

 

 

Sampling designs

with random selection

 

 

Experimental designs

with randomization

 


 

Process

Component

Level A

Level B

Level C

Analyze

Data

Use particular properties of distributions in context of specific example

 

 

 

Display variability within a group

 

 

 

Compare individual to individual

 

Compare individual to group

 

Learn to use particular properties of distributions as tools of analysis

 

 

 

Quantify variability within a group

 

 

 

Compare group to group in displays

 

 

 

 

Acknowledge sampling error

 

 

Some quantification of association

Simple models for association

Understand and use distributions in analysis as a global concept

 

 

 

Measure variability within a group

Measure variability between groups

 

Compare group to group using displays and measures of variability

 

 

Describe and quantify sampling error

 

 

Quantification of association

Fitting of Models for association

 

 


 

Process

Component

Level A

Level B

Level C

Interpret

Results

Do not look beyond the data

 

 

No generalization beyond the classroom

 

 

 

Note difference between two individuals with different conditions

 

 

 

 

 

 

 

 

Observe association in displays

 

Acknowledge that looking beyond the data is feasible

 

Acknowledge that a sample may or may not be representative of larger population

 

Note difference between two groups with different conditions

 

Aware of distinction between observational study and experiment

 

 

 

 

Note differences in strength of association

 

Basic interpretation of models for association

 

Aware of the distinction between ÒassociationÓ and Òcause and effectÓ

 

Are able to look beyond the data in some contexts

 

Generalize from sample to population

 

 

 

Aware of the effect of randomization on the results of experiments

 

 

Understand the difference between observational studies and experiments

 

 

 

Interpret measures of strength of association

 

Interpret models for association

 

Distinguishes between conclusions from association studies and experiments.

 

 


 

Process

Component

Level A

Level B

Level C

Nature of

Variability

 

 

 

 

 

 

 

 

Focus on

Variability

Measurement variability

 

Natural variability

 

Induced variability

 

 

 

 

Variability within a group

 

 

 

Sampling variability

 

 

 

 

 

 

 

 

 

Variability within a group and variability between groups

 

Co-variability

Chance variability.

 

 

 

 

 

 

 

 

 

Variability in model fitting

 

 

Illustrations

All four steps of the problem solving process are used at all three levels, but the depth of understanding and sophistication of methods used increases across the Levels A, B, C. This maturation in understanding the problem solving process and its underlying concepts is paralleled by an increasing complexity in the role of variability. The illustrations of learning activities given here are intended to clarify the differences across the developmental levels for each component of the problem solving process.  A later section in this report will give illustrations of the complete problem solving process for learning activities at each level.

 

Formulate Question

 

Example 1

 

A: How long are the words on this page?

 

B: Are the words in a chapter of a fifth grade book longer than the words in a chapter of a third grade book?

 

C: Do fifth grade books use longer words than third grade books?

 

Example 2

 

A:  What type of music is most popular among students in our class?

 

B:  How do the favorite types of music compare among different classes?

 

C:  What type of music is most popular among students in our school?

Example 3

 

A: In our class, are the heights and arm spans of students approximately the same?

 

B: Is the relationship between arm span and height for the students in our class the same as the relationship between arm span and height for the students in another class?

 

C: Is height a useful predictor of arm span for the students in our school?

 

Example 4

 

A: Will a plant placed by the window grow taller than a plant placed away from the window?

 

B: Will five plants placed by the window grow taller than five plants placed away from the window?

 

C: How does the level of sunlight affect the growth of a plant?

 

Collect Data

 

Example 1

 

A: How long are the words on this page?

 

The length of every word on the page is determined and recorded.

 

B: Are the words in a chapter of a fifth grade book longer than the words in a chapter of a third grade book?

 

A simple random sample of words from each chapter is used.

 

C: Do fifth grade books use longer words than third grade books?

 

Other sampling designs are considered,  compared  and some are used. For example, rather than select words in a simple random sample, a simple random sample of pages from the book is selected and all of the words on the pages chosen are used for the sample.

 

Note- At each level, issues of measurement should be addressed. The length of word depends on the definition of ÒwordÓ.  For instance, is a number a word? Consistency of definition is important to reduce measurement variability.

 

Example 2

 

A: Will a plant placed by the window grow taller than a plant placed away from the window?

 

A seedling is planted in a pot that is placed on the window sill. A second seedling of the same type and size is planted in a pot that is placed away from the window sill. After six weeks the change in height for each is measured and recorded.

 

B: Will five plants of a particular type placed by the window grow taller than five plants of the same type placed away from the window?

 

Five seedlings of the same type and size are planted in a pan which is placed on the window sill. Five seedlings of the same type and size are planted in a pan which is placed away from the window sill.  Random numbers are used to decide which plants go in the window.  After six weeks the change in height for each seedling is measured and recorded.

 

C: How does the level of sunlight affect the growth of plants?

 

Fifteen seedlings of the same type and size are selected. Three pans are used, with five of these seedlings  planted in each. Fifteen seedlings of another variety are selected to determine if the effect of sunlight is the same on different types of plants. Five of these are planted in each of the three pans. The three pans are placed in locations with three different levels of light. Random numbers are used to decide which plants go in which pan.  After six weeks the change in height for each seedling is measured and recorded.

 

Note- At each level, issues of measurement should be addressed. The method of measuring change in height must be clearly understood and applied in order to reduce measurement variability.

 

Analyze Data

 

Example 1

 

A:  What type of music is most popular among students in our class?

 

A bar graph is used to display the number of students who choose each music category.

 

B:  How do the favorite types of music compare among different classes?

 

For each class, a bar graph is used to display the percentage of students who choose each music category. The same scales are used for both graphs so that they can easily be compared.

 

C:  What type of music is most popular among students in our school?

 

A bar graph is used to display the percentage of students who choose each music category. Because a random sample is used, an estimate of the margin of error is given.

 

Note- At each level, issues of measurement should be addressed. A questionnaire will be used to gather studentsÕ music preferences. The design and wording of the questionnaire must be carefully considered to avoid possible biases in the responses. The choice of music categories could also affect results.

 

Example 2

 

A: In our class, are the heights and arm spans of students approximately the same?

 

The difference between height and arm span is determined for each individual.

An X-Y plot is constructed with X=height, Y=arm span. The line Y=X is drawn on this graph.

 

B: Is the relationship between arm span and height for the students in our class the same as the relationship between arm span and height for the students in another class?

 

For each class, an X-Y plot is constructed with X=height, Y=arm span. An "eye ball" line is drawn on each graph to describe the relationship between height and arm span. The equation of this line is determined. An elementary measure of association is determined.

 

C: Is height a useful predictor of arm span for the students in our school?

 

The least squares regression line is determined and assessed for use as a prediction model.

 

Note- At each level, issues of measurement should be addressed. The methods used to measure height and arm span must be clearly understood and applied in order to reduce measurement variability. For instance, do we measure height with shoes on or off?

 

Interpret Results

 

Example 1

 

A: How long are the words on this page?

 

The frequency plot of all word lengths is examined and summarized. In particular, students will note the longest and shortest word lengths, the most common lengths and least common lengths, and the length in the middle.

 

B: Are the words in a chapter of a fifth grade book longer than the words in a chapter of a third grade book?

 

The students interpret a comparison of the distribution of a sample of word lengths from the fifth grade book with the distribution of word lengths from the third grade book using a boxplot to represent each of these. The students also acknowledge that samples are being used which may or may not be representative of the complete chapters.

 

 

The boxplot for a sample of word lengths from the fifth grade book is placed beside the boxplot of the sample from the third grade book.

 

C: Do fifth grade books use longer words than third grade books?

 

The interpretation at Level C includes the interpretation at Level B, but also must consider generalizing from the books included in the study to a greater population of books.

 

Example 2

 

A: Will a plant placed by the window grow taller than a plant placed away from the window?

 

In this simple experiment, the interpretation is just a matter of comparing one measurement of change in size to another.

 

B: Will five plants placed by the window grow taller than five plants placed away from the window?

 

In this experiment, the student must interpret a comparison of one group of five measurements with another group.

 

If a difference is noted, then the student acknowledges that is likely caused by the differences in light conditions.

 

C: How does the level of sunlight affect the growth of a plant?

 

There are several comparisons of groups possible with this design. If a difference is noted, then the student acknowledges that it is likely caused by the differences in light conditions or the differences in types of plants. It is also acknowledged that the randomization used in experiment can possibly cause some of the observed differences.

 

Nature of Variability

 

Variability Within a Group

 

This is the only type considered at Level A. In Example 1, differences among word lengths on a single page are considered; this is variability within a group of word lengths. In Example 2, differences among how many students choose each category of music are considered; this is variability within a group of frequencies.

 

Variability Within a Group and Variability Between Groups

 

At Level B, students begin to make comparisons of groups of measurements. In Example 1, a group of word lengths from a fifth grade book are compared to a group from a third grade book. Such a comparison not only notes differences between the two groups such as the difference between median or mean word lengths, but must also take into consideration how much word lengths differ within each group.

 

Induced Variability

 

In Example 4, Level B, the experiment is designed to determine if there will  be a difference between the growth of plants in sunlight and the growth of those away from sunlight. We want to determine if an imposed difference on the environments will induce a difference in growth.

 

Sampling Variability

 

In Example 1, Level B, samples of words from a chapter are used. Students observe that two different samples will produce different groups of word lengths. This is sampling variability.

 

Co-variability

 

Example 3, Level B or C, investigates the "statistical" relationship between height and arm span. The nature of this statistical relationship is described in terms of how the two variables "co-vary". For instance, if the heights of two students differ by 2 centimeters then we would like for our model of the relationship to tell us by how much we might expect their arm spans to differ.

.

Random Variability from Sampling

 

When random selection is used, then differences between samples will be random. Understanding this random variation is what leads to the predictability of results. In Example 2, Level C, this random variation is not only considered but it is also the basis for understanding the concept of margin or error.

 

Random Variability Resulting from Assignment to Groups in Experiments

 

In Example 4, Level C, plants are randomly assigned to groups. Students cons