Mathematical Sciences Education Board

National Research Council

"You can't fatten a hog by weighing it." So said a farmer to a governor at a public hearing in order to explain in plain language the dilemma of educational assessment. To be useful to society, assessment must advance education, not merely record its status.

Assessment is a way of measuring what students know and of expressing what students should learn. As the role of mathematics in society has changed, so mathematics education is changing, based on new national standards for curriculum and instruction. Mathematics assessment must also change to ensure consistency with the goals of education.

Three fundamental educational principles form the foundation of all assessment that supports effective education:

**THE CONTENT PRINCIPLE**

Assessment should reflect the mathematics that is most important for students to learn.

**THE LEARNING PRINCIPLE**

Assessment should enhance mathematics learning and support good instructional practice.

**THE EQUITY PRINCIPLE**

Assessment should support every student's opportunity to learn important mathematics.

Despite their benign appearance, these principles contain the seeds of revolution. Few assessments given to students in America today reflect any of these vital principles. For educational reform to succeed, the yardsticks of progress must be rooted in the principles of content, learning, and equity.

The pressures to change mathematics education reflect society's disappointment with the lack of interest and accomplishment of so many students in today's schools. In the background of public debate is the steady criticism that school mathematics is out of step with today's world and is neither well taught nor well learned.

Unfortunately, these pressures often suggest inconsistent courses of action, with standards-based curriculum and instruction moving in one direction while mandated tests remain aimed in another direction, at an older, more traditional target. Too often, teachers are caught in the middle. To be effective, mathematics education must be rooted in the practice of mathematics, in the art of teaching, and in the needs of society. These pivotal forces drive the current movement to improve mathematics education:

- A more comprehensive view of mathematics and
its role in society: mathematics is no longer just a prerequisite
subject for prospective scientists and engineers but is a fundamental
aspect of literacy for the twenty-first century.

- A recommitment to the traditional wisdom that
mathematics must be made meaningful to students if it is to be
learned, retained, and used.

- The growing recognition that in this technological era, all students should learn more and better mathematics.

Assessment is the guidance system of education just as standards are the guidance system of reform. It helps teachers and parents determine what students know and what they need to learn. Assessment can play a powerful role in conveying clearly and directly how well students are learning and how well school systems are responding to the national call for higher education standards.

At its root, assessment is a communication process that tells students, teachers, parents, and policymakers some things-but not everything-about what students have learned. Assessment provides information that can be used to award grades, to evaluate a curriculum, or to decide whether to review fractions. Internal assessment communicates to teachers critical aspects of their students' performance, helping them to adjust their instructional techniques accordingly. External assessment provides information about mathematics programs to parents, state and local education agencies, funding bodies, and policymakers.

Many reformers see assessment as much more than an educational report card. Assessment can be the engine that propels reform forward, but only if we make education rather than measurement the driving force in the development of new assessments. By setting a public and highly visible target to which all can aspire, assessment can inform students, parents, and teachers about the real performance-based meaning of curriculum guidelines. Assessments not only measure what students know but provide concrete illustrations of the important goals to which students and teachers can aspire.

Improved assessment is required to complement and support the changes under way in mathematics education: both in the kinds of mathematics that are taught and in the ways in which they are taught. As such, assessment is an integral part of an interlocking triad of reforms along with curriculum and professional development of teachers. Because assessment is key to determining what students learn and how teachers teach, it must be reshaped in a manner consistent with the new vision of teaching and learning.

Students learn important mathematics when they use mathematics in relevant contexts in ways that require them to apply what they know and extend their thinking. Students think when they are learning and they learn when they are thinking. Good teachers have long recognized that mathematics comes alive for students when it is learned through experiences they find meaningful and valuable. Students learn best and most enduringly by engaging mathematics actively, by reflecting on their experience, and by communicating with others about it. Students want to make sense of the world, and mathematics is a wonderful tool to use in this eternal quest.

Because teamwork is important on the job and in the home, mathematics students learn important lessons when they work in teams, combining their knowledge and discovering new ways of solving problems. Often there is no single right answer, only several possibilities that unfold into new questions. Students need opportunities to advance hypotheses, to construct mathematical models, and to test their inferences by using the mathematics of estimation and uncertainty alongside more traditional techniques of school mathematics. Hand-held graphing calculators allow, for the first time, thorough exploration of complex, real-life problems. Computational impediments need no longer block the development of problem-solving or mathematical modeling skills.

This new vision of learning and teaching is now being tried in some classrooms across the country. Current assessment does not support this vision and often works against it. For decades, educational assessment in the United States has been driven largely by practical and technical concerns rather than by educational priorities. Testing as we know it today arose because very efficient methods were found for assessing large numbers of people at low cost. A premium was placed on assessments that were easily administered and that made frugal use of resources. The constraints of efficiency meant that mathematics assessment tasks could not tap a student's ability to estimate the answer to an arithmetic calculation, construct a geometric figure, use a calculator or ruler, or produce a complex deductive argument.

A narrow focus on technical criteria-primarily reliability-also worked against good assessment. For too long, reliability meant that examinations composed of a small number of complex problems were devalued in favor of tests made up of many short items. Students were asked to perform large numbers of smaller tasks, each eliciting information on one facet of their understanding, rather than to engage in complex problem solving or modeling, the mathematics that is most important.

In the absence of expressly articulated educational principles to guide assessment, technical and practical criteria have become de facto ruling principles. The content, learning, and equity principles are proposed not to challenge the importance of these criteria, but to challenge their dominance and to strike a better balance between educational and measurement concerns. An increased emphasis on validity-with its attention to fidelity between assessments, high-quality curriculum and instruction, and consequences- is the tool by which the necessary balance can be achieved.

In some ways, test developers do acknowledge the importance of curricular and educational issues. However, their concern is usually about coverage, so they design tests by following check-off lists of mathematical topics (e.g., fractions, single-digit multiplication). This way of determining test content matched fairly well the old vision of mathematics instruction. In this view you could look at little pieces of learning, add them up, and get the big picture of how well someone knew mathematics.

Today we recognize that students must learn to reason, create models, prove theorems, and argue points of view. Assessments must reflect this recognition by adhering to the three principles of content, learning and equity. You cannot get at this kind of deep understanding and use of mathematics by examining little pieces of learning. Assessments that are appropriately rich in breadth and depth provide opportunities for students to demonstrate their deep mathematical understanding. Mathematics education and mathematics assessment must be guided by a common vision.

Any assessment of mathematics learning should first and foremost be anchored in important mathematics. Assessment should do much more than test discrete procedural skills so typical of today's topic-by-process frameworks for formal assessments. Many current assessments distort mathematical reality by presenting mathematics as a set of isolated, disconnected fragments, facts, and procedures. The goal ought to be assessment tasks that elicit student work on the meaning, process, and uses of mathematics.

Important mathematics must shape and define the content of assessment. Appropriate tasks emphasize connections within mathematics, embed mathematics in relevant external contexts, require students to communicate clearly their mathematical thinking, and promote facility in solving nonroutine problems. Considerations of connections, communication, and nonroutine problems raise many thorny issues that testmakers and teachers are only beginning to explore. However, these considerations are essential if students are to meet the new expectations of mathematics education standards.

The content principle has profound implications for those who design, score, and use mathematics assessments. Many of the assessments used today, such as standardized multiple-choice tests, have reinforced the view that the mathematics curriculum should be constructed from lists of narrow, isolated skills that can be easily disassembled for appraisal. The new vision of school mathematics requires a curriculum and matching assessment that is both broader and more integrated.

The mathematics in an assessment must never be distorted or trivialized for the convenience of assessment. Assessment should emphasize problem solving, thinking, and reasoning. In assessment as in curriculum activities, students should build models that connect mathematics to complex, real-world situations and regularly formulate problems on their own, not just solve those structured by others. Rather than forcing mathematics to fit assessment, assessment must be tailored to the mathematics that is important to learn.

Implications of the content principle extend as well to the scoring and reporting of assessments. New assessments will require new kinds of scoring guides and ways of reporting student performance that more accurately reflect the richness and diversity of mathematical learning than do the typical single-number scores of today.

To be effective as part of the educational process, assessment should be seen as an integral part of learning and teaching rather than as the culmination of the process. Time spent on assessment will then contribute to the goal of improving the mathematics learning of all students.

If assessment is going to support learning, then assessment tasks must provide genuine opportunities for all students to learn significant mathematics. Too often a sharp line has been drawn between assessment and instruction. Teachers teach, then instruction stops and assessment occurs. In the past, for example, students' learning was often viewed as a passive process whereby students remember what teachers tell them to remember. Consistent with this view, assessment has often been thought of as the end of learning. The student is assessed on material learned previously to see if her or she remembers it. Earlier conceptions of the mathematics curriculum as a collection of fragmented knowledge led to assessment that reinforced the use of memorization as a principal learning strategy.

Today we recognize that students make their own mathematics learning individually meaningful. Learning is a process of continually restructuring prior knowledge, not just adding to it. Good education provides opportunities for students to connect what is being learned to prior knowledge. Students know mathematics if they have developed the structures and meanings of the content for themselves.

If assessment is going to support good instructional practice, then assessment and instruction must be better integrated than is commonly the case today. Assessment must enable students to construct new knowledge from what they know. The best way to provide opportunities for the construction of mathematical knowledge is through assessment tasks that resemble learning tasks in that they promote strategies such as analyzing data, drawing contrasts, and making connections. This can be done, for example, by basing assessment on a portfolio of work that the student has done as part of the regular instructional program, by integrating the use of scoring guides into instruction so that students will begin to internalize the standards against which the work will be evaluated, or by using two-stage testing in which students have an extended opportunity to revise their initial responses to an assessment task.

Not only should all students learn some mathematics from assessment tasks, but the results should yield information that can be used to improve students' access to subsequent mathematical knowledge. The results must be timely and clearly communicated to students, teachers, and parents. School time is precious. When students are not informed of their errors and misconceptions, let alone helped to correct them, the assessment may both reinforce misunderstandings and waste valuable instructional time.

When the line between assessment and instruction is blurred, students can engage in mathematical tasks that not only are meaningful and contribute to learning, but also yield information the student, the teacher, and perhaps others can use. In fact, an oft- stated goal of reform efforts in mathematics education is that visitors to classrooms will be unable to distinguish instructional activities from assessment activities.

The idea that some students can learn mathematics and others cannot must end; mathematics is not reserved for the talented few, but is required of all to live and work in the twenty-first century. Assessment should be used to determine what students have learned and what they still need to learn to use mathematics well. It should not be used to filter students out of educational opportunity.

Designing assessments to enhance equity will require conscientious rethinking not just of what we assess and how we do it but also of how different individuals and groups are affected by assessment design and procedures. The challenge posed by the equity principle is to devise tasks with sufficient flexibility to give students a sense of accomplishment, to challenge the upper reaches of every student's mathematical understanding, and to provide a window on each student's mathematical thinking.

Some design strategies are critical to meeting this challenge, particularly permitting students multiple entry and exit points in assessment tasks and allowing students to respond in ways that reflect different levels of mathematics knowledge or sophistication. But there are no guarantees that new assessment will be fairer to every student, that every student will perform better on new assessments, or that differences between ethnic, linguistic, and socioeconomic groups will disappear. While this is the hope of the educational reform community, it seems clear that hope must be balanced by a spirit of empiricism: there is much more to be learned about how changes in assessment will affect longstanding group differences.

Equity implies that every student must have an opportunity to learn the important mathematics that is assessed. Obviously, students who have experience reflecting on the mathematics they are learning, presenting and defending their ideas, or organizing, executing, and reporting on a complex piece of work will have an advantage when called upon to do so in an assessment situation. Especially when assessments are used to make high-stakes decisions on matters such as graduation and promotion, the equity principle requires that students be guaranteed certain basic safeguards. Students cannot be assessed fairly on mathematics content that they have not had an opportunity to learn.

Assessments can contribute to students' opportunities to learn important mathematics only if they are based on standards that reflect high expectations for all students. There can be no equity in assessment as long as excellence is not demanded of all. If we want excellence, the level of expectation must be set high enough so that, with effort and good instruction, every student will learn important mathematics.

We have much to learn about how to maintain uniformly high performance standards while allowing for assessment approaches that are tailored to diverse backgrounds. Uniform application of standards to a diverse set of tasks and responses poses an enormous challenge that we do not yet know how to do fairly and effectively. Nonetheless, the challenge is surely worth accepting.

The boldness of our vision for mathematics assessment should not blind us to either the obstacles educators will face or the limitations on resources we possess for making it come about. Even if new assessments were to magically appear and be implemented across the nation, many substantial problems will remain. Examples of important, unresolved issues abound:

- Open-ended problems are not necessarily better
than well-defined tasks. The mere labels "performance assessment"
and "open ended" do not guarantee that a task meets
sound educational principles. For example, open-ended problems
can be interesting and engaging but mathematically trivial. Performance
tasks can be realistic and mathematically appropriate but out
of harmony with certain students' cultural backgrounds.

- The equity principle implies that students must
be provided an opportunity to learn the mathematics that is assessed
and that schools must be held to "school delivery standards"
to ensure that students are provided with appropriate preparation,
particularly for any high-stakes assessment. However, many would
argue that past remedies designed to improve schools often failed
precisely because the emphasis was placed on the resources schools
should provide rather than the outcomes that schools should achieve.

- The equity principle also requires some consideration
of consequences for schools of the way assessments are used. Fair
inferences can be drawn and comparisons can be made only when
assessment data include information on the nature of the students
served by the school, students' opportunities to learn the mathematics
assessed, and the adequacy of resources available to the school.
Assessments based only on partial data-typically outcome scores
on basic skills-can seriously mislead the public about how schools
are performing and how to improve them.

- On the job and in the real world, knowledge is
frequently constructed and validated in group settings rather
than through individual exploration. Mathematics is no exception:
learning and performance are frequently improved in group settings.
Hence assessment of learning must reflect the value of group interaction.
The challenge of fairly appraising an individual's contribution
to group efforts is immense, posing unresolved problems both for
industry and education.

- New performance-based assessments introduce significant
challenges both for the mathematical expertise of those who score
assessments and for the guidelines used in scoring. Problem solving
legitimately may involve some false starts or blind alleys; students
whose work includes such things are doing important mathematics
and their grades need to communicate this in an appropriate fashion.
All graders must be alert to the unconventional, unexpected answer
that, in fact, may contain insights that the assessor had not
anticipated. Of course, the greater the chances of unanticipated
responses, the greater the mathematical sophistication needed
by those grading the tasks.

- As assessments become more complex and more connected
to real-world tasks, there is a greater chance that the underlying
assumptions and points of view may not apply equally to all students,
particularly when differences in background and instructional
histories are involved. Despite good intentions and best efforts
to make new assessments fairer to all students than traditional
forms of testing, preliminary research does not confirm the corollary
expectation that group differences in achievement will diminish.
Indeed, recent studies suggest that differences may be magnified
when performance assessment tasks are used.

- Teachers are a fundamental key to assessment
reform. As evaluation of student achievement moves away from short-answer
recall of facts and algorithms, teachers will have to become skilled
in using and interpreting new forms of assessment. As a result,
teachers' professional development-at both the preservice and
inservice levels-will become increasingly important.

- To the extent that communication is a part of
mathematics, differences in communication skill must be seen as
differences in mathematical power. To what extent are differences
in ability to communicate to be considered legitimate differences
in mathematical power?

- Current assessment frameworks, derived as they
were from a measurement- based tradition largely divorced from
mathematics itself, rarely conform to the principles of content,
learning, and equity. Today's mathematics reveals the paramount
importance of interconnections among mathematical topics and of
connections between mathematics and other domains. Much assessment
tradition, however, is based on an atomistic approach that hides
connections both within mathematics and among mathematical and
other domains.

Assessment based on the principles of content, learning, and equity are already being tested in numerous schools and jurisdictions in the United States. It is clear already that despite obstacles and challenges, many benefits accrue even beyond the central goal of improved assessment.

Assessments represent an unparalleled tool for communicating the goals and substance of mathematics education reform to various stakeholders. Assessments make the goals for mathematics learning real to students, teachers, parents, policymakers, and the general public, all of whom need to understand clearly where mathematics reform will take America's children and why they should support the effort. Assessments can be enormously helpful in this re-education campaign, especially if the context and rationale for various tasks are explained in terms that the public can understand.

Improved assessment can lead to improved instruction. Assessment can play a key role in exemplifying the new types of mathematics learning students must achieve. Assessments can indicate to students not only what they should learn but also the criteria that will be used in judging their performance. For example, a classroom discussion of an assessment in which students grade some (perhaps fictional) work provides a purely instructional use of an assessment device. The goal is not to teach answers to questions that are likely to arise, but to engage students in thinking about performance expectations.

Assessment can also be a powerful tool for professional development as teachers work together to understand new expectations and synchronize their expectations and grades. Teachers are rich sources of information about their students. With training on methods of scoring new assessments, teachers can become even better judges of student performance.

Improved assessment is not a panacea for the problems in mathematics education. Our findings neither diminish nor reject important, time-honored measurement criteria for evaluating assessment; nor do they suggest that changes in assessment alone will bring about education reform. Clearly, they will not.

What we can say with assurance is that if old assessments remain in use, new curriculum and teaching methods will have little impact. Moreover, if new assessments are used as inappropriately as some old assessments, little good will come of changes in assessment.

It will take courage and vision to stay the course. As changes in curriculum and assessment begin to infiltrate the many jurisdictions of the U.S. educational system, these changes will at the outset increase the likelihood of mismatches among the key components of education: curriculum, teaching, and assessment. It is not unlikely that performance will decline initially if assessment reform is not tightly aligned with reform in curriculum and teaching.

Mathematics education is entering a period of transition in which there will be considerable exploration. Inevitably there will be both successes and failures. No one can determine in advance the full shape of the emerging assessments. Mathematics education is in this respect an experimental science, in which careful observers learn as much from failure-and from the unexpected-as from anticipated success. The necessary change will be neither swift nor straightforward. Nevertheless, we cannot afford to wait until all questions are resolved. It is time to put educational principles at the forefront of mathematics assessment.