Students Placed At Risk in Elementary and Middle Schools
Robert E. Slavin
This is a time of great opportunity, but also of great danger in the education of students placed at risk for school failure. Students may be placed at risk for many reasons, among which are low socioeconomic status, minority status, and limited English proficiency, if they attend schools that are not prepared to build on their strengths. While individual low-income and minority students may excel, and individual schools may have great success with high-poverty students, on average such students perform significantly worse in school than do advantaged students in well-funded schools (Knapp & Woolverton, 1995; National Center for Education Statistics, 1993, 1994). In particular, African American and Latino students have, as a group, performed significantly lower than other groups. While the gap between these groups and white students on the National Assessment of Educational Process (NAEP) and other tests has gradually diminished since the early 1970ís, the gap remains substantial, and in the most recent NAEP assessments the white-minority gap actually increased slightly, for the first time since NAEP has been given (NCES, 1994). The downturn in the progress of African American and Latino students on NAEP may be a statistical artifact or a temporary setback, but there are other troubling trends. One is the disappointing findings of a national evaluation of Chapter 1/Title I (Puma, Jones, Rock, & Fernandez, 1993). Title I, formerly called Chapter 1, is by far the largest federal investment in the education of disadvantaged, low achieving children. Controlling for background factors, the Puma et al. (1993) study found few benefits for students who received services paid for by Title I. Recently, a substantial cut in Title I was narrowly averted in the U.S. Congress; without better evidence of effectiveness, it is unlikely that Title I will fare as well in the future. Cutbacks in other funds, such as funds for bilingual education and for professional development, further limit the capacity of schools to undergo significant reform, just as state and local funding for education is being reduced in many areas. After more than a decade of reform, many policy-makers and private citizens are frustrated and pessimistic about the pace of reform. Continued political pressure toward vouchers, charter schools, and other alternatives to the public school monopoly on education speak to this frustration. Much as there are dangers and frustrated expectations in the education of students placed at risk, there are also significant new opportunities for fundamental positive change. The recent ìsummitî of governors and business leaders indicated continuing support for reform, at least at the rhetorical level. Adoption of new state standards and assessments has focused many state education systems on more ambitious goals for all students, often coupled with professional development and supplementary funding programs specifically designed to accelerate the performance of low-achieving schools. Successful funding equity or funding adequacy suits in several states have provided additional funding to high-poverty schools, and in some states this funding has been tied to adoption of promising reforms. The 1994 reauthorization of Chapter 1 as Title I (of the Improving Americaís Schools Act) introduced many opportunities for significant school-by-school change for disadvantaged children. It focused Title I assessment on broader performance measures that will ultimately incorporate all subjects, correcting some of the unintended consequences of reliance on narrowly-focused standardized tests (Slavin & Madden, 1991). The new Title I opens the opportunity for many more schools to become schoolwide projects, able to use Title I funds flexibility to meet the needs of the whole school population in schools in which 50% of students are in poverty (down from a 75% criterion). Title I is encouraging a focus on adoption of proven practices, moving away from its long-term focus on remediation and classroom aides. An emphasis on site-based management within Title I and in education reform more broadly is giving school staffs the resources and authority to select their own paths to reform. Never before have individual schools serving many low-income and minority children had as much opportunity to adopt comprehensive changes. At the same time, there has been an expansion in the availability of national reform models, particularly comprehensive, ambitious schoolwide reforms such as Comerís (1988) School Development Program, Levinís (1987) Accelerated Schools Program, Sizerís (1992) Coalition of Essential Schools, and our own Success for All and Roots & Wings programs (Slavin, Madden, Dolan, & Wasik, 1996a). These and other programs provide well-developed models for change, extensive networks of professional development and mutual assistance among schools, and teacher and student materials. The New American Schools Development Corporation, now called New American Schools (NAS), has funded the development of seven new comprehensive models, now beginning to be disseminated broadly (Stringfield, Ross, & Smith, 1996). In addition to these schoolwide reform models, there are many reform networks built around innovations focusing on particular subjects and grade levels. Some of these, such as writing process models (Hillocks, 1984) and mathematics reforms connected to the National Council of Teachers of Mathematics standards, provide general guidelines and many sources of professional development. Others are connected to more formal organizations and typically provide assistance in specific approaches to instruction and curriculum. The prototypical example of such a program is Reading Recovery (Pinnell, 1989), a one-to-one tutoring model for at-risk first graders, which is serving thousands of schools nationwide. Increasingly, schools are creating their own models of schoolwide reform by phasing in a variety of these innovative approaches to curriculum, instruction, assessment, and school organization. Schools and school districts serving many students placed at risk have increased opportunities to implement well-developed models, as well as increased responsibility (and pressure) to improve student performance. This opportunity and responsibility create a need for schools to have the highest quality of evidence possible about the effectiveness of alternative approaches. Implementing the kinds of innovative approaches likely to make a substantial difference in the performance of students requires a great deal of time, effort, and money. School staffs must have a well-founded confidence that if they implement a given program with care, intelligence, and energy, studentsí achievement will show a significant improvement. The purpose of this review is to describe the current state of evidence of effectiveness for replicable programs available to elementary and middle schools serving many students placed at risk. In particular, it is intended to give Title I schools comparable information on the instructional outcomes of programs they might adopt to meet the needs of their children. Research on the traditional uses of Title I funds find few educationally meaningful positive achievement effects (Puma et al., 1993). Some widely used alternatives to traditional methods have good evidence of effectiveness, but many do not (see, for example, Herman & Stringfield, 1995). As Title I schools take advantage of the new opportunities presented to them, they must have independent, reliable evidence on which to base their decisions. Earlier research
reviews by Slavin, Karweit, & Madden (1989) and Slavin, Karweit,
& Wasik (1994) have also synthesized research on effective programs
for students placed at risk. Fashola, Slavin, Calderón, and Durán (1996)
reviewed programs of particular relevance to schools serving many Latino
students. This paper considerably updates and extends these reviews,
focusing on programs that are likely to be of particular interest to
Title I schoolwide programs seeking comprehensive approaches to helping
all of their children achieve their full potential. Scope of the Review The focus of this
review is on the identification of programs that have been shown to
be effective in rigorous evaluations, that are replicable across a broad
range of elementary and middle schools, and that have been successfully
evaluated or at least frequently applied to schools serving many low-income
and minority students. The criteria applied in this review are described
in the following sections
Literature Search Procedures The broadest possible search was carried out for programs that had been evaluated and/or applied to disadvantaged students. In addition to searches of the ERIC system and of education journals, we obtained reports on promising programs listed by the National Diffusion Network (NDN) and by Title VII grantees. The NDN is was part of the U.S. Department of Education that identified promising programs, disseminated information about them through a system of state facilitators, and provided ìdeveloper/disseminatorî grants to help developers prepare their products for dissemination and then to carry out a dissemination plan. To be listed by NDN a program must have presented evidence of effectiveness to a Program Effectiveness Panel (PEP), or formerly to the Joint Dissemination Review Panel (JDRP). PEP or JDRP panel members reviewed the data for educationally significant effects. However, the evaluation requirements for PEP/JDRP have been low, and more than 500 programs of all kinds have been approved, mostly on the basis of NCE-gain designs (see National Diffusion Network, 1995). Selection for Review Ideally, programs emphasized in this review would be those that present rigorous evaluation evidence in comparison to control groups showing significant and lasting impacts on the achievement of students placed at risk, have active dissemination programs that have implemented the program in many schools serving at-risk students, and have evidence of effectiveness in dissemination sites, ideally from studies conducted by third parties. To require all of these conditions would limit this review to very few programs. To include a much broader range of programs, we have had to compromise on one or more criteria. For example, we have included programs that have excellent data that show positive effects for students placed at risk even if the program has not been widely replicated (as long as there is no obvious reason it could not be replicated). As noted earlier, we have included programs with excellent outcome data and evidence of replicability with middle class non-minority students if the program has also been replicated in schools serving many low-income students. We have included programs that have shakier evidence of effectiveness if they are particularly well-known, widely replicated, and appropriate to the needs of low-income and minority students. In particular, we have listed widely known comprehensive, schoolwide programs, even if the evidence supporting them is more anecdotal than conclusive. Thus, our listing of a program in this review is by no means a statement that we believe the program to be highly effective, replicable, and uniquely adapted to the needs of students placed at risk. Instead, it is an indication that among the hundreds of programs we have reviewed, these were the ones we felt to be most appropriate to be considered by elementary and middle schools serving many at-risk students, especially Title I schools. We have tried to present the evidence that school and district staff would need to begin a process leading to an informed choice from among effective and promising programs capable of being replicated in their settings. Effect Sizes The outcomes of the evaluations summarized in this review are quantified as ìeffect sizes.î These are computed as the difference between experimental and control group means divided by the control groupís standard deviation (Glass, McGaw & Smith, 1981). To give a sense of scale, an effect size of +1.0 would be equivalent to 100 points on the SAT scale, two stanines, 15 points of IQ, or about 21 NCEs. In general, an effect size of +0.25 or more would be considered educationally significant. When means and standard deviations are not known, they can usually be estimated from t-tests, Fís, chi squares, or exact p values. If effect sizes cannot be computed, study outcomes are still included if they meet all other inclusion criteria. Because of differences between measures, experimental designs, and other factors, effect sizes should be interpreted with great caution. For example, effect sizes are almost always higher on experimenter-made tests closely aligned with program curricula than on more general standardized tests (see Rosenshine & Meister, 1994). However, effect sizes do provide a useful indication of programsí effects on student achievement that can be compared (with caution) across studies and programs.
Some of the most promising programs for students placed at risk are programs designed to reform the entire school, touching on everything from curriculum and instruction to school organization and assessment. Success for All/Lee Conmigo The schoolwide reform program that has been most extensively evaluated in schools serving many students placed at risk is Success for All, a comprehensive reform program for elementary schools serving many children placed at risk (Slavin, Madden, Dolan, & Wasik, 1996a). Success for All provides schools with innovative curricula and instructional methods in reading, writing, and language arts from kindergarten to grade six, with extensive professional development. The curriculum emphasizes a balance between phonics and meaning in beginning reading and extensive use of cooperative learning throughout the grades. Recently, programs in mathematics, social studies, and science have been added to Success for All, making up a program called Roots & Wings (Slavin, Madden, & Wasik, 1996), described below. One-to-one tutoring, usually from certified teachers, is provided to children who are having difficulties in learning to read, with an emphasis on first graders. Family support services provided in each school build positive home-school relations and solve problems such as truancy, behavior problems, or needs for eyeglasses or health services. A program facilitator works with all teachers on continuing professional development and coaching, manages an assessment program to keep track of student progress, and ensures close coordination among all program components. In schools with Spanish bilingual programs, Success for All uses a beginning reading curriculum called Lee Conmigo, which applies instructional strategies similar to those used in the English program (Reading Roots), but uses a curriculum sequence and materials appropriate to Spanish language and Latino culture. Beginning in second grade, students use a Spanish adaptation of Cooperative Integrated Reading and Composition (BCIRC), described later in this paper. A different adaptation of Success for All is made in schools with many limited English proficient students but no native-language instruction. In these schools, the English curriculum is used, but there is a close coordination between ESL and classroom reading programs to infuse effective ESL strategies into the reading approach. ESL teachers usually provide classroom reading instruction and, often, tutoring to LEP children. Research on the Success for All program in general has taken place in 23 schools in nine districts throughout the U.S. In each case Success for All schools were matched with similar comparison schools. Students were pretested to establish comparability and then individually posttested each year on scales from the Woodcock Reading Mastery Test and the Durrell Oral Reading Test. Results show consistent, substantial positive effects of the program, averaging an effect size of about +0.50 at each grade level. For the most at-risk students, those in the lowest 25% of their grades, effect sizes have averaged more than a full standard deviation (ES=+1.00 or more). In grade equivalent terms, differences between Success for All and control students have averaged three months in the first grade, increasing to more than a full grade equivalent by fifth grade (Slavin, Madden, Dolan, Wasik, Ross, Smith, & Dianda, 1996b). Follow-up studies have found that this difference maintains into sixth and seventh grades, after students have left the program schools. For language minority students, the effects of Success for All have been particularly positive (Slavin & Madden, 1995). Bilingual schools using Lee Conmigo in Philadelphia found substantial differences between Success for All and control schools on scales from the Spanish Woodcock, with an effect size at the end of second grade of +1.81 (almost a full grade equivalent different). A study in two California bilingual schools (Dianda & Flaherty, 1995) also found very positive effects of Success for All/Lee Conmigo. At the end of first grade, Success for All students exceeded control students by an effect size of +1.03, or about five months. Dianda and Flaherty (1995) also reported an effect size of +1.02 for Spanish-dominant LEP students in a sheltered English adaptation of Success for All in a third California school. Incidentally, a five-year study of the ESL adaptation of Success for All to limited English proficient Cambodian students in Philadelphia also found extremely positive outcomes, averaging an effect size of +1.44 and a grade equivalent difference of almost three years by the end of fifth grade (Slavin & Madden, 1995). As of fall, 1996, Success for All is in use in more than 450 schools in the U.S., nearly all Title I schools. In Houston, the largest-ever implementation of Success for All is being carried out in 70 schools. English and Spanish outcomes will be assessed in comparison to control schools in 35 bilingual schools. A training staff in Baltimore, with regional training programs in California, Arizona, Texas, Florida, and New York, disseminates the program nationally. Roots & Wings Roots & Wings (Slavin, Madden, Dolan, & Wasik, 1994; Slavin, Madden, & Wasik, 1996) is a comprehensive reform design for elementary schools that adds to Success for All innovative programs in mathematics, social studies, and science. Funded by New American Schools, Roots & Wings has only recently begun to be disseminated and evaluated beyond its four pilot schools. Roots & Wings schools begin by implementing all components of Success for All, described above. In the second year of implementation they typically begin to incorporate the additional major components. MathWings is the name of the mathematics program used in grades 1-5. It is a constructivist approach to mathematics based on NCTM standards, but designed to be practical and effective in schools serving many students placed at risk. MathWings makes extensive use of cooperative learning, games, discovery, creative problem solving, manipulatives, and calculators. WorldLab is an integrated approach to social studies and science that engages students in simulations and group investigations. Students take on roles as various people in history, in different parts of the world, or in various occupations. For example, they work as engineers to design and test efficient vehicles, they form a state legislature to enact environmental legislation, they repeat Benjamin Franklinís experiments, and they solve problems of agriculture in Africa. In each activity students work in cooperative groups, do extensive writing, and use reading, mathematics, and fine arts skills learned in other parts of the program. A study of Roots & Wings (Slavin, Madden, & Wasik, 1996) was carried out in four pilot schools in rural southern Maryland. All four schools were Title I schools; on average, 48% of these students qualified for free lunch, and 21% were Title I eligible. The assessment tracked growth over time on the Maryland School Performance Assessment Program (MSPAP), compared to growth in the state as a whole. The MSPAP is a performance measure on which students are asked to solve complex problems, set up experiments, write in various genres, and read extended text. It uses matrix sampling, which means that different students take different forms of the test. In both third and fifth grade assessments in all subjects tested (reading, language, writing, math, science, and social studies), Roots & Wings students showed substantial growth. The State of Maryland also gained in average performance on the MSPAP over the same time period, but the number of Roots & Wings students achieving at satisfactory or excellent increased by more than twice the states rate on every measure at both grade levels. As of fall 1996,
approximately 30 schools in ten states are adding either MathWings or
WorldLab to their implementations of Success for All, making themselves
into Roots & Wings schools. Additional demonstration sites for the
program are being established in Baltimore County, MD; Memphis; Dade
County (Miami); Cincinnati; and Modesto, CA. Accelerated Schools Accelerated Schools (Levin, 1987; Hopfenberg & Levin, 1993) is an approach to school reform built around three central principles. One is unity of purpose, a common vision of what the school should become, agreed to and worked toward by all school staff, parents, students, and community. A second is empowerment coupled with responsibility, which means that staff, parents, and students find their own way to transform themselves, with freedom to experiment but also a responsibility to carry out their decisions. Building on strengths means identifying the strengths of students, of staff, and of the school as an organization, and then using these as a basis for reform. One of the key ideas behind Accelerated Schools is that rather than remediating studentsí deficits, students who are placed at risk of school failure must be accelerated, given the kind of high-expectations curriculum typical of programs for gifted and talented students. The school implements these principles by establishing a set of ìcadresî which include a steering committee and work groups focused on particular areas of reform. The program has no specific instructional approaches and provides no curriculum material; instead, school staff are encouraged to search for methods that help them realize their vision. However, there is an emphasis both on reducing all uses of remedial activities and on adopting constructivist, engaging teaching strategies (such as project-based learning). The evaluation evidence on Accelerated Schools is quite limited and largely anecdotal. The programís developers state that the program takes five years to fully implement and that is unfair to evaluate program outcomes until that much time has passed. No evaluation evidence has yet been reported from schools in the program this long. However, data from a few individual schools earlier in their implementations have been reported. There have been three evaluations of individual Accelerated Schools, all including significant numbers of Latino students (but none separately reporting results for these students). McCarthy and Still (1993) reported on one Texas school with a large Latino majority that showed gains over time in its fifth-grade standardized test scores (other grades were not mentioned). A similar comparison school showed losses over the same period. Another Texas evaluation of Accelerated Schools was reported by Knight and Stallings (1995). This study compared a school with a 25% Latino population to a matched comparison school over a two-year period. On standardized tests in reading the Accelerated School students gained more than the comparison students in grades 1-3, but not 4-5. In language, the Accelerated School scored better in grades 1-2, but the control schools did as well or better in 3-5. On a statewide accountability test, Accelerated School students passed at a higher rate than comparison students on math and reading tests in grade 3, but the opposite was true in grade 5. On a writing measure students in the Accelerated School performed slightly higher at both grade levels. Finally, in a Sacramento school with students speaking 13 languages, Chasin and Levin (1995) reported gains on standardized tests for sixth graders but did not mention changes in other grades. Even these gains are difficult to attribute to the program, as the schoolís population also increased substantially over the same period. More than 900 schools in 39 states are currently involved in the Accelerated Schools network, and there are four regional training sites for the program in addition to the original training site at Stanford. School Development Program The School Development Program (Comer, 1980, 1988; Comer, Haynes, Joyner, & Ben-Avie, 1996) is a comprehensive approach to school reform in elementary and middle schools. The programís focus is on building a sense of common purpose among school staff, parents, and community, and engaging school staff and others in a planning process intended to change school practices to improve student outcomes. Each SDP school creates three teams that take particular responsibility for moving the reform agenda forward. A School Planning and Management Team, made up of representatives of teachers, parents, and administration, develops and monitors implementation of a comprehensive school improvement plan. A Mental Health Team, principally composed of school staff concerned with mental health such as school psychologists, social workers, counselors, and selected teachers, plans programs focusing on prevention, building positive child development, positive interpersonal relations, and so on. The third major component of the SDP is a Parent Program, designed to build a sense of community among school staff, parents, and students. The Parent Program incorporates existing parent participation activities (such as the PTA) and implements further activities to draw parents into the school, to increase opportunities for parents to provide volunteer services, and to design ways for having the school to respect and celebrate the ethnic backgrounds of its students. The three teams in SDP schools work together to create comprehensive plans for school reform. Whereas the main focus is on mental health and parent involvement, but schools are also encouraged to examine their instructional programs and to look for ways to serve childrenís academic needs more effectively. The SDP was originally designed especially to meet the needs of African-American children and families, but large numbers of Latino and white students also attend SDP schools. Evaluations of the effects of SDP have taken place in a number of locations. The first was a longitudinal evaluation of the first two SDP schools in New Haven, Connecticut, which showed marked improvements in student performance on standardized tests over a 14-year period (Comer, 1988). A recent independent evaluation following first graders in two SDP schools also showed positive effects (Stringfield & Herman, 1995). Other evaluations comparing SDP to matched control schools have found mixed, inconsistent effects, with substantial site-to-site variation. Outcomes emphasized by the program, such as self-concept and school climate, have been more consistently associated with the program than have achievement gains (Becker & Hedges, 1992; Haynes, 1991,1994). The SDP is currently
involved with more than 565 schools, mostly elementary and middle in
22 states. It has regional training programs in several states. Consistency Management and Cooperative Discipline (CMCD) Consistency Management and Cooperative Discipline (CMCD) (Freiberg, Prokosch, & Treister, 1990) is a school-wide reform program designed to improve discipline in inner-city schools at grade levels K-6, to provide an appropriate environment for learning and improve academic achievement. CMDC emphasizes shared responsibility for classroom discipline between students and teachers, turning classrooms into communities of ownership, where the teachers and students together make the rules for classroom management. The idea is that if students have a hand in creating and enforcing the rules, then acting-up to defy the teacher would not work anymore, ìbecause (students) would also be breaking their own lawsíî (Freiberg, Prokosch, & Treister, 1990). CMCD exists as a stand-alone program, or can be used along with other innovative programs directed at improving curriculum and instruction. Teachers initially assess the needs of their classrooms in the spring. During the summer, they attend various workshops on CMCD, and also work with facilitators from their own schools. During the school year, teachers return for workshops and follow-up sessions. CMDC provides a framework of regulations, which schools adapt to fit their needs. The main components or themes of CMDC that exist at every school are prevention, caring, cooperation, organization, and community. At the initial implementation stages of CMDC, the teachers engage in a series of interviews and assessment sessions, whose goals are to evaluate the schoolís strengths and weaknesses and adapt the program to fit their school. CMCD has primarily been evaluated in inner-city schools in Houston, with many African-American and Latino students. The main evaluation of CMDC followed five CMDC and five matched control schools in Houston over a period of five years (Freiberg, Stein, & Huang, 1995). Eighty-three percent of the students qualified for reduced price lunches. Discipline problems and mobility rates were high. The first study (Freiberg, Prokosch, & Treister 1990) was an evaluation of the five schools after the first two years of implementation. This study showed that the students of teachers from the five program schools who had attended at least seven sessions of CMDC outperformed students in the comparison schools, with minimal effect sizes ranging from +.09 to +.29 on the MAT-6. The second evaluation involved the comparison of one of the CMDC schools with its control school (Freiberg, Stein, & Huang 1995). Although both schools had equal scores on the ITBS composite pre-test, the experimental group outperformed the control group on the composite MAT-6 for the next three years, with effect sizes ranging from +.44 to +.95. The treatment group outscored the comparison group in math (ES=+.15), reading (ES=+.52), and writing (ES=+.84). CMDC students also exceeded control students on the Texas Educational Assessment of Minimal Skills (TEAMS), with effect sizes ranging from +.40 to +1.24 in math, from +.13 to +.40 in reading, and from +.14 to +.45 in writing. Measures of motivation, self-concept, and other measurements of positive attitudes toward school and learning also favored the CMDC school. Freiberg & Huang (1994) followed an initial cohort of students who entered before the first year the program was implemented (1985-86) and stayed in their respective schools until they reached the sixth grade (1991-92). Students who later attended the school, students of teachers who were trained at times other than the initial year of implementation, students who did not have scores from all tests (TEAMS and MAT-6) for all five years, and students who were retained at least a year were not included. Because of these restrictions and the high mobility rates, the final number of students who were followed for all six years was very small; 27 for the control school and 27 for the experimental school. Students who had remained in the CMDC program for six years showed significantly higher test scores on TEAMS. Adjusting for the pre-test differences, effect sizes for the differences between the two groups ranged from +.08 to +.99. The students in the program schools also scored consistently higher than the students in the comparison schools on the MAT-6. Students involved in CDMC scored above the 50th percentile on the MAT-6 during all of the years they were assessed, while the scores of students in the comparison group dropped. Adjusting for pre-test differences, effect sizes ranged from +1.13 to +1.23 in reading, +1.44 to +1.51 in math, and +.99 to +1.71 in writing. The most recent study of CMDC (Freiberg, 1996) compared the performances of students in schools implementing a mathematics program with those in schools implementing a combination of CMDC and the mathematics program. All of the schools involved in this study were majority Latino. The students in the combined program outperformed students involved in the mathematics only program, with an effect size of +.33. CMDC currently exists
in over twenty five schools in three Texas districts, as well as abroad. New American Schools Designs The development of comprehensive, schoolwide designs for school reform has been greatly advanced by the New American Schools Development Corporation (NASDC), now called New American Schools (NAS). Founded in 1991, NAS is a foundation primarily funded by large corporations to support the development and dissemination of ambitious school designs for the 21st century. Initially, eleven design teams were funded to develop school designs. Four were discontinued for various reasons. The remaining seven are, at this writing, just beginning national dissemination. With the exception of our own Roots & Wings program, described earlier, the NAS designs are at too early a stage of implementation and evaluation to have produced outcome data that would meet the inclusion criterion applied in this review. Most have anecdotal data noting outstanding gains in one or two schools (among many that might be using the program). However, while the achievement data supporting them are limited so far, these designs have several features that make them attractive alternatives for schools seeking fundamental reform. First, these designs are very comprehensive. To one degree or another, all address curriculum, instruction, school operation, assessments, and parent/community involvement. Second, all are built for replication. All of the designs provide trainers, well specified professional development strategies, and networks of implementing schools that help mentor new schools into the network. In addition to Roots & Wings, the New American Schools designs are as follows. ATLAS Communities The ATLAS Communities (Comer, Gardner, Sizer, & Whitla, 1996) is a design based on a collaboration among four school reform organizations, those led by James Comer, Howard Gardner, Theodore Sizer, and Jane Whitla. ATLAS incorporates elements of Comerís (1988) School Development Project, described earlier, but also adds elements from the other reform networks and also has several unique features unique to it. One of these is a focus on working with pathways, feeder systems of elementary, middle, and high schools whose staff work with each other to create coordinated and continuous experiences for students. The emphasis of the design is on helping school staffs create classroom environments in which students are active participants in their own learning, putting into practice a model (following Sizerís (1992) Coalition of Essential Schools) of student as worker, teacher as coach. Project-based learning is extensively used. Assessment in ATLAS schools emphasizes portfolios, performance examinations, and exhibitions. Preliminary data
from implementing schools show some gains. In Prince Georgeís County,
Maryland, reading test scores increased by up to 30% in one ATLAS elementary
school, and a middle school reported increases on test scores in math,
language arts, science, and social studies on the Maryland School Performance
Assessment Program. Audrey Cohen The Audrey Cohen College System of Education (Cohen & Jordan, 1996) is based on the teaching methods used at the Audrey Cohen College in New York City. This design attempts to have all learning relate to a purpose that contributes to the community or world at large. Each semesters work is built around a purpose, such as using science and technology to shape a just and productive society, or helping people through the arts. Curriculum materials appropriate to the semesterís purpose are identified or adapted for schoolsí use. Academic activities build toward ìconstructive actionî projects in which children apply knowledge to contribute to real community needs. Anecdotal reports of early outcomes have identified
individual schools implementing Audrey Cohen design in San Diego, Phoenix,
and Miami that have reported above-average gains on standardized achievement
tests. Co-NECT Co-NECT (Goldberg & Richards, 1996) is a design created by a Cambridge (MA) consulting firm, Bolt, Beranek, and Newman. The design focuses on complex interdisciplinary projects that extensively incorporate technology and connect students with ongoing scientific investigations, information resources, and other students beyond their own school. Cross-disciplinary teaching teams work with clusters of students. Performance-based assessments are extensively used. On a battery of performance
items, one of the original pilot schools for Co-NECT, a middle school
in Worcester, MA showed significant gains from 1994 to 1995 in reading
scores. Other schools also showed gains in selected areas. Expeditionary Learning Expeditionary Learning Outward Bound (Campbell, Farrell, Kamii, Lam, Rugen, & Udall, 1996) is a design built around learning expeditions, explorations within and beyond school walls. The program is affiliated with Outward Bound, and incorporates many of its principles of active learning, challenge, and teamwork. It makes extensive use of project-based learning, cooperative learning, and performance assessments. Expeditionary Learning
schools in Boston, Dubuque, and New York City have shown significant
increases over time on standardized test scores. Modern Red Schoolhouse The Modern Red Schoolhouse (Kilgore, Doyle, & Linkowsky, 1996) is a project of the Hudson Institute, a conservative think tank in Indianapolis. The program emphasizes strong core academic subjects, and in the elementary and middle grades is based on the E.D. Hirsch (1993) Core Curriculum. It makes extensive use of technology in instruction and assessment, and has established benchmarks for academic performance that all students must achieve to be advanced into the next unit or grade. Several elementary schools involved in the Modern Red Schoolhouse design have shown improvement on NCEs in the early grades. In particular, a school in the Bronx showed substantial gains on a state essential skills test in reading and math. National Alliance The National Alliance
for Restructuring Education (Rothman, 1996) is a partnership of states,
school districts, and national organizations affiliated with the New
Standards Project. The National Alliance is different from all other
NAS designs in that its emphasis is more on systemic reform than on
specific school-by-school restructuring. In particular, the National
Alliance works to help states and districts establish standards, performance
assessments, and accountability methods, and then helps schools design
their own approaches to meet those standards. Districts are also urged
to give schools greater autonomy and control over resources to find
their own ways to meet high standards. In the state of Kentucky, a key
National Alliance partner, schools engaged with the National Alliance
were much more likely than other Kentucky schools to earn awards for
improving their studentsí performance.
Most of the programs currently available to schools for replication are classroom instructional programs, often focusing on a single subject. For example, the National Diffusion Network (1995) listed more than 500 replicable programs with some evidence of effectiveness, and the great majority of these are classroom innovations. The following sections discuss replicable classroom programs that have been researched and/or extensively applied with students placed at risk.
Cooperative learning refers to a broad range of instructional methods in which students work together to learn academic content. Research comparing cooperative learning and traditional methods has found positive effects on the achievement of elementary and secondary students, especially when two key conditions are fulfilled. First, groups must be working toward a common goal, such as the opportunity to earn recognition or rewards based on group performance. Second, the success of the groups must depend on the individual learning of all group members, not on a single group product (see Slavin, 1995). Cooperative learning
methods are widely used throughout the U.S. and other countries with
all kinds of schools and children, and the research on these methods
has equally involved a broad diversity of schools and students. Cooperative Integrated Reading and Composition (CIRC) and Bilingual Cooperative Integrated Reading and Composition (BCIRC) Cooperative Integrated Reading and Composition, or CIRC (Stevens, Madden, Slavin, & Farnish, 1987), used in grades 2-8, involves a series of activities derived from research on reading comprehension and writing strategies. Students work in four-member heterogeneous learning teams. After the teacher introduces a story from a basal text or trade book, students work in their teams on a prescribed series of activities relating to the story. These include partner reading, where students take turns reading to each other in pairs; ìtreasure huntî activities, in which students work together to identify characters, settings, problems, and problem solutions in narratives; and summarization activities. Students write ìmeaningful sentencesî to show the meaning of new vocabulary words, and write compositions that relate to their reading. The program includes a curriculum for teaching main idea, figurative language, and other comprehension skills, and includes a home reading and book report component. The writing/language arts component of CIRC uses a cooperative writing process approach in which students work together to plan, draft, revise, edit, and publish compositions in a variety of genres. Students master language mechanics skills in their teams, and these are then added to editing checklists to ensure their application in studentsí own writing. Teams earn recognition based on the performance of their members on quizzes, compositions, book reports, and other products (see Madden, Slavin, Farnish, Livingston, Calderón, & Stevens, 1996). BCIRC ( Calderón, Hertz-Lazarowitz, Ivory, Slavin 1996; Calderón, Tinajero, & Hertz-Lazarowitz, 1992) adds to the CIRC structure several adaptations to make it appropriate to bilingual settings. It is built around Spanish reading materials in the younger grades and then uses transitional reading materials as students begin to transition from Spanish to English. The age of transition depends on district policies; materials to accompany Spanish basals and novels have been developed though the sixth grade, but most such materials are used in transitional bilingual education programs only through the third or fourth grades. In addition, effective ESL strategies designed to engage students in negotiating meaning in two languages and increase authentic oral communication are built into the training program. The original CIRC program has been evaluated in three studies in elementary schools (Stevens , Madden, Slavin, & Farnish, 1987; Stevens & Slavin, 1995) and one study in two middle schools (Stevens & Durkin, 1992). In each case, CIRC students made significantly greater gains than control students on standardized tests of reading achievement. Two studies in Israel, one in Hebrew and one in Arabic, also found positive effects of CIRC compared to traditional methods (Hertz-Lazarowitz et al., in press; Schaedel et al., in press). A four-year study of BCIRC was conducted in 24 bilingual grade 2-4 bilingual classes in El Paso, Texas (Calderón, 1994; Hertz-Lazarowitz, Ivory, & Calderón, 1993). Experimental and control classes were carefully matched. Students transitioned from mostly-Spanish instruction in second grade to mostly-English instruction in the fourth grade. At the end of second grade, BCIRC students scored significantly better than control on the Spanish TAAS (Texas Assessment of Academic Skills) in both reading (ES=+.43) and writing (ES=+.47). In third grade, students were tested on the English Norm-Referenced Assessment Program for Texas (NAPT), and again BCIRC students outperformed controls in reading (ES=+.59) and language (ES=+.29). Finally, fourth graders in BCIRC scored higher than controls in NAPT reading (ES=+.19), but not language. However, these differences were depressed by the transfer of students out of the bilingual classes into English-only classes, which happened with four times as many BCIRC as control students. Students who were moved out of the bilingual classes early tended to be the highest achievers, so deleting them from the sample reduced the apparent experimental-control differences. CIRC is used in several
hundred schools nationally, and BCIRC is used in more than a hundred,
including Success for All programs with bilingual programs which use
an adaptation of BCIRC. Training programs for CIRC and BCIRC exist at
Johns Hopkins University in Baltimore and El Paso, and additional trainers
in both models are located in many parts of the United States Complex Instruction/Finding Out/Descubrimiento Complex Instruction is the name given to a set of cooperative learning approaches developed and researched by Elizabeth Cohen (1994a) and her associates at Stanford University. From its inception, the program has focused on Spanish bilingual classes. It was first built around a discovery-oriented science and mathematics program called Finding Out/Descubrimiento, developed by DeAvila and Duncan (1980). Finding Out/Descubrimiento provides students with a series of activity cards in English and Spanish which direct them to do experiments, take measurements, solve problems, and so on. Students work in small, heterogeneous groups to do experiments and answer questions intended to evoke high level thinking. As it relates to bilingual education, a major focus of the program is to get students to use complex, sophisticated language to express, debate, and defend their ideas, thereby building language fluency first in their home language and then in English. Whenever possible, each group contains monolingual Spanish, monolingual English, and bilingual children, who freely translate ideas for each other. Complex Instruction adds to Finding Out/Descubrimiento a group structure, in which students take on specified roles (e.g., facilitator; checker, reporter) and learn group process skills, such as active listening, maintaining a positive group atmosphere, and ensuring equal participation. The program also emphasizes building positive expectations for all students; for example, by giving low status children opportunities to be the group expert and constantly reinforcing the idea that all children have different abilities, each of which is worthy of respect (Cohen, 1994a). The evaluations of Complex Instruction/Finding Out/Descubrimiento have not generally met the standards established in this review. Most have reported positive correlations between the frequency of studentsí talking and working together and gains in student achievement (Cohen & Intili, 1981; Cohen, Lotan, & Leechor, 1989; Cohen, 1984; Stevenson, 1982). This could be taken as an indication that better implementers of the program get better results, but it does not indicate that the children are performing better than they would have without the program. Similarly, reports of NCE gains in the program classes (see Cohen, 1994b) are inadequate indicators of program impacts. Still, the accumulation of imperfect but supportive evidence and the clear focus on improving the higher-order thinking of students in bilingual programs makes this program appealing. The Complex Instruction
program at Stanford provides materials and professional development
to support program adoption in elementary and middle schools, and it
is used in many schools, particularly in California. Student Teams-Achievement Divisions and Teams-Games-Tournaments Two related cooperative learning programs developed at Johns Hopkins University are among the most thoroughly evaluated of all cooperative methods, and have been extensively disseminated. These are Students Teams-Achievement Divisions (STAD) and Teams-Games-Tournament (TGT) (Slavin, 1994, 1995). In STAD, students work in four-member, heterogeneous learning teams. First the teacher provides the lesson content through direct instruction. Then students work in their teams to help each other master the content, using study guides, worksheets, or other material as a basis for discussion, tutoring, and assessment among students. Following this, students take brief quizzes, on which they cannot help each other. Teams can earn recognition or privileges based on the improvement made by each team member over his or her own past record. TGT is the same as STAD except that students play academic games with members of the other teams to add points to an overall team score. Both programs emphasize the use of group goals (in this case, recognition) in which teams can only achieve success if each team member can perform well on an independent Jigsaw assessment. This motivates team members to do a good job of teaching and assessing each other. Both STAD and TGT have been extensively evaluated in comparison to control groups in a wide variety of subjects, mostly in schools serving many African American and/or Latino students. Across 26 such studies of at least four weeks duration, there was a median effect size of +.32 for STAD; in 7 studies of TGT, the median effect size was +.38 (Slavin, 1995). STAD and TGT are
used in thousands of classrooms nationwide. A training program at Johns
Hopkins University and certified trainers throughout the U.S. provide
professional development in these methods. Jigsaw Jigsaw (Aronson et al., 1978) is a cooperative learning technique in which students work in small groups to study text, usually social studies or science. In this method, each group member is assigned to become an ìexpertî on some aspect of a unit of study. After reading about their area of expertise, the experts from different groups meet to discuss their topic, and then return to their groups and take turns teaching their topics to their groupmates. In a variation of Jigsaw called Jigsaw II (Slavin, 1994), students are given topics in a common reading, such as a text chapter, biography, or short book. After they have read the material, discussed it with their counterparts in other groups, and shared their topics with their own group, they take a quiz on all topics, as in STAD. The first brief Jigsaw evaluation (Lucker, Rosenfield, Sikes, & Aronson 1976), found positive effects of the program for ìminority studentsî (Latino and African American students analyzed together), but not for Anglos. A study in bilingual classes (Gonzales, 1981) and one in majority-Latino schools (Tomblin & Davis, 1985) found no achievement benefits. Outcomes for Jigsaw II have been more positive (Mattingly & VanSickle, 1991; Ziegler, 1981). Jigsaw is widely used nationwide. Training in numerous Jigsaw variations is provided by Spencer Kagan and his colleagues (Kagan, 1995) among others. Learning Together David and Roger Johnsonsí (1994) Learning Together models of cooperative learning are among the most widely used of all cooperative learning models. In these methods, students work in small groups on common assignments, typically creating a single group product. All group members are evaluated based on this product. In some applications of this method, groups may earn recognition or grades based on either overall group performance or on the sum of individual performances. Many evaluations of Learning Together models have been very brief and artificial, but among those of at least four weeksí duration, evidence supports the achievement effects of forms of the Learning Together model that incorporate group goals and individual accountability (i.e., group success depends of the sum of individual performances). The Johnsonís methods
are widely used throughout the world. Trainers in these methods are
located at the University of Minnesota and in many other parts of the
United States. Group Investigation Group Investigation is a form of project-based learning developed by Shlomo and Yael Sharan (1992) and their colleagues in Israel. In this method, students form their own 2-6 member groups. The groups choose topics from a unit being studied by the entire class, break these topics into individual tasks, and carry out activities necessary to prepare and present group reports. Studies of Group Investigation have generally supported the effectiveness of this approach, especially on higher-order skills (Sharan & Shachar, 1988).
Reading, Writing, and Language Arts There are many well-evaluated
and replicable programs designed for use in specific grades and subjects.
In reading, positive effects have been found in the Success for All
and CIRC programs, described earlier, and in three additional programs
described in this section. Positive reading effects have also been found
for tutoring programs, described in a later section. In writing and
language arts, effective methods generally include some from of process
writing, in which students work together to plan, draft, revise, edit,
and publish compositions. A general review of process writing models
(Hillocks, 1984) found consistently positive effects on quality of studentsí
writing. CIRC and BCIRC, described earlier, use process writing methods.
Other approaches to writing that have been successfully researched and/or
disseminated with students placed at risk are described below. Direct Instruction (DISTAR) DISTAR (Bereiter & Englemann, 1966) is an early elementary school program originally designed to extend the Direct Instruction early childhood curriculum into the elementary grades as part of a federal program called Follow Through, which funded the development and evaluation programs to continue the positive effects of early childhood programs. The primary goal of both the early childhood program and DISTAR was to provide low-SES children with opportunities to succeed academically by utilizing a scripted program that stresses structured direct instruction. Revisions of DISTAR have been disseminated in recent years under the titles of Reading Mastery/Math Mastery and Direct Instruction. Teachers involved in DISTAR have specific instructions on how to teach each of the units presented to the students, as well as what units to teach them. Students initially begin with DISTAR in either kindergarten or first grade. Progress in DISTAR is usually monitored by evaluating academic performance of students in the program, using both criterion-referenced and norm-referenced measures. The most comprehensive evaluation of DISTAR compared the results of nine Follow Through programs that also had early-childhood education programs. Each program was compared to control groups that were not implementing Follow Through (Abt, 1977). The total number of subjects was 9,255 for the Follow Through (experimental group) and 6,485 for the non-Follow Through students. All of the children were from similar socioeconomic backgrounds. The study evaluated the effects of the programs on academic achievement, cognitive achievement, and self-esteem, as measured by performance on norm-referenced tests such as MAT, Ravens Progressive Matrices (1956), Coopersmith Self Esteem Inventory, and the Intellectual Achievement Responsibility Scale. Programs were clustered in three groups in terms of their overall goals for the children. The first cluster of programs stressed individualized, child initiated activities, and focused on the development of the whole child. Examples of these programs included the Open Education Model, Tucson Early Education Model, Cognitively Oriented Curriculum, Responsive Education Model, and the Bank Street College Model. The second cluster of programs stressed direct instruction with the specific goal of developing and improving studentsí academic skills. These two programs were the Behavior Analysis model and the Direct Instruction model. The final cluster included programs whose goals were to improve specific areas related to the performance of the children. These programs were the Florida Parent Education Model and the Language Development (Bilingual Model). Direct Instruction and Behavior Analysis were the only models that showed substantial effects both when compared to non-Follow Through programs and when compared to other programs. Other programs evaluated showed either effects of zero, or negative effects when all three of the skills (basic, cognitive, or affective) were measured. The Direct Instruction group did better than all of the other groups on the MAT language (ES=+.84) and mathematics computation (ES=+.57). Direct Instruction students also scored somewhat higher in reading comprehension (ES=+.07) and mathematics problem solving (ES=+.17), and were also higher in self esteem. Becker and Gersten (1982) studied the lasting effects of Direct Instruction on students in fifth and sixth grades. This study followed up students who had been in DISTAR in grades 1-3 in five sites. The students were matched with control groups based on income level, gender, primary home language, and motherís education level, and these factors were used as covariates. Students in this site were pre-tested in Spring 1975, using all subtests of the MAT, and also the Language Acquisition Scale test. This study was then replicated using post-test scores in 1976. Overall results show that DISTAR students outperformed non-DISTAR students on the overall WRAT (ES=+.53), and on all of the subtests of the MAT. Meyer (1984) investigated the long-term effects of DISTAR on children who had had three and four years of the program, and compared their achievements to those of matched control groups. The study involved three cohorts of students from a New York City elementary school. Students in the Direct Instruction Follow Through school in New York City were matched with control group students based on achievement scores on the MAT, free lunch eligibility, and ethnicity (African-American and Puerto Rican). Evaluators compared the two groups of students on high school graduation rate, ninth grade reading score, ninth grade math score, studentís application to college, studentís acceptance to college, studentís special education placement, and studentís school attendance for the previous year. Students who had been involved in the program in 1968-1969, 1969-1970, or 1970-1971 were followed up in 1981, when they were high school seniors. Over the three cohorts, more than 63% of the Direct Instruction students graduated from high school, as opposed to 38% of the control group. An average of 21% of the Direct Instruction students were retained compared to 33% of the control students. The Direct Instruction students had a lower dropout rate (28%) than the control group (46%) over the three cohorts. More of the Direct Instruction (34%) group students applied to college than the control group (22%), and more of the Direct Instruction group students were accepted for admission to college over the three cohorts (34%) than were the control group students (17%). The follow-up evaluation also compared ninth-grade mean reading and math scores in grade equivalents. Overall, students in the Direct Instruction cohort outperformed students in the control group in readings (ES=+.41) and in math (ES=+.29). The current version of DISTAR, called Reading Mastery, is commercially published and used throughout the U.S. Exemplary Center for Reading Instruction The goal of the Exemplary Center for Reading Instruction (ECRI; Reid, 1989) is to improve elementary studentsí reading ability. This program emphasizes such reading-related skills as word recognition, study skills, spelling, penmanship, proofing, and writing skills, leading to improvement in decoding, comprehension, and vocabulary. ECRI teachers expect all students to excel. The lessons for ECRI are scripted, and incorporate multi-sensory and sequential methods and strategies of teaching. In a typical lesson, teachers introduce new concepts in lessons using at least seven methods of instruction,, teaching at least one comprehension skill, one study skill, and a grammar/creative writing skill. Initially, students are prompted for answers by teachers. As the students begin to master the information presented, fewer and fewer prompts are provided until students can perform independently. In one evaluation of ECRI (Reid, 1989), researchers investigated the effects of ECRI on students in grades 2-7 in Morgan County, Tennessee, and compared them to students in a control group who were using a commercial reading program. Both schools were tested using Stanford Achievement Test reading comprehension and vocabulary subtests. ECRI students outperformed those in the control group, with effect sizes ranging from +.48 to +.90 in reading comprehension, and from +.31 to +1.40 in vocabulary. Another evaluation of the effectiveness of ECRI on Latino bilingual students in Oceanside, California, Killeen, Texas, and Calexico, California, (Reid, 1989), showed NCE gains that ranged from +6.4 to +25.7. ECRI is used in hundreds of schools nationwide. Reciprocal Teaching Reciprocal Teaching (Palincsar & Brown, 1984) is a reading program designed to improve the reading comprehension of children in elementary and middle schools that emphasizes cognitive strategies of scaffolding through dialogue. The main two components of Reciprocal Teaching are comprehension fostering, which includes the four strategies of question generation, summarization, prediction, and clarification; and dialogue, which includes prepared conversations and questions that guide the comprehension process and product. The program uses a scaffolding process, in which teachers are initially more responsible for producing questions, guiding the dialogue, and showing the students how to comprehend text. Eventually, the students become more responsible for the products, creating questions for each other and guiding the dialogue with less teacher input. A typical Reciprocal Teaching session begins with students reading an initial paragraph of expository material, with the teacher modeling how to comprehend the paragraph. The students then practice the strategies on the next section of the text, and the teacher supports each studentís participation through specific feedback, additional modeling, coaching, hints, and explanation. The strategies include commenting and elaborating on summaries of paragraphs, suggesting additional questions, providing feedback on their peersí predictions, and requesting clarification of material not understood. Although Reciprocal Teaching has several important components that distinguish it from other reading approaches, it is flexible. For example, in some forms of Reciprocal Teaching, the cognitive dialogue precedes the text reading exercise, and in other forms cognitive dialogue takes place while the students are reading the text. A meta-analysis of the achievement effects of Reciprocal Teaching was carried out by Rosenshine & Meister (1994). Sixteen studies representing different levels of implementation (high, medium, and low) and different methods of teaching were synthesized. High implementation studies included dialogue, questions, and assessment of student learning strategies, medium level studies included dialogue but did not include assessments, and low level studies had neither dialogue nor assessment information. The meta-analysis investigated how Reciprocal Teaching students performed on standardized and experimenter-made tests as compared to their control-group peers. The overall effect size for performance on standardized tests was +.32; but only in two cases did the Reciprocal Teaching students do significantly better on standardized tests than their control group counterparts. Effect sizes were much higher on the experimenter-made tests (ES=+.88). In several cases, effect sizes were lower in studies in which implementations were rated as low in quality, but there were few differences between the outcomes of high and medium quality implementations. Profile Approach to Writing (PAW) The Profile Approach to Writing (PAW, 1995; Hughey & Hartfiel, 1979; Jacobs, Zinkgraf, Wormuth, Hartfiel, & Hughey, 1981; Hughey, Wormuth, Hartfiel, & Jacobs, 1985; Hartfiel, Hughey, Wormuth, & Jacobs, 1985) is a program that provides professional development in creative writing to students in grades 3-12. The program emphasizes a process of drafting and revision of compositions, and makes use of a writing profile to assess and guide student writing performance. The profile is a holistic/analytic scale that assesses content, organization, vocabulary use, language use, and mechanics in studentsí compositions. Several evaluations of the Profile Approach to Writing have been carried out by the program developers (Profile Approach to Writing, 1995). One of these compared students in a predominately (55%) Latino middle school in Texas to a control group. Students in the experimental and control group were pre- and posttested on the projectís own Composition Profile, the 100-point holistic/analytic scale used in the instructional program. Experimental and control students were similar in scores at pretest. Students in the PAW school gained significantly more than those in the control group (ES=+.69) in a year-long comparison. Other less well-controlled evaluations on district-administered tests also found positive effects of PAW in middle and high schools. A methodological
limitation of the main experimental-control comparison is the fact that
it used the projectís own evaluation instrument, which teachers and
students had been using all year. However, holistic/analytic writing
comparisons of this kind are common in many writing performance measures
and are widely accepted by writing curriculum experts. The replicability
of PAW has been amply demonstrated. The program is in use in more than
1000 schools and has certified trainers in seven states. Multi-Cultural Reading and Thinking (McRAT) Multi-Cultural Reading and Thinking (McRAT) is a writing program that trains teachers to improve studentsí academic achievement by adding multi-cultural themes to all areas of the curriculum in grades 3-8. The program, developed by the Arkansas Department of Education (Quellmalz & Hoskyn, 1988; Arkansas Department of Education, 1992; Quellmalz, 1987), is intended to make students better readers and writers by adding multi-cultural and problem-solving components to all areas of the curriculum. McRAT does not exist as a stand-alone program, but works with the existing school curriculum. It strives to teach children to think critically about what they read in class, so that they can apply these critical processes to their writing and to real-life situations in which people of different backgrounds have to learn to work and live together. Specific skills that the children are taught include analysis, comparison, inference/interpretation, and evaluation, and these skills are used in all areas of the curriculum. In the study that evaluated the effects of McRAT on achievement, students represented a range of socio-economic status backgrounds, achievement levels, and ethnic backgrounds. This evaluation (Arkansas Department of Education, 1992) studied the effects of McRAT on achievement scores in the specific cognitive areas that the students were taught in the program: McRAT students were compared to matched control students. The students in the treatment group were 32% minority, 15% gifted and talented, and 25% Title I students. In the control group, the students were 30% minority, 15% gifted and talented, and 10% Title I students. Students in both the experimental and control groups were using the same curriculum, the only difference being that students in the experimental group had McRAT-trained teachers. Students in this sample included 234 fourth-, fifth-, and sixth-grade McRAT students, and 106 fourth-, fifth-, and sixth-grade non-McRAT students. Teachers in the treatment group were either in their first or second years of McRAT implementation. Students in both groups were assessed using an assessment measure created by the treatment group in September and again in May. The McRAT students outperformed the control students in the areas of analysis (ES=+.41), inference (ES=+.57), comparison (ES=+.65), and evaluation (ES=+.45). McRAT joined the National Diffusion Network in 1993, is currently used in 44 schools in Arkansas, and is also being disseminated nationally.
Four mathematics
programs met the inclusion standards applied in this review. Comprehensive School Mathematics Program The Comprehensive School Mathematics Program (CSMP, 1995), is a math program for grades K-6 that emphasizes problem solving rather than drill and practice lessons. CSMP strives to teach children the mathematical thinking skills and concepts that they need to use when approached with new math problems. The contents of the CSMP curriculum range from basic skills such as addition and subtraction to more abstract skills such as probability, statistics, and classification using higher order thinking skills, understanding of concepts, and algorithmic thinking. The program incorporates the use of calculators and computers. CSMP uses different types of ìlanguagesî for performing different types of mathematical functions. The language of strings, for example is used to gather data, the language of arrows places the different components of the mathematical problem into sets, and the language of a minicomputer allows the children to compute different problems using an abacus. Students also use manipulatives, such as tiles and blocks, to solve their problems. The materials used in CSMP were developed in classrooms in the Carbondale, Illinois, and University City, Missouri school districts, both of which are integrated (20-50% African-American) middle class communities. Disproportionately high numbers of both high- and low- performing students were included in the program development. After the initial pilot testing, the materials were tested nationwide. CSMP was developed, evaluated, and initially disseminated by CEMREL, a former education laboratory in St. Louis. Two research designs were used to evaluate CSMP (Comprehensive School Mathematics Program, 1995). The first design controlled for teacher effects: Teachers taught the regular curriculum during the first year and the CSMP curriculum during the second year. In the second design, CSMP classes were matched with a control group studying the regular curriculum. In both designs, students were given a problem solving test called Mathematics Applied to Novel Situations test (MANS), which was created by CEMREL. The CSMP students outscored the control students in the second, third, and sixth grades, with effect sizes of +1.26 +.22, and +.30, respectively. In the fourth and fifth grades, the non-CSMP students outperformed the CSMP students, with effect sizes of -.16 and -.32, respectively. CSMP, which has been
an NDN program since 1978, is now disseminated by another educational
laboratory (MCREL, in Aurora, CO) and has been used in districts throughout
the United States. Cognitively Guided Instruction Cognitively Guided Instruction (CGI) (Carpenter, Fennema, Peterson, Chiang, & Loef, 1989; Carey, Fennema, Carpenter, & Franke, 1993) is a mathematics program designed to develop student problem solving in the early elementary grades. CGI was created to teach the teachers of first grade students about problem-solving processes that their students use when solving simple arithmetic and complex mathematics problems and to train the teachers to create curricula consistent with new understandings of how children learn. Following extensive training, CGI teachers create units and themes to last the entire school year. In an evaluation of CGI (Carpenter, Fennema, Peterson, Chiang, & Loef, 1989), forty teachers were randomly assigned to either a control or a treatment group. CGI as well as control teachers had volunteered to participate in a summer in-service program that would last four weeks, and also to be observed in their classroom during instruction in mathematics during the following year. Teachers in both of the groups were involved in problem solving workshops, but one was a CGI workshop, and the other was a generic problem-solving workshop. Teachers in the CGI workshop, for instance, learned that they should closely relate problem solving to basic skills competency, and that problem solving should be the main focus of the mathematics lessons. They also learned that students should use prior knowledge when solving problems and be able to link what they already know to new problems that they may be solving. Teachers in the CGI workshop learned about teaching children conceptual problem solving, and the teachers were familiarized with curricular materials available for instruction. Finally, GCI teachers were asked to write a mathematics curriculum, based on what they had learned at the CGI workshops, that would span the academic year. Teachers in the control groups also participated in problem solving exercises for a similar amount of time. The teachers learned about the general concept of problem solving, but did not discuss how to understand how children solve problems or how to write a curriculum that would help children to solve problems based on this information. All students were given the Iowa Test of Basic Skills (ITBS) level 6 as a pretest in September, and the computation subtest of the ITBS level 7 was used as the written posttest of computation in April-May. Interviews were also conducted with the students. Student achievement results showed that CGI students outscored their control group counterparts in computations (specifically in number facts) and in problem-solving that involved complex addition/subtraction. Interviews also found that treatment students also had better attitudes toward math and felt more confident that they could perform complex mathematics. A second study of CGI evaluated the effectiveness of the program among low-income minority students (Villaseñor & Kepner, 1993). Twelve experimental and twelve control teachers were randomly assigned to CGI and control classes in Milwaukee. Minority populations ranged from 57% to 99%, primarily Latino or African-American. A 14-item arithmetic word-problem test focusing on higher level cognitive processes (Carpenter, Fennema, Peterson, Chiang, and Loef, 1989), developed by the creators of CGI, was administered as a pretest in early October, and again as a posttest in late February and early March. On the pretest, students in the experimental group had a slightly higher mean score than did students in the control group. Controlling for these differences, the experimental students still outscored their control-group counterparts. CGI is currently being implemented in several states, and training programs for the model have been established in Wisconsin, North Carolina, and Ohio. Project SEED Project SEED (Johntz, 1966, 1975; Phillips & Ebrahimi, 1993; Hollins, Smiler, & Spencer 1994; Project SEED, 1995), is an enrichment mathematics program designed to teach elementary school students, particularly low-income and minority students, to develop confidence in their ability to be successful in all academic work, giving them the grounding to help them to face challenging academic situations. Students participating in Project SEED are helped to improve their mathematics achievement skills and to continue to take classes in abstract and advanced mathematics. Project SEED hires and trains mathematicians, scientists, and engineers to teach students in the targeted population. Project SEED mathematics specialists then go into the classroom and introduce abstract mathematical concepts using a discovery method based on Socratic questioning, always making students active participants in the lessons. The Project SEED curriculum does not take the place of the regular mathematics curriculum, but is a supplement to it. When the Project SEED mathematics specialists teach the students, the regular classroom teachers remain in the classroom and observe and participate in what is being taught. Students involved in the program are expected to learn using dialogue, choral responses, discussion, and debates. In addition to teaching the students, the Project SEED mathematics specialists conduct workshops with the regular classroom teachers. Part of ongoing staff development includes Project SEED mathematics specialists observing and critiquing each other in the classroom at work and attending internal workshops. A study that evaluated the effects of one semester of Project SEED in Detroit (Webster & Chadbourn, 1992) compared the California Achievement Test (CAT) scores of 244 fourth grade students in SEED classrooms to those of 244 fourth grade students in SEED schools, but not in SEED classrooms (non-SEED), and to those of 244 fourth grade students neither in SEED schools nor in SEED classrooms (comparison group) during the 1991-92 academic year. Students in all three groups were matched based on gender, ethnicity, free or reduced lunch status, and third grade CAT scores. The SEED students outscored comparison group students in total math scores (ES=+.37), math computation (ES=+.38), and math concepts (ES=+.32). The non-SEED students in SEED schools also outscored the comparison students in all three areas, with effect sizes of +.17, +.23, and +.13 respectively. When the SEED and non-SEED schools were compared, students in the SEED group also outperformed students in the non-SEED groups on math total (ES=+.19), math computation (ES=+.16) and math concepts (ES=+.19). The effect of one semester of SEED was also evaluated in Dallas in a Project SEED longitudinal evaluation study (Webster & Chadbourn, 1992). The Dallas evaluation involved eleven elementary learning centers (South Dallas Learning Centers and West Dallas Learning Centers). Students in the South Dallas Learning Centers were 80% African-American, and students in the West Dallas Learning Centers were mostly Latino. There was a total of 10,890 Project SEED and matched comparison students. The treatment students were those who had been involved in Project SEED for at least one semester between 1982 and 1991. Students were administered either the Iowa Tests of Basic Skills (ITBS) or the Norm-Referenced Achievement Program for Texas (NAPT). The test scores between the control and experimental groups were equivalent at the beginning of the experiment. Students were tested on three ITBS scales: Concepts and problem solving, computation, and mathematics total. As with the Detroit study, SEED students significantly outscored the non-SEED students on all scales. The cumulative effects of Project SEED on students after one, two, and three semesters of involvement were also investigated (Webster & Chadbourn, 1992). A total of 3,092 students in five different settings were matched with control students on the basis of grade level, total mathematics achievement score, gender, ethnicity, and socioeconomic status, as determined by free lunch program participation. Beginning in the fourth grade, students in the treatment group received either one, two, or three semesters of Project SEED. Students were matched with students in other schools who did not receive SEED instruction, but may have received other types of intervention. Students were pre-tested on the ITBS, and after 1991, the Norm Referenced Achievement Test for Texas (NAPT). In every case except one out of thirty comparisons, the Project SEED students significantly outperformed the students in the control groups on the post-tests for both the NAPT and the ITBS, and the more semesters that a student had been involved in Project SEED (up to the maximum of three semesters), the greater the cumulative effect of Project SEED. A follow-up study (Webster & Russell, 1992) sought to evaluate the retention of mathematics skills after students had left Project SEED. This study included a total of 1,215 matched students from the previous study. Students who had been involved in the project for only one semester were followed for five years after their involvement, and students who had received three semesters of Project SEED in grades 4-6 were followed through the 1991-1992 school year. Overall, all Project SEED students, regardless of how long they had been in the program, still outscored the non-SEED students on the ITBS/NAPT up to two years after their Project SEED participation ended. More specifically, students who had been involved in Project SEED for one semester retained their mathematics skills for at least two years after they had left the program, and students who had been involved in the program for three semesters still retained their skills between two and five years after they had left the program. In the final Dallas longitudinal follow-up study of Project SEED (Webster, 1995), results showed that students who had been involved in Project SEED were more likely to enroll in advanced mathematics classes in the 9th ,10th , and 11th grades than were students who had not been involved in Project SEED. Project SEED currently exists in Texas, Michigan, Indiana, Pennsylvania, and California, and is validated by the National Diffusion Network (see Project SEED, 1995). Skills Reinforcement Project (SRP) The Skills Reinforcement Project (Mills, 1992; Skills Reinforcement Project, 1984, 1992, 1995) was developed by the Johns Hopkins University Center for Talented Youth (CTY). CTY began as a program for gifted or ìhighly ableî students, but it later added SRP, which is specifically designed for use with minority or low socioeconomic students who are likely to be underrepresented in advanced mathematics. The program was written to prepare fifth through eighth grade students to succeed in advanced level mathematics, with hopes that they would eventually become involved in mathematics and science careers. Staff in schools that adopt SRP attend training sessions before the implementation and during the year. SRP schools have a coordinator, who oversees the general management of the program at the school, and also oversees teacher training, curriculum development, and program evaluation. In addition to this, SRP schools involve a site director who acts as a facilitator for the program. Students involved in the SRP program are volunteers. They attend Saturday school during the school year, and then they participate in a two-week summer residential program. Students are initially assessed and then teaching is based on the results of this testing. The SRP program provides a balance of individualized instruction and cooperative learning. The content of the SDP curriculum ranges from arithmetic concepts and skills to more advanced areas of study such as algebra, geometry, and statistics. Research on SRP has been done at two sites in California, at schools with populations that are 40% African-American, 40% Latino, and 20% other, with a majority of the minority students qualifying for free lunches. The research design for all of the evaluations consisted of pre-post experimental/control comparisons. Student participants in both the control and treatment groups were volunteered by parents, and had to score between the 80th and 95th percentiles on the California Achievement Test. Students who met the criteria were randomly assigned to the SRP and control conditions, where the experimental students received substantial additional mathematics instruction, and the control group students received no extra mathematics instruction. The students were also equivalent on the basis of gender, ethnicity, income-level, and mean pretest scores. In addition to the CAT, the Sequential Tests of Educational Progress II (STEP) were used as pre- and posttests. The School and College Ability Test (SCAT) was used to assess mathematical reasoning ability. The first evaluation was done in Pasadena, California (Lynch & Mills, 1990). In this study, 32 SRP and 32 control sixth graders were administered the CAT and the STEP while they were in the sixth grade, and again nine months later in the fall of 7th grade. Adjusting for pretest differences, SRP students outperformed their control group counterparts (ES=+.41). A replication study was also done in Pasadena (SRP, 1992). This study involved 38 students; 19 in the control group, and 19 in the experimental group. In this study also, SRP students outscored the control group on both the SCAT (ES=+.72) and on the CAT/STEP tests (ES=+.73). The third evaluation was done in Los Angeles (Mills, Stork, & Krug, 1992). This study involved 54 students; 28 SRP students and 26 students in the control group. Once again, SRP students outscored the students in the control group on the SCAT (ES=+.55) and on the CAT/STEP (ES=+1.35). It is important to note that the evaluation of SRP does not compare one instructional method to another, but instead compares additional mathematics instruction to no extra instruction. SRP is currently
being used in three California school districts. Maneuvers With Mathematics Maneuvers With Mathematics (MWM) was founded at the University of Illinois at Chicago (Page, 1989; Long, 1993; MWM, 1995). This program was designed to teach students in grades 5-8 advanced mathematics problem solving. The goal of MWM is to motivate students to use mathematics in a creative manner, while still learning basic arithmetic skills. MWM trainers attend training sessions in summer institutes. An emphasis of MWM is on training both the teachers and students to use calculators to solve both simple arithmetic and complex geometry and advanced mathematics problems. Students are shown how math is used every day, for example in cooking, traveling, building houses, and using money. They use specific books created by MWM, which stress problem solving, re-checking answers, and using mathematics in real-life situations. Teacher guides provide alternative ways of presenting topics and concepts to the students. The main evaluation of this program was done in 1991. This evaluation involved 617 MWM students matched with 223 control students (MWM, 1991). The students in both groups exceeded the state norms in mobility and in the number of low income, limited English proficient (LEP) students. At the beginning of the year, students in both groups were administered pretests created by the Second International Mathematics Study (SIMS) and the National Assessment of Educational Progress (NAEP). The same tests were also used as posttests at the end of the school year. Students were not allowed to use calculators on these tests. Adjusting for pretest differences, the MWM students outperformed the students in the control group (ES=+.47). At each individual grade level, MWM students made better gains than the students in the control groups (ES = +.12, +.54, +.59, and +.86 in the fifth, sixth, seventh, and eighth grades, respectively). MWM is validated by the National Diffusion Network, and currently exists in all fifty states nationwide.
One way to increase the probability that students will succeed in school is to provide them with high-quality experiences before they enter school. This section briefly reviews research on Head Start, the source of prekindergarten programs for most disadvantaged students, and on two specific approaches to early childhood education. In addition, preschool and kindergarten curricula are part of the Success for All/Lee Conmigo and Roots & Wings programs, described earlier. Head Start The largest Federal investment in early childhood education is Project Head Start (Zigler & Muenchow, 1992, Zigler & Valentine, 1973). Head Start began as one of President Johnsonís War On Poverty programs in 1965. The goal of Head Start was to provide young children (mainly four year olds) with social and cognitive competence, by addressing certain specific outcomes felt to increase the likelihood that students would succeed when they entered elementary school. It was designed to achieve these outcomes through seven service components; education, parent involvement, mental health, physical health, nutrition, social services, and disabled student services or special needs. Head Start has served millions of children since its inception in 1965, and its effects have been extensively evaluated. Like Title I, Head Start is a funding source, not a specific program. Thus it is difficult to evaluate Head Start as a whole, as many different Head Start centers have different curriculum goals. Studies have shown that overall, the program is effective in helping children to adjust to kindergarten and elementary school (McKey, Condelli, Ganson, Barrett, McConkey, & Plantz, 1985) in including parents as participants in their childrenís education, and in seeing that children are up-to-date on their immunizations. Evaluations of the academic achievements of students who attend Head Start schools and centers generally find positive effects on early cognitive measures (such as IQ tests). Karweit (1989, 1994)
and Stein, Leinhardt, & Bickel (1989) reviewed the effects of Head
Start programs, and their syntheses found that Head Start showed immediate
improvement on cognitive functioning (ES=+.52). After the first year,
the effects decreased substantially (ES=+.10), and decreased further
during the second and third years (ES =+.08 and +.02 respectively).
Longitudinal studies of the Perry Preschool program, described below,
have found positive effects of preschool participation on such outcomes
as high graduation and delinquency, but there is little indication at
any age that attending Head Start or other early childhood programs
increases performance on measures of school achievement, such as reading
or math scores. Perry Preschool/High Scope One of the most extensively researched curriculum-specific early childhood education programs is the Perry Preschool Curriculum (Weikart, Rogers, Adcock, & McClelland, 1971). The creators of the Perry Preschool Curriculum believe in empowering the family, the child, and the teacher, as in Head Start programs, but the Perry Preschool program also has specific academic goals for participants in the program and its developers created a specific curriculum to accomplish these goals. Based on Piagetís theories of cognition, the Perry Preschool curriculum seeks to increase academic achievement and reduce studentsí chances of being placed in special education classes by teaching them to become active learners. The teacher acts as a facilitator of knowledge who sets up the classroom in such a way that the student is provided with the opportunity to learn math, science, reading, art, music, social studies, and movement every day. Students choose what they wish to study or work with, but the teacher is expected to be available to answer any questions and clarify any misunderstandings that students may have. The Perry Preschool model has been evaluated to investigate both short-term and long-term outcomes with at-risk preschoolers. As with other preschool programs, the Perry Preschool program has shown immediate (end of the year ) positive effects on cognitive measures such as I.Q., but these effects do not maintain into elementary school. In addition to the cognitive gains made by students who had attended Perry Preschool programs, a longitudinal evaluation of the effects of the Perry Preschool program on at-risk students (Schweinhart & Weikart, 1980, Schweinhart, Weikart, & Larner, 1986a, Schweinhart, Weikart, & Larner, 1986b) showed that children involved in these programs tended to stay in school longer, had fewer cases of teenage pregnancies and juvenile arrests, were retained less, were less likely to drop out of school, were more literate, were more likely to be employed, and were more likely to attend college or vocational school than students in control groups who had had no preschool experience. Evaluations of the long-term effects of the program on social adjustment showed that when students in three preschool groups (Direct Instruction, High/Scope, and nursery) were compared on self-reported delinquency, High/Scope students were less likely to have committed delinquent acts, followed by students who had attended traditional nursery school, and followed by students involved in Direct Instruction A twenty-two year follow-up study done on 95% of the participants involved in the original High Scope study (Schweinhart, Barnes, Weikart, Barnett, & Epstein, 1993) showed that High/Scope graduates still had a smaller chance of being arrested than the control group (35%). earned approximately $2,000 per month more than non-program members, were more likely to own their own home (36%) than non-program participants (13%), and had a higher rate of high school graduation (71%) than the control group students (54%). The High Scope curriculum
exists today in all 50 states. The program also provides an early elementary
curriculum that is used around the nation. Early Intervention for School Success (EISS)Project Early Intervention for School Success (EISS, 1986, Rogers, 1993) is an early intervention program developed under special funding from the California Legislature to provide low-income children with early education opportunities to help them become successful learners and thinkers. The legislative intent of this program was threefold: First, to establish a system to identify pupils at the ages of 4 to 7 who may be at-risk; second, to implement appropriate instructional programs to reduce the frequency and severit |