• unlimited access with print and download
    $ 37 00
  • read full document, no print or download, expires after 72 hours
    $ 4 99
More info
Unlimited access including download and printing, plus availability for reading and annotating in your in your Udini library.
  • Access to this article in your Udini library for 72 hours from purchase.
  • The article will not be available for download or print.
  • Upgrade to the full version of this document at a reduced price.
  • Your trial access payment is credited when purchasing the full version.
Buy
Continue searching

An analysis of the concurrent and predictive validity of curriculum based measures CBM), the Measures of Academic Progress (MAP), and the New England Common Assessment Program (NECAP) for reading

Dissertation
Author: Kristina J. Andren
Abstract:
This study examined the concurrent validity of four different reading assessments that are commonly used to screen students at risk for reading difficulties by measuring the correlation of the third grade Measures of Academic Progress (MAP) in Reading with three specific versions of curriculum based measurement: DIBELS oral reading fluency (ORF), AIMSweb ORF, and AIMSweb Maze. In addition, correlations were calculated among each of these measures with the third grade New England Common Assessment Program (NECAP) measure of reading achievement. Multiple regression analyses also were conducted to provide information on the predictive validity of CBM and MAP in determining risk for reading difficulty, as measured by a high stakes assessment (e.g., NECAP). Reading performance data were collected on 137 third grade students in the fall and winter. Significant correlations were found among each measure of reading at each point in time (p < .001). Correlations ranged from .972 (DIBELS ORF and AIMSweb ORF) to .621 (Maze and NECAP). Within each measure, ORF had the highest correlations between fall and winter measures (r = .952), followed by the MAP (r = .872) and Maze (r = .746), respectively. Regression analyses revealed that the MAP assessment in the fall best predicted MAP scores in the winter (p < .001), followed by oral reading fluency (p < .05). MAP was also the best predictor of NECAP scores for the general population of students (p < .001), as well as those students receiving supplemental reading support (p < .001). When MAP was removed from the equation, ORF was the most significant predictor of performance on the NECAP for general education (p < .001) and at-risk readers (p < .001). Educational implications and suggestions for further research are discussed.

TABLE OF CONTENTS

ACKNOWLEDGEMENTS............................................................................................................v LIST OF TABLES........................................................................................................................vii Chapter 1. INTRODUCTION & LITERATURE REVIEW....................................................................1 2. METHOD...............................................................................................................................7 Participants..................................................................................................................................7 Measures.....................................................................................................................................8 Procedures.................................................................................................................................11 3. RESULTS.............................................................................................................................12 Descriptive Statistics.................................................................................................................12 Correlation Analysis.................................................................................................................13 Regression Analysis..................................................................................................................16 4. DISCUSSION.......................................................................................................................19 Concurrent and Predictive Validity of Reading Measures.......................................................19 Educational Implications..........................................................................................................22 Limitations and Future Research..............................................................................................24 5. SUMMARY..........................................................................................................................25 REFERENCES.............................................................................................................................26 BIOGRAPHY OF THE AUTHOR...............................................................................................31

vii LIST OF TABLES Table 1: DEMOGRAPHIC DATA......................................................................................................8 Table 2: MEANS AND STANDARD DEVIATIONS (SD) FOR EACH MEASURE OF READING BY GROUP............................................................................................................................12 Table 3: INTERCORRELATIONS BETWEEN READING MEASURES IN SEPTEMBER AND JANUARY, OVERALL SAMPLE........................................................................................14 Table 4: INTERCORRELATIONS BETWEEN READING MEASURES IN SEPTEMBER AND JANUARY FOR STUDENTS IN TIER 1.............................................................................15 Table 5: SUMMARY OF MULTIPLE REGRESSION ANALYSIS FOR VARIABLES PREDICTING WINTER MAP SCORES..............................................................................16 Table 6: SUMMARY OF MULTIPLE REGRESSION ANALYSIS FOR CBM VARIABLES PREDICTING WINTER MAP SCORES..............................................................................16 Table 7: SUMMARY OF MULTIPLE REGRESSION ANALYSIS FOR VARIABLES PREDICTING NECAP SCORES (N = 137)..........................................................................17 Table 8: SUMMARY OF MULTIPLE REGRESSION ANALYSIS FOR CBM VARIABLES PREDICTING NECAP SCORES..........................................................................................17 Table 9: SUMMARY OF MULTIPLE REGRESSION ANALYSIS FOR VARIABLES PREDICTING NECAP SCORES IN SAMPLE OF LOWER PERFORMING READERS..............................................................................................................................18

viii Table 10: SUMMARY OF MULTIPLE REGRESSION ANALYSIS FOR CBM VARIABLES PREDICTING NECAP SCORES IN SAMPLE OF LOWER PERFORMING READERS..............................................................................................................................18

1 Introduction and Literature Review With recent educational reforms, schools nationwide are experiencing increased demands for assessment, accountability, and evidence-based practices in education. The No Child Left Behind Act (NCLB) of 2001 includes a requirement that academic instruction and assessment methods in both general and special education be scientifically-based (U.S. Department of Education, 2002). The 2004 reauthorization of the Individuals with Disabilities Education Improvement Act (IDEIA) incorporated requirements for Response to Intervention (RTI) initiatives, including scientifically based reading instruction, evaluation of students’ response to intervention, and data-based decision-making (IDEIA, 2004). RTI is a three-tiered model of assessment and intervention that involves systematic, data-based decision making. The process includes universal screening, early intervention, and ongoing progress monitoring (Brown- Chidsey & Steege, 2005). There are a wide variety of assessment measures that have been used within an RTI model, which differ in technical adequacy, administration setting and procedures, skills tested, and time and financial requirements. Curriculum-based measurement (CBM) is a method of formative evaluation within a problem-solving model, which utilizes brief standardized tests of basic academic skills (Deno, 2005; Shinn, 2002). The Northwest Evaluation Association (NWEA) Measures of Academic Progress (MAP) are computerized adaptive tests that measure academic achievement in the areas of mathematics, reading, and language usage. The MAP is designed to provide academic achievement data, in order for educators to gather school-wide data and develop targeted instruction for individuals and groups of students. The purpose of this study was to examine the correlation of the MAP in Reading with three specific versions of curriculum based measurement using the Dynamic Indicators of Basic Early Literacy Skills (DIBELS) and AIMSweb CBM

2 reading items. In addition, the predictive qualities of both the MAP and CBM in relation to student performance on the New England Common Assessment Program (NECAP) were examined. This analysis provides information on the concurrent and predictive validity of these four reading measures. A cornerstone of RTI is a technically sound method for gathering data on student progress, in order to identify students in need of additional instruction and to determine whether the instruction is effective. It is important that the assessment method is valid and reliable, efficient and cost-effective to administer, and sensitive to student growth (Deno, 2005; Shinn, 2002). Curriculum based measurement is one tool for gathering universal screening and progress monitoring data that meets these three requirements (Deno et al., 2009; Stecker, Lembke, & Foegen, 2008). In addition, teachers’ use of CBM within a data-based decision- making model has been associated with improvements in student achievement, providing evidence for its utility in guiding instructional decisions (Deno et al., 2009; Stecker, Fuchs, & Fuchs, 2005). In reading, there are two primary curriculum based measures: oral reading fluency (ORF) and maze tasks. Examples of oral reading fluency probes are the DIBELS and AIMSweb. For these instruments, students are given a grade-level reading passage and are asked to real aloud for one minute. The metric for this assessment is the number of words read correctly in one minute (Good, Gruba, & Kaminski, 2002). Maze procedures use reading probes in which every seventh word is removed and replaced with a choice of three words, one of which is the correct word and two are distracters. Administered in a group setting, students are instructed to read the passage and circle the correct word for each blank space, for three minutes. Scores on a maze

3 task represent the total number of words circled correctly in the given time period (AIMSweb, 2008). Numerous studies have demonstrated a relationship between CBM data and student performance on standardized measures of achievement (Deno et al., 2009; Hintze & Silberglitt, 2005; Silberglitt, Burns, Madyun, & Lail, 2006). These findings are important because they suggest that brief CBM can help predict performance on high stakes tests, thereby providing opportunities for early identification and intervention. Ardoin et al. (2004), for example, investigated the correlation between oral reading fluency, maze, a group-administered achievement test, and reading subtests of the Woodcock-Johnson Tests of Achievement, Third Edition (WJ-III). Seventy-seven third grade students were given all four assessments, and correlations, t-tests to measure differences in correlations, and multiple regression analyses were conducted. All correlations between ORF, maze, and the WJ-III subtests were statistically significant. ORF was more closely related than the maze to the WJ-III, and the addition of the maze did not significantly increase the predictive power of ORF. The authors concluded that although both curriculum-based measures correlated significantly with the WJ-III, ORF was a better predictor of overall reading achievement and reading comprehension. Wiley & Deno (2005) compared the predictive value of ORF and maze tasks by administering both to a group of third and fifth grade students, and correlating their scores with a state standards test. Moderate correlations were found between both CBM measures and the state assessment. Furthermore, combining ORF and maze increased the predictive power. This finding was not true, however, for English language learners (ELLs). For ELLs, maze was a better predictor than ORF, while ORF better predicted statewide test scores for non-ELL

4 students. Overall, these results support the use of CBM in reading for screening and progress monitoring. Similarly, Deno et al. (2009) investigated the use of a maze task as a universal screening measure by examining the relationship between performance on the maze and a standardized test of reading. Correlations between the two reading measures ranged from .61 to .77. In addition, school-wide data indicated that maze scores increased steadily with each grade level over the course of two school years, providing support for its use as a progress monitoring measure. The authors concluded that given its evidence of validity and utility in identifying students at risk, and its group administration format, maze procedures are efficient, effective, and provide clear data as a universal screening measure within a school-wide RTI model. Several studies have examined the correlation between ORF and standardized reading tests over a longer time period, and found similar results. In a replication of a study by Stage & Jacobsen (2001), which found significant correlations between oral reading fluency and a state reading test for fourth grade students, McGlinchey & Hixson (2004) examined a larger sample of students and tracked performance over eight years. The researchers found moderately strong correlations (.67 on average) between a single ORF probe and performance on the Michigan Educational Assessment Program assessment of reading. In another longitudinal study, Keller- Margulis, Shapiro, & Hintze (2008) found moderate to strong correlations between CBM data and achievement test scores in reading one and two years later. Evidence from longitudinal studies suggests that the magnitude of the correlation between ORF and state reading achievement test data tends to decrease with advancing grade levels (Silberglitt, Burns, Madyun, & Lail, 2006).

5 There is a strong research base providing evidence that measures of oral reading fluency (Ardoin & Christ, 2008; Baker et al, 2008; Hintze & Silberglitt, 2005) and maze procedures (Begeny & Martens, 2006; Graney, Missall, Martinez, & Bergstrom, 2009) are sensitive to student growth over time. These findings suggest that CBM can be administered on an ongoing basis to track student growth. When used for this purpose, the slope, or rate of growth, in addition to the student’s reading level, can predict performance on standardized reading measures (Baker et al., 2008; Keller-Margulis, Shapiro, & Hintze, 2008). In a study on the relationship between ORF level and slope and high stakes reading tests, Baker et al. (2008) found that the best fitting predictive model for performance on high stakes reading tests included ORF level and slope, and the state reading assessment scores from the previous year. Taken together, these studies provide strong evidence for the predictive value and measurement sensitivity of CBM in reading. Standardized achievement tests have also been used to assess students’ reading skills. The Northwest Evaluation Association (NWEA) has developed a computerized achievement test known as the Measures of Academic Progress (MAP). The MAP is a group-administered, computerized adaptive test that measures academic achievement in the areas of mathematics, reading, and language usage. A comparison of student performance on computerized adaptive tests and paper-and-pencil achievement tests indicated no significant differences in reading scores (Kingsbury, 2002). Overall, test modality had very little observed effect (less than one scale score) on student performance. The Measures of Academic Progress in Reading include items assessing word recognition and vocabulary, reading comprehension, and literary analysis. Scores on the MAP are reported as Rasch Unit (RIT) scores, percentiles, and growth scores.

6 According to studies published by the NWEA (2004), there is evidence of concurrent validity of the MAP with numerous state achievement tests. For measures of reading in third grade, the correlations range from .66 with the Texas Assessment of Knowledge and Skills to .87 with the Stanford Achievement Test, 9 th Edition. Another study, examining the predictive value of the MAP on the Delaware State Testing Program, found a correlation of .54 between the two reading measures (Hall-Michalcewiz, 2008). Delong (2007) investigated the relationship between the MAP and state assessments in a different way. This study compared scores on the Indiana Statewide Testing for Educational Progress-Plus (ISTEP+) for schools that utilize MAP testing and those that do not. Results indicated no significant correlation between ISTEP+ scores and the use of the MAP or the level of implementation of MAP testing. In the state of Maine, NWEA conducted a study to determine how well RIT scores on the MAP correlated with student performance on the Maine Educational Assessment (MEA), and to identify RIT cut-scores that would predict success on the MEA (Cronin, 2004). Results indicated a correlation of .74 for fourth grade reading scores. This suggests that the MAP has the capacity to identify students at risk for academic difficulties, and given individual subtest scores, determine the subject area(s) where support is most needed. In 2009, Maine changed the state- required test and began using the New England Common Assessment Program (NECAP) intead of the MEA. Despite the widespread use of MAP in Maine schools, no studies docmenting the correlation between MAP and NECAP scores were found. The goals of universal screening and progress monitoring measures are to identify those students at risk for learning difficulties so that they can receive early, targeted intervention, and to evaluate the effectiveness of the instruction. To do this, it is important to identify assessment measures that best describe and predict future reading ability. Previous research has separately

7 examined the concurrent and predictive validity of CBM and MAP with numerous state reading assessments and standardized achievement measures, yet the direct relationship between CBM and MAP has not been explored. This information is important to school personnel who make decisions about universal and benchmark screening, because the MAP and CBM are often used together for these purposes. This study addressed the following research questions: 1. What is the concurrent validity of the MAP, DIBELS and AIMSweb ORF, and AIMSweb maze measures of reading? 2. What are the correlations of each measure at fall and winter benchmarks? 3. How well do CBM and MAP assessments in the fall predict future reading performance, as measured by the post-test MAP and the New England Common Assessment Program (NECAP), the newly adopted statewide achievement test in reading? First, it was hypothesized that DIBELS and AIMSweb measures are more closely correlated with one another than with the MAP measure of reading. Second, this study tested the hypothesis that CBM of reading in the form of oral reading fluency and maze predict mid-year scores on the MAP and achievement scores on the NECAP as well or better than fall MAP scores. Method Participants The participants in this study were 137 third grade students from two schools in a suburban public school district in the northeastern United States. Table 1 summarizes data concerning the participants. The two elementary schools enrolled 448 and 199 students, respectively. The racial/ethnic breakdown of the schools was 96% Caucasian, 1% African

8 American, 1% Hispanic, and 2% Asian. The percentage of students who were eligible for free or reduced price lunch in each school was 19%. Table 1

Demographic Data

Boys

Girls

Tier 2/3

Special Education

Total

70

67

11

11

137

Data were gathered on 150 students, although 13 students did not participate in every measure at each point in time. Only those students with complete datasets were included in the analysis. Of the final sample, 51% (n = 70) were boys and 49% (n = 67) were girls. Students were coded according to whether they were receiving Tier 1 instruction only, supplemental Tier 2 or 3 reading instruction, or special education services. Tier 1 is the general education curriculum within a regular classroom. Students in Tier 2 and Tier 3 typically received Tier 1 instruction as well as small group reading instruction for 20 – 30 minutes per day outside of the regular classroom. Eleven of the students had Individual Education Plans (IEPs), and 11 students received supplemental Tier 2 or 3 reading instruction. A university human subjects review board approved all procedures. Measures Five different measures of reading were used in this study: (a) Dynamic Indicators of Basic Early Literacy Skills (DIBELS) oral reading fluency (ORF), (b), AIMsweb ORF, (c) AIMsweb maze, (d) NWEA Measures of Academic Progress (MAP) reading subtest, and (e) the New England Common Assessment Program (NECAP) reading subtest. DIBELS. Third grade Dynamic Indicators of Basic Early Literacy Skills (DIBELS) were used to measure oral reading fluency (ORF). Three grade-level reading probes were

9 administered to each student individually. Each student was given a one-page reading passage and instructed to read aloud for one minute. Performance was measured by the number of words read correctly in one minute. The median of the three scores was used in this analysis (Good, Gruba, & Kaminski, 2002). AIMSweb. AIMSweb is a commercial assessment and web-based data management system, which consists of oral reading fluency and maze assessments for reading. The ORF measure consisted of three one-minute reading probes that were administered to students individually in a fashion very similar to the DIBELS. Each student was given a one-page reading passage and instructed to read aloud for one minute. Performance was measured by the number of words read correctly in one minute. The median of the three scores was used in this analysis. For the maze assessment, students were given a reading passage in which every seventh word has been deleted and replaced with three multiple-choice alternatives. Administered in classroom groups, students were asked to read the passage silently for three minutes and choose the correct words when they came to the word choices. Performance was based on the number of correct choices made within three minutes. Because the maze task includes three minutes of reading, only one passage was used at a time (AIMSweb, 2008). MAP. The Northwest Evaluation Association (NWEA) Measures of Academic Progress (MAP) are computerized adaptive tests that measure academic achievement in the areas of mathematics, reading, and language usage. The test was given in a group setting using computers. The difficulty of each question was based on the student’s accuracy on prior questions, so that each test is adjusted to the individual student’s performance level. Scores on the MAP were reported as Rasch Unit (RIT) scores, percentiles, and analyses of progress. The RIT scale is an equal interval scale which estimates student achievement based on the difficulty

10 of individual items. Using this scale, results of the MAP also can be reported as improvement scores, which represent the number of RIT points gained by a student since the previous assessment and the extent to which a student exceeds or falls short of the average growth (Northwest Evaluation Association, n.d.). The MAP for Reading included five subtests: Word Recognition and Vocabulary, Reading Comprehension – Literal, Reading Comprehension – Inferential/Interpretive, Reading Comprehension – Evaluation, and Literary Response and Analysis. Word Recognition and Vocabulary items measured a student’s ability to use context cues to understand word meanings and relationships between words. Reading comprehension subtests measured students’ ability to recall, identify, classify and sequence stated content, make predictions and inferences, synthesize information, and evaluate, compare, and apply what they have read. Literary Response and Analysis items required students to respond to questions about a story’s characters, themes, plot, and setting (Northwest Evaluation Association, n.d.). NECAP. The New England Common Assessments Program (NECAP) was developed by New Hampshire, Rhode Island, and Vermont, based on grade level expectations and the requirements of the No Child Left Behind Act (NCLB). Maine joined the program in 2009, and administers the test to students in grades three through eight annually in the fall. The NECAP measures academic achievement in reading, writing, and mathematics. The reading test consisted of multiple-choice and constructed-response questions in six content areas: Word Identification Skills and Strategies, Vocabulary Strategies and Breadth of Vocabulary, Initial Understanding of Literary Texts, Analysis and Interpretation of Literary Text, Citing Information, Initial Understanding of Informational Text, and Analysis and Interpretation of Informational Text, Citing Evidence (Measured Progress, 2009).

11 Procedures MAP and CBM data were collected at two time points: September 2009 and January 2010. The MAP was administered in a group setting, located in the computer lab at the students’ schools. Standardized procedures, as provided by NWEA, were followed. Within two weeks of each MAP administration, DIBELS and AIMSweb measures were administered. All students were given these tests in the same one-week period. DIBELS and AIMSweb ORF were administered individually, in a single session, in private testing rooms located in the students’ schools. Researchers used administration procedures as published in the AIMSweb and DIBELS administration and scoring guides. AIMSweb maze assessments were administered in a group setting in the students’ regular classrooms. Researchers used administration procedures as published in the AIMSweb administration and scoring guide. All curriculum based measures were administered by school psychology graduate students and faculty. The NECAP was administered to all students in October 2009, according to the procedures specified by NECAP and the Maine Department of Education In order to ensure that all individually administered reading assessments were conducted in the same manner with each student and according to the administration methods, interobserver agreement (IOA) was calculated. To compute IOA 20% of the oral reading fluency (ORF) CBM administrations were observed and scored by a second researcher. The second researcher sat behind the first one and listened to the ORF session and scored the student's oral reading in the same exact manner as the first researcher did. The dual scores for the IOA sessions were compared to determine the number of times that the two scores matched (agreements), and how many times they differed (disagreements). The occurrence of agreements, and disagreements were calculated on a word-by-word basis. IOA was then calculated by dividing the total number

12 of agreements by the number of agreements plus disagreements and multiplying by 100 to determine the percentage of agreements. Total IOA across the sample of double-scored ORF reading sessions was 98%. Results Table 2 presents descriptive statistics for each measure of reading by instructional group. On average, this sample exceeded national norms by 10 points on oral reading fluency measures (Good et al, 2002; AIMSweb, 2010b), and four points on the MAP (NWEA, 2008). Performance on the maze task was equal to or less than national averages by one point (AIMSweb, 2010a). When compared to state averages on the NECAP, the participants in this study achieved a higher mean score by four points (NECAP, 2010). Notable were lower scores of students receiving Tier 2 or 3 instruction, even when compared to those receiving special education services. Table 2

Means and Standard Deviations (SD) for Each Measure of Reading by Group Overall Sample

(N = 137)

Tier 1

(N = 115)

Tiers 2 & 3

(N = 11)

Special Education

(N = 11)

Measure

Fall

Winter

Fall

Winter

Fall

Winter

Fall

Winter

DIBELS ORF

Mean

( SD )

97

(40)

111

( 41 )

102

( 38 )

116

( 38 )

47

( 27 )

57

( 18 )

93

( 41 )

103

( 46 )

AIMSweb ORF

Mean

( SD )

94

( 41 )

116

( 41 )

100

( 39 )

122

( 38 )

45

( 27 )

64

( 28 )

85

( 42 )

102

( 45 )

Maze

Mean

( SD )

12

( 6 )

15

( 7 )

13

( 6 )

16

( 7 )

6

( 3 )

8

( 3 )

9

( 8 )

13

( 5 )

MAP

Mean

( SD )

195

( 15 )

201

( 14 )

198

( 13 )

204

( 12 )

170

( 9 )

182

( 14 )

191

( 18 )

197

( 14 )

NECAP

Mean

( SD )

350

( 10 )

351

( 9 )

336

( 8 )

350

( 7 )

13 Correlation Analysis To determine the relationships among the measures of reading, Pearson Product Moment correlations were calculated. These results are presented in Table 3. Significant correlations were found among each measure of reading at each point in time (p ≤ .01). Not surprisingly, the strongest correlation was between the two measures of oral reading fluency (r = .972). The next highest correlations were between MAP and NECAP (r =.819), ORF and Maze (r = .812), and MAP and ORF (r = .809). The lowest correlation was between Maze and NECAP (r = .621). In the fall, ORF correlated more strongly with the MAP than the maze, while the reverse was true in the winter. Correlations were also calculated within each measure between fall and winter administrations. ORF had the highest correlations between the fall and winter measures (r = .952), followed by the MAP (r = .872) and Maze (r = .746), respectively. In addition to performing correlation analyses on the overall sample, the data were disaggregated by Tier 1 and Tiers 2 and 3, and a separate analysis was conducted in order to analyze differences between the two groups (Table 4). For students in Tier 1, or general education, correlations among each measure of reading remained significant (p ≤ .01), although they were not as high as they were for the overall sample. Overall, all the measures appeared to correlate strongly with each other and themselves when given more than once. The consistently high correlations suggest that all of the measures tapped into a common skill area: reading.

14 Table 3

Intercorrelations Between Reading Measures in September and January, Overall Sample (N = 137) September

January

DIBELS ORF

AIMSweb ORF

Maze

MAP

DIBELS ORF

AIM Sweb ORF

Maze

MAP

NECAP

September

DIBELS

ORF

-

.972

.761

.809

.952

.945

.778

.767

.718

AIMSweb

ORF

-

.760

.807

.947

.949

.773

.762

.714

Maze

-

.683

.763

.740

.746

.652

.621

MAP

-

.806

.803

.743

.872

.819

January

DIBELS

ORF

-

.961

.812

.765

.705

AIMSweb

ORF

-

.793

.769

.678

Maze

-

.708

.713

MAP

-

.777

NECAP

-

15 Table 4

Intercorrelations Between Reading Measures in September and January for Students in Tier 1 (N = 115) September

January

DIBELS ORF

AIMSweb ORF

Maze

MAP

DIBELS ORF

AIMSweb ORF

Maze

MAP

NECAP

September

DIBELS

ORF

-

.967

.730

.779

.945

.942

.753

.726

.663

AIMSweb

ORF

-

.729

.778

.937

.943

.742

.725

.666

Maze

-

.624

.723

.70 6

.723

.591

.579

MAP

-

.765

.758

.724

.852

.785

January

DIBELS

ORF

-

.961

.785

.727

.655

AIMSweb

ORF

-

.769

.715

.617

Maze

-

.680

.693

MAP

-

.725

NECAP

-

16 Regression Analysis Several regression models were computed to learn which reading scores best predicted later scores. Step-wise with removal multiple regression analyses were conducted with the winter MAP scores as the dependent variable and all other reading scores as the independent variables (Table 5). The results indicated that the MAP assessment in the fall best predicted MAP scores in the winter (p < .001), and that oral reading fluency accounted for significant variance in MAP scores beyond performance on the fall MAP (p < .05). In a second analysis, in which the fall MAP scores were removed, results indicated that oral reading fluency was a significant predictor of winter MAP performance (p < .001). These results are presented in Table 6. Table 5

Summary of Multiple Regression Analysis for Variables Predicting Winter MAP Scores (N = 137) Variable

B

SE B

β

Fall MAP

.652

.063

.728

DIBELS ORF

.060

.024

.178

Notes: R² = .77 (p < .05).

Table 6

Summary of Multiple Regression Analysis for CBM Variables Predicting Winter MAP Scores (N = 137) Variable

B

SE B

β

DIBELS ORF

.260

.019

.767

Notes: R² = .59 (p < .001).

17 A multiple regression analysis also was performed to determine which measures of reading best predicted NECAP scores (Table 7). Results showed that MAP was the best predictor, accounting for 67% of variance in NECAP scores (p < .001). In order to determine whether ORF contributed unique variance to NECAP scores when MAP data were not present, an additional multiple regression analysis was conducted. As shown in Table 8, when MAP was removed from the equation, ORF was also found to predict NECAP scores, with a statistically significant amount of variance accounted for by the ORF scores. (p < .001). Table 7

Full document contains 40 pages
Abstract: This study examined the concurrent validity of four different reading assessments that are commonly used to screen students at risk for reading difficulties by measuring the correlation of the third grade Measures of Academic Progress (MAP) in Reading with three specific versions of curriculum based measurement: DIBELS oral reading fluency (ORF), AIMSweb ORF, and AIMSweb Maze. In addition, correlations were calculated among each of these measures with the third grade New England Common Assessment Program (NECAP) measure of reading achievement. Multiple regression analyses also were conducted to provide information on the predictive validity of CBM and MAP in determining risk for reading difficulty, as measured by a high stakes assessment (e.g., NECAP). Reading performance data were collected on 137 third grade students in the fall and winter. Significant correlations were found among each measure of reading at each point in time (p < .001). Correlations ranged from .972 (DIBELS ORF and AIMSweb ORF) to .621 (Maze and NECAP). Within each measure, ORF had the highest correlations between fall and winter measures (r = .952), followed by the MAP (r = .872) and Maze (r = .746), respectively. Regression analyses revealed that the MAP assessment in the fall best predicted MAP scores in the winter (p < .001), followed by oral reading fluency (p < .05). MAP was also the best predictor of NECAP scores for the general population of students (p < .001), as well as those students receiving supplemental reading support (p < .001). When MAP was removed from the equation, ORF was the most significant predictor of performance on the NECAP for general education (p < .001) and at-risk readers (p < .001). Educational implications and suggestions for further research are discussed.