Difference between revisions of "Composition Effect Kao Roll"

Revision as of 17:30, 31 March 2007

The Composition Effect - What is the Source of Difficulty in Problems which Require Application of Several Skills?

Ido Roll, Yvonne Kao, Kenneth E. Koedinger

Abstract

This study found that the presence of distracters creates significant difficulty for students solving geometry area problems, but that practice on composite area problems improves students’ ability to ignore distracters. In addition, this study found some support for the hypothesis that the increased spatial processing demands of a complex diagram can negatively impact performance and could be a source of a composition effect in geometry.

Glossary

- Composite problems: Problems which require the application of several skills, such as solving 3x+6=0 for x.

- Single-step problems: Problems which require the application of a single skill, such as y+6=0 or 3x=-6

- DFA (Difficulty Factor Analysis): A test that includes pairs of items varying along one dimension only. It allows to evaluate the difficulty level of the single dimensions along which the problems differ.

- The Composition Effect: The effect according to which composite problems are harder than a set of single-step problems using the same skills.

Research question

What is the main source of difficulty in composite problems?

Background and Significance

Although much work has been done to improve students’ math achievement in the United States, geometry achievement appears to be stagnant. While the 2003 TIMSS found significant gains in U.S. eighth-graders’ algebra performance between 1999 and 2003, it did not find a significant improvement on geometry items between 1999 and 2003 (Gonzales et al., 2004). Furthermore, of the five mathematics content areas assessed by TIMSS, geometry was the weakest for U.S. eighth-graders (Mullis, Martin, Gonzales, & Chrostowski, 2004). While students have often demonstrated reasonable skill in “basic, one-step problems,” (Wilson & Blank, 1999, p. 41) they often struggle with extended, multi-step problems in which they have to construct a free response, rather than selecting a multiple-choice item. Thus it is our goal to examine the sources of difficulty in multi-step geometry problems and to determine how to address these difficulties during instruction. Heffernan and Koedinger (1997) found a composition effect in multi-step algebraic story problems—the probability of correctly completing the multi-step problem was less than the product of the probability of correctly completing each of the subproblems, P(Composite) < P(Subproblem A) × P(Subproblem B). They suggested that this difference could be due to an exponential increase in the number of possible problem-solving actions as a problem became more complex, or it could be due to a missing or over-specialized knowledge component, such as students not understanding that whole subexpressions could be manipulated like single numbers or variables. Our research questions are: is there a composition effect in multi-step geometry area problems, e.g., a problem in which the student must subtract the area of an inner shape from the area of an outer shape to find the area of a shaded region, and if so, what might be the source of the composition effect? Would it be a missing or over-specialized knowledge component, as concluded by Heffernan and Koedinger, or would it be a combinatorial search? In order to answer these questions, we first needed to assess the difficulty of a single-step area problem. Koedinger and Cross (2000) found that the presence of distracter numbers on parallelogram problems—the length of a side was given in addition to the lengths of the height and base—significantly increased the difficulty of the problems due to students’ shallow application of area knowledge. In particular, students seemed to have over- generalized a procedure for finding the area of rectangles— multiplying the lengths of adjacent sides—to parallelograms. In addition, Koedinger and Cross conjectured that non-standard orientations for shapes—non- horizontal bases and non-vertical heights—would also expose students’ shallow knowledge. Given that a multi- step area problem inherently contains distracters and often features shapes that are rotated from their standard orientations, it will be important for us to follow Koedinger and Cross’s lead and get a baseline measure of how distracters and orientation affect performance on single-step area problems. Then we will study how combining single- step area problems into a typical, “find the shaded area” composite area problem effects performance. In these types of problems, students are required to perform three steps: calculate the area of the outer shape, calculate the area of the inner shape, and then subtract the values of the two areas. We believe that we will find a composition effect in these types of geometry area problems. One possible source of the effect is the additional spatial-processing demands placed by a more complex diagram. Koedinger and Anderson (1990) found that a hallmark of geometry expertise was the ability to parse a complex diagram into perceptual chunks that could be used to guide a search of problem-solving schemas. Geometry novices most likely are not able to parse complex diagrams into meaningful perceptual chunks quickly or efficiently and thus increasing diagram complexity could result in increased problem difficulty. This explanation would be more consistent with the combinatorial search explanation for the composition effect than the missing-skill explanation favored by Heffernan and Koedinger. This conjecture leads to an interesting prediction: in contrast to the composition effect found by Heffernan and Koedinger, the probability of correctly completing a composite problem should be greater than the product of the probability of correctly completing each of its three subproblems. This is because in completing a single composite problem, the act of parsing the complex diagram need only be performed once whereas it needs to be performed at least twice when completing the three subproblems separately. This prediction has two corollaries: performance on the Outer Shape subproblem should be lower than performance on a mathematically equivalent problem using a simple diagram, and that the probability of correctly completing a composite problem should be equal to the product of the probabilities of correctly completing the Subtraction subproblem, the Inner Shape subproblem, and a simple-diagram equivalent of the Outer Shape subproblem.

Independent Variables

An instruction in the form of solved-example, targeting a common misconception - identifying base and hight in a cluttered environment.

Dependent variables

Three tests are used in the study: - Pre-test: given before all instruciton - Mid-test: given after students learned about single-step problems and before composite problems - Post-test: after students have learned and practice all material.

The tests include the following items. Some of which are transfer items, evaluating robust learning, since they require and adaptive application of the knowledge learned and practiced in class.

Simple diagram:
1. no distractors, canonical orientation
2. distractors, canonical orientation
3. no distractors, tilted orientation
4. distractors, tilted orientation
Complex diagram:
1. Given complex diagram, ask for skill A
2. Given complex diagram, ask for skill B
3. Given steps A and B, ask for skills C (which requires A and B)
4. Given complex diagram, ask for C (which requires A and B)

Hypothesis

1. Adding distracters to a basic area problem and rotating the figure from its standard orientation will make the problem more difficult. 2. We will find a composition effect in area problems, in that the probability of correctly completing a composite problem is not equal to the product of the probabilities of correctly completing the three subskills: P(Composite) ≠ P(Outer) × P(Inner) × P(Subtract). 3. P(Composite) > P(Outer) × P(Inner) × P(Subtract), P(Outer) < P(Simple Outer Eq.), and P(Composite) = P(Simple Outer Eq.) × P(Inner) × P(Subtract) due to the demands of spatially parsing the diagram.

Findings

An alpha value of .05 was used for all statistical tests.

Comparison of Mid-test and Post-test Performance

Scores on the pretest were at floor, ranging from 0 to 50% correct (M = 14.94%, SD = 13.61%). Pre-test scores did not correlate significantly with either mid-test scores or post- test scores. Thus we did not analyze the pretest further. Performance on the mid-test and the post-test were significantly correlated (r2 = 0.239, p < .001). A paired t- test found significant gains in overall performance from mid-test to post-test, t(65) = 3.115, p = 0.003, 2-tailed, with highly significant gains on Simple problems, t(65) = 3.104, p = 0.003, 2-tailed, and significant gains on Complex problems, t(65) = 2.308, p = 0.024. Participants performed better on the Simple problems than on the Complex problems. This difference was significant at mid-test, t(65) = 2.214, p = .030, 2-tailed, and at post-test, t(65) = 2.355, p = .022, two-tailed. These results are presented in Table 1. ==== Effects of Distracters, Orientation, and Shape on Simple Problem Performance ==== We used a binary logistic regression analysis to predict the probability that students would answer a Simple problem correctly on the mid-test and the post-test. Distracters, Rotation, and three dummy variables coding diagram shape—parallelogram, pentagon, trapezoid, or triangle— were entered into the equation as predictors.

Table 1: Mean performance on mid-test and post-test by diagram type1

Mid-test (%) Post-test (%) Gain (%)

Type M SD M SD M SD Overall

65.34 27.79 75.41 23.66 10.07** 26.26

Simple

68.56 30.79 79.17 23.03 10.61** 27.76

Complex

61.74 30.13 71.21 31.08 9.47* 33.33

Simple- Complex 6.82* 25.03 7.96* 27.44

At mid-test, the full model differed significantly from a model with intercept only, χ2 (5, N = 264) = 17.884, Nagelkerke R2 = .092, p = .003. Overall, this model had a success rate of 69.3%, correctly classifying 90.1% of correct responses and 24.1% of incorrect responses. The presence of distracters was a significant predictor of problem accuracy at mid-test. Students were only .541 times as likely to respond correctly when the problem contained distracters. Shape was also a significant predictor. Students were only .484 times as likely to respond correctly when the problem involved a pentagon over a triangle. Rotation was not a significant predictor of problem accuracy. At post-test, the full model differed significantly from a model with intercept only, χ2 (5, N = 264) = 15.533, Nagelkerke R2 = .089, p = .008. Overall, this model had a success rate of 79.2%, correctly classifying 100% of the correct responses, but 0% of incorrect responses. Distracters were no longer a significant predictor at post- test, and Rotation remained a non-significant predictor. Shape remained a significant predictor of problem accuracy. Students were only .335 times as likely to respond correctly when the problem involved a pentagon over a triangle. Effects of Skill Composition and Shape on Complex Problem Performance We used a binary logistic regression analysis to predict the probability that students would answer a Complex problem correctly on the mid-test and the post-test. Whether the problem required an Outer calculation, an Inner calculation, or Subtraction were entered into the equation as predictors. As before, three dummy variables coding diagram shape were entered as well. At mid-test, the full model differed significantly from a model with intercept only, χ2 (6, N = 264) = 12.862, Nagelkerke R2 = .065, p = .045. Overall, this model had a success rate of 64.0%,

1

*p < .05, **p < 0.01

correctly classifying 85.9% of correct responses and 28.7% of incorrect responses. At post-test, the full model also differed significantly from a model with intercept only, χ2 (6, N = 264) = 25.019, Nagelkerke R2 = .129, p < .001. Overall, this model had a success rate of 70.5%, correctly classifying 95.7% of correct responses and 7.9% of incorrect responses. In both models, the Outer calculation was the only significant predictor of a correct response. When the problem required an Outer calculation—in the Outer and Composite conditions—students were only .439 and .258 times as likely to respond correctly at mid-test and post-test, respectively. Predictors of Performance on Composite Problems We used a binary logistic regression to predict the probability that a student would answer a Composite Complex problem correctly on the mid-test and the post- test, given his/her success on: Inner, Outer, and Subtract, the Distracters+Rotation (DR) Simple problem that is mathematically-equivalent to Outer; and the shape of the diagram—parallelogram, pentagon, trapezoid, or triangle— coded using three dummy variables. This model differed significantly from a model with intercept only for both mid- test, χ2 (7, N = 66) = 26.567, Nagelkerke R2 = .442, p < .001, and post-test, χ2 (7, N = 66) = 30.466, Nagelkerke R2 = .495, p < .001. The mid-test model had overall success rate of 75.8%, correctly classifying 75.0% of correct responses and 76.5% of incorrect responses. The post-test model had an overall success rate of 77.3%, correctly classifying 83.8% of correct responses and 69.0% of incorrect responses. The significant predictor variables in the model changed from mid-test to post-test. At mid-test, Subtract and DR success were significant predictors of a correct response on Composite problems. Students who answered Subtract correctly were 9.168 times more likely to answer Composite correctly. Students who answered DR correctly were 5.891 times more likely to answer Composite correctly. At post- test, Subtract and DR success remained significant predictors with odds ratios of 20.532 and 9.277, respectively. In addition, Outer success became a significant predictor. Students who answered Outer correctly were 6.366 times more likely to answer Composite correctly. Shape was not a significant predictor at either mid-test or post-test. These results are presented in Table 4.

Assessing the Difficulty of Spatial Parsing We took Accuracy(Outer) × Accuracy(Inner) × Accuracy(Subtract) and compared this to Accuracy(Composite) for each student using a paired, 2-tailed t-test. This difference was significant at mid-test, t(65) = 2.193, p = .032. Students performed better on Composite (M = 48.48%, SD = 50.36%) than was predicted by the product (M = 33.33%, SD = 47.52%). This difference was no longer significant at post- test. In contrast, Accuracy(DR) × Accuracy(Inner) × Accuracy(Subtract) did not differ significantly from Accuracy(Composite) at either mid-test or post-test. Paired, 2-tailed t-tests found that performance did not differ significantly between the Outer and DR at either mid-test, t(65) = -.851, p = .398, or post-test, t(65) = -1.356, p = .180. Discussion We will return to our original hypotheses to begin the discussion. It was clear that distracters had a negative impact on Simple performance at mid-test, although this effect had largely disappeared by post-test. Although we did not find significant effects of Rotation on Simple performance, we did find evidence that many students simply rotated the paper until they were viewing the figure in a standard orientation, effectively negating our Rotation manipulation. Thus, our first hypothesis is partially supported. We did find a composition effect in area problems, and the probability of success on Composite problems could not be predicted by simply multiplying the probabilities of success for the three subproblems. Thus, our second hypothesis is supported. We only found partial support for our hypothesis that the source of the composition effect is due to the diagram parsing load. We did find that the probability of success on Composite problems was greater than the product of probabilities for the three subproblems, but only at mid-test. In addition, there were no significant differences between Outer and DR at either mid-test or post-test, although we feel it is worth noting that the data are trending in the predicted direction, with Outer being more difficult than DR on both mid-test and post-test. Finally, our P(Composite) = P(Simple Outer Eq.) × P(Inner) × P(Subtract) model did a good job of predicting actual performance on Composite problems. To conclude our discussion, we would like to address the differences between performance at mid-test and performance at post-test and the implications for designing instructional interventions. First, we would like to note that instruction in the Area Composition unit of the Geometry Cognitive Tutor was able to improve performance on all skills, not just skills new to composite problems. This suggests that students may not have fully mastered the single-step area skills prior to beginning Area Composition, but that Area Composition continues to provide practice on these skills. Furthermore, the single-step skill practice in the Area Composition unit seems particularly effective at removing the effects of distracters on performance. This makes a great deal of intuitive sense if you consider composite area problems to inherently contain distracters. Second, although we did not find strong support for our contention that spatial parsing is difficult for students, we feel that training students to quickly identify important perceptual chunks can still have a positive impact on performance. If, for example, students were trained to look

Explanation

We will return to our original hypotheses to begin the discussion. It was clear that distracters had a negative impact on Simple performance at mid-test, although this effect had largely disappeared by post-test. Although we did not find significant effects of Rotation on Simple performance, we did find evidence that many students simply rotated the paper until they were viewing the figure in a standard orientation, effectively negating our Rotation manipulation. Thus, our first hypothesis is partially supported. We did find a composition effect in area problems, and the probability of success on Composite problems could not be predicted by simply multiplying the probabilities of success for the three subproblems. Thus, our second hypothesis is supported. We only found partial support for our hypothesis that the source of the composition effect is due to the diagram parsing load. We did find that the probability of success on Composite problems was greater than the product of probabilities for the three subproblems, but only at mid-test. In addition, there were no significant differences between Outer and DR at either mid-test or post-test, although we feel it is worth noting that the data are trending in the predicted direction, with Outer being more difficult than DR on both mid-test and post-test. Finally, our P(Composite) = P(Simple Outer Eq.) × P(Inner) × P(Subtract) model did a good job of predicting actual performance on Composite problems. To conclude our discussion, we would like to address the differences between performance at mid-test and performance at post-test and the implications for designing instructional interventions. First, we would like to note that instruction in the Area Composition unit of the Geometry Cognitive Tutor was able to improve performance on all skills, not just skills new to composite problems. This suggests that students may not have fully mastered the single-step area skills prior to beginning Area Composition, but that Area Composition continues to provide practice on these skills. Furthermore, the single-step skill practice in the Area Composition unit seems particularly effective at removing the effects of distracters on performance. This makes a great deal of intuitive sense if you consider composite area problems to inherently contain distracters. Second, although we did not find strong support for our contention that spatial parsing is difficult for students, we feel that training students to quickly identify important perceptual chunks can still have a positive impact on performance. If, for example, students were trained to look for a “T” or “L” structure and map the segments onto the base and height of a shape, students might be less prone to developing shallow knowledge about area.

Descendents

Annotated bibliography

Gonzales, P., Guzman, J. C., Partelow, L., Pahlke, E., Jocelyn, L., Kastberg, D., & Williams, T. (2004). Highlights From the Trends in International Mathematics and Science Study: TIMSS 2003. Washington, DC: National Center for Education Statistics. Heffernan, N. T. & Koedinger, K. R. (1997). The composition effect in symbolizing: The role of symbol production vs. text comprehension. In Proceedings of the Nineteenth Annual Conference of the Cognitive Science Society,307-312. Hillsdale, NJ: Erlbaum. Koedinger, K. R., & Anderson, J. R. (1990). Abstract planning and perceptual chunks: Elements of expertise in geometry. Cognitive Science, 14, 511-550. Koedinger, K. R., Anderson, J. R., Hadley, W. H., & Mark, M. (1997). Intelligent tutoring goes to school in the big city. International Journal of Artificial Intelligence in Education, 8, 30-43. Koedinger, K. R., & Cross, K. (2000). Making informed decisions in educational technology design: Toward meta- cognitive support in a cognitive tutor for geometry. In Proceedings of the Annual Meeting of the American Educational Research Association (AERA), New Orleans, LA. Mullis, I. V. S., Martin, M. O., Gonzalez, E. J., & Chrostowski, S. J. (2004). Findings from IEA’s Trends in International Mathematics and Science Study at the fourth and eighth grades. Chestnut Hill, MA: TIMSS & PIRLS International Study Center, Boston College. Wilson, L. D. & Blank, R. K. (1999). Improving mathematics education using results from NAEP and TIMSS. Washington, DC: Council of Chief State School Officers, State Education Assessment Center. Bransford (2000). How people learn: brain, mind, experience, and school National Academy Press. Heffernan, N.T., & Koedinger, K.R. (1997) The composition effect in symbolizing: The role of symbol production vs. text comprehension. in proceedings of Nineteenth Annual Conference of the Cognitive Science Society, 307-12. Hillsdale, NJ: Erlbaum. Koedinger, K.R., & Anderson, J.R. (1997). Intelligent Tutoring Goes to School in the Big City. International Journal of Artificial Intelligence in Education 8, 30-43 Koedinger, K. R. & Cross, K. (2000). Making informed decisions in educational technology design: Toward meta-cognitive support in a cognitive tutor for geometry. Presented at the annual meeting of the American Educational Research Association, New Orleans, LA. Owen, E., & Sweller, J. (1985). What do students learn while solving mathematics problems? Journal of Educational Psychology, 77, 272-284. Simon, H. A., & Lea, G. (1974). Problem solving and rule induction: A unified view. In L. W. Gregg (Ed.), Knowledge and cognition. Hillsdale, NJ: Erlbaum.

Difference between revisions of "Composition Effect Kao Roll"

Revision as of 17:30, 31 March 2007

Contents

The Composition Effect - What is the Source of Difficulty in Problems which Require Application of Several Skills?

Abstract

Glossary

Research question

Background and Significance

Independent Variables

Dependent variables

Hypothesis

Findings

Comparison of Mid-test and Post-test Performance

Explanation

Descendents

Annotated bibliography

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Thrusts

Clusters

Courses

Datashop

other links

Tools

@@ Line 1: / Line 1: @@
-#REDIRECT [[Composition Effect Kao Roll - old, please keep]]
+== The Composition Effect - What is the Source of Difficulty in Problems which Require Application of Several Skills?  ==
+Ido Roll, Yvonne Kao, Kenneth E. Koedinger
+=== Abstract ===
+This study found that the presence of distracters creates
+significant difficulty for students solving geometry area
+problems, but that practice on composite area problems
+improves students’ ability to ignore distracters.  In addition,
+this study found some support for the hypothesis that the
+increased spatial processing demands of a complex diagram
+can negatively impact performance and could be a source of a
+composition effect in geometry.
+=== Glossary ===
+- Composite problems: Problems which require the application of several skills, such as solving 3x+6=0 for x.
+- Single-step problems: Problems which require the application of a single skill, such as y+6=0 or 3x=-6
+- DFA (Difficulty Factor Analysis): A test that includes pairs of items varying along one dimension only. It allows to evaluate the difficulty level of the single dimensions along which the problems differ.
+- The Composition Effect: The effect according to which composite problems are harder than a set of single-step problems using the same skills.
+=== Research question ===
+What is the main source of difficulty in composite problems?
+=== Background and Significance ===
+Although much work has been done to improve students’
+math achievement in the United States, geometry
+achievement appears to be stagnant.  While the 2003 TIMSS
+found significant gains in U.S. eighth-graders’ algebra
+performance between 1999 and 2003, it did not find a
+significant improvement on geometry items between 1999
+and 2003 (Gonzales et al., 2004).  Furthermore, of the five
+mathematics content areas assessed by TIMSS, geometry
+was the weakest for U.S. eighth-graders (Mullis, Martin,
+Gonzales, & Chrostowski, 2004).  While students have
+often demonstrated reasonable skill in “basic, one-step
+problems,” (Wilson & Blank, 1999, p. 41) they often
+struggle with extended, multi-step problems in which they
+have to construct a free response, rather than selecting a
+multiple-choice item.  Thus it is our goal to examine the
+sources of difficulty in multi-step geometry problems and to
+determine how to address these difficulties during
+instruction.
+Heffernan and Koedinger (1997) found a composition
+effect in multi-step algebraic story problems—the
+probability of correctly completing the multi-step problem
+was less than the product of the probability of correctly
+completing each of the subproblems, P(Composite) <
+P(Subproblem A) × P(Subproblem B).  They suggested that
+this difference could be due to an exponential increase in the
+number of possible problem-solving actions as a problem
+became more complex, or it could be due to a missing or
+over-specialized knowledge component, such as students
+not understanding that whole subexpressions could be
+manipulated like single numbers or variables.  Our research
+questions are: is there a composition effect in multi-step
+geometry area problems, e.g., a problem in which the
+student must subtract the area of an inner shape from the
+area of an outer shape to find the area of a shaded region,
+and if so, what might be the source of the composition
+effect?  Would it be a missing or over-specialized
+knowledge component, as concluded by Heffernan and
+Koedinger, or would it be a combinatorial search?
+In order to answer these questions, we first needed to
+assess the difficulty of a single-step area problem.
+Koedinger and Cross (2000) found that the presence of
+distracter numbers on parallelogram problems—the length
+of a side was given in addition to the lengths of the height
+and base—significantly increased the difficulty of the
+problems due to students’ shallow application of area
+knowledge.  In particular, students seemed to have over-
+generalized a procedure for finding the area of rectangles—
+multiplying the lengths of adjacent sides—to
+parallelograms.  In addition, Koedinger and Cross
+conjectured that non-standard orientations for shapes—non-
+horizontal bases and non-vertical heights—would also
+expose students’ shallow knowledge.  Given that a multi-
+step area problem inherently contains distracters and often
+features shapes that are rotated from their standard
+orientations, it will be important for us to follow Koedinger
+and Cross’s lead and get a baseline measure of how
+distracters and orientation affect performance on single-step
+area problems.  Then we will study how combining single-
+step area problems into a typical, “find the shaded area”
+composite area problem effects performance.  In these types
+of problems, students are required to perform three steps:
+calculate the area of the outer shape, calculate the area of
+the inner shape, and then subtract the values of the two
+areas.
+We believe that we will find a composition effect in these
+types of geometry area problems.  One possible source of
+the effect is the additional spatial-processing demands
+placed by a more complex diagram.  Koedinger and
+Anderson (1990) found that a hallmark of geometry
+expertise was the ability to parse a complex diagram into
+perceptual chunks that could be used to guide a search of
+problem-solving schemas.  Geometry novices most likely
+are not able to parse complex diagrams into meaningful
+perceptual chunks quickly or efficiently and thus increasing
+diagram complexity could result in increased problem
+difficulty.  This explanation would be more consistent with
+the combinatorial search explanation for the composition
+effect than the missing-skill explanation favored by
+Heffernan and Koedinger.  This conjecture leads to an
+interesting prediction: in contrast to the composition effect
+found by Heffernan and Koedinger, the probability of
+correctly completing a composite problem should be greater
+than the product of the probability of correctly completing
+each of its three subproblems.  This is because in
+completing a single composite problem, the act of parsing
+the complex diagram need only be performed once whereas
+it needs to be performed at least twice when completing the
+three subproblems separately.  This prediction has two
+corollaries: performance on the Outer Shape subproblem
+should be lower than performance on a mathematically
+equivalent problem using a simple diagram, and that the
+probability of correctly completing a composite problem
+should be equal to the product of the probabilities of
+correctly completing the Subtraction subproblem, the Inner
+Shape subproblem, and a simple-diagram equivalent of the
+Outer Shape subproblem.
+=== Independent Variables ===
+An instruction in the form of solved-example, targeting a common misconception - identifying base and hight in a cluttered environment.
+=== Dependent variables ===
+Three tests are used in the study:
+- Pre-test: given before all instruciton
+- Mid-test: given after students learned about single-step problems and before composite problems
+- Post-test: after students have learned and practice all material.
+The tests include the following items. Some of which are [[transfer]] items, evaluating robust learning, since they require and adaptive application of the knowledge learned and practiced in class.
+* Simple diagram:
+*# no distractors, canonical orientation
+*# distractors,    canonical orientation
+*# no distractors, tilted orientation
+*# distractors,    tilted orientation
+* Complex diagram:
+*# Given complex diagram, ask for skill A
+*# Given complex diagram, ask for skill B
+*# Given steps A and B,   ask for skills C (which requires A and B)
+*# Given complex diagram, ask for C (which requires A and B)
+=== Hypothesis ===
+. Adding distracters to a basic area problem and rotating
+the figure from its standard orientation will make the
+problem more difficult.
+. We will find a composition effect in area problems, in
+that the probability of correctly completing a
+composite problem is not equal to the product of the
+probabilities of correctly completing the three
+subskills:
+P(Composite) ≠ P(Outer) × P(Inner) × P(Subtract).
+. P(Composite) > P(Outer) × P(Inner) × P(Subtract),
+P(Outer) < P(Simple Outer Eq.), and P(Composite) =
+P(Simple Outer Eq.) × P(Inner) × P(Subtract) due to
+the demands of spatially parsing the diagram.
+=== Findings ===
+An alpha value of .05 was used for all statistical tests.
+==== Comparison of Mid-test and Post-test Performance ====
+Scores on the pretest were at floor, ranging from 0 to 50%
+correct (M = 14.94%, SD = 13.61%).  Pre-test scores did not
+correlate significantly with either mid-test scores or post-
+test scores.  Thus we did not analyze the pretest further.
+Performance on the mid-test and the post-test were
+significantly correlated (r2 = 0.239, p < .001).  A paired t-
+test found significant gains in overall performance from
+mid-test to post-test, t(65) = 3.115, p = 0.003, 2-tailed, with
+highly significant gains on Simple problems, t(65) = 3.104,
+p = 0.003, 2-tailed, and significant gains on Complex
+problems, t(65) = 2.308, p = 0.024. Participants performed
+better on the Simple problems than on the Complex
+problems.  This difference was significant at mid-test, t(65)
+= 2.214, p = .030, 2-tailed, and at post-test, t(65) = 2.355, p
+= .022, two-tailed.  These results are presented in Table 1.
+==== Effects of Distracters, Orientation, and Shape on
+Simple Problem Performance ====
+We used a binary logistic regression analysis to predict the
+probability that students would answer a Simple problem
+correctly on the mid-test and the post-test.  Distracters,
+Rotation, and three dummy variables coding diagram
+shape—parallelogram, pentagon, trapezoid, or triangle—
+were entered into the equation as predictors.
+Table 1: Mean performance on mid-test and post-test by
+diagram type1
+ Mid-test (%) Post-test (%) Gain (%)
+Type M SD M SD M SD
+Overall
+.34 27.79 75.41 23.66 10.07** 26.26
+Simple
+.56 30.79 79.17 23.03 10.61** 27.76
+Complex
+.74 30.13 71.21 31.08 9.47* 33.33
+Simple-
+Complex 6.82* 25.03 7.96* 27.44
+At mid-test, the full model differed significantly from a
+model with intercept only, χ2 (5, N = 264) = 17.884,
+Nagelkerke R2 = .092, p = .003.  Overall, this model had a
+success rate of 69.3%, correctly classifying 90.1% of correct
+responses and 24.1% of incorrect responses.  The presence
+of distracters was a significant predictor of problem
+accuracy at mid-test.  Students were only .541 times as
+likely to respond correctly when the problem contained
+distracters.  Shape was also a significant predictor.  Students
+were only .484 times as likely to respond correctly when the
+problem involved a pentagon over a triangle.  Rotation was
+not a significant predictor of problem accuracy.
+At post-test, the full model differed significantly from a
+model with intercept only, χ2 (5, N = 264) = 15.533,
+Nagelkerke R2 = .089,  p = .008.  Overall, this model had a
+success rate of 79.2%, correctly classifying 100% of the
+correct responses, but 0% of incorrect responses.
+Distracters were no longer a significant predictor at post-
+test, and Rotation remained a non-significant predictor.
+Shape remained a significant predictor of problem accuracy.
+Students were only .335 times as likely to respond correctly
+when the problem involved a pentagon over a triangle.
+Effects of Skill Composition and Shape on Complex
+Problem Performance
+We used a binary logistic regression analysis to predict
+the probability that students would answer a Complex
+problem correctly on the mid-test and the post-test. Whether
+the problem required an Outer calculation, an Inner
+calculation, or Subtraction were entered into the equation as
+predictors.  As before, three dummy variables coding
+diagram shape were entered as well.  At mid-test, the full
+model differed significantly from a model with intercept
+only, χ2 (6, N = 264) = 12.862, Nagelkerke R2 = .065, p =
+.045.  Overall, this model had a success rate of 64.0%,
+ *p < .05, **p < 0.01
+correctly classifying 85.9% of correct responses and 28.7%
+of incorrect responses.  At post-test, the full model also
+differed significantly from a model with intercept only, χ2
+(6, N = 264) = 25.019, Nagelkerke R2 = .129,  p < .001.
+Overall, this model had a success rate of 70.5%, correctly
+classifying 95.7% of correct responses and 7.9% of
+incorrect responses.  In both models, the Outer calculation
+was the only significant predictor of a correct response.
+When the problem required an Outer calculation—in the
+Outer and Composite conditions—students were only .439
+and .258 times as likely to respond correctly at mid-test and
+post-test, respectively.
+Predictors of Performance on Composite Problems
+We used a binary logistic regression to predict the
+probability that a student would answer a Composite
+Complex problem correctly on the mid-test and the post-
+test, given his/her success on: Inner, Outer, and Subtract, the
+Distracters+Rotation (DR) Simple problem that is
+mathematically-equivalent to Outer; and the shape of the
+diagram—parallelogram, pentagon, trapezoid, or triangle—
+coded using three dummy variables.  This model differed
+significantly from a model with intercept only for both mid-
+test, χ2 (7, N = 66) = 26.567, Nagelkerke R2 = .442, p <
+.001, and post-test, χ2 (7, N = 66) = 30.466, Nagelkerke R2 =
+.495, p < .001.  The mid-test model had overall success rate
+of 75.8%, correctly classifying 75.0% of correct responses
+and 76.5% of incorrect responses.  The post-test model had
+an overall success rate of 77.3%, correctly classifying
+.8% of correct responses and 69.0% of incorrect
+responses.
+The significant predictor variables in the model changed
+from mid-test to post-test.  At mid-test, Subtract and DR
+success were significant predictors of a correct response on
+Composite problems.  Students who answered Subtract
+correctly were 9.168 times more likely to answer Composite
+correctly.  Students who answered DR correctly were 5.891
+times more likely to answer Composite correctly.  At post-
+test, Subtract and DR success remained significant
+predictors with odds ratios of 20.532 and 9.277,
+respectively.  In addition, Outer success became a
+significant predictor.  Students who answered Outer
+correctly were 6.366 times more likely to answer Composite
+correctly.  Shape was not a significant predictor at either
+mid-test or post-test. These results are presented in Table 4.
+Assessing the Difficulty of Spatial Parsing We took
+Accuracy(Outer) × Accuracy(Inner) × Accuracy(Subtract)
+and compared this to Accuracy(Composite) for each student
+using a paired, 2-tailed t-test.  This difference was
+significant at mid-test, t(65) = 2.193, p = .032.  Students
+performed better on Composite (M = 48.48%, SD = 50.36%)
+than was predicted by the product (M = 33.33%, SD =
+.52%).  This difference was no longer significant at post-
+test.  In contrast, Accuracy(DR) × Accuracy(Inner) ×
+Accuracy(Subtract) did not differ significantly from
+Accuracy(Composite) at either mid-test or post-test.  Paired,
+-tailed t-tests found that performance did not differ
+significantly between the Outer and DR at either mid-test,
+t(65) = -.851, p = .398, or post-test, t(65) = -1.356, p = .180.
+Discussion
+We will return to our original hypotheses to begin the
+discussion.  It was clear that distracters had a negative
+impact on Simple performance at mid-test, although this
+effect had largely disappeared by post-test.  Although we
+did not find significant effects of Rotation on Simple
+performance, we did find evidence that many students
+simply rotated the paper until they were viewing the figure
+in a standard orientation, effectively negating our Rotation
+manipulation.  Thus, our first hypothesis is partially
+supported.
+We did find a composition effect in area problems, and
+the probability of success on Composite problems could not
+be predicted by simply multiplying the probabilities of
+success for the three subproblems.  Thus, our second
+hypothesis is supported.
+We only found partial support for our hypothesis that the
+source of the composition effect is due to the diagram
+parsing load.  We did find that the probability of success on
+Composite problems was greater than the product of
+probabilities for the three subproblems, but only at mid-test.
+In addition, there were no significant differences between
+Outer and DR at either mid-test or post-test, although we
+feel it is worth noting that the data are trending in the
+predicted direction, with Outer being more difficult than DR
+on both mid-test and post-test.  Finally, our P(Composite) =
+P(Simple Outer Eq.) × P(Inner) × P(Subtract) model did a
+good job of predicting actual performance on Composite
+problems.
+To conclude our discussion, we would like to address the
+differences between performance at mid-test and
+performance at post-test and the implications for designing
+instructional interventions.  First, we would like to note that
+instruction in the Area Composition unit of the Geometry
+Cognitive Tutor was able to improve performance on all
+skills, not just skills new to composite problems.  This
+suggests that students may not have fully mastered the
+single-step area skills prior to beginning Area Composition,
+but that Area Composition continues to provide practice on
+these skills.  Furthermore, the single-step skill practice in
+the Area Composition unit seems particularly effective at
+removing the effects of distracters on performance.  This
+makes a great deal of intuitive sense if you consider
+composite area problems to inherently contain distracters.
+Second, although we did not find strong support for our
+contention that spatial parsing is difficult for students, we
+feel that training students to quickly identify important
+perceptual chunks can still have a positive impact on
+performance.  If, for example, students were trained to look
+=== Explanation ===
+We will return to our original hypotheses to begin the
+discussion.  It was clear that distracters had a negative
+impact on Simple performance at mid-test, although this
+effect had largely disappeared by post-test.  Although we
+did not find significant effects of Rotation on Simple
+performance, we did find evidence that many students
+simply rotated the paper until they were viewing the figure
+in a standard orientation, effectively negating our Rotation
+manipulation.  Thus, our first hypothesis is partially
+supported.
+We did find a composition effect in area problems, and
+the probability of success on Composite problems could not
+be predicted by simply multiplying the probabilities of
+success for the three subproblems.  Thus, our second
+hypothesis is supported.
+We only found partial support for our hypothesis that the
+source of the composition effect is due to the diagram
+parsing load.  We did find that the probability of success on
+Composite problems was greater than the product of
+probabilities for the three subproblems, but only at mid-test.
+In addition, there were no significant differences between
+Outer and DR at either mid-test or post-test, although we
+feel it is worth noting that the data are trending in the
+predicted direction, with Outer being more difficult than DR
+on both mid-test and post-test.  Finally, our P(Composite) =
+P(Simple Outer Eq.) × P(Inner) × P(Subtract) model did a
+good job of predicting actual performance on Composite
+problems.
+To conclude our discussion, we would like to address the
+differences between performance at mid-test and
+performance at post-test and the implications for designing
+instructional interventions.  First, we would like to note that
+instruction in the Area Composition unit of the Geometry
+Cognitive Tutor was able to improve performance on all
+skills, not just skills new to composite problems.  This
+suggests that students may not have fully mastered the
+single-step area skills prior to beginning Area Composition,
+but that Area Composition continues to provide practice on
+these skills.  Furthermore, the single-step skill practice in
+the Area Composition unit seems particularly effective at
+removing the effects of distracters on performance.  This
+makes a great deal of intuitive sense if you consider
+composite area problems to inherently contain distracters.
+Second, although we did not find strong support for our
+contention that spatial parsing is difficult for students, we
+feel that training students to quickly identify important
+perceptual chunks can still have a positive impact on
+performance.  If, for example, students were trained to look for a “T” or “L” structure and map the segments onto the
+base and height of a shape, students might be less prone to
+developing shallow knowledge about area.
+=== Descendents ===
+=== Annotated bibliography ===
+Gonzales, P., Guzman, J. C., Partelow, L., Pahlke, E.,
+Jocelyn, L., Kastberg, D., & Williams, T. (2004).
+Highlights From the Trends in International Mathematics
+and Science Study: TIMSS 2003. Washington, DC:
+National Center for Education Statistics.
+Heffernan, N. T. & Koedinger, K. R. (1997). The
+composition effect in symbolizing: The role of symbol
+production vs. text comprehension. In Proceedings of the
+Nineteenth Annual Conference of the Cognitive Science
+Society,307-312. Hillsdale, NJ: Erlbaum.
+Koedinger, K. R., & Anderson, J. R. (1990). Abstract
+planning and perceptual chunks: Elements of expertise in
+geometry. Cognitive Science, 14, 511-550.
+Koedinger, K. R., Anderson, J. R., Hadley, W. H., & Mark,
+M. (1997). Intelligent tutoring goes to school in the big
+city. International Journal of Artificial Intelligence in
+Education, 8, 30-43.
+Koedinger, K. R., & Cross, K. (2000). Making informed
+decisions in educational technology design: Toward meta-
+cognitive support in a cognitive tutor for geometry.  In
+Proceedings of the Annual Meeting of the American
+Educational Research Association (AERA), New Orleans,
+LA.
+Mullis, I. V. S., Martin, M. O., Gonzalez, E. J., &
+Chrostowski, S. J. (2004). Findings from IEA’s Trends in
+International Mathematics and Science Study at the
+fourth and eighth grades. Chestnut Hill, MA: TIMSS &
+PIRLS International Study Center, Boston College.
+Wilson, L. D. & Blank, R. K. (1999). Improving
+mathematics education using results from NAEP and
+TIMSS.  Washington, DC: Council of Chief State School
+Officers, State Education Assessment Center. Bransford (2000). How people learn: brain, mind, experience, and school National Academy Press.
+Heffernan, N.T., & Koedinger, K.R. (1997) The composition effect in symbolizing: The role of symbol production vs. text comprehension. in proceedings of Nineteenth Annual Conference of the Cognitive Science Society, 307-12. Hillsdale, NJ: Erlbaum.
+Koedinger, K.R., & Anderson, J.R. (1997). Intelligent Tutoring Goes to School in the Big City. International Journal of Artificial Intelligence in Education 8, 30-43
+Koedinger, K. R. & Cross, K. (2000).  Making informed decisions in educational technology design: Toward meta-cognitive support in a cognitive tutor for geometry.  Presented at the annual meeting of the American Educational Research Association, New Orleans, LA.
+Owen, E., & Sweller, J. (1985).  What do students learn while solving mathematics problems?  Journal of Educational Psychology, 77, 272-284.
+Simon, H. A., & Lea, G. (1974).  Problem solving and rule induction: A unified view.  In L. W. Gregg (Ed.), Knowledge and cognition. Hillsdale, NJ: Erlbaum.
+[[Category:Empirical Study]]