Difference between revisions of "Using learning curves to optimize problem assignment"

From LearnLab
Jump to: navigation, search
(Background and significance)
(Background and significance)
Line 11: Line 11:
 
=== Background and significance ===
 
=== Background and significance ===
 
Much intelligent tutoring system (ITS) research has been focused on designing new features to improve learning gains measured by the difference between pre and post test scores. However, learning time is another principal measure in the summative evaluation of an ITS.  Intelligent tutors contribute more to education when they accelerate learning [9]. Bloom’s “Two Sigma” effect of a model human tutor [4] has been one of the ultimate goals for most intelligent tutors to achieve. So should be the “Accelerated Learning” effect shown by SHERLOCK’s offering four-year’s trouble shooting experience in the space of seven days of practice [12].  
 
Much intelligent tutoring system (ITS) research has been focused on designing new features to improve learning gains measured by the difference between pre and post test scores. However, learning time is another principal measure in the summative evaluation of an ITS.  Intelligent tutors contribute more to education when they accelerate learning [9]. Bloom’s “Two Sigma” effect of a model human tutor [4] has been one of the ultimate goals for most intelligent tutors to achieve. So should be the “Accelerated Learning” effect shown by SHERLOCK’s offering four-year’s trouble shooting experience in the space of seven days of practice [12].  
 +
 
Cognitive Tutors are an ITS based on cognitive psychology results [11]. Students spend about 40% of their class time using the software. The software is built on cognitive models, which represent the knowledge a student might possess about a given subject. The software assesses students’ knowledge step by step and presents curricula tailored to individual skill levels [11]. According to Carnegie Learning Inc., by 2006, Cognitive Tutors have been widely used in over 1300 school districts in the U.S. by over 475,000 secondary school students. With such a large user base, the learning efficiency with the Tutor is of great importance. If every student saves four hours of learning over one year, nearly two million hours will be saved. To ensure adequate yearly progress, many schools are calling for an increase in instructional time. However, the reality is that students have a limited amount of total learning time, and teachers have limited amount of instructional time. Saving one hour of learning time can be better than increasing one hour of instructional time because it does not increase students’ or teachers’ work load. Moreover, if these saved hours are devoted to other time-consuming subjects, they can improve the learning gains in those subjects.  
 
Cognitive Tutors are an ITS based on cognitive psychology results [11]. Students spend about 40% of their class time using the software. The software is built on cognitive models, which represent the knowledge a student might possess about a given subject. The software assesses students’ knowledge step by step and presents curricula tailored to individual skill levels [11]. According to Carnegie Learning Inc., by 2006, Cognitive Tutors have been widely used in over 1300 school districts in the U.S. by over 475,000 secondary school students. With such a large user base, the learning efficiency with the Tutor is of great importance. If every student saves four hours of learning over one year, nearly two million hours will be saved. To ensure adequate yearly progress, many schools are calling for an increase in instructional time. However, the reality is that students have a limited amount of total learning time, and teachers have limited amount of instructional time. Saving one hour of learning time can be better than increasing one hour of instructional time because it does not increase students’ or teachers’ work load. Moreover, if these saved hours are devoted to other time-consuming subjects, they can improve the learning gains in those subjects.  
 +
 
Educational data mining is an emerging area, which provides many potential insights that may improve education theory and learning outcomes. Much educational data mining to date has stopped at the point of yielding new insights, but has not yet come full circle to show how such insights can yield a better intelligent tutoring system (ITS) that can improve student learning [2, 3].
 
Educational data mining is an emerging area, which provides many potential insights that may improve education theory and learning outcomes. Much educational data mining to date has stopped at the point of yielding new insights, but has not yet come full circle to show how such insights can yield a better intelligent tutoring system (ITS) that can improve student learning [2, 3].
 
Learning Factors Analysis (LFA) [6, 5] is a data-mining method for evaluating cognitive models and analyzing student-tutor log data. Combining a statistical model [10], human expertise and a combinatorial search, LFA is able to measure the difficulty and the learning rates of knowledge components (KC), predict student performance in each KC practice, identify over-practiced or under-practiced KCs, and discover “hidden” KCs interpretable to humans. The statistical model is shown in Eq. (1).
 
Learning Factors Analysis (LFA) [6, 5] is a data-mining method for evaluating cognitive models and analyzing student-tutor log data. Combining a statistical model [10], human expertise and a combinatorial search, LFA is able to measure the difficulty and the learning rates of knowledge components (KC), predict student performance in each KC practice, identify over-practiced or under-practiced KCs, and discover “hidden” KCs interpretable to humans. The statistical model is shown in Eq. (1).
 
  (1)
 
  (1)
 
Pijt is the probability of getting a step in a tutoring question right by the ith student’s tth opportunity to practice the jth KC. The model says that the log odds of Pijt is proportional to the overall “smarts” of that student (θi) plus the “easiness” of that KC (βj) plus the amount gained (γj) for each practice opportunity. With this model, we can show the learning growth of students at any current or past moment.
 
Pijt is the probability of getting a step in a tutoring question right by the ith student’s tth opportunity to practice the jth KC. The model says that the log odds of Pijt is proportional to the overall “smarts” of that student (θi) plus the “easiness” of that KC (βj) plus the amount gained (γj) for each practice opportunity. With this model, we can show the learning growth of students at any current or past moment.
 +
 
By applying LFA to the student log data from the Area unit of the 1997 Geometry Cognitive Tutor, we found two interesting phenomena. On the one hand, some easy (i.e. high βj) KCs with low learning rates (i.e. low γj) are practiced many times. Few improvements can be made in the later stages of those practices. KC rectangle-area is an example. This KC characterizes the skill of finding the area of a rectangle, given the base and height. As shown in Figure 1, students have an initial error rate around 12%. After 18 times of practice, the error rate reduces to only 8%. The average number of practices per student is 10. Many practices spent on an easy skill are not a good use of student time. Reducing the amount of practice for this skill may save student time without compromising their performance. Other over-practiced KCs include square-area, and parallelogram-area. On the other hand, some difficult (i.e. low βj) KCs with high learning rates (i.e. high γj) do not receive enough practice. Trapezoid-area is such an example in the unit. But students received up to a maximum of 6 practices. Its initial error rate is 76%. By the end of the 6th practice the error rate remains as high as 40%, far from the level of mastery. More practice on this KC is needed for students to reach mastery. Other under-practiced KCs include pentagon-area and triangle-area.  
 
By applying LFA to the student log data from the Area unit of the 1997 Geometry Cognitive Tutor, we found two interesting phenomena. On the one hand, some easy (i.e. high βj) KCs with low learning rates (i.e. low γj) are practiced many times. Few improvements can be made in the later stages of those practices. KC rectangle-area is an example. This KC characterizes the skill of finding the area of a rectangle, given the base and height. As shown in Figure 1, students have an initial error rate around 12%. After 18 times of practice, the error rate reduces to only 8%. The average number of practices per student is 10. Many practices spent on an easy skill are not a good use of student time. Reducing the amount of practice for this skill may save student time without compromising their performance. Other over-practiced KCs include square-area, and parallelogram-area. On the other hand, some difficult (i.e. low βj) KCs with high learning rates (i.e. high γj) do not receive enough practice. Trapezoid-area is such an example in the unit. But students received up to a maximum of 6 practices. Its initial error rate is 76%. By the end of the 6th practice the error rate remains as high as 40%, far from the level of mastery. More practice on this KC is needed for students to reach mastery. Other under-practiced KCs include pentagon-area and triangle-area.  
 +
 
Having students practice less than needed is clearly undesirable in the curriculum. Is over practice necessary?  The old idiom “practice makes perfect” suggests that the more practice we do on a skill, the better we can apply the skill. Many teachers believe that giving students more practice problems is beneficial and “would like to have the students work on more practice problems”, even when “[students] were not making any mistakes and were progressing through the tutor quickly”[7].
 
Having students practice less than needed is clearly undesirable in the curriculum. Is over practice necessary?  The old idiom “practice makes perfect” suggests that the more practice we do on a skill, the better we can apply the skill. Many teachers believe that giving students more practice problems is beneficial and “would like to have the students work on more practice problems”, even when “[students] were not making any mistakes and were progressing through the tutor quickly”[7].
 
We believe that if the teachers want more problems for their students to practice unmastered KCs or useful KCs not covered by the curriculum, more practice is necessary. To support KC long-term retention, more practice is necessary but needs to be spread on an optimal schedule [1, 14]. In the rectangle-area example, where all the practice for this KC is allocated in a short period, more practice becomes over practice, which is unnecessary after the KC is mastered.
 
We believe that if the teachers want more problems for their students to practice unmastered KCs or useful KCs not covered by the curriculum, more practice is necessary. To support KC long-term retention, more practice is necessary but needs to be spread on an optimal schedule [1, 14]. In the rectangle-area example, where all the practice for this KC is allocated in a short period, more practice becomes over practice, which is unnecessary after the KC is mastered.

Revision as of 17:12, 27 April 2007

Abstract

This study examined the effectiveness of an educational data mining method – Learning Factors Analysis (LFA) – on improving the learning efficiency in the Cognitive Tutor curriculum. LFA uses a statistical model to predict how students perform in each practice of a knowledge component (KC), and identifies over-practiced or under-practiced KCs. By using the LFA findings on the Cognitive Tutor geometry curriculum, we optimized the curriculum with the goal of improving student learning efficiency. With a control group design, we analyzed the learning performance and the learning time of high school students participating in the Optimized Cognitive Tutor geometry curriculum. Results were compared to students participating in the traditional Cognitive Tutor geometry curriculum. Analyses indicated that students in the optimized condition saved a significant amount of time in the optimized curriculum units, compared with the time spent by the control group. There was no significant difference in the learning performance of the two groups in either an immediate post test or a two-week-later retention test. Findings support the use of this data mining technique to improve learning efficiency with other computer-tutor-based curricula.

Glossary

Data mining, intelligent tutoring systems, learning efficiency

Research question

Over-practice is not necessary for short term retention as well as long term retention. Reducing over-practice can improve learning efficiency.

Background and significance

Much intelligent tutoring system (ITS) research has been focused on designing new features to improve learning gains measured by the difference between pre and post test scores. However, learning time is another principal measure in the summative evaluation of an ITS. Intelligent tutors contribute more to education when they accelerate learning [9]. Bloom’s “Two Sigma” effect of a model human tutor [4] has been one of the ultimate goals for most intelligent tutors to achieve. So should be the “Accelerated Learning” effect shown by SHERLOCK’s offering four-year’s trouble shooting experience in the space of seven days of practice [12].

Cognitive Tutors are an ITS based on cognitive psychology results [11]. Students spend about 40% of their class time using the software. The software is built on cognitive models, which represent the knowledge a student might possess about a given subject. The software assesses students’ knowledge step by step and presents curricula tailored to individual skill levels [11]. According to Carnegie Learning Inc., by 2006, Cognitive Tutors have been widely used in over 1300 school districts in the U.S. by over 475,000 secondary school students. With such a large user base, the learning efficiency with the Tutor is of great importance. If every student saves four hours of learning over one year, nearly two million hours will be saved. To ensure adequate yearly progress, many schools are calling for an increase in instructional time. However, the reality is that students have a limited amount of total learning time, and teachers have limited amount of instructional time. Saving one hour of learning time can be better than increasing one hour of instructional time because it does not increase students’ or teachers’ work load. Moreover, if these saved hours are devoted to other time-consuming subjects, they can improve the learning gains in those subjects.

Educational data mining is an emerging area, which provides many potential insights that may improve education theory and learning outcomes. Much educational data mining to date has stopped at the point of yielding new insights, but has not yet come full circle to show how such insights can yield a better intelligent tutoring system (ITS) that can improve student learning [2, 3]. Learning Factors Analysis (LFA) [6, 5] is a data-mining method for evaluating cognitive models and analyzing student-tutor log data. Combining a statistical model [10], human expertise and a combinatorial search, LFA is able to measure the difficulty and the learning rates of knowledge components (KC), predict student performance in each KC practice, identify over-practiced or under-practiced KCs, and discover “hidden” KCs interpretable to humans. The statistical model is shown in Eq. (1).

	(1)

Pijt is the probability of getting a step in a tutoring question right by the ith student’s tth opportunity to practice the jth KC. The model says that the log odds of Pijt is proportional to the overall “smarts” of that student (θi) plus the “easiness” of that KC (βj) plus the amount gained (γj) for each practice opportunity. With this model, we can show the learning growth of students at any current or past moment.

By applying LFA to the student log data from the Area unit of the 1997 Geometry Cognitive Tutor, we found two interesting phenomena. On the one hand, some easy (i.e. high βj) KCs with low learning rates (i.e. low γj) are practiced many times. Few improvements can be made in the later stages of those practices. KC rectangle-area is an example. This KC characterizes the skill of finding the area of a rectangle, given the base and height. As shown in Figure 1, students have an initial error rate around 12%. After 18 times of practice, the error rate reduces to only 8%. The average number of practices per student is 10. Many practices spent on an easy skill are not a good use of student time. Reducing the amount of practice for this skill may save student time without compromising their performance. Other over-practiced KCs include square-area, and parallelogram-area. On the other hand, some difficult (i.e. low βj) KCs with high learning rates (i.e. high γj) do not receive enough practice. Trapezoid-area is such an example in the unit. But students received up to a maximum of 6 practices. Its initial error rate is 76%. By the end of the 6th practice the error rate remains as high as 40%, far from the level of mastery. More practice on this KC is needed for students to reach mastery. Other under-practiced KCs include pentagon-area and triangle-area.

Having students practice less than needed is clearly undesirable in the curriculum. Is over practice necessary? The old idiom “practice makes perfect” suggests that the more practice we do on a skill, the better we can apply the skill. Many teachers believe that giving students more practice problems is beneficial and “would like to have the students work on more practice problems”, even when “[students] were not making any mistakes and were progressing through the tutor quickly”[7]. We believe that if the teachers want more problems for their students to practice unmastered KCs or useful KCs not covered by the curriculum, more practice is necessary. To support KC long-term retention, more practice is necessary but needs to be spread on an optimal schedule [1, 14]. In the rectangle-area example, where all the practice for this KC is allocated in a short period, more practice becomes over practice, which is unnecessary after the KC is mastered.

Dependent variables

Normal post-test

Long-term retention


Independent variables

Hypothesis

Findings

Explanation

Knowledge component hypothesis

Descendants

Annotated bibliography

  • Cen, H., Koedinger, K. R., & Junker, B. (2006). Learning Factors Analysis: A general method for cognitive model evaluation and improvement. In M. Ikeda, K. D. Ashley, T.-W. Chan (Eds.) Proceedings of the 8th International Conference on Intelligent Tutoring Systems, 164-175. Berlin: Springer-Verlag.