Application of SimStudent for Error Analysis

From LearnLab
Revision as of 21:16, 14 May 2009 by Nmatsuda (talk | contribs) (Towards a theory of learning errors)
Jump to: navigation, search

Towards a theory of learning errors

Personnel

  • PI: Noboru Matsuda
  • Key Faculty: William W. Cohen, Kenneth R. Koedinger

Abstract

The purpose of this project is to study how students learn errors from examples. We apply a computational model of learning, called SimStudent that learns cognitive skills inductively either from worked-out examples or by being tutored. In this study, we use SimStudent to study how and when erroneous skills (the skills that produce errors when applied) would be learned.

We are particularly interested in studying how the differences in prior knowledge affect the nature and rate of learning. We hypothesize that when students rely on shallow, domain general features (which we call "weak" features) as opposed to deep, more domain specific features ("strong" features), then students would more likely to make induction errors.

To test this hypothesis, we give SimStudent different sets of prior knowledge and analyze learning outcomes.

Overview of SimStudent

A fundamental technology used for SimStudent is called Inductive Logic Programming (Muggleton, 1999) as an application for programming by demonstration (Cypher, 1993). Prior to learning, SimStudent is given a set of operators and feature predicates as prior knowledge.

Feature predicate is a Boolean function to test an existence of a certain feature. For example, isPolynomial("3x+1") returns true, but isConstantTerm("3x") returns false. An operators, on the other hand, is a more generic function to manipulate various form of objects involved in a target task. For example, addTerm("3x", "2x") returns "5x" and getCoefficient("-4y") returns "-4."

To learn cognitive skills, SimStudent generalizes examples of each individual skill applications. There are two types of examples necessary to given to SimStudent: (1) positive examples that show when to apply a particular skill, and (2) negative examples that show when not to apply a particular skill.

Positive examples are acquired either from (1) steps demonstrated in worked-out examples, (2) steps demonstrated as a hint during tutoring, and (3) steps performed correctly by SimStudent itself during tutoring. In either case, a context of a skill application (i.e., a problem status) is stored as a positive examples for that particular skill.

Negative examples are acquired either when (1) a positive example is generated, or (2) SimStudent made an error during tutoring. When a positive example is made for a certain skill, say S, the example also becomes negative examples for all other skills than S. Such an example is called implicit negative example. An implicit negative example becomes a positive example if the corresponding skill is applied in the specified situation.

Given a set of positive and negative examples for a skill, SimStudent generates a hypothesis (in the form of production rule) representing when and how to apply the skill. The hypothesis is generated so that it applies to all positive examples and none of the negative examples.

Background and Significance

There are a number of models of student errors proposed so far (Brown & Burton, 1978; Langley & Ohlsson, 1984; Sleeman, Kelly, Martinak, Ward, & Moore, 1989; Weber, 1996; Young & O'Shea, 1981). Our effort builds on the past works by exploring how differences in prior knowledge affect the nature of the incorrect skills acquired and the errors derived. We are particularly interested in errors that are made by applying incorrect skills, and our computational model explains the processes of learning such incorrect skills as incorrect induction from examples.

We hypothesize that incorrect generalizations are more likely when students have weaker, more general prior knowledge for encoding incoming information. This knowledge is typically perceptually grounded and is in contrast to deeper or more abstract encoding knowledge. An example of such perceptually grounded prior knowledge is to recognize 3 in x/3 simply as a number instead of as a denominator. Such an interpretation might lead students to learn an inappropriate generalization such as "multiply both sides by a number in the left hand side of the equation" after observing x/3=5 gets x=15. If this generalization gets applied to an equation like 4x=2, the error of multiplying both sides by 4 is produced.

We call this type of perceptually grounded prior knowledge "weak" prior knowledge in a similar sense as Newell and Simon’s weak reasoning methods (1972). Weak knowledge can apply across domains and can yield successful results prior to domain-specific instruction. However, in contrast to "strong" domain-specific knowledge, weak knowledge is more likely to lead to incorrect conclusions.

Research Question

Hypothesis

Study Variables

Independent Variable

Prior knowledge: implemented as "operator" and "feature predicates" for SimStudent.

Dependent Variables

Findings

Impact of having "weak" prior knowledge in learning errors

Publications

  • Matsuda, N., Lee, A., Cohen, W. W., & Koedinger, K. R. (2009; to appear). A Computational Model of How Learner Errors Arise from Weak Prior Knowledge. In Conference of the Cognitive Science Society.

References

  • Booth, J. L., & Koedinger, K. R. (2008). Key misconceptions in algebraic problem solving. In B. C. Love, K. McRae & V. M. Sloutsky (Eds.), Proceedings of the 30th Annual Conference of the Cognitive Science Society (pp. 571-576). Austin, TX: Cognitive Science Society.
  • Muggleton, S. (1999). Inductive Logic Programming: Issues, results and the challenge of Learning Language in Logic. Artificial Intelligence, 114(1-2), 283-296.
  • Cypher, A. (Ed.). (1993). Watch what I do: Programming by Demonstration. Cambridge, MA: MIT Press.