Difference between revisions of "Chi - Induction of Adaptive Pedagogical Tutorial Tactics"

From LearnLab
Jump to: navigation, search
(Project Overview)
Line 1: Line 1:
 
== Project Overview ==
 
== Project Overview ==
This project will address goal 3 of the CMDM thrust and in particular investigate on application of a general data-driven methodology, Reinforcement Learning (RL), to derive adaptive pedagogical tutorial tactics directly from pre-existing interaction data. More specifically, this project is designed to: 1) help computer tutors employ effective, adaptive pedagogical tutorial tactics; 2) test the viability of using RL, especially POMDP, to induce pedagogical tactics, 3) show that pedagogical tutorial tactics is a potential source of learning power for computer tutors to improve students' learning; and 4) explore the underlining causes of the effectiveness of the induced pedagogical tactics.
+
The goal of this project is to investigate on application of Reinforcement Learning (RL) to derive adaptive pedagogical strategies directly from pre-existing interaction data. Pedagogical strategies are policies to decide the next system's action when there are multiple ones available. More specifically, this project is designed to: 1) help computer tutors employ effective, adaptive pedagogical policies; 2) test the viability of using RL, especially POMDP, to induce pedagogical policies, 3) show that pedagogical policies is a potential source of learning power for computer tutors to improve students' learning; and 4) explore the underlining causes of the effectiveness of the induced policies.  
In designing e-learning environments that effectively support student learning, one faces many decisions with respect to how the system should interact with the student at any given point. For any forms of e-learning environment, the system's behavior can be viewed as a sequential decision process wherein, at each discrete step, the system is responsible for selecting the next action to take.  Each of these system decisions affects successive user's actions and performances. Pedagogical strategies are defined as policies to decide the next system's action when there are multiple ones available. It is often unclear how to make each of these system decisions effectively because its impact on learning cannot often be observed immediately and the effectiveness of one decision also depends on the effectiveness of sub-sequence decisions.  Ideally, an effective learning environment should craft and adapt its actions to the user's needs.  However, there is no existing well-established theory on how to make these system decisions effectively. Typically, system designers designs pedagogical strategies by hands and have to make many nontrivial design choices. However, it is also often difficult to evaluate these hand-coded rules as their performance depends upon a number of factors, such as the content difficulty, the student's incoming competence, the system's usability, and so on.  
 
  
One form of genuinely highly interactive e-learning environments lies in the center of our interest is Intelligent tutoring systems (ITSs). Existing ITSs typically employ hand-coded pedagogical rules that seek to implement existing cognitive or instructional theories. These theories may or may not have been well-evaluated. For example, in both the CTAT \cite{Anderson1995,koedingerintelligent1997} and Andes systems \cite{AndesJAIED2005}, help is provided upon request because it is assumed that students know when they need help and will only process help when they desire it.  Research on gaming, however, has raised some doubts about this, by showing that students sometimes exploit these mechanisms for shallow gains thus voiding the help value \cite{DBLP:conf/chi/BakerCKW04,DBLP:conf/its/BakerCK04}.
+
For any forms of learning environment including ITSs, the system's behaviors can be viewed as a sequential decision process wherein, at each discrete step, the system is responsible for selecting the next action to take. Each of these system decisions affects successive user's actions and performances. It is unclear how to make each decision effectively because its impact on learning cannot often be observed immediately and the effectiveness of one decision also depends on the effectiveness of subsequence decisions. Ideally, an effective learning environment should craft and adapt its actions to users' needs.  However, there is no existing well-established theory on how to make these system decisions effectively. Most of existing ITSs, for example, either employ fixed pedagogical policies providing with little adaptability or employ hand-coded pedagogical rules that seek to implement existing cognitive or instructional theories. These theories may or may not have been well-evaluated.
  
 +
In this project, we apply RL to improve the effectiveness of an ITS by induce pedagogical policies direct from a pre-existing student-computer interactivity data. More specifically, we focused on the two types of tutorial decisions: Elicit vs.Tell (ET) and Justify vs. Skip-Justify (JS).  When making ET decisions the tutor decides whether to elicit the next step from the student or to tell them the step directly.  The JS decisions address points where the tutor may optionally ask students to justify an answer they have taken or entry they have made.  Neither type of decisions is well-understood in that there are many theories but no widespread consensus on how or when an action should be taken. Thus, we investigate on applying and evaluating RL to induce pedagogical tutorial tactics from pre-existing interactivity data.
  
 
== Planned accomplishments for PSLC Year 6 ==
 
== Planned accomplishments for PSLC Year 6 ==

Revision as of 20:42, 31 August 2010

Project Overview

The goal of this project is to investigate on application of Reinforcement Learning (RL) to derive adaptive pedagogical strategies directly from pre-existing interaction data. Pedagogical strategies are policies to decide the next system's action when there are multiple ones available. More specifically, this project is designed to: 1) help computer tutors employ effective, adaptive pedagogical policies; 2) test the viability of using RL, especially POMDP, to induce pedagogical policies, 3) show that pedagogical policies is a potential source of learning power for computer tutors to improve students' learning; and 4) explore the underlining causes of the effectiveness of the induced policies.

For any forms of learning environment including ITSs, the system's behaviors can be viewed as a sequential decision process wherein, at each discrete step, the system is responsible for selecting the next action to take. Each of these system decisions affects successive user's actions and performances. It is unclear how to make each decision effectively because its impact on learning cannot often be observed immediately and the effectiveness of one decision also depends on the effectiveness of subsequence decisions. Ideally, an effective learning environment should craft and adapt its actions to users' needs. However, there is no existing well-established theory on how to make these system decisions effectively. Most of existing ITSs, for example, either employ fixed pedagogical policies providing with little adaptability or employ hand-coded pedagogical rules that seek to implement existing cognitive or instructional theories. These theories may or may not have been well-evaluated.

In this project, we apply RL to improve the effectiveness of an ITS by induce pedagogical policies direct from a pre-existing student-computer interactivity data. More specifically, we focused on the two types of tutorial decisions: Elicit vs.Tell (ET) and Justify vs. Skip-Justify (JS). When making ET decisions the tutor decides whether to elicit the next step from the student or to tell them the step directly. The JS decisions address points where the tutor may optionally ask students to justify an answer they have taken or entry they have made. Neither type of decisions is well-understood in that there are many theories but no widespread consensus on how or when an action should be taken. Thus, we investigate on applying and evaluating RL to induce pedagogical tutorial tactics from pre-existing interactivity data.

Planned accomplishments for PSLC Year 6

1. Develop code and human-computer interfaces for applying, comparing and interpreting cognitive model discovery algorithms across multiple data sets in DataShop. We will document processes for how the algorithms, like LFA, combine automation and human input to discover or improve cognitive models of specific learning domains. 2. Demonstrate the use of the model discovery infrastructure (#1) for at least two discovery algorithms applied to at least 4 DataShop data sets. We will target at least one math (Geometry area and/or Algebra equation solving), one science (Physics kinematics), and one language (English articles) domain. 3. For at least one of these data sets, work with associated researchers to perform a “close the loop” experiment whereby we test whether a better cognitive model leads to better or more efficient student learning.

Integrated Research Results and High Profile Publication

Establishing that cognitive models of academic domain knowledge in math, science, and language can be discovered from data would be an important scientific achievement. The achievement will be greater to the extent that the discovered models involve deep or integrative knowledge components not directly apparent in surface task structure (e.g., model discovery in the Geometry area domain isolated a problem decomposition skill). The statistical model structure of competing discovery algorithms promises to shed new light on the nature or extent of regularities or laws of learning, like the power or exponential shape of learning curves, whether the complexity of task behavior is due to human or domain characteristics (the ant on the beach question), whether or not there are systematic individual differences in student learning rates. We expect integrative results of this project can be published in high-profile general journals (e.g., Science or Nature) or more specific technical (e.g., Machine Learning or JMLR) or psychological journals (e.g., Cognitive Science or Learning Science).

Year 6 Project Deliverables

  • Develop code and human-computer interfaces for applying, comparing and interpreting cognitive model discovery algorithms across multiple data sets in DataShop.
  • Demonstrate the use of the model discovery infrastructure for at least two discovery algorithms applied to at least 4 DataShop data sets.
  • For at least one of these data sets, work with associated researchers to perform a “close the loop” experiment whereby we test that a better cognitive model leads to better or more efficient student learning.

6th Month Milestone

By March, 2010 we will 1) be able to run the LFA algorithm on PSLC data sets from the DataShop web services, 2) have run model discovery with using at least one algorithm on at least two data sets, and 3) we will have designed and ideally run the close-the-loop experiment.