Difference between revisions of "REAP Study on Personalization of Readings by Topic (Fall 2006)"

From LearnLab
Jump to: navigation, search
(Findings)
(Explanation)
 
(47 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
== REAP Study on Personalization of Readings for Increased Interest ==
 
== REAP Study on Personalization of Readings for Increased Interest ==
 
   
 
   
 +
=== Logistical Information ===
 +
 +
{| border="1"
 +
|+
 +
|-
 +
| '''Contributors''' || Maxine Eskenazi, Alan Juffs, Michael Heilman, Kevyn Collins-Thompson, Lois Wilson, Jamie Callan 
 +
|-
 +
| '''Study Start Date''' || September 11, 2006 
 +
|-
 +
| '''Study End Date''' || November 21, 2006 
 +
|-
 +
| '''Learnlab Courses''' || English Language Institute Reading 4 (ESL LearnLab)
 +
|-
 +
| '''Number of Students''' || 35
 +
|-
 +
| '''Total Participant Hours (est.)''' || 270
 +
|-
 +
| '''Data in Datashop''' || no
 +
|}
 +
 
=== Abstract ===
 
=== Abstract ===
  
Intrinsic motivation and personal interest have been shown to lead to deeper processing, the use of learning [[strategies]], and better learning outcomes (Lepper, 1988). Therefore, it seems reasonable to try to increase interest in a tutoring environment. Previous studies have shown both positive (Cordova & Lepper, 1997) and negative (Clark & Mayer, 2003) effects of attempts to increase interest.  Clark and Mayer (2003) state that adding interest but irrelevant material may distract or disrupt the learner.
+
In this work, the term “[[personalization]]” refers to the selection of practice readings in order to match a student’s interests.
 +
 
 +
During each training session with the REAP Tutor, students work through a series of readings, each of which is followed by
 +
practice exercises for the target words in the reading. While reading a passage, students are able to access
 +
dictionary definitions for any word in a reading either by clicking on a highlighted target word or by typing a
 +
word into a box in the lower-left corner of the screen. The target words in the readings are also highlighted
 +
because highlighting may increase the use of dictionary definitions, thus encouraging students to
 +
coordinate multiple sources of information about a word’s meaning—namely, the implicit examples from context around
 +
words and the explicit generalizations in the definitions of words (as exemplified in the figure below)
  
This study investigated the effects of personalizing practice readings by topic.  The REAP tutor for ESL vocabulary can prefer practice readings that match personal interests as specified in a questionnaire.  However, if this preference is included as a factor in reading selection, then other factors, such as the density of practice opportunities in a reading, are necessarily given less weight.  Finding a reading for a student is like searching for an optimal point in a multidimensional space.  In practice, the tutor never finds optimal readings, but must weigh different factors against each other (e.g., reading difficulty, length, density of practice opportunities, etc.).  Adding personalization as another factor may significantly affect the tutor's ability to find readings of high quality according to other factors.
+
[[Image:Reap context definition.jpg|500px]]
  
This study investigated the use of personalization of readings by topic as an attempt to reduce gaming and increase the likelihood that students would more deeply process the context around target words, rather than just access definitions for them. As such, this study primarily addresses the category of passive, implicit instruction in the following table of types of instruction and learning.
+
A problem discovered in past studies with REAP is that many students spend only a brief amount of time
 +
on a reading and do not deeply process the text. Students often only read the dictionary definition for target
 +
words rather than attempting to process the entire context around the words. Inferring the meaning of
 +
vocabulary from context is a seemingly important strategy that is not used by such students. This behavior is likely due to a desire to perform well on post-reading practice exercises and post-test, which can be viewed as forms of extrinsic motivation. Intrinsically
 +
motivated students who are more interested in a reading are more likely to read the entire text and to use
 +
context to learn the meaning of unknown vocabulary. Therefore, [[personalization]] that increases intrinsic
 +
motivation could lead to deeper processing of context and better learning of vocabulary.
  
 
{| border="1"
 
{| border="1"
Line 14: Line 48:
 
| ||'''Passive''' || '''Active''' || '''Interactive'''
 
| ||'''Passive''' || '''Active''' || '''Interactive'''
 
|-
 
|-
| '''Explicit (general)''' || ||  ||  
+
| '''Explicit (general)''' || Dictionary Definitions ||  || Practice Exercises
 
|-
 
|-
| '''Implicit (instance)''' || Interpreting meaning in context while reading || ||  
+
| '''Implicit (instance)''' || Interpreting meaning in context while reading || Sentence Production (assessment) || Practice Exercises
 
|}
 
|}
  
Line 24: Line 58:
  
 
''Extrinsic Motivation:'' Motivation for learn in order to satisfy an external goal, such as completing a task or passing an assessment.
 
''Extrinsic Motivation:'' Motivation for learn in order to satisfy an external goal, such as completing a task or passing an assessment.
 
''Personalization:'' The preference of practice materials to match the personal interests of the learner.  In this context, this term is used as it is in information retrieval rather than as in Mayer's work (Clark & Mayer, 2003) where it means using casual, direct language.
 
  
 
=== Research question ===
 
=== Research question ===
  
Do the benefits of personalization of practice readings by topics of interest outweigh the costs in a tutoring system for ESL vocabulary practice?
+
Does [[personalization]] of practice readings to match students' personal interests increase ESL vocabulary learning?
 
+
  
 
=== Dependent variables ===
 
=== Dependent variables ===
Immediate post-test scores for practiced words.
+
[[Normal post-test]] scores  
 
+
Number of words practiced
+
  
Overall post-test scores (essentially a product of the previous two)
+
[[Normal post-test]] scores for practiced words only
  
[[Long-term retention]] test scores similar to post-test but administered months later.
+
[[Long-term retention]] test scores, same post-test but administered months later.
  
[[Transfer]] of knowledge: sentence production tasks for target words, correct use of words in writing assignments for other courses.
+
Evidence of [[Transfer]]: sentence production tasks for target words, correct use of words in writing assignments for other courses.
  
 
=== Independent variables ===
 
=== Independent variables ===
Personalization of readings by topics of interest.  In the control condition, the tutor did not use potential personal interest as a factor in its selection of reading materials.  In the treatment condition, the tutor did use interest as a factor.  All other selection criteria were the same in both conditions.  Time on task was also the same.
+
[[Personalization]] of readings by topics of interest.  In the control condition, the tutor did not use potential personal interest as a factor in its selection of reading materials.  In the treatment condition, the tutor did use interest as a factor.  All other selection criteria were the same in both conditions.  Time on task was also the same.
  
 
=== Hypotheses ===
 
=== Hypotheses ===
  
Since intrinsic motivation seems to be important in language learning, the benefits of personalization will outweigh the costs.
+
Since intrinsic motivation seems to be important in language learning, the benefits of [[personalization]] will outweigh the costs.
  
 
=== Findings ===
 
=== Findings ===
  
  
 +
Personalization to match interests can lead to improved learning of the relevant knowledge components in a tutoring environment for vocabulary learning.  Students in the treatment group correctly answered a higher proportion of questions on target words that were practiced in the REAP tutor.
  
The interest level of students using the REAP tutor is evident in their response to a questionnaire following each reading that asked them to indicate on a scale from one to five their interest in the preceding textThe distributions of post-reading interest ratings for students in the treatment and control conditions are shown in Figures 1 and 2.
+
Second, personalization can compromise domain-based goals.  In the REAP tutor, an important domain-based goal is to give the student practice opportunities for many new target wordsHowever, students receiving personalization practiced fewer target words.  The difficulty in achieving the domain-based goal of practicing many unknown words is due to the fact that the REAP tutor often could not find texts that included multiple target words and also matched personal interests.  
  
Figures 1&2:
+
Third and finally, if the challenges of negotiating personalization and domain-based goals are met, personalization can lead to improvements in overall learning.  Students with personalization appeared to learn the words they practiced with greater frequency but practiced fewer target words, and as a result did not perform reliably differently than their controls on the overall post-test measure for cloze questions.  The researchers attributed this lack of a difference to the fact that, in many cases, the tutor had to choose between interesting readings and those with more practice opportunities.  However, the availability of readings that are both interesting and provide ample practice is a technical issue which can be solved in a straightforward manner by increasing the size and coverage of the corpus of available practice reading passages.
[[Image:Interest_combined.PNG|700px]]
+
  
 +
=== Explanation ===
 +
 +
Students in the treatment condition with [[personalization]] performed slightly better on average (M=35.5%, SD=14.9%) in terms of overall post-test scores compared to students in the control condition (M=27.1%, SD=17.2%).  However, this difference was not statistically significant. 
 +
 +
[[Image:graph40.jpg|500px]]
 +
 +
There is evidence that the difference in post-test scores is due to increased interest leading to deeper processing of the reading practice texts.
 +
 +
Responses to questionnaires following each reading show the interest level of students using the REAP tutor.  The questionnaires asked students to indicate on a scale from one to five their interest inr the preceding text.  The distributions of post-reading interest ratings for students in the treatment and control conditions are shown below.
 +
 +
[[Image:Interest_combined.jpg|700px]]
  
 
Students were also given an exit survey during their last week of practice with the tutor that asked them, among other questions, for to indicate whether they agreed with the statement, “Most of the readings were interesting.”  The ratings were on a scale from one to five, with five indicating strong agreement and one indicating strong disagreement.  Exit survey interest ratings by students in the treatment condition were significantly higher (p<0.05) than the ratings by students in the control condition.  The mean response for students who received personalized readings was 3.18, while it was 2.65 for students in the control condition.
 
Students were also given an exit survey during their last week of practice with the tutor that asked them, among other questions, for to indicate whether they agreed with the statement, “Most of the readings were interesting.”  The ratings were on a scale from one to five, with five indicating strong agreement and one indicating strong disagreement.  Exit survey interest ratings by students in the treatment condition were significantly higher (p<0.05) than the ratings by students in the control condition.  The mean response for students who received personalized readings was 3.18, while it was 2.65 for students in the control condition.
  
 +
Further analysis of post-test scores reveals that students did learn more of the words that they actually practiced in REAP.  The post-test contained 40 questions for target vocabulary words.  Many of the students did not practice 40 words, so performance on practiced words alone was analyzed.  Students in the treatment condition scored higher (N=16, M=50.3, SD=20.1) on questions for words seen in readings than did students in the control condition (N=19, M=32.4, SD=18.9).  A two-tailed t-test for independent means verified that this result is statistically significant (t=2.719, df=33, p=0.005).  The difference of scores between the two groups was 17.9% (95% CI = 4.5%, 31.3%), which corresponds to a large effect size of 0.85.  This result indicates that [[personalization]] improved learning for the words that students saw in readings, which is in line with previous findings that intrinsic motivation leads to improved learning.
  
The effect of personalization on student learning was measured by post-test cloze questions for vocabulary words.  Two measurements were made for each group.  First, the percentage correct of post-test questions was measured for target words that had appeared in the readings for each student.  Second, the overall post-test score was measured.  Additionally, student progress through the curriculum was measured for each student by the total number of words from his or her individual target vocabulary list that had appeared in at least one reading.
 
Students in the treatment condition scored higher (N=16, M=50.3, SD=20.1) on questions for words seen in readings than did students in the control condition (N=19, M=32.4, SD=18.9).  A two-tailed t-test for independent means verified that this result is statistically significant (t=2.719, df=33, p=0.005).  The difference of scores between the two groups was 17.9% (95% CI = 4.5%, 31.3%), which corresponds to a large effect size of 0.85.  This result indicates that personalization improved learning for the words that students saw in readings, which is in line with previous findings that intrinsic motivation leads to improved learning.
 
  
Also, students who were in the treatment condition spent slightly (though not significantly) longer on each reading. Students in the treatment group scored higher on post-reading reading-check questions aimed at verifying that the student actually read the text, rather than just accessing definitions for highlighted target words, which was a gaming behavior witnessed in previous studies.  The reading check questions were multiple-choice questions of the form, "Which set of words occurred in the passage?" The correct answer contained only salient words (defined by the tf.idf measure from information retrieval) that appeared in the text. Distractors contained some salient words from the text, but also words that were not in the text.  There is some evidence from REAP studies that performance on this type of question correlates with post-test vocabulary scores (which are unrelated to the content of readings).  Thus, it seems that the students in the treatment group were processing the context around the target words to a greater degree.  However, the difference in reading check question performance is only marginally significant (2-sided independent samples t-test, p<0.1)
+
[[Image:post_just_practice.jpg|400px]]
  
Figure 3
+
However, students in the treatment condition that included [[personalization]] saw fewer words in their training sessions (N=16, M=12.0 , SD=1.13) than students in the control condition (N=19, M=16.3, SD=0.87) (t=-2.9, df=33, p=0.006).  Average time on task was essentially the same for students in both conditions.  Students in the treatment condition spent slightly longer on each reading.  The main reason, however, for the difference in the average total number of words practiced was that students for whom the tutor provided personalized instruction saw fewer words (M=3.41, SD=0.55) per practice reading passage than students in the control condition (M=4.07, SD=0.83) (t=2.929, df=33, p=0.006).
  
[[Image:Readingcheck0.PNG|350px]]
+
Thus, when the tutor used [[personalization]] as a factor in the selection of readings, it chose readings that were less valuable according to other factors. Specifically, this result shows that by personalizing instruction, the tutor was not able to provide practice for as many words.  Of course, the practice that it did provide was better, as is shown in the previous result that for words student did practice, [[personalization]] appeared to increase learning.
  
 +
The reduced number of target words per text with personalization is a technical issue which can be avoided in a straightforward manner by increasing the size of the database of readings.  With more readings, the tutor can find texts that both have ample target words and cover topics of personal interest.
  
  
However, students in the treatment condition that included personalization saw fewer words in their training sessions (N=16, M=12.0 , SD=1.13) than students in the control condition (N=19, M=16.3, SD=0.87) (t=-2.9, df=33, p=0.006).  Average time on task was essentially the same for students in both conditions.  Students in the treatment condition spent slightly longer on each reading.  The main reason, however, for the difference in the average total number of words practiced was that students for whom the tutor provided personalized instruction saw fewer words (M=3.41, SD=0.55) per practice reading passage than students in the control condition (M=4.07, SD=0.83) (t=2.929, df=33, p=0.006).
+
[[Image:words_per_reading.jpg|400px]]
Thus, when the tutor used personalization as a factor in the selection of readings, it chose readings that were less valuable according to other factors.  Specifically, this result shows that by personalizing instruction, the tutor was not able to provide practice for as many words.  Of course, the practice that it did provide was better, as is shown in the previous result that for words student did practice, personalization appeared to increase learning.
+
  
Figures 4&5:
+
There is a possibility that the students in the treatment condition who were seeing fewer words in each reading were learning more of the words simply because they had fewer to learn per reading. To rule out this hypothesis, regression analyses (multiple linear regression) with overall post-test performance and performance for practiced words as the dependent variables.  In both regression analyses, the number of target words per reading was not a significant predictor of performance.  In fact, the number of target words per document was slightly positively correlated with post-test performance in both cases.  This result seems to rule out the possibility that students were learning more target words in the treatment condition because they were seeing fewer words. 
[[Image:Graph combined.PNG|700px]]
+
  
Despite practicing fewer words, students in the treatment condition with personalization still performed better on average (M=35.5%, SD=14.9%) in terms of overall post-test scores compared to students in the control condition (M=27.1%, SD=17.2%).  The improvement of average overall post-test scores in the treatment condition was only 8.4% (95% CI = -2.8%, 19.5%), which corresponds to a medium effect size of 0.51.  However, this difference is not statistically significant (p=0.14).  Therefore, the null hypothesis that personalization has no effect on overall post-test scores cannot be rejected.
 
  
=== Explanation ===
+
[[Long-term retention]] test results showed no reliable differences because of a small sample sizeThe test was administered to students who stayed in the ELI in the subsequent semester, which constituted only a fraction of the original sample.
Although personalization did have positive effects on the learning of words that were practiced with the REAP tutor, the personalization forced the tutor to choose readings that had fewer target words.  The student’s progress through the curriculum was thus impeded by incorporating personalization into the algorithm for selecting tasks to give students.  Overall scores suggest that personalization still improved overall learning as measured by post-test scores for all words tested.  However, the overall effect was smaller and not statistically reliable.
+
In the REAP tutor, attempts to increase interest required meaningful changes to the tasks given to students.  For example, if a student was interested in science, then the tutor had to assign higher values to passages about science in its search for reading material containing the target vocabulary.  This change necessarily meant that sometimes suboptimal tasks with respect to curriculum progress had to be given to students.  Additionally, in choosing readings for specific topics, the tutor was selecting readings with different sets of non-target words, grammatical features, discourse, etc.  Thus, learning opportunities other than the target vocabulary words were affected because deep changes to the task were required for personalization. 
+
By presenting the student with readings of interest, the REAP tutor was potentially introducing extraneous material, which would go against Clark and Mayer’s Coherence Principle (Clark & Mayer, 2003, pp. 111-129)It is plausible that students would be more interested in the content of the readings than the target vocabulary words in those readings, and the content would distract and disrupt their learning of vocabulary.  In the case of acquiring vocabulary, however, the content of readings was likely not extraneous but important because it provided valuable contextual clues to word meanings.  The goal of providing job-relevant practice to increase transfer and links with prior knowledge (Clark & Mayer, 2003, pp. 156-157) may also play an important role.  The ultimate “job” or goal of the students practicing vocabulary with REAP is to read texts that they are interested in (e.g., material for future courses, employment, or personal enjoyment).  Thus, by motivating students to more closely read texts and by giving them texts related to their personal interests, the REAP tutor is giving students job-relevant practice.
+
Increasing motivation is a challenging task for the developers of tutoring systems.  In the experiment described in this paper, the benefits of personalization appeared to outweigh the costs, but such may not always be the case.  Previous research has indicated that intrinsic motivation has positive effects on learning.  Therefore, personalization and other possible ways of increasing intrinsic motivation should be considered when designing a tutoring system or curriculum for students.  However, attempts to increase interest by adding interesting but extraneous material may distract and disrupt students.  Therefore, changes to increase motivation should be relevant to the immediate task as well as the ultimate goals of the individual student.  However, even when material is altered in a relevant way, incorporating personalization and motivation as a factor for selecting tasks may negatively affect other curriculum goals.  In this study, task-relevant personalization improved learning for target words student saw in reading passages, but it also decreased the number of target vocabulary words per reading.  Overall learning results were improved slightly but not significantly.  In short, ways of increasing the intrinsic motivation of students should be considered, but also weighed against their potential negative effects.
+
  
NOTE: [[Long-term retention]] test results are pending.
+
=== Further Information ===
  
=== Descendents ===
+
The following study addresses a different form of personalization, by which interactions with the learner (e.g., instructions, directions) are conducted using casual and direct rather than formal language:
  
Submitted paper to appear here.
+
[[Stoichiometry_Study | Studying the Learning Effect of Personalization and Worked Examples in the Solving of Stoichiometry Problems (McLaren, Koedinger & Yaron)]]
  
 
=== Annotated bibliography ===
 
=== Annotated bibliography ===
 +
 +
Note: a paper on this study has been submitted to International Journal of Artificial Intelligence in Education.
 +
 +
[http://reap.cs.cmu.edu/Papers/heilman_topic_choice_AIED2007_poster_final.pdf Heilman, M., Juffs, A., & Eskenazi, M. (2007). Choosing Reading Passages for Vocabulary Learning by Topic to Increase Intrinsic Motivation. Proceedings of the 13th International Conferenced on Artificial Intelligence in Education. Marina del Rey, CA. (poster)]
  
 
Clark, R. C. and Mayer, R. E. (2003). e-Learning and the Science of Instruction.  Jossey-Bass/Pfeiffer.
 
Clark, R. C. and Mayer, R. E. (2003). e-Learning and the Science of Instruction.  Jossey-Bass/Pfeiffer.

Latest revision as of 12:54, 30 May 2008

REAP Study on Personalization of Readings for Increased Interest

Logistical Information

Contributors Maxine Eskenazi, Alan Juffs, Michael Heilman, Kevyn Collins-Thompson, Lois Wilson, Jamie Callan
Study Start Date September 11, 2006
Study End Date November 21, 2006
Learnlab Courses English Language Institute Reading 4 (ESL LearnLab)
Number of Students 35
Total Participant Hours (est.) 270
Data in Datashop no

Abstract

In this work, the term “personalization” refers to the selection of practice readings in order to match a student’s interests.

During each training session with the REAP Tutor, students work through a series of readings, each of which is followed by practice exercises for the target words in the reading. While reading a passage, students are able to access dictionary definitions for any word in a reading either by clicking on a highlighted target word or by typing a word into a box in the lower-left corner of the screen. The target words in the readings are also highlighted because highlighting may increase the use of dictionary definitions, thus encouraging students to coordinate multiple sources of information about a word’s meaning—namely, the implicit examples from context around words and the explicit generalizations in the definitions of words (as exemplified in the figure below)

Reap context definition.jpg

A problem discovered in past studies with REAP is that many students spend only a brief amount of time on a reading and do not deeply process the text. Students often only read the dictionary definition for target words rather than attempting to process the entire context around the words. Inferring the meaning of vocabulary from context is a seemingly important strategy that is not used by such students. This behavior is likely due to a desire to perform well on post-reading practice exercises and post-test, which can be viewed as forms of extrinsic motivation. Intrinsically motivated students who are more interested in a reading are more likely to read the entire text and to use context to learn the meaning of unknown vocabulary. Therefore, personalization that increases intrinsic motivation could lead to deeper processing of context and better learning of vocabulary.

Passive Active Interactive
Explicit (general) Dictionary Definitions Practice Exercises
Implicit (instance) Interpreting meaning in context while reading Sentence Production (assessment) Practice Exercises

Glossary

Intrinsic Motivation: Motivation to learn for learning's own sake rather than some external goal.

Extrinsic Motivation: Motivation for learn in order to satisfy an external goal, such as completing a task or passing an assessment.

Research question

Does personalization of practice readings to match students' personal interests increase ESL vocabulary learning?

Dependent variables

Normal post-test scores

Normal post-test scores for practiced words only

Long-term retention test scores, same post-test but administered months later.

Evidence of Transfer: sentence production tasks for target words, correct use of words in writing assignments for other courses.

Independent variables

Personalization of readings by topics of interest. In the control condition, the tutor did not use potential personal interest as a factor in its selection of reading materials. In the treatment condition, the tutor did use interest as a factor. All other selection criteria were the same in both conditions. Time on task was also the same.

Hypotheses

Since intrinsic motivation seems to be important in language learning, the benefits of personalization will outweigh the costs.

Findings

Personalization to match interests can lead to improved learning of the relevant knowledge components in a tutoring environment for vocabulary learning. Students in the treatment group correctly answered a higher proportion of questions on target words that were practiced in the REAP tutor.

Second, personalization can compromise domain-based goals. In the REAP tutor, an important domain-based goal is to give the student practice opportunities for many new target words. However, students receiving personalization practiced fewer target words. The difficulty in achieving the domain-based goal of practicing many unknown words is due to the fact that the REAP tutor often could not find texts that included multiple target words and also matched personal interests.

Third and finally, if the challenges of negotiating personalization and domain-based goals are met, personalization can lead to improvements in overall learning. Students with personalization appeared to learn the words they practiced with greater frequency but practiced fewer target words, and as a result did not perform reliably differently than their controls on the overall post-test measure for cloze questions. The researchers attributed this lack of a difference to the fact that, in many cases, the tutor had to choose between interesting readings and those with more practice opportunities. However, the availability of readings that are both interesting and provide ample practice is a technical issue which can be solved in a straightforward manner by increasing the size and coverage of the corpus of available practice reading passages.

Explanation

Students in the treatment condition with personalization performed slightly better on average (M=35.5%, SD=14.9%) in terms of overall post-test scores compared to students in the control condition (M=27.1%, SD=17.2%). However, this difference was not statistically significant.

Graph40.jpg

There is evidence that the difference in post-test scores is due to increased interest leading to deeper processing of the reading practice texts.

Responses to questionnaires following each reading show the interest level of students using the REAP tutor. The questionnaires asked students to indicate on a scale from one to five their interest inr the preceding text. The distributions of post-reading interest ratings for students in the treatment and control conditions are shown below.

Interest combined.jpg

Students were also given an exit survey during their last week of practice with the tutor that asked them, among other questions, for to indicate whether they agreed with the statement, “Most of the readings were interesting.” The ratings were on a scale from one to five, with five indicating strong agreement and one indicating strong disagreement. Exit survey interest ratings by students in the treatment condition were significantly higher (p<0.05) than the ratings by students in the control condition. The mean response for students who received personalized readings was 3.18, while it was 2.65 for students in the control condition.

Further analysis of post-test scores reveals that students did learn more of the words that they actually practiced in REAP. The post-test contained 40 questions for target vocabulary words. Many of the students did not practice 40 words, so performance on practiced words alone was analyzed. Students in the treatment condition scored higher (N=16, M=50.3, SD=20.1) on questions for words seen in readings than did students in the control condition (N=19, M=32.4, SD=18.9). A two-tailed t-test for independent means verified that this result is statistically significant (t=2.719, df=33, p=0.005). The difference of scores between the two groups was 17.9% (95% CI = 4.5%, 31.3%), which corresponds to a large effect size of 0.85. This result indicates that personalization improved learning for the words that students saw in readings, which is in line with previous findings that intrinsic motivation leads to improved learning.


Post just practice.jpg

However, students in the treatment condition that included personalization saw fewer words in their training sessions (N=16, M=12.0 , SD=1.13) than students in the control condition (N=19, M=16.3, SD=0.87) (t=-2.9, df=33, p=0.006). Average time on task was essentially the same for students in both conditions. Students in the treatment condition spent slightly longer on each reading. The main reason, however, for the difference in the average total number of words practiced was that students for whom the tutor provided personalized instruction saw fewer words (M=3.41, SD=0.55) per practice reading passage than students in the control condition (M=4.07, SD=0.83) (t=2.929, df=33, p=0.006).

Thus, when the tutor used personalization as a factor in the selection of readings, it chose readings that were less valuable according to other factors. Specifically, this result shows that by personalizing instruction, the tutor was not able to provide practice for as many words. Of course, the practice that it did provide was better, as is shown in the previous result that for words student did practice, personalization appeared to increase learning.

The reduced number of target words per text with personalization is a technical issue which can be avoided in a straightforward manner by increasing the size of the database of readings. With more readings, the tutor can find texts that both have ample target words and cover topics of personal interest.


Words per reading.jpg

There is a possibility that the students in the treatment condition who were seeing fewer words in each reading were learning more of the words simply because they had fewer to learn per reading. To rule out this hypothesis, regression analyses (multiple linear regression) with overall post-test performance and performance for practiced words as the dependent variables. In both regression analyses, the number of target words per reading was not a significant predictor of performance. In fact, the number of target words per document was slightly positively correlated with post-test performance in both cases. This result seems to rule out the possibility that students were learning more target words in the treatment condition because they were seeing fewer words.


Long-term retention test results showed no reliable differences because of a small sample size. The test was administered to students who stayed in the ELI in the subsequent semester, which constituted only a fraction of the original sample.

Further Information

The following study addresses a different form of personalization, by which interactions with the learner (e.g., instructions, directions) are conducted using casual and direct rather than formal language:

Studying the Learning Effect of Personalization and Worked Examples in the Solving of Stoichiometry Problems (McLaren, Koedinger & Yaron)

Annotated bibliography

Note: a paper on this study has been submitted to International Journal of Artificial Intelligence in Education.

Heilman, M., Juffs, A., & Eskenazi, M. (2007). Choosing Reading Passages for Vocabulary Learning by Topic to Increase Intrinsic Motivation. Proceedings of the 13th International Conferenced on Artificial Intelligence in Education. Marina del Rey, CA. (poster)

Clark, R. C. and Mayer, R. E. (2003). e-Learning and the Science of Instruction. Jossey-Bass/Pfeiffer.

Cordova, D. I. & Lepper, M. R. (1996). Intrinsic Motivation and the Process of Learning: Beneficial Effects of Contextualization, Personalization, and Choice. Journal of Educational Psychology. Vol. 88,l No. 4, 715-730.

Lepper, M. (1988). Motivational Considerations in the Study of Instruction. Cognition and Instruction. 5(4), 289-309.

Heilman, M., Juffs, A., & Eskenazi, M. (To Appear). Choosing Reading Passages for Vocabulary Learning by Topic to Increase Intrinsic Motivation. Proceedings of the 13th International Conferenced on Artificial Intelligence in Education. Marina del Rey, CA. (poster)