Difference between revisions of "REAP Study on Personalization of Readings by Topic (Fall 2006)"

From LearnLab
Jump to: navigation, search
(Annotated bibliography)
(Explanation)
 
(32 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
== REAP Study on Personalization of Readings for Increased Interest ==
 
== REAP Study on Personalization of Readings for Increased Interest ==
 
   
 
   
 +
=== Logistical Information ===
 +
 +
{| border="1"
 +
|+
 +
|-
 +
| '''Contributors''' || Maxine Eskenazi, Alan Juffs, Michael Heilman, Kevyn Collins-Thompson, Lois Wilson, Jamie Callan 
 +
|-
 +
| '''Study Start Date''' || September 11, 2006 
 +
|-
 +
| '''Study End Date''' || November 21, 2006 
 +
|-
 +
| '''Learnlab Courses''' || English Language Institute Reading 4 (ESL LearnLab)
 +
|-
 +
| '''Number of Students''' || 35
 +
|-
 +
| '''Total Participant Hours (est.)''' || 270
 +
|-
 +
| '''Data in Datashop''' || no
 +
|}
 +
 
=== Abstract ===
 
=== Abstract ===
  
This paper discusses the enhancement of the REAP tutor to allow for [[personalization]] of reading materials
+
In this work, the term “[[personalization]]” refers to the selection of practice readings in order to match a student’s interests.  
by topic in order to increase interest and motivation. In this work, the term “[[personalization]]” refers to the
 
selection of practice readings in order to match a student’s interests.  
 
  
During each training session with REAP, students work through a series of readings, each of which is followed by
+
During each training session with the REAP Tutor, students work through a series of readings, each of which is followed by
 
practice exercises for the target words in the reading. While reading a passage, students are able to access
 
practice exercises for the target words in the reading. While reading a passage, students are able to access
 
dictionary definitions for any word in a reading either by clicking on a highlighted target word or by typing a
 
dictionary definitions for any word in a reading either by clicking on a highlighted target word or by typing a
 
word into a box in the lower-left corner of the screen. The target words in the readings are also highlighted
 
word into a box in the lower-left corner of the screen. The target words in the readings are also highlighted
 
because highlighting may increase the use of dictionary definitions, thus encouraging students to
 
because highlighting may increase the use of dictionary definitions, thus encouraging students to
coordinate multiple sources of information about a word’s meaning—namely, the implicit context around
+
coordinate multiple sources of information about a word’s meaning—namely, the implicit examples from context around
words and the explicit definitions of words.
+
words and the explicit generalizations in the definitions of words (as exemplified in the figure below)
 +
 
 +
[[Image:Reap context definition.jpg|500px]]
  
 
A problem discovered in past studies with REAP is that many students spend only a brief amount of time
 
A problem discovered in past studies with REAP is that many students spend only a brief amount of time
Line 28: Line 48:
 
| ||'''Passive''' || '''Active''' || '''Interactive'''
 
| ||'''Passive''' || '''Active''' || '''Interactive'''
 
|-
 
|-
| '''Explicit (general)''' || Dictionary Definitions ||  ||  
+
| '''Explicit (general)''' || Dictionary Definitions ||  || Practice Exercises
 
|-
 
|-
| '''Implicit (instance)''' || Interpreting meaning in context while reading || ||  
+
| '''Implicit (instance)''' || Interpreting meaning in context while reading || Sentence Production (assessment) || Practice Exercises
 
|}
 
|}
  
Line 41: Line 61:
 
=== Research question ===
 
=== Research question ===
  
Do the benefits of [[personalization]] of practice readings by topics of interest outweigh the costs in a tutoring system for ESL vocabulary practice?
+
Does [[personalization]] of practice readings to match students' personal interests increase ESL vocabulary learning?
 
 
  
 
=== Dependent variables ===
 
=== Dependent variables ===
Overall post-test scores  
+
[[Normal post-test]] scores  
 
 
Immediate post-test scores for practiced words.
 
  
Number of words practiced
+
[[Normal post-test]] scores for practiced words only
  
[[Long-term retention]] test scores similar to post-test but administered months later.
+
[[Long-term retention]] test scores, same post-test but administered months later.
  
[[Transfer]] of knowledge: sentence production tasks for target words, correct use of words in writing assignments for other courses.
+
Evidence of [[Transfer]]: sentence production tasks for target words, correct use of words in writing assignments for other courses.
  
 
=== Independent variables ===
 
=== Independent variables ===
Line 65: Line 82:
  
  
Students in the treatment condition with [[personalization]] performed better on average (M=35.5%, SD=14.9%) in terms of overall post-test scores compared to students in the control condition (M=27.1%, SD=17.2%)However, the improvement of average overall post-test scores in the treatment condition was only 8.4% (95% CI = -2.8%, 19.5%), which corresponds to a medium effect size of 0.51.  This difference was not statistically significant (p=0.14).  Therefore, the null hypothesis that [[personalization]] has no effect on overall post-test scores cannot be rejected.
+
Personalization to match interests can lead to improved learning of the relevant knowledge components in a tutoring environment for vocabulary learningStudents in the treatment group correctly answered a higher proportion of questions on target words that were practiced in the REAP tutor.
  
 +
Second, personalization can compromise domain-based goals.  In the REAP tutor, an important domain-based goal is to give the student practice opportunities for many new target words.  However, students receiving personalization practiced fewer target words.  The difficulty in achieving the domain-based goal of practicing many unknown words is due to the fact that the REAP tutor often could not find texts that included multiple target words and also matched personal interests.
  
[[Image:graph40.PNG|500px]]
+
Third and finally, if the challenges of negotiating personalization and domain-based goals are met, personalization can lead to improvements in overall learning.  Students with personalization appeared to learn the words they practiced with greater frequency but practiced fewer target words, and as a result did not perform reliably differently than their controls on the overall post-test measure for cloze questions.  The researchers attributed this lack of a difference to the fact that, in many cases, the tutor had to choose between interesting readings and those with more practice opportunities.  However, the availability of readings that are both interesting and provide ample practice is a technical issue which can be solved in a straightforward manner by increasing the size and coverage of the corpus of available practice reading passages.
  
 
=== Explanation ===
 
=== Explanation ===
 +
 +
Students in the treatment condition with [[personalization]] performed slightly better on average (M=35.5%, SD=14.9%) in terms of overall post-test scores compared to students in the control condition (M=27.1%, SD=17.2%).  However, this difference was not statistically significant. 
 +
 +
[[Image:graph40.jpg|500px]]
  
 
There is evidence that the difference in post-test scores is due to increased interest leading to deeper processing of the reading practice texts.
 
There is evidence that the difference in post-test scores is due to increased interest leading to deeper processing of the reading practice texts.
  
Responses to questionnaires following each reading show the interest level of students using the REAP tutor.  The questionnaires asked students to indicate on a scale from one to five their interest in the preceding text.  The distributions of post-reading interest ratings for students in the treatment and control conditions are shown in Figures 1 and 2.
+
Responses to questionnaires following each reading show the interest level of students using the REAP tutor.  The questionnaires asked students to indicate on a scale from one to five their interest inr the preceding text.  The distributions of post-reading interest ratings for students in the treatment and control conditions are shown below.
  
[[Image:Interest_combined.PNG|700px]]
+
[[Image:Interest_combined.jpg|700px]]
  
 
Students were also given an exit survey during their last week of practice with the tutor that asked them, among other questions, for to indicate whether they agreed with the statement, “Most of the readings were interesting.”  The ratings were on a scale from one to five, with five indicating strong agreement and one indicating strong disagreement.  Exit survey interest ratings by students in the treatment condition were significantly higher (p<0.05) than the ratings by students in the control condition.  The mean response for students who received personalized readings was 3.18, while it was 2.65 for students in the control condition.
 
Students were also given an exit survey during their last week of practice with the tutor that asked them, among other questions, for to indicate whether they agreed with the statement, “Most of the readings were interesting.”  The ratings were on a scale from one to five, with five indicating strong agreement and one indicating strong disagreement.  Exit survey interest ratings by students in the treatment condition were significantly higher (p<0.05) than the ratings by students in the control condition.  The mean response for students who received personalized readings was 3.18, while it was 2.65 for students in the control condition.
 
 
The effects of this increased interest were measured by time spent on readings and scores on reading check questions designed to test that the student at least read the text (these were not detailed tests of comprehension).  Students in the treatment condition spent slightly (though not significantly) longer on each reading.  Students in the treatment group also scored higher on post-reading reading-check questions aimed at verifying that the student actually read the text, rather than just accessing definitions for highlighted target words, which was a gaming behavior witnessed in previous studies.  The reading check questions were multiple-choice questions of the form, "Which set of words occurred in the passage?" The correct answer contained only salient words (defined by the tf.idf measure from information retrieval) that appeared in the text. Distractors contained some salient words from the text, but also words that were not in the text.  There is some evidence from REAP studies that performance on this type of question correlates with post-test vocabulary scores (which are unrelated to the content of readings).  Thus, it seems that the students in the treatment group were processing the context around the target words to a greater degree.  However, the difference in reading check question performance is only marginally significant (2-sided independent samples t-test, p<0.10)
 
 
[[Image:Readingcheck0.PNG|350px]]
 
  
 
Further analysis of post-test scores reveals that students did learn more of the words that they actually practiced in REAP.  The post-test contained 40 questions for target vocabulary words.  Many of the students did not practice 40 words, so performance on practiced words alone was analyzed.  Students in the treatment condition scored higher (N=16, M=50.3, SD=20.1) on questions for words seen in readings than did students in the control condition (N=19, M=32.4, SD=18.9).  A two-tailed t-test for independent means verified that this result is statistically significant (t=2.719, df=33, p=0.005).  The difference of scores between the two groups was 17.9% (95% CI = 4.5%, 31.3%), which corresponds to a large effect size of 0.85.  This result indicates that [[personalization]] improved learning for the words that students saw in readings, which is in line with previous findings that intrinsic motivation leads to improved learning.
 
Further analysis of post-test scores reveals that students did learn more of the words that they actually practiced in REAP.  The post-test contained 40 questions for target vocabulary words.  Many of the students did not practice 40 words, so performance on practiced words alone was analyzed.  Students in the treatment condition scored higher (N=16, M=50.3, SD=20.1) on questions for words seen in readings than did students in the control condition (N=19, M=32.4, SD=18.9).  A two-tailed t-test for independent means verified that this result is statistically significant (t=2.719, df=33, p=0.005).  The difference of scores between the two groups was 17.9% (95% CI = 4.5%, 31.3%), which corresponds to a large effect size of 0.85.  This result indicates that [[personalization]] improved learning for the words that students saw in readings, which is in line with previous findings that intrinsic motivation leads to improved learning.
  
  
[[Image:post_just_practice.PNG|400px]]
+
[[Image:post_just_practice.jpg|400px]]
  
 
However, students in the treatment condition that included [[personalization]] saw fewer words in their training sessions (N=16, M=12.0 , SD=1.13) than students in the control condition (N=19, M=16.3, SD=0.87) (t=-2.9, df=33, p=0.006).  Average time on task was essentially the same for students in both conditions.  Students in the treatment condition spent slightly longer on each reading.  The main reason, however, for the difference in the average total number of words practiced was that students for whom the tutor provided personalized instruction saw fewer words (M=3.41, SD=0.55) per practice reading passage than students in the control condition (M=4.07, SD=0.83) (t=2.929, df=33, p=0.006).
 
However, students in the treatment condition that included [[personalization]] saw fewer words in their training sessions (N=16, M=12.0 , SD=1.13) than students in the control condition (N=19, M=16.3, SD=0.87) (t=-2.9, df=33, p=0.006).  Average time on task was essentially the same for students in both conditions.  Students in the treatment condition spent slightly longer on each reading.  The main reason, however, for the difference in the average total number of words practiced was that students for whom the tutor provided personalized instruction saw fewer words (M=3.41, SD=0.55) per practice reading passage than students in the control condition (M=4.07, SD=0.83) (t=2.929, df=33, p=0.006).
 +
 
Thus, when the tutor used [[personalization]] as a factor in the selection of readings, it chose readings that were less valuable according to other factors.  Specifically, this result shows that by personalizing instruction, the tutor was not able to provide practice for as many words.  Of course, the practice that it did provide was better, as is shown in the previous result that for words student did practice, [[personalization]] appeared to increase learning.
 
Thus, when the tutor used [[personalization]] as a factor in the selection of readings, it chose readings that were less valuable according to other factors.  Specifically, this result shows that by personalizing instruction, the tutor was not able to provide practice for as many words.  Of course, the practice that it did provide was better, as is shown in the previous result that for words student did practice, [[personalization]] appeared to increase learning.
 +
 +
The reduced number of target words per text with personalization is a technical issue which can be avoided in a straightforward manner by increasing the size of the database of readings.  With more readings, the tutor can find texts that both have ample target words and cover topics of personal interest.
  
  
[[Image:words_per_reading.PNG|400px]]
+
[[Image:words_per_reading.jpg|400px]]
  
 
There is a possibility that the students in the treatment condition who were seeing fewer words in each reading were learning more of the words simply because they had fewer to learn per reading.  To rule out this hypothesis, regression analyses (multiple linear regression) with overall post-test performance and performance for practiced words as the dependent variables.  In both regression analyses, the number of target words per reading was not a significant predictor of performance.  In fact, the number of target words per document was slightly positively correlated with post-test performance in both cases.  This result seems to rule out the possibility that students were learning more target words in the treatment condition because they were seeing fewer words.   
 
There is a possibility that the students in the treatment condition who were seeing fewer words in each reading were learning more of the words simply because they had fewer to learn per reading.  To rule out this hypothesis, regression analyses (multiple linear regression) with overall post-test performance and performance for practiced words as the dependent variables.  In both regression analyses, the number of target words per reading was not a significant predictor of performance.  In fact, the number of target words per document was slightly positively correlated with post-test performance in both cases.  This result seems to rule out the possibility that students were learning more target words in the treatment condition because they were seeing fewer words.   
  
  
NOTE: [[Long-term retention]] test results are pending.
+
[[Long-term retention]] test results showed no reliable differences because of a small sample size.  The test was administered to students who stayed in the ELI in the subsequent semester, which constituted only a fraction of the original sample.
  
=== Descendents ===
+
=== Further Information ===
  
Submitted paper to appear here.
+
The following study addresses a different form of personalization, by which interactions with the learner (e.g., instructions, directions) are conducted using casual and direct rather than formal language:
 +
 
 +
[[Stoichiometry_Study | Studying the Learning Effect of Personalization and Worked Examples in the Solving of Stoichiometry Problems (McLaren, Koedinger & Yaron)]]
  
 
=== Annotated bibliography ===
 
=== Annotated bibliography ===
  
[http://reap.cs.cmu.edu/Papers/heilman_topic_choice_AIED2007_poster_final.pdf Heilman, M., Juffs, A., & Eskenazi, M. (To Appear). Choosing Reading Passages for Vocabulary Learning by Topic to Increase Intrinsic Motivation. Proceedings of the 13th International Conferenced on Artificial Intelligence in Education. Marina del Rey, CA. (poster)]
+
Note: a paper on this study has been submitted to International Journal of Artificial Intelligence in Education.
 +
 
 +
[http://reap.cs.cmu.edu/Papers/heilman_topic_choice_AIED2007_poster_final.pdf Heilman, M., Juffs, A., & Eskenazi, M. (2007). Choosing Reading Passages for Vocabulary Learning by Topic to Increase Intrinsic Motivation. Proceedings of the 13th International Conferenced on Artificial Intelligence in Education. Marina del Rey, CA. (poster)]
  
 
Clark, R. C. and Mayer, R. E. (2003). e-Learning and the Science of Instruction.  Jossey-Bass/Pfeiffer.
 
Clark, R. C. and Mayer, R. E. (2003). e-Learning and the Science of Instruction.  Jossey-Bass/Pfeiffer.

Latest revision as of 17:54, 30 May 2008

REAP Study on Personalization of Readings for Increased Interest

Logistical Information

Contributors Maxine Eskenazi, Alan Juffs, Michael Heilman, Kevyn Collins-Thompson, Lois Wilson, Jamie Callan
Study Start Date September 11, 2006
Study End Date November 21, 2006
Learnlab Courses English Language Institute Reading 4 (ESL LearnLab)
Number of Students 35
Total Participant Hours (est.) 270
Data in Datashop no

Abstract

In this work, the term “personalization” refers to the selection of practice readings in order to match a student’s interests.

During each training session with the REAP Tutor, students work through a series of readings, each of which is followed by practice exercises for the target words in the reading. While reading a passage, students are able to access dictionary definitions for any word in a reading either by clicking on a highlighted target word or by typing a word into a box in the lower-left corner of the screen. The target words in the readings are also highlighted because highlighting may increase the use of dictionary definitions, thus encouraging students to coordinate multiple sources of information about a word’s meaning—namely, the implicit examples from context around words and the explicit generalizations in the definitions of words (as exemplified in the figure below)

Reap context definition.jpg

A problem discovered in past studies with REAP is that many students spend only a brief amount of time on a reading and do not deeply process the text. Students often only read the dictionary definition for target words rather than attempting to process the entire context around the words. Inferring the meaning of vocabulary from context is a seemingly important strategy that is not used by such students. This behavior is likely due to a desire to perform well on post-reading practice exercises and post-test, which can be viewed as forms of extrinsic motivation. Intrinsically motivated students who are more interested in a reading are more likely to read the entire text and to use context to learn the meaning of unknown vocabulary. Therefore, personalization that increases intrinsic motivation could lead to deeper processing of context and better learning of vocabulary.

Passive Active Interactive
Explicit (general) Dictionary Definitions Practice Exercises
Implicit (instance) Interpreting meaning in context while reading Sentence Production (assessment) Practice Exercises

Glossary

Intrinsic Motivation: Motivation to learn for learning's own sake rather than some external goal.

Extrinsic Motivation: Motivation for learn in order to satisfy an external goal, such as completing a task or passing an assessment.

Research question

Does personalization of practice readings to match students' personal interests increase ESL vocabulary learning?

Dependent variables

Normal post-test scores

Normal post-test scores for practiced words only

Long-term retention test scores, same post-test but administered months later.

Evidence of Transfer: sentence production tasks for target words, correct use of words in writing assignments for other courses.

Independent variables

Personalization of readings by topics of interest. In the control condition, the tutor did not use potential personal interest as a factor in its selection of reading materials. In the treatment condition, the tutor did use interest as a factor. All other selection criteria were the same in both conditions. Time on task was also the same.

Hypotheses

Since intrinsic motivation seems to be important in language learning, the benefits of personalization will outweigh the costs.

Findings

Personalization to match interests can lead to improved learning of the relevant knowledge components in a tutoring environment for vocabulary learning. Students in the treatment group correctly answered a higher proportion of questions on target words that were practiced in the REAP tutor.

Second, personalization can compromise domain-based goals. In the REAP tutor, an important domain-based goal is to give the student practice opportunities for many new target words. However, students receiving personalization practiced fewer target words. The difficulty in achieving the domain-based goal of practicing many unknown words is due to the fact that the REAP tutor often could not find texts that included multiple target words and also matched personal interests.

Third and finally, if the challenges of negotiating personalization and domain-based goals are met, personalization can lead to improvements in overall learning. Students with personalization appeared to learn the words they practiced with greater frequency but practiced fewer target words, and as a result did not perform reliably differently than their controls on the overall post-test measure for cloze questions. The researchers attributed this lack of a difference to the fact that, in many cases, the tutor had to choose between interesting readings and those with more practice opportunities. However, the availability of readings that are both interesting and provide ample practice is a technical issue which can be solved in a straightforward manner by increasing the size and coverage of the corpus of available practice reading passages.

Explanation

Students in the treatment condition with personalization performed slightly better on average (M=35.5%, SD=14.9%) in terms of overall post-test scores compared to students in the control condition (M=27.1%, SD=17.2%). However, this difference was not statistically significant.

Graph40.jpg

There is evidence that the difference in post-test scores is due to increased interest leading to deeper processing of the reading practice texts.

Responses to questionnaires following each reading show the interest level of students using the REAP tutor. The questionnaires asked students to indicate on a scale from one to five their interest inr the preceding text. The distributions of post-reading interest ratings for students in the treatment and control conditions are shown below.

Interest combined.jpg

Students were also given an exit survey during their last week of practice with the tutor that asked them, among other questions, for to indicate whether they agreed with the statement, “Most of the readings were interesting.” The ratings were on a scale from one to five, with five indicating strong agreement and one indicating strong disagreement. Exit survey interest ratings by students in the treatment condition were significantly higher (p<0.05) than the ratings by students in the control condition. The mean response for students who received personalized readings was 3.18, while it was 2.65 for students in the control condition.

Further analysis of post-test scores reveals that students did learn more of the words that they actually practiced in REAP. The post-test contained 40 questions for target vocabulary words. Many of the students did not practice 40 words, so performance on practiced words alone was analyzed. Students in the treatment condition scored higher (N=16, M=50.3, SD=20.1) on questions for words seen in readings than did students in the control condition (N=19, M=32.4, SD=18.9). A two-tailed t-test for independent means verified that this result is statistically significant (t=2.719, df=33, p=0.005). The difference of scores between the two groups was 17.9% (95% CI = 4.5%, 31.3%), which corresponds to a large effect size of 0.85. This result indicates that personalization improved learning for the words that students saw in readings, which is in line with previous findings that intrinsic motivation leads to improved learning.


Post just practice.jpg

However, students in the treatment condition that included personalization saw fewer words in their training sessions (N=16, M=12.0 , SD=1.13) than students in the control condition (N=19, M=16.3, SD=0.87) (t=-2.9, df=33, p=0.006). Average time on task was essentially the same for students in both conditions. Students in the treatment condition spent slightly longer on each reading. The main reason, however, for the difference in the average total number of words practiced was that students for whom the tutor provided personalized instruction saw fewer words (M=3.41, SD=0.55) per practice reading passage than students in the control condition (M=4.07, SD=0.83) (t=2.929, df=33, p=0.006).

Thus, when the tutor used personalization as a factor in the selection of readings, it chose readings that were less valuable according to other factors. Specifically, this result shows that by personalizing instruction, the tutor was not able to provide practice for as many words. Of course, the practice that it did provide was better, as is shown in the previous result that for words student did practice, personalization appeared to increase learning.

The reduced number of target words per text with personalization is a technical issue which can be avoided in a straightforward manner by increasing the size of the database of readings. With more readings, the tutor can find texts that both have ample target words and cover topics of personal interest.


Words per reading.jpg

There is a possibility that the students in the treatment condition who were seeing fewer words in each reading were learning more of the words simply because they had fewer to learn per reading. To rule out this hypothesis, regression analyses (multiple linear regression) with overall post-test performance and performance for practiced words as the dependent variables. In both regression analyses, the number of target words per reading was not a significant predictor of performance. In fact, the number of target words per document was slightly positively correlated with post-test performance in both cases. This result seems to rule out the possibility that students were learning more target words in the treatment condition because they were seeing fewer words.


Long-term retention test results showed no reliable differences because of a small sample size. The test was administered to students who stayed in the ELI in the subsequent semester, which constituted only a fraction of the original sample.

Further Information

The following study addresses a different form of personalization, by which interactions with the learner (e.g., instructions, directions) are conducted using casual and direct rather than formal language:

Studying the Learning Effect of Personalization and Worked Examples in the Solving of Stoichiometry Problems (McLaren, Koedinger & Yaron)

Annotated bibliography

Note: a paper on this study has been submitted to International Journal of Artificial Intelligence in Education.

Heilman, M., Juffs, A., & Eskenazi, M. (2007). Choosing Reading Passages for Vocabulary Learning by Topic to Increase Intrinsic Motivation. Proceedings of the 13th International Conferenced on Artificial Intelligence in Education. Marina del Rey, CA. (poster)

Clark, R. C. and Mayer, R. E. (2003). e-Learning and the Science of Instruction. Jossey-Bass/Pfeiffer.

Cordova, D. I. & Lepper, M. R. (1996). Intrinsic Motivation and the Process of Learning: Beneficial Effects of Contextualization, Personalization, and Choice. Journal of Educational Psychology. Vol. 88,l No. 4, 715-730.

Lepper, M. (1988). Motivational Considerations in the Study of Instruction. Cognition and Instruction. 5(4), 289-309.

Heilman, M., Juffs, A., & Eskenazi, M. (To Appear). Choosing Reading Passages for Vocabulary Learning by Topic to Increase Intrinsic Motivation. Proceedings of the 13th International Conferenced on Artificial Intelligence in Education. Marina del Rey, CA. (poster)