Difference between revisions of "Co-training of Chinese characters"

From LearnLab
Jump to: navigation, search
(Independent variables)
 
(13 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
----
 
----
*Node Title: Learning to read Chinese: [[Co-training]] in human
+
'''Summary Table'''
 +
*Node Title: Learning to read Chinese: [[Co-training]] in human (Study 1)
 
*Researchers: Ying Liu, Charles Perfetti, Susan Dunlap, Gusheng Zi, Tom Mitchell
 
*Researchers: Ying Liu, Charles Perfetti, Susan Dunlap, Gusheng Zi, Tom Mitchell
 
*PIs: Ying Liu, Charles Perfetti, Tom Mitchell
 
*PIs: Ying Liu, Charles Perfetti, Tom Mitchell
Line 7: Line 8:
 
*Graduate Students: Derek Chan
 
*Graduate Students: Derek Chan
 
*Study Start Date Sep 1, 2005
 
*Study Start Date Sep 1, 2005
*Study End Date Dec 31, 2006
+
*Study End Date Dec 31, 2005
*LearnLab Site and Courses , CMU Chinese Online
+
*LearnLab Site and Courses: LRDC, pull out study
*Number of Students: 20
+
*Number of Students: 44
*Total Participant Hours for the study: 20
+
*Total Participant Hours for the study: 44
 
*Data in the Data Shop: Yes
 
*Data in the Data Shop: Yes
 
----
 
----
  
 
== Abstract ==
 
== Abstract ==
This study was designed to explore how native English speakers learn to speak and read Chinese. The experiment consisted of two parts. The first part was training, which was used to teach the input (Chinese fonts and sounds) to output (English translations) mapping of 16 Chinese characters. Training methods were manipulated in this part. A quarter of the subjects only received labeled training trials (English translation provided), the others received extra training trials with [[unlabeled examples|non-labeled trials]] (only the orthography or/and phonology without English translation). The non-labeled trials were further separated into three types: unpaired, correlated paired and uncorrelated paired, with each type used for one quarter of subjects.
+
The present study explored how native English speakers learn to speak and read Chinese in a cotraining environment. The experiment consisted of two parts. The first part was training, which was used to teach the input (Chinese fonts and sounds) to output (English translations) mapping of 16 Chinese characters. Training methods were manipulated in this part. A quarter of the subjects only received labeled training trials (English translation provided), the others received extra training trials with [[unlabeled examples|non-labeled trials]] (only the orthography or/and phonology without English translation). The non-labeled trials were further separated into three types: unpaired, correlated paired and uncorrelated paired, with each type used for one quarter of subjects.
 
+
The second part was posttest, in which students produced the English translation when they saw the Chinese fonts or hear the Chinese sounds one by one. The accuracy of translation was recorded. It showed that [[unlabeled examples]] did help the learning, and uncorrelated paired examples did the best among all three types of unlabeled examples.
The second part was testing, in which students produced the English translation when they saw the Chinese fonts or hear the Chinese sounds one by one. The accuracy of translation was recorded. It showed that [[unlabeled examples]] did help the learning, and uncorrelated paired examples did the best among all three types of unlabeled examples.
 
 
 
In the fall of 2006, we conducted Experiment 2 of this study as an [[in-vivo experiment]] and focused on the pairing effect by using the Chinese online course students. A within subject 2 by 2 design (labeling x pairing) was applied to the online course students. The labeling factor tested the effectiveness of unlabeled trials in learning the mapping from visual and auditory forms to meaning. The pairing factor tested the difference between paired and unpaired inputs. The paired inputs have been found to be better in our previous experiment using lab learners.
 
 
 
In the spring of 2007, we finished experiment 3 which explored the effect of variation and correlation in a [[cotraining]] setup (same as experiment 1 and 2).
 
  
 
== Glossary ==
 
== Glossary ==
Line 28: Line 24:
 
labeling; source pairing; source correlation.
 
labeling; source pairing; source correlation.
  
3. The research question stated as concisely as possible, usually in a single sentence;
+
== Research question ==
  
 
How native English speakers learn to speak and read Chinese under various coordinative learning conditions.  
 
How native English speakers learn to speak and read Chinese under various coordinative learning conditions.  
  
4. A background and significance section that briefly summarizes prior work on the research question and why it is important to answer it;
+
== Background ==
  
 
In machine learning research, it has been found that multiple-strategies and multiple modalities facilitate learning (Blum and Mitchell, 1998). However, the effectiveness of the properties of “co-training” theory have not been tested in human learners yet. We carried out this study to directly test two important properties of this theory in human learners. There are two results from the finished experiment and one non-result of interest. Most dramatic is the advantage of written over spoken input. This has nothing to do with co-training but is interesting and important for L2 word learning (translation). Second is the pairs effect, the advantage of spoken + written input presented during unlabelled training compared with either one separately. The independence of the surface features of these inputs (specific speaker, specific font) was not a factor.
 
In machine learning research, it has been found that multiple-strategies and multiple modalities facilitate learning (Blum and Mitchell, 1998). However, the effectiveness of the properties of “co-training” theory have not been tested in human learners yet. We carried out this study to directly test two important properties of this theory in human learners. There are two results from the finished experiment and one non-result of interest. Most dramatic is the advantage of written over spoken input. This has nothing to do with co-training but is interesting and important for L2 word learning (translation). Second is the pairs effect, the advantage of spoken + written input presented during unlabelled training compared with either one separately. The independence of the surface features of these inputs (specific speaker, specific font) was not a factor.
Line 40: Line 36:
 
To understand the correlation feature better, we are testing the correlation feature in an in-vivo setup with more learning sessions.
 
To understand the correlation feature better, we are testing the correlation feature in an in-vivo setup with more learning sessions.
  
5. The dependent variables, which are observable and typically measure competence, motivation, interaction, meta-learning, or some other pedagogically desirable outcome;
+
== Dependent variables ==
  
 
[[Normal post-test]]: Accuracy of producing the English word under reading and/or listening situation.
 
[[Normal post-test]]: Accuracy of producing the English word under reading and/or listening situation.
  
6. The independent variables, which are typically include instructional environment, activity or method, and perhaps some student characteristics, such as gender or first language;  
+
== Independent variables ==
 +
Labeling variable and correction variable.
 +
Four training conditions, between subject design. All subjects received 48 Labeled examples, then followed by
 +
A) none;
 +
B) 192 unpaired unlabeled examples;
 +
C) correlated paired unlabeled examples;
 +
D) uncorrelated paired unlabeled examples.
 +
[[Image:study1.jpg]]
  
Labeling
+
== Hypothesis ==
Pairing
 
Variation
 
Correlation
 
 
7. The hypothesis, which is a concise statement of the relationship among the variables that answers the research question;
 
  
 
Pairing of visual font and auditory sound of Chinese characters should enhance learning under both labeled and unlabeled trials, but the benefit is most significant when the trials are unlabeled.
 
Pairing of visual font and auditory sound of Chinese characters should enhance learning under both labeled and unlabeled trials, but the benefit is most significant when the trials are unlabeled.
 +
*
 +
[[Image:cotraining1.jpg]]
  
8. The findings, which are the results of the study if any are currently available;
+
== Findings ==
  
There are two results from the first experiment and one non-result of interest. Most dramatic is the advantage of written over spoken input. This has nothing to do with co-training but is interesting and important for L2 word learning (translation). Second is the pairs effect, the advantage of spoken + written input presented during unlabeled training compared with either one separately. The independence of the surface features of these inputs (specific speaker, specific font) was not a factor.
+
*“Unlabelled paired” trials may aid learning. Learning meanings was facilitated by the addition of unlabeled paired trials that did not provide meaning.
Experiment 2 is under analysis and experiment 3 is collecting data.
+
**However, this unlabeled-trials effect was restricted to cross-modal pairs (spoken syllable and written character); it was absent when only one (spoken syllable) or the other (written character) modality was presented.
 +
**Implication: Cross-modal inputs in this situation can establish multiple representations (speech-writing pairs) from which meaning links are more readily retrieved.
 +
*Written form learned better than spoken form Large advantage for the presentation of written characters compared with their corresponding spoken syllables for learning a form-meaning pair.
 +
*Benefits of uncorrelated examples was not observed.  
 +
**Correlated examples: Given font and given speaker always co-occur (conditional dependent)
 +
**Uncorrelated examples: Given font occurs with all speakers; and given speaker occurs with all fonts (conditional independent)
 +
**This is still being assessed by using  multiple learning sessions.  
  
9. An explanation, which is short (a paragraph or two) and typically mentions unobservable, hypothetical attributes of the students (e.g., the students’ knowledge or motivation) and cognitive or social processes that affect them;
+
[[Image:cotraining2.jpg]]
 +
 
 +
== Explanation ==
  
 
Learning meanings was facilitated by the addition of unlabeled paired trials that did not provide meaning implicates that predictions of the label are generated for unlabeled trials, so they serve as self-generated labeled trials and work as meaningful materials for learning. This effect is especially significant in multiple input situation (paired trials) because the establishment of multiple representations (speech-writing pairs) makes the “label prediction” more accurate.
 
Learning meanings was facilitated by the addition of unlabeled paired trials that did not provide meaning implicates that predictions of the label are generated for unlabeled trials, so they serve as self-generated labeled trials and work as meaningful materials for learning. This effect is especially significant in multiple input situation (paired trials) because the establishment of multiple representations (speech-writing pairs) makes the “label prediction” more accurate.
  
10. The descendents, which lists links to descendent nodes of this one, if there are any;
+
== Descendents ==
  
 
None.
 
None.
  
11. A further information section that points to documents using hyper links and/or references in APA format. Each indicates briefly the document's relationship to the node (e.g., whether the document is a paper reporting the node in full detail, a proposal describing the motivation and design of the study in more detail, the node for a similar PSLC research study, etc.).
+
== Further information ==
 
 
www.pitt.edu/~liuying/pslc_plan.doc
 

Latest revision as of 03:13, 3 November 2008


Summary Table

  • Node Title: Learning to read Chinese: Co-training in human (Study 1)
  • Researchers: Ying Liu, Charles Perfetti, Susan Dunlap, Gusheng Zi, Tom Mitchell
  • PIs: Ying Liu, Charles Perfetti, Tom Mitchell
  • Others who have contributed 160 hours or more:
  • Post-Docs: Gusheng Zi
  • Graduate Students: Derek Chan
  • Study Start Date Sep 1, 2005
  • Study End Date Dec 31, 2005
  • LearnLab Site and Courses: LRDC, pull out study
  • Number of Students: 44
  • Total Participant Hours for the study: 44
  • Data in the Data Shop: Yes

Abstract

The present study explored how native English speakers learn to speak and read Chinese in a cotraining environment. The experiment consisted of two parts. The first part was training, which was used to teach the input (Chinese fonts and sounds) to output (English translations) mapping of 16 Chinese characters. Training methods were manipulated in this part. A quarter of the subjects only received labeled training trials (English translation provided), the others received extra training trials with non-labeled trials (only the orthography or/and phonology without English translation). The non-labeled trials were further separated into three types: unpaired, correlated paired and uncorrelated paired, with each type used for one quarter of subjects. The second part was posttest, in which students produced the English translation when they saw the Chinese fonts or hear the Chinese sounds one by one. The accuracy of translation was recorded. It showed that unlabeled examples did help the learning, and uncorrelated paired examples did the best among all three types of unlabeled examples.

Glossary

2. A glossary that defines terms used elsewhere in this node but not defined in the nodes that are parents, grandparents, etc. of this node;

labeling; source pairing; source correlation.

Research question

How native English speakers learn to speak and read Chinese under various coordinative learning conditions.

Background

In machine learning research, it has been found that multiple-strategies and multiple modalities facilitate learning (Blum and Mitchell, 1998). However, the effectiveness of the properties of “co-training” theory have not been tested in human learners yet. We carried out this study to directly test two important properties of this theory in human learners. There are two results from the finished experiment and one non-result of interest. Most dramatic is the advantage of written over spoken input. This has nothing to do with co-training but is interesting and important for L2 word learning (translation). Second is the pairs effect, the advantage of spoken + written input presented during unlabelled training compared with either one separately. The independence of the surface features of these inputs (specific speaker, specific font) was not a factor.

To understand the pairs effect, we have to know whether it is restricted to or larger for unlabeled trials. Experiment 1 did not manipulate pairing in labeled trials. In the fall of 2006, we tested the pairing property under both labeled and unlabeled trails.

To understand the correlation feature better, we are testing the correlation feature in an in-vivo setup with more learning sessions.

Dependent variables

Normal post-test: Accuracy of producing the English word under reading and/or listening situation.

Independent variables

Labeling variable and correction variable. Four training conditions, between subject design. All subjects received 48 Labeled examples, then followed by A) none; B) 192 unpaired unlabeled examples; C) correlated paired unlabeled examples; D) uncorrelated paired unlabeled examples. Study1.jpg

Hypothesis

Pairing of visual font and auditory sound of Chinese characters should enhance learning under both labeled and unlabeled trials, but the benefit is most significant when the trials are unlabeled.

Cotraining1.jpg

Findings

  • “Unlabelled paired” trials may aid learning. Learning meanings was facilitated by the addition of unlabeled paired trials that did not provide meaning.
    • However, this unlabeled-trials effect was restricted to cross-modal pairs (spoken syllable and written character); it was absent when only one (spoken syllable) or the other (written character) modality was presented.
    • Implication: Cross-modal inputs in this situation can establish multiple representations (speech-writing pairs) from which meaning links are more readily retrieved.
  • Written form learned better than spoken form Large advantage for the presentation of written characters compared with their corresponding spoken syllables for learning a form-meaning pair.
  • Benefits of uncorrelated examples was not observed.
    • Correlated examples: Given font and given speaker always co-occur (conditional dependent)
    • Uncorrelated examples: Given font occurs with all speakers; and given speaker occurs with all fonts (conditional independent)
    • This is still being assessed by using multiple learning sessions.

Cotraining2.jpg

Explanation

Learning meanings was facilitated by the addition of unlabeled paired trials that did not provide meaning implicates that predictions of the label are generated for unlabeled trials, so they serve as self-generated labeled trials and work as meaningful materials for learning. This effect is especially significant in multiple input situation (paired trials) because the establishment of multiple representations (speech-writing pairs) makes the “label prediction” more accurate.

Descendents

None.

Further information