Learning Chinese pronunciation from a “talking head”

From LearnLab
Revision as of 17:16, 22 January 2007 by Julie-Booth (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Node title: Learning Chinese pronunciation from a “talking head”

Researchers: Ying Liu, Dominic Massaro, Susan Dunlap, Suemei Wu, Trevor Chen, Derek Chan, Charles Perfetti

1. An abstract that briefly describes the research encompassed by the node;

In this study, we compared the learning of Chinese pronunciation under three different online instruction methods: audio only, human “talking head”, and computer generated synthetic “talking head”. The learning took place through a web site developed specifically for students learning Chinese. Under both “talking head” conditions, the face of the speaker occupied 2/3 of the video screen. When student viewed the human “talking head”, major information came from the shape of the mouth and lip movement accompanied by audio sound. Whereas the synthetic “talking head” is transparent to reveal the internal articulators, which was accompanied by a slower than normal sound to match the “talking head” articulation.

2. A glossary that defines terms used elsewhere in this node but not defined in the nodes that are parents, grandparents, etc. of this node;

Visual; audio; video

3. The research question stated as concisely as possible, usually in a single sentence;

Does visual input of a “talking head” enhance the learning of Chinese pronunciation?

4. A background and significance section that briefly summarizes prior work on the research question and why it is important to answer it;

Multimedia technology has been used in second language learning for many years. The current available technology makes it possible to deliver not only text information, but also auditory and visual information through the Internet. It has been found that multiple-strategies and multiple modalities facilitate learning (Blum and Mitchell, 1998). For example, research in English showed that visual information on the vertical separation between the lips and the degree of lip spreading/rounding help the understanding of spoken language (Massaro and Cohen, 1990; Cohen and Massaro, 1994). So, does a visually presented “talking head” contains both auditory and visual information help Chinese character learning? Especially the robust learning of Chinese pronunciations which contain difficult consonants and tones? The method has not been tested by any well-designed experiment yet. However, based on a study in which we used a real person “talking head” to train true beginners on Chinese character, we believe it is a very effective learning method. Dr. Massaro’s research group is currently working on developing a animated 3D Chinese virtual speaker: Bao (Massaro, Ouni, Cohen, and Clark, In press). They found both the animated video (Baldi) and natural video were perceived better than voice only condition in a perceptual recognition experiment. The above two video conditions performed equally. We will do a comparison study between audio only, Bao and real person talking heads on our Chinese learners.


3. The dependent variables, which are observable and typically measure competence, motivation, interaction, meta-learning, or some other pedagogically desirable outcome;

Accuracy of pronouncing Chinese syllables.

4. The independent variables, which are typically include instructional environment, activity or method, and perhaps some student characteristics, such as gender or first language;

Three learning methods: audio only (control), human “talking head”, computer synthesized “talking head”.

5. The hypothesis, which is a concise statement of the relationship among the variables that answers the research question;

We predict that visual input can provide more robust learning of pronouncing Chinese sound when using appropriately.

6. The findings, which are the results of the study if any are currently available;

No advantage for “talking head” and “Bao” was obtained, which might due to the power of between subject design. We changed the experiment design in this fall by adding a quiz (pretest) between the audio and video trainings. Interaction between conditions and test (pretest vs. posttest) will be the main effect to test.


7. An explanation, which is short (a paragraph or two) and typically mentions unobservable, hypothetical attributes of the students (e.g., the students’ knowledge or motivation) and cognitive or social processes that affect them;


It is difficult to learn to speak a language by just listening to it, especially for a second language learner at beginner’s level. Visual cues can provide extra information for reach the goal of speaking “natively”. Imitation is best achieved by understanding how the organs produce the sound.

As a node under coordinative learning cluster, coordination of visual and audio inputs is the cognitive process leads to more robust learning.

8. The descendents, which lists links to descendent nodes of this one, if there are any;

None.

9. A further information section that points to documents using hyper links and/or references in APA format. Each indicates briefly the document's relationship to the node (e.g., whether the document is a paper reporting the node in full detail, a proposal describing the motivation and design of the study in more detail, the node for a similar PSLC research study, etc.).

www.pitt.edu/~liuying/pslc_plan.doc