Learning French gender cues with prototypes

From LearnLab
Jump to: navigation, search
PIs Presson, MacWhinney
Faculty MacWhinney
Postdocs Pavlik
Others with > 160 hours n/a
Study Start Date 10/03/07
Study End Date 11/28/07
Learnlab French
Number of participants (total) 18
Number of participants (treatment) 18
Total Participant Hours ~40
Datashop? In Prep


The goal of this project is to improve the ability of students of Elementary French to determine the gender of French nouns. This improvement is attained through large amounts of adaptive practice, and is measured in terms of ability to generalize to novel nouns. This work emphasizes the role of scheduling in attaining mastery, using the optimized practice scheduling software designed by Pavlik. This is the same system and basic research design (the FaCT System) as in Presson, MacWhinney & Pavlik's previous gender learning study.


Research question

This research is designed to discover the best method of producing robust learning of French nominal gender, as well as the factors that make this learning more difficult.

Background and significance

Tucker, Lambert and Rigault (1977) evaluated the L1 (first language) learning of cues to gender in French. More recently, Holmes and Dejean de la Batie (1999) produced the first study of the acquisition of grammatical gender by L2 learners. Holmes and Segui (2004) have extended the detail of these analyses, but so far only with native speakers. Carroll (1999) and Lyster (2006) have explored the role of cue validity and availability in predicting usage by learners. All of these studies underscore the importance of high validity cues for the general vocabulary. However, these cues are only marginally useful for the highest frequency forms, whose gender must be learned more or less by rote. These analyses are in very close accord with the claims of the Competition Model (MacWhinney 1978, 2006).

In the Competition Model, each cue has a strength that is based on its reliability in signaling information (as in, for example, the use of spelling to predict grammatical gender). Some cues are more reliable than others: for instance, in the case of nouns that refer to people, semantic cues (the gender of a person) are more reliable than spelling cues. Over time, a learner picks up on these reliabilities, first acquiring the most clearly reliable cues, then later pulling apart conflicting but frequently co-occurring ones. Cue conflicts are then resolved through a process of competition. A full discussion of cue conflict is found in MacDonald and MacWhinney (1991).

Our goal here is to use these findings to guide effective instruction. One way of doing so is to aim for mastery of some grammatical structure in an L2, in this case grammatical gender, to show that with efficient and optimized practice, the learning gains can be large. We do this using an optimized schedule designed by Pavlik (2005) in the FaCT System and inspired by the memory schedules of Pimsleur (1967). We expect that, with a sufficient amount of practice under the right conditions, grammatical gender assignment can become proceduralized. Although grammatical gender is a relatively simple grammatical structure, and (for English L1 speakers) should show little interference from structures in the native language, this is an important first step toward optimizing grammar learning overall as well as toward learning more about the available mechanisms to learn an L2.

We are changing instruction here based on whether or not the relevant cue is presented with a salient (because of its frequency and simplicity) prototype or with a larger number of varied exemplars. In both cases, instruction is explicit.

Dependent variables

One primary dependent variable is percentage correct gender judgment for a given rule. Because there are only two genders in French, chance performance is at 50%. On pre- and post-test performance, because of the likelihood of response bias, we use signal detection theory (d-prime) to detect sensitivity to gender cues.

In this study, because we are concerned with increase in skilled performance over time, it is important to consider response latency as a dependent variable. To test the consistency of student responding over time, we use the coefficient of variability (of response latencies) as defined by Segalowitz (e.g., Segalowitz, Segalowitz & Wood, 1998). This tells us whether student responses are highly variable (even if accuracy or average speed are high) or, more like native speakers and other proficient performers, consistently fast and accurate.

Independent variables

First, to ensure that the training is working, we are using a pretest-posttest design to measure the overall effects of the online training. As in previous studies, post-test - pre-test improvement is significantly different from zero, and appears to be bimodal, with some high-gain learners and more low- or no-gain learners.

The main manipulation in this study is the presentation of half the rules with a salient prototype word that follows the cue being taught explicitly. For such "prototype rules," 50% of trials for that rule consist of the prototype, while the other 50% consists of practice with 7 other equally practiced exemplars. In the baseline "exemplar rules," all practice is evenly distributed among 14 exemplars. These rules are randomized for each subject, creating a counterbalanced, within-subjects design in which all students see 14 prototype rules and 14 exemplar rules.

The rules of the study involve a set of 34 cues summarized here.

4 of these cues involve tighter specifications of the shape of more general cues. For example final -té is more specific than final -é. In these specificity relations, the more specific cue should dominate over the less specific cue, but this relation must be learned.

  1. -té f, but
  2. -é m#-tion/-xion/-sion f, but
  3. -on m

6 other cues stand in pairs with the shorter forms being masculine and the longer feminine.

  1. -ine/-aine f
  2. -in/-ain m
  3. -ais/-ois m
  4. -aise/-oise f
  5. -en/-ien m
  6. -nne f

14 other cues are highly derivational in nature:

  1. -age m
  2. -isme/-me m
  3. -tre m
  4. -se/-euse/-esse/-ise f
  5. -tte f
  6. -ére f
  7. -ie f
  8. -ée f
  9. -ance f
  10. -ure f
  11. -trice f
  12. -eur (agential) m
  13. -ier m

6 further cues are simply phonological:

  1. -u/-au/-ou/-eau m
  2. -l m
  3. -rd m
  4. -i/-oi m
  5. -e f
  6. final C m

Note the tendency to pronounce final c, r, f, and l (CaReFuL)

4 cues are semantic in content:

  1. borrowing m
  2. fem gender f
  3. masc gender m
  4. time m

From the first semester vocabulary the exceptions to these cues were: téléphone, vie, basket, chanson, and groupe.


  1. The presence of a salient prototype will accelerate learning.
  2. Within the prototype condition, the differences between cues (more specifically, the extent to which all exemplars are similar and tightly grouped versus very dissimilar in form) will predict the effectiveness of prototype presentation. For example: one cue states that words referring to a person of a specific gender will always take that gender (semantic gender). However, these words can be very dissimilar. Frère (brother) cues masculine gender, whereas tante (aunt) cues feminine gender. We predict this sort of rule will not be much helped by the prototype. On the other hand, a cue such as -tion → F, which has a very orthographically similar set of examples, should show a stronger effect of prototype.
  3. Learning will be most robust in the prototype condition: when tested in a follow-up session with no prototype words presented at all, performance on the prototype will generalize to non-prototype words.

These predictions derive from the Competition Model (MacWhinney, 2006). Also, in line with the literature on the use of extensive practice toward proceduralization (e.g., Anderson & Fincham, 1994), we will use the Segalowitz measure as well as general increases in accuracy and speed to show some level of proceduralization with optimized practice.


Learning gains showed approximately 40% increase in accuracy between pre- and post-test with 4 15-minute practice sessions.

The prototype manipulation showed a main effect where "prototype" cues were learned to higher accuracy (t = 2.53, p < .005) and shorter latencies (p < .05). However, this really shows strong learning of the prototype case, with less generalization to the other examples, visible in the following table (standard deviations in parentheses):

Condition Mean Accuracy Mean Latency (correct responses only)
Exemplar 0.81 (0.39) 1442 (1069)
Prototype ("Prototype" word itself) 0.89 (0.31) 1246 (965)
Prototype (Other words) 0.79 (0.41) 1432 (1037)


Although the prototype manipulation was effective in increasing overall accuracy for cues presented with prototypes, this seems to be an artifact of the overlearning of the prototype word itself. Because of the low sample size and questionable compliance with the practice schedule, we are hesitant to conclude that there is no beneficial effect of the prototype.

One possible alternative explanation of the lack of generalization advantage with this highly frequent exemplar is the explicit cue instruction. Because all cues were presented with explicit study trials and feedback prompts (e.g., -age -> F ), it could be that stimulus manipulations are ineffective in this case. In the follow-up (laboratory) study described here, the use of both implicit and explicit instruction, crossed with prototype and exemplar conditions as described in the current study, and those results seem to somewhat disambiguate the relative effects of the two instructional manipulations.


Annotated bibliography

  • Anderson, J. R., & Fincham, J. M. (1994). Acquisition of procedural skills from examples. Journal of Experimental Psychology: Learning, Memory, & Cognition, 20(6), 1322-1340.
  • Carroll, S. (1999). Input and SLA: Adults' sensitivity to different sorts of cues to French gender. Language Learning, 49, 37-92.
  • DeKeyser, R. M. (2005). What Makes Learning Second-Language Grammar Difficult? A Review of Issues. Language Learning, 55(Suppl1), 1-25.
  • Holmes, V. M., & Dejean de la Batie, B. (1999). Assignment of grammatical gender by native speakers and foreign learners of French. Applied Psycholinguistics, 20, 479-506.
  • Holmes, V. M., & Segui, J. (2004). Sublexical and lexical influences on gender assignment in French. Journal of Psycholinguistic Research, 33, 425-457.
  • Lyster, R. (2006). Predictability in French gender attribution: A corpus analysis. French Language Studies, 16, 69-92.
  • MacDonald, J. L., & MacWhinney, B. (1991). Levels of learning: A microdevelopmental study of concept formation. Journal of Memory and Language, 30, 407-430.
  • MacWhinney, B. (2006). A unified model. In N. Ellis & P. Robinson (Eds.), Handbook of Cognitive Linguistics and Second Language Acquisition. Mahwah, NJ: Lawrence Erlbaum Press.
  • Pavlik Jr., P. (2005). Modeling order effects in the learning of information.
  • Pavlik Jr., P., & Anderson, J. R. (2005). Practice and forgetting effects on vocabulary memory: An activation-based model of the spacing effect. Cognitive Science, 29(4), 559-586.
  • Pimsleur, P. (1967). A memory schedule. The Modern Language Journal, 51(2), 73-75.
  • Robinson, P. (1997). Generalizability and automaticity of second language learning under implicit, incidental, enhanced, and instructed conditions. Studies in Second Language Aquisition, 19(2), 223-247.
  • Segalowitz, S., Segalowitz, N., & Wood, A. (1998). Assessing the development of automaticity in second language word recognition. Applied Psycholinguistics, 19, 53-67.
  • Ullman, M. T. (2001). The neural basis of lexicon and grammar in first and second language: the declarative/procedural model. Bilingualism: Language and Cognition, 4(1), 105-122.