Rose - Integrated framework for analysis of classroom discussions

From LearnLab
Revision as of 04:25, 31 January 2010 by Cprose (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Within the field of computer supported collaborative learning, the topic of what makes group discussions productive for learning and community building has been explored with very similar findings, perhaps with subtle distinctions, and under different names such as transactivity (Berkowitz & Gibbs, 1983; Teasley, 1997; Weinberger & Fischer, 2006) in the cognitive learning community and uptake (Suthers, 2006), group cognition (Stahl, 2006), or productive agency (Schwartz, 1998) in the socio-cultural learning community. Despite differences in orientation between the cognitive and socio-cultural learning communities, the conversational behaviors that have been identified as valuable are very similar. And building on these common findings, the field of Computer-Supported Collaborative Learning has emerged where support for collaborative learning has been developed that addresses observed weaknesses in conversational behavior related to this phenomenon.

In order to deepen and expand our understanding of what has been called ‘transactivity’ in the literature on collaborative dyadic interaction, we are attempting to extend those ideas to student discourse in the context of classroom discussion. The de Lisi and Golbeck interpretation of Piaget’s theory models the process through which experiences with peers can play a critical role in the development of a child’s cognitive system. A key idea that has been appropriated from this theory is that when students come together to solve a problem, bringing with them different perspectives, the interaction causes the participants to consider questions that might not have occurred to them otherwise. Through this interaction, children operate on each other’s reasoning, and in so doing they become aware of inconsistencies between their reasoning and that of their partner or even within their own model itself (Teasley). This process was termed transactive discussion after Dewey and Bentley(1949), and further formalized by Berkowitz and Gibbs (1980, 1983, manual). A transactive discussion is defined most simply as “reasoning that operates the reasoning of another” (Berkowitz and Gibbs 1983), although the Berkowitz and Gibbs formulation also allows for transactive contributions to operate on formerly expressed reasoning of the speaker himself.

Explicitly articulated reasoning and transactive discussion are at the heart of what makes collaborative learning discussions valuable. When we shift to consider teacher-guided classroom discourse we will still find similar collaborative exchanges between peers, but there it will be enriched with the pedagogical lead of the teacher. The teacher is responsible for orchestrating the discussion and setting up a structure that is used to elicit reasoned participation from the students.

Any transcript can be coded in limitless ways. Our choice of code is driven by certain hypotheses about what kinds of peer to peer or teacher and student discourse will promote robust learning. We are seeking to make those as precise as possible, so that we can operationalize the discourse categories into a codable form and study them systematically. However, in classroom situations, where the teacher plays the role of lead orchestrator of talk, there is the need to code teacher and student discourse differently, in order to develop quantifiable measures of the kinds of teaching and classroom interaction we think are productive. That way we can test hypotheses of a variety of kinds whether a certain sequence of teacher moves frequently lead to a certain kind of student talk or if the quantity of a particular kind of student talk is associated with better learning outcomes (e.g. pre- post-test gains). We are not looking for the same thing in both teacher and student discourse, thus we do not code each utterance for the same things, whether the speaker is teacher or student.

If transactivity is defined as using another person’s thinking actively to change your own thinking, or develop your own thinking, then the teacher is acting (in some classrooms) as a super proxy creating the conditions for everybody to experience transactivity vicariously or directly. We talk about this as scaffolded transactivity. Our team, including our colleagues at Boston University, are working to develop two complementary coding schemes, one that tracks student talk (lead by the CMU team), and one that tracks teacher moves that scaffold transactivity development in student talk (lead by the BU team).

Recently we have reviewed a large amount of literature from the area of systemic functional linguistics. Several relevant lines of work come from this community. First, they have a long track record for work on analysis of social interactions within traditional academic writing (Halliday, Martin, Rose, Christie, Hyland). In particular their Appraisal system includes sub-systems for characterizing how people position themselves through language in relation to their listeners, the content they are communicating, and the relationship between that content and that of previous contributions (e.g., earlier publications). Most relevant is their work on a sub-system they refer to as Engagement. Some of this work has already been adapted to face-to-face conversation, including whole group class discussions about math, where in addition to an analysis of these interpersonal/relational aspects of language, an analysis of linguistic constructions that are useful for articulating math concepts and which can be use as an indicator for the level of sophistication in a student’s math articulation, which may be useful for our work from an assessment standpoint. Part of their goal has been to make the act of positioning, which Martin and Rose characterize as “power relationships” within the texts, explicit in order to make those positioning processes explicit and teachable. Another aspect of their work that builds on this is work on literacy issues, especially for traditionally “low power” populations, such as aboriginal communities in Australia.

A major aspect of the work we are doing involves automatic analysis of discussion data. Some publically available tools we have produced are found at and

Machine-learning algorithms can learn mappings between a set of input features and a set of output categories. They do this by using statistical techniques to find characteristics of hand-coded “training examples” that exemplify each of the output categories. The goal of the algorithm is to learn rules by generalizing from these examples in such a way that the rules can be applied effectively to new examples. In order for this to work well, the set of input features provided must be sufficiently expressive, and the training examples must be representative. Typically, machine-learning researchers design a set of input features that they suspect will be expressive enough. At the most superficial level, these input features are simply the words in a document. But many other features are routinely used in a wide range of text-processing applications, such as word collocations and simple patterns involving part of speech tags and low-level lexical features; we will draw from this prior work.

Once candidate input features have been identified, analysts typically hand code a large number of training examples. The previously developed TagHelper tool set (Rosé et al., 2008) has the capability of allowing users to define how texts will be represented and processed by making selections on the GUI interface. In addition to basic text-processing tools such as part-of-speech taggers and stemmers that are used to construct a representation of the text that machine-learning algorithms can work with, a variety of algorithms from toolkits such as Weka (Witten & Frank, 2005) are included in order to provide many alternative machine-learning algorithms to map between the input features and the output categories. Based on their understanding of the classification problem, machine-learning practitioners typically pick an algorithm that they expect to perform well. Often this is an iterative process of applying an algorithm, seeing where the trained classifier makes mistakes, and then adding additional input features, removing extraneous input features, or experimenting with algorithms.

Applying this iterative process requires insight and skill in the areas of linguistics and machine learning that the social scientists conducting corpus analysis are unlikely to possess. TagHelper tools supports this interactive processes by making it easy to define different processing configurations through the GUI and then providing reports about how the configuration worked and where the process may have broken down. The goal of our tool development is to make this process easier for social scientists. In particular, the process of identifying where the process has broken down and how the configuration can be tuned in order to improve the performance requires more expertise than typical social scientists would possess. Thus, the bulk of our development work will be in developing the machinery to bridge the gap between the natural structure of the input texts and the behaviors that social scientists are interested in cataloguing and coding, using bootstrapping approaches.

In our recent corpus-based experiments (Josh & Rosé, 2009; Arora, Joshi, & Rosé, 2009) we have explored the usage of alternative types of syntactically motivated features on text classification performance. Our methodology is extensively discussed in our recent journal article in the International Journal of Computer-Supported Collaborative Learning, investigating the use of text classification technology for automatic collaborative learning process analysis (Rosé et al., 2008). In more recent work we have experimented with learning paradigms such as genetic programming and genetic algorithms to “evolve” more powerful features that improve classification performance.