DataShop Pipeline

From LearnLab
Revision as of 17:57, 31 July 2007 by Bkb (talk | contribs) (LIVE!)
Jump to: navigation, search

This page provides information on datasets in various stages of progress. If you see an error in any of this information please feel free to correct it by editing this page. If you have a comment or concern on where a dataset sits in the DataShop pipeline, please contact the DataShop staff.

LIVE!

(on learnlab.org - ordered by alphabetically by project)

Project Dataset Name Tool LearnLab P.I. School(s) Date Notes/Status Remarks
1 Algebra Course Algebra I 2005
Algebra I 2005-2006 (Hampton only)
CL-2005 Algebra Albert Corbett CWCTC
Hampton
Wilkinsburg HSs
2005-2006 2/20: Received DataMunger.

2/27: Started running data munger for wilkinsburg-algebra i.
3/5: Munger done, have 1,378,891 txs instead of 1,187,841.
5/4: Munger fixed, Jon Steinhart running the conversion, Tristan to QA. Supposed to have data week of 5/7
Load of Wilkinsburg student data delayed due to problems copying the student files.
6/25: Hampton numbers on production not right. CWCTC numbers on the-cooker not right. Wilks data doesn't load. Requested a new complete CD from CL.
7/9: Modifying analysis database to support improved pruning, waiting for new CD with latest version of munger.
7/23: Loaded onto production with all 3 schools, one student from Wilkinsburg failed to load.

1 Chemistry Buffer Chemistry_Buffer_Study OLI Chemistry Jodi Davenport CMU Spring 2006  
2 Chemistry Buffer CMU_sp07_BUFFERS OLI Chemistry Jodi Davenport CMU unknown  
3 Chemistry Buffer Chemistry_Buffer_Study_2007 OLI Chemistry Jodi Davenport CMU, UBC? Spring 2007
4 Chemistry Collaboration Effects of collaboration in virtual laboratory environments VLAB Chemistry Bruce McLaren Spring 2007 Files only.
5 Chinese Tone Study Chinese_tonestudy unknown Chinese Ying Liu CMU Summer 2006
6 Chinese Tone Study Two Chinese_toneperception unknown Chinese Ying Liu CMU 2005-2006
8 Contiguity-CWCTC Contiguity CWCTC Spring 2006 CL Geometry Vincent Aleven CWCTC HS Spring 2006
9 Contiguity-CWCTC Contiguity CWCTC Winter 2006 CL Geometry Vincent Aleven CWCTC HS Winter 2006
10 Contiguity-CWCTC Contiguity CWCTC Fall 2006 CL + CTAT Geometry Kirsten Butcher, Vincent Aleven CWCTC Fall 2006 6/21: Reloaded.

7/23: Received missing CTAT data from Frank. Kyle to work on 7/24.
07/18/07: v4.13 Loaded on test machine for reload verify

11 Contiguity-CWCTC Contiguity CMU Winter 2007 CL Geometry Kirsten Butcher, Vincent Aleven CMU Winter 2007 6/21: Reloaded.
12 Division Tutor division_study CTAT-Flash Geometry? Stefan King Perrysville Elem Summer 2007
Does Treating Student Uncertainty as a Learning Impasse Improve Learning in Spoken Dialogue Tutoring? WOZ Uncertainty Adaptation ITSPOKE Physics (intended) Kate Forbes-Riley; Diane Litman Lab experiment with Pitt students Winter-Spring 2006-7 7/31/07: Project & Dataset created.
13 Elementary Chinese Course ElemChineseFA06 OLI Chinese
14 Elementary Chinese Course ElemChineseSU07 OLI Chinese
15 Elementary Chinese Course ElemChineseSP07 OLI Chinese
16 Example Study - Freiburg and CMU Example CMU Summer 2006 CL-modified Geometry Vincent Aleven CMU Summer 2006
18 Example Study - Freiburg and CMU Example Freiburg Summer 2006 CL Geometry Alexander Renkl unknown Summer 2006
19 Example Study - Freiburg and CMU Example CWCTC Winter 2007 CL+CTAT Geometry Vincent Aleven CMU Summer 2006 6/19: Need to reload CL data. Need to reload CTAT data.

6/21: Reloading CTAT to QA machine. CL will follow.
6/21: Data reloaded to production.
6/25: Bad (not all students were anonymized correctly) CTAT logs removed from production.
7/18: CL data: v4.13 Loaded on test machine for reload verify by Octav.
7/24: CTAT data: ready to load
7/31: CTAT data reloaded.

20 Example Study - Freiburg and CMU Example Steel Valley Spring 2006 CL Geometry Vincent Aleven Steel Valley HS Spring 2006 6/19: Need to reload CL data.

6/21: Data reloaded to production.

Example Study - Freiburg and CMU Example Spring 2007 CL-Lisp tutor (no CTAT) Geometry Vincent Aleven, Ron Salden CWCTC & Wilkinsburg Spring 2007 07/06/07: Will receive from CL week of 7/10/07

07/09/07: Octav received data
07/18/07: Loaded on cooker
07/21/07: QUESTION: is this 3 datasets or 1 with 3 schools?
7/29/07: Loaded to production.

Example Study - Freiburg and CMU Example Freiburg 2007 CL-Lisp tutor (no CTAT) Geometry Vincent Aleven, Ron Salden Gymnasium Spring 2007 07/10/07: Octav verifying conversion

7/29/07: Loaded to production.

21 French Course French Languag2 OLI French
22 French Course French Language OLI French
23 French Course French Language2 OLI French
24 French Culture Study French Online CTAT? French Amy Ogan CMU Fall 2005
25 French Culture Study FrenchTutor_Demo CTAT? French Amy Ogan CMU Spring 2005
26 French Culture Study French_Culture_Tutor CTAT? French Amy Ogan CMU Spring 2005
27 French Culture Study French_Culture_Tutor_Fall_2005 CTAT? French Amy Ogan CMU Fall 2005
28 Geometry Course Geometry Angles - North Hills Spring 2003 CL Geometry Vincent Aleven a north hills HS Spring 2003 Paper: Aleven & Koedinger, 2002 in Cognitive Science?? Or Popescu
29 Geometry Course Hampton Fall 2005 CL Geometry Vincent Aleven Hampton HS 2005-2006 Redoing this dataset with bug fixes in Octav's converter.
37 Geometry Course Geometry Area (1996-97) CL Geometry Ken Koedinger unknown 1996-97 Paper written -- get reference!!
38 Geometry Course Geometry-AllStudents CL Geometry Ken Koedinger unknown unknown Bad Version
38 Geometry Course Geometry Angles - Fox Chapel 1998 CL Geometry Vincent Aleven unknown unknown Files-only

7/23: Low priority to do: get raw logs and distill to transaction level.

30 IERI: Learning Oriented Dialogue Project Learning Oriented Dialogue Project - original
2 Improving Algebra Learning and Collaboration CPS Algebra I 2005 CL-2005 Algebra Bruce McLaren, Nikol Rummel Hampton HS 2005-2006
2 Improving Algebra Learning and Collaboration PTS Algebra I 2005 CL-2005 Algebra Bruce McLaren, Erin Walker 2005-2006 raw data files only
31 Intelligent Writing Tutor iwt_course
32 Intelligent Writing Tutor iwt retention course
40 Knowledge Tracking Chinese Vocabulary Spring 2006 Phil's own Chinese Phil Pavlik CMU Spring 2006
Knowledge Tracking Spanish Vocabulary Spring 2006 Phil's own Spanish Phil Pavlik Winter 2007
Knowledge Tracking Chinese Vocabulary Transfer Lab Study Spring 2006 Phil's own Chinese Phil Pavlik Spring 2006 6/22: Loaded to production
33 MacWhinney Dictation Studies Chinese Dictation Fall 2005 unknown Chinese Brian MacWhinney unknown Fall 2005
34 MacWhinney Dictation Studies French Dictation Fall 2005 unknown French Brian MacWhinney unknown Fall 2005
35 MacWhinney Dictation Studies Spanish Dictation Fall 2005 unknown Other Brian MacWhinney unknown Fall 2005
36 OLI Statistics 07Meyer201 OLI Other CMU 2007 Loaded on Jun-08-2007
Physics Physics - USNA - Fall 2006 Andes Physics Kurt VanLehn US Naval Academy Fall 2006 6/19/07: Received 18,000 files from Anders.
6/20/07: Loaded to cooker, awaiting verification.

6/22/07: Tim to work on better skill model with Brett van de Sande & Anders. Waiting on result of this.
6/27/07: Anders regenerating raw logs
6/28/07: Files distilled ~100 have invalid xml. Anders to fix.
7/3/07: Reloading to the-cooker, some tutor semantic names are off. Also asked Anders to include units. 7/24/07: Received new data from Anders. Fixing bugs in distiller.
7/29/07: Data loaded to production - will be reloaded for changes to distiller.

39 Public Pre_Summer_School_01_Jun_05 CTAT-Flash Chemistry unknown unknown June 2005 Used to show DataShop features, example data only.
42 The Effect of Generation and Interaction on Robust Learning Self Explanation - Electric Fields - USNA - Spring 2006 Andes Physics Bob Hausmann USNA Spring 2006 Loaded on Jun-08-2007.


Previously known as 'Hausmann-Experiment'.

Robust Learning of Vocabulary REAP ELI Reading 4 Summer 2006 REAP English Maxine Eskenazi Pitt-ELI Summer 2006 3/29/2007 - Sent message to Michael, Maxine to determine status

4/4/07 - Waiting for Michael's studies to finish, he will then convert to DS format.
6/12/07 - Waiting for Michael to add <conditions> to the datasets
6/25/07 - Loading datasets to the-cooker
6/26/07 - Ready to load to production
6/27/07 - Loaded to production

Robust Learning of Vocabulary REAP ELI Reading 4 Spring 2006 REAP English Maxine Eskenazi Pitt-ELI Spring 2006 3/29/2007 - Sent message to Michael, Maxine to determine status

4/4/07 - Waiting for Michael's studies to finish, he will then convert to DS format.
6/12/07 - Waiting for Michael to add <conditions> to the datasets
6/25/07 - Loading datasets to the-cooker
6/26/07 - Reloading to the-cooker
6/27/07 - Ready to load to production
6/27/07 - Loaded to production

43 Stoichometry Study PSLC Stoichiometry Study 1, 2 and 3 CTAT-Flash Chemistry Bruce McLaren UBC, a NJ HS, Hampton HS 2005+ Paper written. Problem name labels include the condition and probably should not, makes error report difficult. Also missing problem descriptions.
44 Stoichometry Study SummerSchool2005 Chemistry Bruce McLaren
45 Stoichometry Study Winter_Workshop01 Chemistry Bruce McLaren
46 Stoichometry Study PSLC Stoichiometry Study Demo Chemistry Bruce McLaren
47 Thermodynamics Thermo Fall 2005 unknown Other Vincent Aleven unknown Fall 2005
48 Unclassified TWS_Group_01
49 WPI-Assistments Assistments - 8th Grade Math - 2004-2005 (200 students) Assistments Other Neil Heffernan Mass Public 2004-2005

6/20: Need to reload.
6/25: Reloaded.

50 WPI-Assistments Assistments - 8th Grade Math - 2004-2005 (762 students) Assistments Other Neil Heffernan Mass Public 2004-2005

6/20: Need to reload.
6/25: Ready to load (on production)
6/26: Reloaded.

LFA - OutOfMemory on all 4 KC models.
WPI-Assistments Assistments - 8th Grade Math - 2005-2006 (200 students) Assistments Other Neil Heffernan Mass Public 2005-2006 Loaded on 7/03/07
10 WPI-Assistments Assistments - 8th Grade Math - 2005-2006 (all students) Assistments Other Neil Heffernan Mass Public 2005-2006 Loaded on 7/03/07

READY TO GO LIVE

(datasets that are ready to be loaded to learnlab.web, highest priority on top)

Project Dataset Name Tool LearnLab P.I. School(s) Date Notes/Status Remarks
None at the moment

IN TESTING

(datasets that have been loaded and are to be reviewed for correctness, highest priority on top)

Project Dataset Name Tool LearnLab P.I. School(s) Date Notes/Status Remarks
Robust learning with a Meta-Cognitive Tutor Help Tutor - Angles and Quads - CWCTC - 2006 (cognitive) CTAT+CL Geometry Ido Roll CWCTC HS 2005-2006 7/18: Loaded on the-cooker, can be reviewed.

7/18: Files received from Octav, modifying DataShop code to improve import speed.
7/26/07: Loaded to the-cooker.

Robust learning with a Meta-Cognitive Tutor Help Tutor - Angles and Quads - CWCTC - 2006 (meta) CTAT+CL Geometry Ido Roll CWCTC HS 2005-2006 7/18: Files received from Octav, modifying DataShop code to improve import speed.

7/26/07: Loaded to the-cooker.

Robust learning with a Meta-Cognitive Tutor Short Hints - Circles - Wilkinsburg - 2006 (cognitive) CTAT+CL Geometry Ido Roll Wilkinsburg HS 2005-2006 7/18: Files received from Octav, modifying DataShop code to improve import speed.

7/26/07: Loaded to the-cooker.

Robust learning with a Meta-Cognitive Tutor Short Hints - Circles - Wilkinsburg - 2006 (meta) CTAT+CL Geometry Ido Roll Wilkinsburg HS 2005-2006 7/18: Files received from Octav, modifying DataShop code to improve import speed.

7/26/07: Loaded to the-cooker.

Contiguity - CWCTC Contiguity Spring 2007 CL-Lisp tutor (no CTAT) Geometry Vincent Aleven, Kirsten Butcher CWCTC 07/06/07: Will receive from CL week of 7/10/07

07/09/07: Octav received data

07/18/07: Loaded on cooker

IN PROGRESS

(datasets that require additional conversion work or tweaking, highest priority on top)

Project Dataset Name Tool LearnLab P.I. School(s) Date Notes/Status Remarks
need a real name PSLC-OLI Physics Brett van de Sande ONR Fall 2006
Physics Physics - USNA - Fall 2005 Andes Physics Kurt VanLehn US Naval Academy Fall 2005 Waiting to receive files from Anders.
Physics Physics - USNA - Spring 2006 Andes Physics Kurt VanLehn US Naval Academy Spring 2006 Waiting to receive files from Anders.
Physics Physics - USNA - Spring 2007 Andes Physics Kurt VanLehn US Naval Academy Spring 2006 Waiting to receive files from Anders.
Geometry 2005-2006 CL-2005 (lisp) Geometry Vincent Aleven CWCTC, Hampton, Wilkinsburg HSs 2005-2006 Octav needs to convert this data into XML.

UPCOMING

(datasets we expect to receive soon - may or may not require additional processing on our part)

Project Dataset Name Tool LearnLab P.I. School(s) Date Notes/Status Remarks
1 unknown Physics Data Andes Physics Kurt VanLehn Not sure how much data aside from what is listed in the "In Progress" section we'll be getting
Algebra Bridge to Algebra 2005/2006 CL-Munger Algebra 6/6/07: Data is good to munge.

Request sent to CL to break the dataset into smaller pieces
07/09/2007: Confirmed that the munger can load sections only. Steve Ritter wishes to speak with ken/kurt to cover legal issues with this dataset.

Improving Skill at Solving Equations via Better Encoding of Algebraic Concepts Corrective Self Explanation CL-2006 Algebra Julie Booth Golden Valley High School Feb, 2007 07/06/07: Being QA'd by Tristan
A Multimodal Interface for Solving Equations Handwriting Examples - Winter 2006 CL-2006 Algebra Lisa Anthony Winter 2006 07/06/07: Being QA'd by Tristan
A Multimodal Interface for Solving Equations Handwriting - Spring 2007 CL-2006 Algebra Lisa Anthony Spring 2007 07/06/07: Being QA'd by Tristan
7 Stoichiometry VLAB data from Stoich studies Chemistry Bruce McLaren Oliver Scheuer to oversee conversion in Germany.

6/20/07: Mike Karabinos to work up examples for each type of VLAB Action, DataShop to fill in dummy and real <tutor_message>. Will then send this info to Oliver.
7/27/07: Kyle completed transformation to XML. Sent out for review.

8 ESL ESL data 7/27/07: Moving the Online Search System to PSLC servers in the near future. Will tie in to DataShop with single-sign on. No current plan to move any ESL data into DataShop itself.
9 Improving Algebra Learning and Collaboration PTS Algebra I 2005 CL Algebra Tutor Algebra Erin Walker ? 2005-2006 (not sure) Researcher has raw logs and has written a converter for analysis. Need to find out if it's in DataShop format.
10 Improving Algebra Learning and Collaboration PTS Algebra I Spring 2007 CL Algebra Tutor Algebra Erin Walker CWCTC Spring 2007 Researcher has raw logs is responsible for submitting to DataShop for import.

07/06/07: Frank sent anonymized raw logs to Erin.

11 Algebra I Course Algebra I 2006-2007 CL Algebra - Mungable Algebra Albert Corbett CWCTC, Hampton, Wilkinsburg, 3 LA schools 2006-2007 school year Waiting for harvest from CL. Also need a new version of munger to load.
12 Geometry Course Geometry 2006-2007 CL Geometry Geometry Vincent Aleven CWCTC, Hampton, Wilkinsburg 2006-2007 school year Waiting for harvest from CL. Will need a conversion from Octav.
13 Knowledge Tracking Chinese Vocabulary Fall 2006 Phil's own Chinese Phil Pavlik Fall 2006 07/06/07: Phil working on conversion. ETA 4-6 weeks.
14 Knowledge Tracking Chinese Vocabulary Winter 2007 Phil's own Other Phil Pavlik Spring 2006 07/06/07: Phil working on conversion. ETA 4-6 weeks.
15 Knowledge Tracking French Vocabulary Fall 2006 Phil's own French Phil Pavlik Fall 2006 07/06/07: Nora Presson to send data, no ETA yet.
15 Robust Learning of Vocabulary Fall 2006 Personalization Study REAP Tutor English Juffs English Language Institute Fall 2006 No ETA from Michael yet.
15 Robust Learning of Vocabulary Spring 2007 Reading 4 REAP Tutor English Juffs English Language Institute Fall 2006 No ETA from Michael yet.
Robust Learning of Vocabulary Summer 2007 Word Sense and Pronunciation Audio Study REAP Tutor English Juffs English Language Institute Fall 2006 No ETA from Michael yet.
Fostering fluency in second language learning: Testing two types of instruction Repetition in fluency training, Study 1 4/3/2 training English De Jong English Language Institute Fall 2006
Fostering fluency in second language learning: Testing two types of instruction Formulaic sequences in fluency training, Study 2 4/3/2 training English De Jong English Language Institute Spring 2007
Fostering fluency in second language learning: Testing two types of instruction Formulaic sequences in fluency training, Study 3 shadowing training English De Jong English Language Institute Spring 2007
Training oral production in learning second language grammar The order of French pronouns Flash online audio recording French De Jong Regular French courses at Pitt and CMU Spring and Summer 2007
Training oral production in learning second language grammar The use of French conditionals Flash online audio recording French De Jong Regular French courses at Pitt and CMU Spring and Summer 2007

Octav to regenerate all of his datasets?