DataShop Pipeline

From LearnLab
Revision as of 02:47, 23 January 2008 by Alida (talk | contribs) (IN TESTING)
Jump to: navigation, search

This page provides information on datasets in various stages of progress. If you see an error in any of this information please feel free to correct it by editing this page. If you have a comment or concern on where a dataset sits in the DataShop pipeline, please contact the DataShop staff.

ANALYSIS REPORTED

(on learnlab.org - ordered by alphabetically by project, researchers get a gold star when their datasets move up into this table!)


Project Dataset Name DB ID Tool LearnLab P.I. School(s) Date Notes/Status Remarks
Chemistry Collaboration Effects of collaboration in virtual laboratory environments 95 VLAB Chemistry Bruce McLaren Spring 2007 "The CoChemEx Project: Conceptual Chemistry Learning through Experimentation and Adaptive Collaboration"
Contiguity Contiguity CWCTC Spring 2006 79 CL Geometry Vincent Aleven CWCTC HS Spring 2006
Geometry Course Geometry Area (1996-97) 76 CL Geometry Ken Koedinger unknown 1996-97 Paper written -- get reference!!
8/17: Primary analysis complete. Cen (2006), Cen (2007)
Geometry Course Geometry Angles - Fox Chapel 1998 122 CL Geometry Vincent Aleven unknown unknown Files-only

7/23: Low priority to do: get raw logs and distill to transaction level.
8/17: Two best paper awards!

Improving Algebra Learning and Collaboration CPS Algebra 1 2005 109 CL-2005 Algebra Bruce McLaren, Nikol Rummel Hampton HS 2005-2006
Knowledge Tracking Chinese Vocabulary Fall 2006 160 FaCT Chinese Phil Pavlik CMU Spring 2006 2 papers
Physics Course USNA Physics Fall 2006 126 Andes Physics Kurt VanLehn US Naval Academy Fall 2006
Stoichiometry Study PSLC Stoichiometry Study 1 2 CTAT-Flash Chemistry Bruce McLaren UBC, Hampton HS, NJ High school 2005+
The Effect of Generation and Interaction on Robust Learning Self Explanation - Electric Fields - USNA - Spring 2006 104 Andes Physics Kurt VanLehn/Bob Hausman US Naval Academy Spring 2006


LIVE!

(on learnlab.org - ordered by alphabetically by project)


Project Dataset Name DB ID Tool LearnLab P.I. School(s) Date Notes/Status Remarks

A Multimodal Interface for Solving Equations Handwriting Examples - Winter 2006 145 CL-2006 (Munger) Algebra Lisa Anthony
Winter 2006 07/06/07: Being QA'd by Tristan

09/22/07: Data loaded to production. 01/18/08: Reloaded on production with bug fixes.


A Multimodal Interface for Solving Equations Handwriting - Spring 2007 165 CL-2006 (Munger) Algebra Lisa Anthony
Spring 2007 07/06/07: Being QA'd by Tristan

09/24/07: Data loaded to the-cooker.
09/28/07: Updated on cooker to v2.3, waiting for new version of munger to load on production with v2.3 analysis_db
10/26/07: Bug fixes completed in Munger, waiting to receive new version.
11/5/07 New munger version had errors on run, awaiting word from tristan on causes.
1/18/08 Loaded onto production.


Algebra Course Algebra I 2005
Algebra I 2005-2006 (Hampton only)
123,110 CL-2005 Algebra Albert Corbett CWCTC
Hampton
Wilkinsburg HSs
2005-2006 2/20: Received DataMunger.

2/27: Started running data munger for wilkinsburg-algebra i.
3/5: Munger done, have 1,378,891 txs instead of 1,187,841.
5/4: Munger fixed, Jon Steinhart running the conversion, Tristan to QA. Supposed to have data week of 5/7
Load of Wilkinsburg student data delayed due to problems copying the student files.
6/25: Hampton numbers on production not right. CWCTC numbers on the-cooker not right. Wilks data doesn't load. Requested a new complete CD from CL.
7/9: Modifying analysis database to support improved pruning, waiting for new CD with latest version of munger.
7/23: Loaded onto production with all 3 schools, one student from Wilkinsburg failed to load.


Chemistry Buffer Chemistry_Buffer_Study 63 OLI Chemistry Jodi Davenport CMU Spring 2006  

Chemistry Buffer CMU_sp07_BUFFERS 94 OLI Chemistry Jodi Davenport CMU unknown  

Chemistry Buffer Chemistry_Buffer_Study_2007 84 OLI Chemistry Jodi Davenport CMU, UBC? Spring 2007

Chinese Tone Study Chinese_tonestudy 1 unknown Chinese Ying Liu CMU Summer 2006

Chinese Tone Study Two Chinese_toneperception 64 unknown Chinese Ying Liu CMU 2005-2006

Contiguity-CWCTC Contiguity Difficulty Factors Analysis Winter 2006 153
Geometry Vincent Aleven
Winter 2006 Files only

Contiguity-CWCTC Contiguity CWCTC Winter 2006 80 CL Geometry Vincent Aleven CWCTC HS Winter 2006

Contiguity-CWCTC Contiguity CWCTC Fall 2006 102 CL + CTAT Geometry Kirsten Butcher, Vincent Aleven CWCTC Fall 2006 6/21: Reloaded.

07/18/07: v4.13 Loaded on test machine for reload verify
7/23: Received missing CTAT data from Frank. Kyle to work on 7/24.
7/26: Missing CTAT uploaded to production.
11/14 reloaded CL data, CTAT data needs reload


Contiguity-CWCTC Contiguity CWCTC Spring 2007 162 CL Geometry Vincent Aleven CWCTC Spring 2006 11/14: Loaded to production

Contiguity-CWCTC Contiguity CMU Winter 2007 113 CL Geometry Kirsten Butcher, Vincent Aleven CMU Winter 2007

6/21: Reloaded.
11/14/07: Reloaded with converter v4.13


Division Tutor division_study 98 CTAT-Flash Geometry? Stefan King Perrysville Elem Summer 2007

Does Treating Student Uncertainty as a Learning Impasse Improve Learning in Spoken Dialogue Tutoring? WOZ Uncertainty Adaptation 128 ITSPOKE Physics (intended) Kate Forbes-Riley; Diane Litman Lab experiment with Pitt students Winter-Spring 2006-7 7/31/07: Project & Dataset created.

Elementary Chinese Course ElemChineseFA06 75 OLI Chinese



Elementary Chinese Course ElemChineseSU07 96 OLI Chinese




Elementary Chinese Course ElemChineseSP07 83 OLI Chinese



Example Study - Freiburg and CMU Example Freiburg Spring 2006 77 CL Geometry Vincent Aleven CMU Spring 2006

Example Study - Freiburg and CMU Example CMU Summer 2006 88 CL-modified Geometry Vincent Aleven CMU Summer 2006

Example Study - Freiburg and CMU Example Freiburg Summer 2006 78 CL Geometry Alexander Renkl unknown Summer 2006

Example Study - Freiburg and CMU Example CWCTC Winter 2007 99 CL+CTAT Geometry Vincent Aleven CMU Summer 2006 6/19: Need to reload CL data. Need to reload CTAT data.

6/21: Reloading CTAT to QA machine. CL will follow.
6/21: Data reloaded to production.
6/25: Bad (not all students were anonymized correctly) CTAT logs removed from production.
7/18: CL data: v4.13 Loaded on test machine for reload verify by Octav.
7/24: CTAT data: ready to load
7/31: CTAT data reloaded.


Example Study - Freiburg and CMU Example Steel Valley Spring 2007 114 CL Geometry Vincent Aleven Steel Valley HS Spring 2006 6/19: Need to reload CL data.

6/21: Data reloaded to production.


Example Study - Freiburg and CMU Example CWCTC Spring 2007 125 CL-Lisp tutor (no CTAT) Geometry Vincent Aleven, Ron Salden CWCTC & Wilkinsburg Spring 2007 07/06/07: Will receive from CL week of 7/10/07

07/09/07: Octav received data
07/18/07: Loaded on cooker
07/21/07: QUESTION: is this 3 datasets or 1 with 3 schools?
7/29/07: Loaded to production.

11/14: reloaded onto production.


Example Study - Freiburg and CMU Example Wilkinsburg Spring 2007 124 CL-Lisp tutor (no CTAT) Geometry Vincent Aleven, Ron Salden CWCTC & Wilkinsburg Spring 2007 7/29/07: Loaded to production.
11/14: reloaded onto production

Example Study - Freiburg and CMU Example Freiburg Spring 2007 127 CL-Lisp tutor (no CTAT) Geometry Vincent Aleven, Ron Salden Gymnasium Spring 2007 07/10/07: Octav verifying conversion

7/29/07: Loaded to production.


Fostering fluency in second language learning: Testing two types of instruction Repetition in fluency training, Study 1 129 4/3/2 training English De Jong English Language Institute Fall 2006 7/31/07: Dataset created on production. No files yet.

Fostering fluency in second language learning: Testing two types of instruction Formulaic sequences in fluency training, Study 2 130 4/3/2 training English De Jong English Language Institute Spring 2007 7/31/07: Dataset created on production. No files yet.

Fostering fluency in second language learning: Testing two types of instruction Formulaic sequences in fluency training, Study 3 131 shadowing training English De Jong English Language Institute Spring 2007 7/31/07: Dataset created on production. No files yet.
21 French Course FrenchLanguag2 82 OLI French



22 French Course FrenchLanguage 74 OLI French



23 French Course FrenchLanguage2 81 OLI French




French Course pitteiffel 151 OLI French




French Course toureiffel 152 OLI French



28 Geometry Course Geometry Angles - North Hills Spring 2003 68 CL Geometry Vincent Aleven North Hills HS Spring 2003 Paper: Aleven & Koedinger, 2002 in Cognitive Science?? Or Popescu
29 Geometry Course Hampton Fall 2005 66 CL Geometry Vincent Aleven Hampton HS 2005-2006 Redoing this dataset with bug fixes in Octav's converter.
11/14: Reloaded onto production
38 Geometry Course Geometry-AllStudents 6 CL Geometry Ken Koedinger unknown unknown Bad Version
30 IERI: Learning Oriented Dialogue Project Learning Oriented Dialogue Project - original 62




2 Improving Algebra Learning and Collaboration PTS Algebra I 2005 112 CL-2005 Algebra Bruce McLaren, Erin Walker
2005-2006 raw data files only

Improving Skill at Solving Equations via Better Encoding of Algebraic Concepts Corrective Self Explanation 147, 149 CL-2006 (Munger) + CTAT Algebra Julie Booth Golden Valley High School Feb, 2007 07/06/07: Being QA'd by Tristan

8/13/07: CTAT files received from Frank. Need to insert course_name, check for errors.
8/15/07: CTAT logs loaded to production. Some dataset info updated as well.
9/21/07: Munger failure due to CTAT logs already loaded. Need to clean out CTAT logs and try again.
9/22/37: Data loaded to production.
9/25/07: CTAT logs loaded to production (were removed for munger reload)
1/18/2007: reoaded to production

24 Improving cultural learning by predicting in French film FrenchOnline 32 CTAT? French Amy Ogan CMU Fall 2005
25 Improving cultural learning by predicting in French film FrenchTutor_Demo 16 CTAT? French Amy Ogan CMU Spring 2005
26 Improving cultural learning by predicting in French film French_Culture_Tutor 9 CTAT? French Amy Ogan CMU Spring 2005
27 Improving cultural learning by predicting in French film French_Culture_Tutor_Fall_2005 8 CTAT? French Amy Ogan CMU Fall 2005

Improving cultural learning by predicting in French film French Discussion Board 154 CTAT French Amy Ogan CMU Spring 2007 8/2/07: Received log files from Jonathan Sewall. Will load to QA.

8/07/07: Files loaded to QA. Some still need fixed (contain invalid XML)
8/16/07: All CTAT logs free of errors. All have been loaded to QA - waiting for researcher verification.
8/24/07: Need to get logs from Discussion Board from Erin Walker. Let's ping them again late Sept.
9/25/07: Files are on QA - waiting for go-ahead to move to prod from the researchers.


Improving cultural learning by predicting in French film French Discussion Board ITS 155 CTAT French Amy Ogan CMU Spring 2007

Improving cultural learning by predicting in French film French Verb Tutor 156 CTAT French Amy Ogan CMU Spring 2007
31 Intelligent Writing Tutor iwt_course 86
english Ruth Wylie


32 Intelligent Writing Tutor iwt retention course 89
english Ruth Wylie

10/9/2007: this dataset is garbage.
32 Intelligent Writing Tutor eli_study 148
english Ruth Wylie Fall 1007
10/9/2007: should be renamed to IWT Article Tutor Level 5 Study Fall 2007 when data collection is complete

Knowledge Tracking Chinese Vocabulary Spring 2006 107 Phil's own Chinese Phil Pavlik CMU Spring 2006

Knowledge Tracking Chinese Vocabulary Spring 2007 161 Phil's own Chinese Phil Pavlik CMU Spring 2007 7/06/07: Phil working on conversion. ETA 4-6 weeks.

9/26/07: Loaded to the-cooker
10/9/07: Problem with the number of observations in Learning Curves. DS team looking into it.
11/12/2007: Data reloaded to bunny.
Loaded to production on 11/13/07


Knowledge Tracking Spanish Vocabulary Spring 2006 108 Phil's own Spanish Phil Pavlik
Winter 2007

Knowledge Tracking Chinese Vocabulary Transfer Lab Study Spring 2006 115 Phil's own Chinese Phil Pavlik
Spring 2006 6/22: Loaded to production
33 MacWhinney Dictation Studies Chinese Dictation Fall 2005 73 unknown Chinese Brian MacWhinney unknown Fall 2005
34 MacWhinney Dictation Studies French Dictation Fall 2005 71 unknown French Brian MacWhinney unknown Fall 2005
35 MacWhinney Dictation Studies Spanish Dictation Fall 2005 72 unknown Other Brian MacWhinney unknown Fall 2005
36 OLI Statistics 07Meyer201 103 OLI Other
CMU 2007 Loaded on Jun-08-2007

Physics Course Physics - USNA - Spring 2007 157 Andes Physics Kurt VanLehn US Naval Academy Spring 2006 Waiting to receive files from Anders.

8/14/07: files on Cooker, waiting for verification.
09/28/07: Loaded on the cooker under v2.3, waiting for verification
10/26/07: Will reload to production once new release is live.
11/5/07: Loaded to production


Physics Joint Explanation - Electric Fields - Pitt - Spring 2007 163 Andes Physics Bob Hausmann US Naval Academy Spring 2007 10/26/07: Bob to send files to Anders for conversion.

10/30/07: Anders sent files to DataShop team.
11/2/07: Loaded to the cooker, waiting for verify.
renamed from previous "The Effects of Interaction on Robust Learning - Spring 2007"

11/14/07: Loaded onto production

39 Public Pre_Summer_School_01_Jun_05 10 CTAT-Flash Chemistry unknown unknown June 2005 Used to show DataShop features, example data only.

Robust Learning of Vocabulary REAP ELI Reading 4 Summer 2006 118 REAP English Maxine Eskenazi Pitt-ELI Summer 2006 3/29/2007 - Sent message to Michael, Maxine to determine status

4/4/07 - Waiting for Michael's studies to finish, he will then convert to DS format.
6/12/07 - Waiting for Michael to add <conditions> to the datasets
6/25/07 - Loading datasets to the-cooker
6/26/07 - Ready to load to production
6/27/07 - Loaded to production


Robust Learning of Vocabulary REAP ELI Reading 4 Spring 2006 117 REAP English Maxine Eskenazi Pitt-ELI Spring 2006 3/29/2007 - Sent message to Michael, Maxine to determine status

4/4/07 - Waiting for Michael's studies to finish, he will then convert to DS format.
6/12/07 - Waiting for Michael to add <conditions> to the datasets
6/25/07 - Loading datasets to the-cooker
6/26/07 - Reloading to the-cooker
6/27/07 - Ready to load to production
6/27/07 - Loaded to production


Robust learning with a Meta-Cognitive Tutor Help Tutor CWCTC Spring 2006 (cognitive) 134 CTAT+CL Geometry Ido Roll CWCTC HS 2005-2006

7/18/07: Files received from Octav, modifying DataShop code to improve import speed.
7/26/07: Loaded to the-cooker.
8/13/07: Loaded onto production!


Robust learning with a Meta-Cognitive Tutor Help Tutor CWCTC Spring 2006 (meta) 135 CTAT+CL Geometry Ido Roll CWCTC HS 2005-2006

7/18/07: Files received from Octav, modifying DataShop code to improve import speed.
7/26/07: Loaded to the-cooker.
8/13/07: Loaded onto production!


Robust learning with a Meta-Cognitive Tutor Short Hints Wilkinsburg Spring 2006 (cognitive) 137 CTAT+CL Geometry Ido Roll Wilkinsburg HS 2005-2006

7/18/07: Files received from Octav, modifying DataShop code to improve import speed.
7/26/07: Loaded to the-cooker.
8/13/07: Loaded onto production!


Robust learning with a Meta-Cognitive Tutor Short Hints Wilkinsburg Spring 2006 (meta) 136 CTAT+CL Geometry Ido Roll Wilkinsburg HS 2005-2006

7/18/07: Files received from Octav, modifying DataShop code to improve import speed.
7/26/07: Loaded to the-cooker.
8/13/07: Loaded onto productionnot!


Robust learning with a Meta-Cognitive Tutor
Help Tutor - Angles and Quads - 2006
159
CTAT only
Geometry
Ido Roll

2005-2006
11/12/07: CTAT logs loaded to production. 
44 Stoichometry Study SummerSchool2005 11
Chemistry Bruce McLaren


45 Stoichometry Study Winter_Workshop01 59
Chemistry Bruce McLaren


46 Stoichometry Study PSLC Stoichiometry Study Demo 3
Chemistry Bruce McLaren


47 Thermodynamics Thermo Fall 2005 61 unknown Other Vincent Aleven unknown Fall 2005

Training oral production in learning second language grammar The order of French pronouns 132 Flash online audio recording French De Jong Regular French courses at Pitt and CMU Spring and Summer 2007 7/31/07: Dataset created on production. No files yet.

Training oral production in learning second language grammar The use of French conditionals 133 Flash online audio recording French De Jong Regular French courses at Pitt and CMU Spring and Summer 2007 7/31/07: Dataset created on production. No files yet.
48 Unclassified TWS_Group_01 65





49 WPI-Assistments Assistments - 8th Grade Math - 2004-2005 (200 students) 90 Assistments Other Neil Heffernan Mass Public 2004-2005 6/20: Need to reload.
6/25: Reloaded.
50 WPI-Assistments Assistments - 8th Grade Math - 2004-2005 (762 students) 92 Assistments Other Neil Heffernan Mass Public 2004-2005 6/20: Need to reload.
6/25: Ready to load (on production)
6/26: Reloaded.
LFA - OutOfMemory on all 4 KC models.

WPI-Assistments Assistments - 8th Grade Math - 2005-2006 (200 students) 119 Assistments Other Neil Heffernan Mass Public 2005-2006 Loaded on 7/03/07
10 WPI-Assistments Assistments - 8th Grade Math - 2005-2006 (1582 students) 120 Assistments Other Neil Heffernan Mass Public 2005-2006 Loaded on 7/03/07

READY TO GO LIVE

(datasets that are ready to be loaded to learnlab.web, highest priority on top)


Project Dataset Name Tool LearnLab P.I. School(s) Date Notes/Status Remarks
None at the moment.

IN TESTING

(datasets that have been loaded and are to be reviewed for correctness, highest priority on top)


Project Dataset Name Tool LearnLab P.I. School(s) Date Notes/Status Remarks

Algebra I Course Algebra I 2006-2007 CL Algebra - Mungable Algebra Albert Corbett CWCTC, Hampton, Wilkinsburg, 3 LA schools 2006-2007 school year Waiting for harvest from CL. Also need a new version of munger to load.
Loaded to the-cooker, had multiple errors on load.

Robust Learning of Vocabulary Fall 2006 Personalization Study REAP Tutor English Juffs English Language Institute Fall 2006 No ETA from Michael yet.

9/25/07: Sent follow-up email to Michael
10/03/07: Waiting for files from REAP Team.
12/10/07: Loaded to the-cooker for verification.


Robust Learning of Vocabulary Spring 2007 Reading 4 REAP Tutor English Juffs English Language Institute Spring 2007 No ETA from Michael yet.

9/25/07: Sent follow-up email to Michael
10/03/07: Waiting for files from REAP Team.
12/10/07: Loaded to the-cooker for verification.


Robust Learning of Vocabulary Summer 2007 Word Sense and Pronunciation Audio Study REAP Tutor English Juffs English Language Institute Summer 2007 No ETA from Michael yet.

9/25/07: Sent follow-up email to Michael
10/03/07: Waiting for files from REAP Team.
12/10/07: Loaded to the-cooker for verification.

IN PROGRESS

(datasets that require additional conversion work or tweaking, highest priority on top)


Project Dataset Name Tool LearnLab P.I. School(s) Date Notes/Status Remarks

Geometry Course Geometry 2005-2006 CL-2005 (lisp) Geometry Vincent Aleven CWCTC, Hampton, Wilkinsburg HSs 2005-2006 Octav needs to convert this data into XML.

UPCOMING

(datasets we expect to receive soon - may or may not require additional processing on our part)


Project Dataset Name Tool LearnLab P.I. School(s) Date Notes/Status Remarks

Algebra Bridge to Algebra 2005/2006 CL-Munger Algebra


6/6/07: Data is good to munge.

Request sent to CL to break the dataset into smaller pieces
07/09/2007: Confirmed that the munger can load sections only. Steve Ritter wishes to speak with ken/kurt to cover legal issues with this dataset.

7 Stoichiometry VLAB data from Stoich studies
Chemistry Bruce McLaren

Oliver Scheuer to oversee conversion in Germany.

6/20/07: Mike Karabinos to work up examples for each type of VLAB Action, DataShop to fill in dummy and real <tutor_message>. Will then send this info to Oliver.
7/27/07: Kyle completed transformation to XML. Sent out for review.
8/14/07: Working on storage of replay data - waiting for response from CTAT and Germany
9/25/07: Jonathan Sewall wants to see what logs look like in DataShop.
10/8/07: Alida loaded sample Vlab dataset to QA (Dataset: Chemistry VLAB Example Data Fall 2007).

8 ESL ESL data




7/27/07: Moving the Online Search System to PSLC servers in the near future. Will tie in to DataShop with single-sign on. No current plan to move any ESL data into DataShop itself.

Improving Algebra Learning and Collaboration PTS Algebra I 2005 CL Algebra Tutor Algebra Erin Walker  ? 2005-2006 (not sure) Researcher has raw logs and has written a converter for analysis. Need to find out if it's in DataShop format.

Improving Algebra Learning and Collaboration PTS Algebra I Spring 2007 CL Algebra Tutor Algebra Erin Walker CWCTC Spring 2007 Researcher has raw logs is responsible for submitting to DataShop for import.

7/06/07: Frank sent anonymized raw logs to Erin.


Geometry Course Geometry 2006-2007 CL Geometry Geometry Vincent Aleven CWCTC, Hampton, Wilkinsburg 2006-2007 school year Waiting for harvest from CL. Will need a conversion from Octav.










Physics Physics - USNA - Spring 2006 Andes Physics Kurt VanLehn US Naval Academy Spring 2006 10/26/07: Check with Anders to see when DataShop can get this data.

Fluency Study Fluency Study 4 - Winter 2008 CTAT ESL Nel de Jong CMU Winter 2008

Octav to regenerate all of his datasets?