Difference between revisions of "Flat File Importer"

From LearnLab
Jump to: navigation, search
(Notes/Comments)
Line 7: Line 7:
 
== Notes/Comments ==
 
== Notes/Comments ==
  
* Current Dataset Import Tool processes teh dataset row by row and uses Hibernate layer, which takes a long time to import a dataset.  
+
* Current Dataset Import Tool processes the dataset row by row and uses Hibernate layer, which takes a long time to import a dataset.  
* The import sometime has failed for several large datasets.  
+
* The import sometime has failed for several large datasets and has some bugs as well.
 
* We'd like to rewrite this import tool to allow column by column process and avoid Hibernate layer to make the import faster.  
 
* We'd like to rewrite this import tool to allow column by column process and avoid Hibernate layer to make the import faster.  
 
* The goal is to process 1 million rows per minute on import only. [[User:Shanwen|Shanwen]]10:09, 12 October 2010 (EDT)  
 
* The goal is to process 1 million rows per minute on import only. [[User:Shanwen|Shanwen]]10:09, 12 October 2010 (EDT)  

Revision as of 15:53, 13 January 2011

Status: Requirements and Estimate done, 12 weeks, 2011

User Story

As a DataShop administrator, I want to redesign Dataset Import Tool to load the dataset into database by processing it column by column so that I can speed up the import process.

Notes/Comments

  • Current Dataset Import Tool processes the dataset row by row and uses Hibernate layer, which takes a long time to import a dataset.
  • The import sometime has failed for several large datasets and has some bugs as well.
  • We'd like to rewrite this import tool to allow column by column process and avoid Hibernate layer to make the import faster.
  • The goal is to process 1 million rows per minute on import only. Shanwen10:09, 12 October 2010 (EDT)



See completed DataShop 3.x Features
See on-going DataShop 4.x Features
See prioritized DataShop Feature Wish List
See unordered Collected User Requests