Difference between revisions of "Flat File Importer"
(8 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | '''Status: | + | '''Status: Done (DataShop v5.0 May 2011)''' |
== User Story == | == User Story == | ||
As a DataShop administrator, I want to redesign Dataset Import Tool to load the dataset into database by processing it column by column so that I can speed up the import process. | As a DataShop administrator, I want to redesign Dataset Import Tool to load the dataset into database by processing it column by column so that I can speed up the import process. | ||
+ | |||
+ | == Summary from Release Notes == | ||
+ | This release of DataShop concludes work improving the tool used to import tab-delimited text files into DataShop. With these improvements, loading large tab-delimited text files of transaction data is now possible. It's fast, too. | ||
+ | |||
+ | As part of this release, we have used the new import tool to load 6 datasets that we had been unable to load. These datasets range in size from 122,000 to 870,000 transactions. | ||
== Notes/Comments == | == Notes/Comments == | ||
− | * Current Dataset Import Tool processes | + | * Current Dataset Import Tool processes the dataset row by row and uses Hibernate layer, which takes a long time to import a dataset. |
− | * The import sometime has failed for several large datasets. | + | * The import sometime has failed for several large datasets and has some bugs as well. |
* We'd like to rewrite this import tool to allow column by column process and avoid Hibernate layer to make the import faster. | * We'd like to rewrite this import tool to allow column by column process and avoid Hibernate layer to make the import faster. | ||
* The goal is to process 1 million rows per minute on import only. [[User:Shanwen|Shanwen]]10:09, 12 October 2010 (EDT) | * The goal is to process 1 million rows per minute on import only. [[User:Shanwen|Shanwen]]10:09, 12 October 2010 (EDT) | ||
+ | |||
<br> | <br> | ||
---- | ---- | ||
− | See | + | See [[DataShop Completed Features|completed features]]<br> |
+ | See [[DataShop On-going Features|on-going features]]<br> | ||
+ | See unordered [[Collected User Requests]]<br> | ||
+ | See the [[:Category:DataShop Glossary|DataShop Glossary]] | ||
[[Category:Protected]] | [[Category:Protected]] | ||
[[Category:DataShop]] | [[Category:DataShop]] | ||
+ | [[Category:DataShop Feature]] |
Latest revision as of 12:14, 28 September 2011
Status: Done (DataShop v5.0 May 2011)
User Story
As a DataShop administrator, I want to redesign Dataset Import Tool to load the dataset into database by processing it column by column so that I can speed up the import process.
Summary from Release Notes
This release of DataShop concludes work improving the tool used to import tab-delimited text files into DataShop. With these improvements, loading large tab-delimited text files of transaction data is now possible. It's fast, too.
As part of this release, we have used the new import tool to load 6 datasets that we had been unable to load. These datasets range in size from 122,000 to 870,000 transactions.
Notes/Comments
- Current Dataset Import Tool processes the dataset row by row and uses Hibernate layer, which takes a long time to import a dataset.
- The import sometime has failed for several large datasets and has some bugs as well.
- We'd like to rewrite this import tool to allow column by column process and avoid Hibernate layer to make the import faster.
- The goal is to process 1 million rows per minute on import only. Shanwen10:09, 12 October 2010 (EDT)
See completed features
See on-going features
See unordered Collected User Requests
See the DataShop Glossary