Data Download module and Data Repository tools: Difference between pages

From MIPAV
(Difference between pages)
Jump to navigation Jump to search
m (1 revision imported)
 
MIPAV>Olga Vovk
 
Line 1: Line 1:
== Introduction==
In order to help researches to upload and download data to the data repository, BRICS provides the Data Repository tool that includes the following  modules (or sub-tools):
[[Data Repository tools|The data repository]] users can download selected datasets from the repository to their local machines. The Data Download tool (available via the Data Download module) assists users in this task. The tool runs locally as a Java Web Start application on a user's computer (requires the Java runtime environment).
# [[#ImagingSubmission|Imaging data submission and validation]] module should be used for imaging data to create the image submission package.
# [[#DataValidation| Data Validation]] module verifies that data conforms to the required format and range values defined in the data dictionary. It also creates a data submission package ([http://en.wikipedia.org/wiki/XML XML]) and submission ticket ([http://en.wikipedia.org/wiki/XML XML]) that can be uploaded to the data repository via the [[#DataUpload| Data Upload]] module.
# [[#DataUpload| Data Upload]] module assists researches in uploading their data to the data repository (in the form of a submission package and submission ticket).
# [[#DataDownload| Data Download]] module assists researches in downloading data from the data repository.


See also: [[Data Repository tools| Data Repository tools]]
== Steps in data preparation, validation, submission, and download ==
== System requirements ==
'''For [[Image submission plug-in| imaging data]]:'''
The most recent version of [http://java.com/en/download/index.jsp Java Runtime Environment (JRE)] (6 or 7) is required in order to run the Imaging Data Submission and Validation module.
# Create and pre-validate the image submission package using the [[#ImagingSubmission|Imaging data submission and validation]] module.
# Validate the the image submission package using the [[#DataValidation| Data Validation]] module and create the data submission package/ticket using the same module.
# Use the data submission ticket from the previous step to upload the data submission package with the help of the [[#DataUpload| Data Upload]] module.
# Download the data using the means provided by the  [[#DataDownload| Data Download]] module.
 
'''For non-imaging data:'''
# Create and validate the data submission package using the [[#DataValidation| Data Validation]] module and create the submission ticket using the same module (creates automatically as soon as data pass validation).
# Use the data submission ticket from the previous step to upload the data submission package with the help of the [[#DataUpload| Data Upload]] module.
# Download the data using the means provided by the  [[#DataDownload| Data Download]] module.
 
== Data for upload ==
<div id="ImagingData"><div>
=== Imaging data ===
Imaging data can be uploaded in the form of a brain image file and a corresponding [http://en.wikipedia.org/wiki/Comma-separated_values CSV] file that contains some additional patient/subject/visit information (not stored in the image header) as well as image related metadata. See also [[Image submission plug-in#Required image information|Required image information]].
 
The following information is required for all [[Image submission plug-in|image submissions]] and must be included in a CSV file:
# The patient/subject information including  - GUID, a patient/subject age, a study site name, and a visit date;
# The image information including - imaging study date and time, imaging file name, imaging file itself, imaging file format and modality, image QA/QC information.
 
The brain image file can be:
* a single file in one of [[Supported Formats| supported formats]],
* a [[Other formats supported by MIPAV#DicomFormat|DICOM multifile]] or some other multifile in one of the supported formats,
* a ZIP archive that contains an image dataset (e.g. [[Other formats supported by MIPAV#DicomFormat|DICOM dataset]] with images of multiple slices stored in separate image files - [[Image submission plug-in#Multifiles|a multifile]]).
 
For more information refer to: [[Image submission plug-in|Imaging data submission and validation module]].
 
=== Clinical  data ===
In order to upload clinical data to the data repository, the data should be submitted to the [[#DataValidation|Data Validation module]] as [[Data Validation module#CSVFiles|CSV]] files (in tab-delimited format). 
 
The [[#DataValidation|Data Validation module]] accepts [[Data Validation module#CSVFiles|CSV]] files from a researcher and validates the files' content against the values defined in the data dictionary. For  those data that pass validation, the [[#DataValidation| Data Validation module]] creates a submission package and submission ticket both in XML format.
 
The submission ticket is used by the [[#DataUpload| Data Upload module]] to upload the data (in the form of a corresponding submission package) to the data repository.
 
For more information about the structure of CSV files, data submission and validation, refer to [[Data Validation module]].
 
=== Genomics data ===
 
TBD.
 
<div id="ImagingSubmission"><div>
== Imaging data submission and validation module ==
The Imaging Data Submission and Validation module (also known as  MIPAV Image Submission and Validation tool or MIPAV [[Image submission plug-in]]) is designed to help researches to validate and submit their data. Data validation is a necessary step that must be done prior data submission in order to ensure the quality of uploaded data and to make data queryable. The module runs as a Java Web Start application locally on a user's computer (Java runtime environment is required).
 
'''Module input:'''
# a brain image(s) in one of supported formats,
# a corresponding CSV file with metadata, see [[#ImagingData| Imaging data]].
 
'''Module output:'''
# the brain image(s) in one of supported formats,
# the CSV file confirmed against the data dictionary and ready for validation by the [[#DataValidation|Data Validation module]],
# [[Image submission plug-in#Output log|the Output log]] that lists all brain image files and CSV files added to the image submission package. It also displays the path(s) to the directory where the image package(s) is stored. 
 
Read [[Image submission plug-in| more]]...
 
<div id="DataValidation"><div>
 
== Data Validation module ==
The [[Data Validation module]] assists researchers with the submission of both imaging and non-imaging data into the repository. The Data Validation module verifies that data conforms to the required format and range values defined in the data dictionary (note that for the [[Image submission plug-in|imaging data]] this is going to be a second validation needed to ensure the quality of data). The [[Data Validation module]] validates the metadata associated with the data files identified by the user for submission against the data dictionary. If everything is OK, the Data Validation module creates a submission ticket and submission package. After that data are good for uploading. If errors are found, the module provides a detailed report of any data discrepancies, errors, and warnings. The module runs as a Java Web Start application locally on a user's computer (Java runtime environment is required).


== Module input and output ==
'''Module input:'''
'''Module input:'''
# Data submitted to the repository by the [[#DataUpload|Data Upload module]]..
# CSV file with clinical data or [[#ImagingData|imaging metadata]].


'''Module output:'''
'''Module output:'''
# Data downloaded to your computer- TBD.
# a submission package and submission ticket (XML) ready for submission by the [[#DataUpload| Data Upload module]].
# [[Data Validation module#Error log|an error log]] with validation errors and warnings (if any).
 
Read [[Data Validation module|more]] ...


== Running the Data Download tool ==
<div id="DataUpload"><div>
[[File:DataDownloadDownloadTool.png|200px|thumb|left|The Data Repository tools]]
== Data Upload module ==
[[File:DataDownloadDownloadTool1.png|400px|thumb|left|A list of datasets from a selected study]]
After the submission package has been created using either the [[Image submission plug-in|Imaging Data Submission and Validation module]] (for imaging data) or by the [[Data Validation module]] (for non-imaging data), data can be submitted to the data repository. The Data Upload module facilitates this process. The module runs as a Java Web Start application locally on a user's computer (Java runtime environment is required).
* First, [[#StepsDownloadQueue|create and populate a download queue]] with datasets for download.
* Then, [[#StepsRunDownloadTool|run the Data Download tool]] to download the datasets from [[#DownloadQueue|the download queue]].


<div id="StepsDownloadQueue"><div>
'''Module input:'''
=== Steps to populate a download queue ===
# a submission package and submission ticket (XML) from the [[#DataValidation|Data Validation module]].
# Navigate to Data Repository > Manage Studies.
# Click View Studies.
# In the list of studies that appears, find a study from which you wish to download a dataset(s). If the study you wish to download data from did not appear in the study list, male sure you have permissions to view that study/download data from that study. Contact the Operations Team (TBD) in order to know your permissions.
# Make sure that the data types icons next to the study name are highlighted in [[#DataTypes|color]]. That means that [[Data Upload module#DataTypes|the study has datasets uploaded]]. If the icons next to the study name are highlighted in color, the study has datasets of the highlighted types.
# Click on the study name/title.
# The study page appears.
# Navigate to the bottom of the page and click on  the "+" sign to expand the Data Set Submissions section.
# The list of datasets submitted to the study appears. '''Note:''' in order to view and download datasets from a selected study you must have permissions to do that. Contact the Operations Team (TBD) in order to know your permissions.
# Click on the dataset you wish to download.
# A pop-up window opens with information regarding the dataset. Note the big green Add to Download Queue button in the upper right corner.
# Click the Add to Download Queue button to add the dataset to the download queue. The the Add to Download Queue button changes to the "Added to download queue" status message and the selected dataset appears in [[#DownloadQueue|the download queue table]].
# To download the dataset(s) from the download queue to your computer, [[#StepsRunDownloadTool|run the Data Download tool]].


<div id="DownloadQueue"><div>
'''Module output:'''
==== Where to see the download queue ====
# Data submitted to the data repository.
* To view the download queue navigate to Data Repository> Download Tool and scroll the page down. The list of datasets to download appears under the Download Queue section of the page.


'''Note:'''  the download queue table does not show download entries created using the query tool.  
Read [[Data Upload module| more]]...
==== What you can do with the entries in the download queue ====
* To delete a single dataset from the the download queue, select it (using a chackbox) and press Delete Selected.
* To delete multiple datasets from the the download queue, select them using the checkboxes (or select all) and press Delete Selected.
* To sort the download queue table by the dataset name, use the small triangle next to the Dataset column name.  


<div id="StepsRunDownloadTool"><div>
<div id="DataDownload"><div>
=== Steps to run the Data Download tool ===
== Data Download module ==
[[File:DownloadManager.png|400px|thumb|left|The Download Manager window]]
Data Download module helps users to select and download datasets from the data repository to their own systems. The module runs as a Java Web Start application locally on a user's computer (Java runtime environment are required).
The Data Download module runs locally on your machine. In order to launch the module, navigate to the Data Repository > Download Data page and click the Launch the Download Tool link.
'''Note:''' the most recent version of [http://java.com/en/download/index.jsp Java Runtime Environment (JRE)] (6 or 7) is required in order to run the module. Make sure your computer has it installed.


* Click Launch the Download Tool. In the Opening downloadTool.jnlp window that appears, select Open with Java(TM) Web Start Launcher (default) and click OK. In the Java Runtime Environment window that appears next saying "Do you want to run this application?", click Run.
'''Module input:'''
* The EULE Agreement window appears displaying the data privacy user agreement. Read the agreement and click Accept if you agree.
# Data submitted to the repository by the [[#DataUpload|Data Upload module]]..
* The Download Manager window appears showing the list of datasets you selected for download.
'''In the Download Manager window:'''
# Use the Browse button to select a directory where to download datasets.
# Navigate to the location on your computer of the working directory where you want the file to be downloaded. Select the folder, and then click Open.
# Select the files you want to download by clicking the check box next to the file name. Note that the status for  selected datasets changes to "Ready".
# Click Start Download. The screen will update as file(s) are being downloaded. The download progress will appear in the Progress column.
# For successful downloads, the status will be designated as Completed.
# Use the Refresh button to refresh the list.
# Navigate to the working directory and open downloaded files.


* To stop download(s), use the Stop Download(s) button.
'''Module output:'''
* To delete download(s), use the Delete Download(s) button.
# Data downloaded to your computer- TBD.
* To clear completed doenload(s) use the Clear Completed Downloads button.


== See also ==
Read [[Data Download module|more]]...
*[[Data Repository tools| Data Repository tools]]
*[[Image submission plug-in|Imaging data submission and validation module]]
*[[Data Upload module]]
*[[Data Validation module]]


[[Category:Help:Stub]]
[[Category:Help]]
[[Category:BRICS]]
[[Category:BRICS]]

Revision as of 17:27, 26 July 2013

In order to help researches to upload and download data to the data repository, BRICS provides the Data Repository tool that includes the following modules (or sub-tools):

  1. Imaging data submission and validation module should be used for imaging data to create the image submission package.
  2. Data Validation module verifies that data conforms to the required format and range values defined in the data dictionary. It also creates a data submission package (XML) and submission ticket (XML) that can be uploaded to the data repository via the Data Upload module.
  3. Data Upload module assists researches in uploading their data to the data repository (in the form of a submission package and submission ticket).
  4. Data Download module assists researches in downloading data from the data repository.

Steps in data preparation, validation, submission, and download

For imaging data:

  1. Create and pre-validate the image submission package using the Imaging data submission and validation module.
  2. Validate the the image submission package using the Data Validation module and create the data submission package/ticket using the same module.
  3. Use the data submission ticket from the previous step to upload the data submission package with the help of the Data Upload module.
  4. Download the data using the means provided by the Data Download module.

For non-imaging data:

  1. Create and validate the data submission package using the Data Validation module and create the submission ticket using the same module (creates automatically as soon as data pass validation).
  2. Use the data submission ticket from the previous step to upload the data submission package with the help of the Data Upload module.
  3. Download the data using the means provided by the Data Download module.

Data for upload

Imaging data

Imaging data can be uploaded in the form of a brain image file and a corresponding CSV file that contains some additional patient/subject/visit information (not stored in the image header) as well as image related metadata. See also Required image information.

The following information is required for all image submissions and must be included in a CSV file:

  1. The patient/subject information including - GUID, a patient/subject age, a study site name, and a visit date;
  2. The image information including - imaging study date and time, imaging file name, imaging file itself, imaging file format and modality, image QA/QC information.

The brain image file can be:

For more information refer to: Imaging data submission and validation module.

Clinical data

In order to upload clinical data to the data repository, the data should be submitted to the Data Validation module as CSV files (in tab-delimited format).

The Data Validation module accepts CSV files from a researcher and validates the files' content against the values defined in the data dictionary. For those data that pass validation, the Data Validation module creates a submission package and submission ticket both in XML format.

The submission ticket is used by the Data Upload module to upload the data (in the form of a corresponding submission package) to the data repository.

For more information about the structure of CSV files, data submission and validation, refer to Data Validation module.

Genomics data

TBD.

Imaging data submission and validation module

The Imaging Data Submission and Validation module (also known as MIPAV Image Submission and Validation tool or MIPAV Image submission plug-in) is designed to help researches to validate and submit their data. Data validation is a necessary step that must be done prior data submission in order to ensure the quality of uploaded data and to make data queryable. The module runs as a Java Web Start application locally on a user's computer (Java runtime environment is required).

Module input:

  1. a brain image(s) in one of supported formats,
  2. a corresponding CSV file with metadata, see Imaging data.

Module output:

  1. the brain image(s) in one of supported formats,
  2. the CSV file confirmed against the data dictionary and ready for validation by the Data Validation module,
  3. the Output log that lists all brain image files and CSV files added to the image submission package. It also displays the path(s) to the directory where the image package(s) is stored.

Read more...

Data Validation module

The Data Validation module assists researchers with the submission of both imaging and non-imaging data into the repository. The Data Validation module verifies that data conforms to the required format and range values defined in the data dictionary (note that for the imaging data this is going to be a second validation needed to ensure the quality of data). The Data Validation module validates the metadata associated with the data files identified by the user for submission against the data dictionary. If everything is OK, the Data Validation module creates a submission ticket and submission package. After that data are good for uploading. If errors are found, the module provides a detailed report of any data discrepancies, errors, and warnings. The module runs as a Java Web Start application locally on a user's computer (Java runtime environment is required).

Module input:

  1. CSV file with clinical data or imaging metadata.

Module output:

  1. a submission package and submission ticket (XML) ready for submission by the Data Upload module.
  2. an error log with validation errors and warnings (if any).

Read more ...

Data Upload module

After the submission package has been created using either the Imaging Data Submission and Validation module (for imaging data) or by the Data Validation module (for non-imaging data), data can be submitted to the data repository. The Data Upload module facilitates this process. The module runs as a Java Web Start application locally on a user's computer (Java runtime environment is required).

Module input:

  1. a submission package and submission ticket (XML) from the Data Validation module.

Module output:

  1. Data submitted to the data repository.

Read more...

Data Download module

Data Download module helps users to select and download datasets from the data repository to their own systems. The module runs as a Java Web Start application locally on a user's computer (Java runtime environment are required).

Module input:

  1. Data submitted to the repository by the Data Upload module..

Module output:

  1. Data downloaded to your computer- TBD.

Read more...