Data Repository tools

From MIPAV
Jump to: navigation, search

In order to help researches to upload and download data to the data repository, BRICS provides the Data Repository tool that includes the following modules (or sub-tools):

  1. Imaging data submission and validation module should be used for imaging data to create the image submission package.
  2. Data Validation module verifies that data conforms to the required format and range values defined in the data dictionary. It also creates a data submission package (XML) and submission ticket (XML) that can be uploaded to the data repository via the Data Upload module.
  3. Data Upload module assists researches in uploading their data to the data repository (in the form of a submission package and submission ticket).
  4. Data Download module assists researches in downloading data from the data repository.

Steps in data preparation, validation, submission, and download

For imaging data:

  1. Create and pre-validate the image submission package using the Imaging data submission and validation module.
  2. Validate the the image submission package using the Data Validation module and create the data submission package/ticket using the same module.
  3. Use the data submission ticket from the previous step to upload the data submission package with the help of the Data Upload module.
  4. Download the data using the means provided by the Data Download module.

For non-imaging data:

  1. Create and validate the data submission package using the Data Validation module and create the submission ticket using the same module (creates automatically as soon as data pass validation).
  2. Use the data submission ticket from the previous step to upload the data submission package with the help of the Data Upload module.
  3. Download the data using the means provided by the Data Download module.

Data for upload

Imaging data

Imaging data can be uploaded in the form of a brain image file and a corresponding CSV file that contains some additional patient/subject/visit information (not stored in the image header) as well as image related metadata. See also Required image information.

The following information is required for all image submissions and must be included in a CSV file:

  1. The patient/subject information including - GUID, a patient/subject age, a study site name, and a visit date;
  2. The image information including - imaging study date and time, imaging file name, imaging file itself, imaging file format and modality, image QA/QC information.

The brain image file can be:

For more information refer to: Imaging data submission and validation module.

Clinical data

In order to upload clinical data to the data repository, the data should be submitted to the Data Validation module as CSV files (in tab-delimited format).

The Data Validation module accepts CSV files from a researcher and validates the files' content against the values defined in the data dictionary. For those data that pass validation, the Data Validation module creates a submission package and submission ticket both in XML format.

The submission ticket is used by the Data Upload module to upload the data (in the form of a corresponding submission package) to the data repository.

For more information about the structure of CSV files, data submission and validation, refer to Data Validation module.

Genomics data

TBD.

Imaging data submission and validation module

The Imaging Data Submission and Validation module (also known as MIPAV Image Submission and Validation tool or MIPAV Image submission plug-in) is designed to help researches to validate and submit their data. Data validation is a necessary step that must be done prior data submission in order to ensure the quality of uploaded data and to make data queryable. The module runs as a Java Web Start application locally on a user's computer (Java runtime environment is required).

Module input:

  1. a brain image(s) in one of supported formats,
  2. a corresponding CSV file with metadata, see Imaging data.

Module output:

  1. the brain image(s) in one of supported formats,
  2. the CSV file confirmed against the data dictionary and ready for validation by the Data Validation module,
  3. the Output log that lists all brain image files and CSV files added to the image submission package. It also displays the path(s) to the directory where the image package(s) is stored.

Read more...

Data Validation module

The Data Validation module assists researchers with the submission of both imaging and non-imaging data into the repository. The Data Validation module verifies that data conforms to the required format and range values defined in the data dictionary (note that for the imaging data this is going to be a second validation needed to ensure the quality of data). The Data Validation module validates the metadata associated with the data files identified by the user for submission against the data dictionary. If everything is OK, the Data Validation module creates a submission ticket and submission package. After that data are good for uploading. If errors are found, the module provides a detailed report of any data discrepancies, errors, and warnings. The module runs as a Java Web Start application locally on a user's computer (Java runtime environment is required).

Module input:

  1. CSV file with clinical data or imaging metadata.

Module output:

  1. a submission package and submission ticket (XML) ready for submission by the Data Upload module.
  2. an error log with validation errors and warnings (if any).

Read more ...

Data Upload module

After the submission package has been created using either the Imaging Data Submission and Validation module (for imaging data) or by the Data Validation module (for non-imaging data), data can be submitted to the data repository. The Data Upload module facilitates this process. The module runs as a Java Web Start application locally on a user's computer (Java runtime environment is required).

Module input:

  1. a submission package and submission ticket (XML) from the Data Validation module.

Module output:

  1. Data submitted to the data repository.

Read more...

Data Download module

Data Download module helps users to select and download datasets from the data repository to their own systems. The module runs as a Java Web Start application locally on a user's computer (Java runtime environment are required).

Module input:

  1. Data submitted to the repository by the Data Upload module..

Module output:

  1. Data downloaded to your computer- TBD.
Read more...