Difference between revisions of "Data Validation module"

From MIPAV
Jump to: navigation, search
m (Submission package)
m
Line 16: Line 16:
  
 
For more information about CSV files for data uploading, contact the data dictionary operations team - TBD.
 
For more information about CSV files for data uploading, contact the data dictionary operations team - TBD.
 +
 +
<div id="FormStructure"><div>
 +
=== Form structures ===
 +
'''A form structure''' represents a grouping/collection of data elements used in [http://ibis.nih.gov/jsp/tools/about-brics.jsp BRICS data dictionary]. A form structure is analogous to [http://en.wikipedia.org/wiki/Case_report_form a case report form (CRF)] (electronic or paper) where data elements are linked together for collection and display.
 +
 +
'''A data element''' is a logical unit of data used in [http://ibis.nih.gov/jsp/tools/about-brics.jsp BRICS]. It contains a single piece of  information of one kind. A data element has a name, precise definition, and a set of permissible values (codes), if applicable. A data element is not necessarily the smallest unit of data; it can be a unique combination of one or more smaller units. A data element occupies the space provided by field(s) on a paper/electronic case report form (CRF) or field(s) in a database record. [http://ibis.nih.gov/jsp/tools/about-brics.jsp BRICS] allows a use of two types of data elements:
 +
 +
* Common data elements (CDEs), which are used across multiple studies and diseases/domains,
 +
* And unique data elements (UDE), which are used to gather information for a particular study.
 +
 +
Both types of data elements are used in forms structures and to collect data.
  
 
<div id="SubmissionPackage"><div>
 
<div id="SubmissionPackage"><div>

Revision as of 16:41, 23 July 2013

In order to ensure the quality of uploaded data and also to make data easy to query, data should be submitted in a specific format and range values should comply with the values defined in the data dictionary. All submitted research data must be validated against the values defined in the data dictionary prior to submission. To facilitate this process, we provide the Data Validation module that assists researchers with the submission of their data.

Introduction

The Data Validation module accepts the data as CSV files from a researcher and validates the file content against the values defined in the data dictionary. It then creates a submission package. If everything is OK, the Data Validation module creates a submission ticket and submission package. After that data a good for uploading.

If any validation errors or warnings are found, the module provides a detailed report of any data discrepancies, errors, and warnings received.

Validation warnings are just warnings and they did not prevent creating of the submission package. However, if any validation errors are found, a submission package cannot be created. In that case, the researched should edit data to fix all errors, first, and then re-validate the data.

System requirements

The most recent version of Java Runtime Environment (JRE) (6 or 7) is required in order to run the Data Validation module.

CSV files

The structure of a CSV file should match a corresponding form structure queryable by the query tool.

For more information about CSV files for data uploading, contact the data dictionary operations team - TBD.

Form structures

A form structure represents a grouping/collection of data elements used in BRICS data dictionary. A form structure is analogous to a case report form (CRF) (electronic or paper) where data elements are linked together for collection and display.

A data element is a logical unit of data used in BRICS. It contains a single piece of information of one kind. A data element has a name, precise definition, and a set of permissible values (codes), if applicable. A data element is not necessarily the smallest unit of data; it can be a unique combination of one or more smaller units. A data element occupies the space provided by field(s) on a paper/electronic case report form (CRF) or field(s) in a database record. BRICS allows a use of two types of data elements:

  • Common data elements (CDEs), which are used across multiple studies and diseases/domains,
  • And unique data elements (UDE), which are used to gather information for a particular study.

Both types of data elements are used in forms structures and to collect data.

Submission package

The submission package includes:

  • A submission ticket (XML), see an example below,
  • A data file (XML).

An example of a submission ticket:

<?xml version="1.0" encoding="UTF-8" standalone="true"?> -<submissionTicket environment="production" version="2.0.2.108">-<submissionPackage types="CLINICAL" crcHash="55830a2aa77164ea834942e65e319a38" dataFileBytes="241686" bytes="19233" name="dataFile-1373248220203">-<datasets><dataset crcHash="82ff11c787fc3086ce5bbd9e7518e279" bytes="19233" name="WardMinus2DemoGUIDS.csv" type="CLINICAL" path="C:\Users\user1\Documents\TBI 2013\CSV\sampleCSV.csv"/></datasets><associatedFiles/></submissionPackage> </submissionTicket>

Running the Data Validation module

TBD


Error log

TBD.