Data Validation module and Degrees of freedom: Difference between pages

From MIPAV
(Difference between pages)
Jump to navigation Jump to search
m (1 revision imported)
 
MIPAV>Olgavovk
mNo edit summary
 
Line 1: Line 1:
In order to ensure the quality of uploaded data and also to make data easy to query, data should be submitted in a specific format and range values should comply with the values defined in the data dictionary. All submitted research data must be validated against the values defined in the data dictionary prior to submission. To facilitate this process, we provide the Data Validation module that assists researchers with the submission of their data.  
The number of independent pieces of information that go into the estimate of a parameter is called '''the degrees of freedom (DOF)'''. In this guide, DOF are given for 3D images.


== Introduction==
== Basics ==
The Data Validation module accepts the data as [[#CSVFiles|CSV]] files from a researcher and validates the file content against the values defined in the data dictionary.  
In general, the degrees of freedom of an estimate is equal to the number of independent scores that go into the estimate minus the number of parameters estimated as intermediate steps in the estimation of the parameter itself. In image registration, a transformation matrix establishes geometrical correspondence between coordinate systems of different images. It is used to transform one image into the space of the other.  


The module accepts [[#CSVFiles|CSV]] files from a researcher and validates the files' content against the values defined in the data dictionary. For thos CSV files that pass validation, the Data Validation module creates  [[#SubmissionPackage|a submission ticket and submission package]] both in [http://en.wikipedia.org/wiki/XML XML format]. After that data are good for uploading. The submission ticket is used by the [[#DataUpload| Data Upload module]] to upload the data (in the form of a corresponding submission package) to the repository. 
=== Transformations generally used in biomedical imaging ===


If any validation errors or warnings  are found, the module provides [[#ErrorLog|a detailed report]] of any data discrepancies, [[#ErrorLog|errors]], and warnings received.
==== Rigid-body transformations ====


Validation warnings are just warnings and they did not prevent creating of the submission package. However, if any validation errors are found, [[#SubmissionPackage|a submission package]] cannot be created. In that case, the researched should edit data to fix all errors, first, and then re-validate the data.
Rigid-body transformations include translations and rotations. They preserve all lengths and angles. These are '''6 DOF''' transformations, and the transformation matrix is as follows:


'''See also:'''
<math>
*[[Data Repository tools| Data Repository tools]]
\begin{bmatrix}
*[[Image submission plug-in|Imaging data submission and validation]] module
  R_x & R_y & R_z \\
*[[Data Upload module| Data Upload]] module
  T_x & T_y & T_z
*[[Data Download module| Data Download]] module
\end{bmatrix}
</math>


== System requirements ==
==== Global rescale transformations ====
The most recent version of [http://java.com/en/download/index.jsp Java Runtime Environment (JRE)] (6 or 7) is required in order to run the Data Validation module.


== Module input and output ==
Include translations, rotations, and a single scale parameter S=Sx=Sy=Sz. They preserve all angles and relative lengths. These are '''7DOF''' transformations and the transformation matrix is as follows:
'''Module input:'''
# [[#CSVFiles|CSV]] files with clinical data or [[#ImagingData|imaging metadata]].


'''Module output:'''
<math>
# [[#SubmissionPackage|a submission package and submission ticket]] ([http://en.wikipedia.org/wiki/XML XML]) ready for submission by the [[#DataUpload| Data Upload module]].
\begin{bmatrix}
# [[Data Validation module#Error log|an error log]] with validation errors and warnings (if any).
  R_x & R_y & R_z \\
  T_x & T_y & T_z \\
        S
\end{bmatrix}
</math>


<div id="CSVFiles"><div>
<div id="AffineTransformations"></div>
== CSV files ==
==== Affine transformations ====
The structure of a [http://en.wikipedia.org/wiki/Comma-separated_values  CSV file] should match a corresponding [[#FormStructure|form structure]] queryable by [http://fitbir-demo.cit.nih.gov the query tool].


For more information about CSV files for data uploading, contact the data dictionary operations team - TBD.
Include translations, rotations, scales, and/or skewing parameters. They preserve straight lines but necessarily not angles or lengths. Transformation matrixes for affine transformations are as follows:


<div id="FormStructure"><div>
'''9 DOF''' transformation matrix which includes scale parameters Sx, Sy and Sz looks as follows
=== Form structures ===
'''A form structure''' represents a grouping/collection of data elements used in [http://ibis.nih.gov/jsp/tools/about-brics.jsp BRICS data dictionary]. A form structure is analogous to [http://en.wikipedia.org/wiki/Case_report_form a case report form (CRF)] (electronic or paper) where data elements are linked together for collection and display.


'''A data element''' is a logical unit of data used in [http://ibis.nih.gov/jsp/tools/about-brics.jsp BRICS]. It contains a single piece of information of one kind. A data element has a name, precise definition, and a set of permissible values (codes), if applicable. A data element is not necessarily the smallest unit of data; it can be a unique combination of one or more smaller units. A data element occupies the space provided by field(s) on a paper/electronic case report form (CRF) or field(s) in a database record. [http://ibis.nih.gov/jsp/tools/about-brics.jsp BRICS] allows a use of two types of data elements:
<math>
\begin{bmatrix}
  R_x & R_y & R_z \\
  T_x & T_y & T_z \\
  S_x & S_y & S_z
  \end{bmatrix}
</math>


* Common data elements (CDEs), which are used across multiple studies and diseases/domains,
* And unique data elements (UDE), which are used to gather information for a particular study.


Both types of data elements are used in forms structures and to collect data.
'''12 DOF''' transformation matrix which includes both scale and skew parameters. In the matrix, skewing parameters are presented as Skx, Sky, and Skz:


<div id="SubmissionPackage"><div>
<math>
== Submission package ==
\begin{bmatrix}
The submission package includes:
  R_x & R_y & R_z \\
* A submission ticket ([http://en.wikipedia.org/wiki/XML XML]), see an example below,
  T_x & T_y & T_z \\
* A data file ([http://en.wikipedia.org/wiki/XML XML]).
  S_x & S_y & S_z \\
  S_kx & S_ky & S_kz
\end{bmatrix}
</math>


'''An example of a submission ticket:'''
== References ==


<code>
*Lisa Gottesfeld Brown, A survey of image registration techniques (abstract), ACM Computing Surveys (CSUR) archive, Volume 24 , Issue 4, December 1992), Pages: 325 - 376
<?xml version="1.0" encoding="UTF-8" standalone="true"?>
*Simonson, K., Drescher, S., Tanner, F., A Statistics Based Approach to Binary Image Registration with Uncertainty Analysis. IEEE Pattern Analysis and Machine Intelligence, Vol. 29, No. 1, January 2007
-<submissionTicket environment="production" version="2.0.2.108">-<submissionPackage types="CLINICAL" crcHash="55830a2aa77164ea834942e65e319a38" dataFileBytes="241686" bytes="19233" name="dataFile-1373248220203">-<datasets><dataset crcHash="82ff11c787fc3086ce5bbd9e7518e279" bytes="19233" name="WardMinus2DemoGUIDS.csv" type="CLINICAL" path="C:\Users\user1\Documents\TBI 2013\CSV\sampleCSV.csv"/></datasets><associatedFiles/></submissionPackage>
*Domokos, C., Kato, Z., Francos, J., Parametric estimation of affine deformations of binary images. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, 2008
</submissionTicket>
</code>


<div id="RunningDataValidation"><div>
== See also: ==
== Running the Data Validation module ==
 
[[File:ValidationTool1.png|200px|thumb|left|Selecting the Working Directory and loading files]]
[[File:ValidationTool2.png|200px|thumb|left|Including and excluding files from/to validation]]
[[File:ValidationTool3.png|200px|thumb|left|Validation warnings (if any) appear in the Result Details table. Files with warnings can be included into a submission package]]
The the Data Validation module runs locally on your machine. In order to launch the module, navigate to the Data Repository>Validate Data page and click the Launch the Validation Tool link.
 
'''Note''' that the most recent version of [http://java.com/en/download/index.jsp Java Runtime Environment (JRE)] (6 or 7) is required in order to run the module. Make sure your computer has it installed.
 
# Click Launch Validation Tool. In the Opening window that appears, select Open with Java(TM) Web Start Launcher (default) and click OK. In the Java Runtime Environment window that appears next saying "Do you want to run this application?", click Run.
# The module dialog box appears. Click Browse (under Working Directory) to navigate to the directory where the files for submission (CSVs) are located. We call it your Working Directory.
# Select the directory and click Load files to load CSVs into the dialog box. The Loading Files window appears showing the progress.
# At some point the list of files from your working directory appears in the dialog box under Files.
# If your Working Directory contains only the CSV files designated for validation, select the files (click to highlight) to be validated and clock Validate Files. See [[#IncludingExcludingFiles| Including and excluding files]].
# The Validation module begins to run and validation errors/warnings (if any) appear in the Result Details table. Note: Files with warnings can be included into a submission package. However, [[#ErrorLog|files with errors]] must be fixed and re-validated.
# If there are no errors found in your CSV file(s), for each file that passes validation the following information appears in the Files table: 1) the form structure name appears in the Structure column, 2) the word PASSED appears in the Result column, and 3) the summary column contains only warnings but no errors. Note that a file that passed validation still can have a lot of warnings. That is OK. For more information about validation errors and warnings refer to [[#ErrorLog| Error log]].
# Click Build Submission Package.
# For each validated CSV file, [[#SubmissionPackage|the submission package and submission ticket]] will be deposited in the same Working Directory as the original files submitted to the Validation Tool.
 
<div id="IncludingExcludingFiles"><div>
 
== Including and excluding files ==
In the ideal world, your Working Directory should contain only CSV files for validation. Although very often it contains other files also (such as error logs and notes, etc.) In that case, you need to:
 
# Exclude from validation those files (and directories) that are not designated for validation. These files usually appear with Type= UNKNOWN under Files in the Working Directory;
# Include into validation the CSV data files that you would like  to validate. These files have Type=CSV in the Files table.
 
Refer to [[#ExcludingFiles|Excluding files from validation]] and [[#IncludingFiles| Including files for validation]].
 
<div id="ExcludingFiles"><div>
=== Excluding files from validation ===
To exclude files from validation, select individual file(s) (click to highlight) that are of TYPE UNKNOWN and those not needed for the submission. Hold Ctrl while clicking in order to highlight multiple files. Click Exclude Files.
 
<div id="IncludingFiles"><div>
=== Including files for validation ===
To include files for validation, select the CSV files you want to be validated and press Include Files. Hold Ctrl while clicking in order to highlight multiple files.
 
<div id="ErrorLog"><div>
 
== Error log ==
Validation errors and warnings appear in the Result Details table. Files with warnings can be validated. However, files with errors must be fixed and re-validated, and then resubmitted for another validation round.
 
Validation errors appear when a CSV file has entries that are
* Of different type than defines in the data dictionary, e.g. numbers instead of alphanumeric values;
* In different format (other than defiled in the data dictionary for this data element);
* Not listed among permissible values for this particular data element;
* Have more than 1 permissible value separated by a semicolon ";"
* Some other errors.
 
Validation warnings mostly appear when a data entry, which was defined as Required in the corresponding form structure, is missing in the CSV file.
 
Please don't be surprised when your validation error log appears having a ton of warnings, these can be easily ignored.
 
=== Fixing validation errors===
[[File:ValidationTool4.png|200px|thumb|left|Saving validation errors and warnings in a TXT file]]
Validation errors and warnings can be exported into a text file - that makes working with them and fixing errors much easier.
 
==== To export validation errors or warnings, or both,====
# Click the Export Result Details,
# In the Save dialog box that appears, a) select a directory where you would like to save validation logs, b) specify what types of error log entries you would like to export. These could be a) both errors and warnings (recommended only for smaller log files), b) errors only (recommended), or c) warnings only.
# Type in your own file name and press Save.
# The log file will be saved in the designated directory under the chosen name.
 
'''Recommendations:'''
* By default, an error log file is created and stored in the same directory as your working files. We recommend that you create a designated error log directory and save validation logs there.
* By default, an error log is saved under the "resultDetail.txt" name. We commend that you choose your own file name for an error log and that name is somehow related to the name of your data file. E.g. if you have a data file let say "MyData.csv" you give the corresponding error log file the following name "MyDataErroLog.txt".
 
After you have exported all validation errors,
# Open the log file in a text editor (MS Word, Notepad, Crimson, Notepadd++ - all these will work).
# Open your CSV file in MS Excel or your preferable text editor that can work with CSV (not MS Word!).
# Go through each entry in the error log and fix it in the CSV file. Save changes in the CSV file. Make sure you saved it as CSV.
# [[#RunningDataValidation|Re-validate]] the fixed CSV file. Make sure that all errors are gone.
# Create the submission package.
 
'''Recommendations:'''
* If you received a ton of validation errors, we commend that you work on fixing them in batches. Fix a few errors, save the fixed CSV file and [[#RunningDataValidation|re-run it through the Data Validation module]]. It will still give you a lot of errors, but we hope it would be fewer that before. Save the new error log and go through it fixing a few more errors.  [[#RunningDataValidation|Re-run validation]]. Repeat these steps until you get 0 (zero) errors.
 
== Next step - data upload ==
 
After all validation errors have been fixed and a submission package has been created, data can now be submitted to the system. To upload the submission package, use the [[Data Upload module]]. The module, runs locally your computer as [http://java.com/en/download/faq/java_webstart.xml a Java Web Start application] (the latest version of [http://java.com/en/download/index.jsp Java runtime environment] required).
 
Read [[Data Upload module|more]]...
 
== See also ==
 
*[[Data Repository tools| Data Repository tools]]
*[[Image submission plug-in|Imaging data submission and validation module]]
*[[Data Upload module]]
*[[Data Download module]]


*[[Using MIPAV Algorithms]]
*[[Cost functions used in MIPAV algorithms]]


[[Category:Help]]
[[Category:Help]]
[[Category:BRICS]]
[[Category:Help:Algorithms]]

Revision as of 18:07, 18 May 2012

The number of independent pieces of information that go into the estimate of a parameter is called the degrees of freedom (DOF). In this guide, DOF are given for 3D images.

Basics

In general, the degrees of freedom of an estimate is equal to the number of independent scores that go into the estimate minus the number of parameters estimated as intermediate steps in the estimation of the parameter itself. In image registration, a transformation matrix establishes geometrical correspondence between coordinate systems of different images. It is used to transform one image into the space of the other.

Transformations generally used in biomedical imaging

Rigid-body transformations

Rigid-body transformations include translations and rotations. They preserve all lengths and angles. These are 6 DOF transformations, and the transformation matrix is as follows:

Global rescale transformations

Include translations, rotations, and a single scale parameter S=Sx=Sy=Sz. They preserve all angles and relative lengths. These are 7DOF transformations and the transformation matrix is as follows:

Affine transformations

Include translations, rotations, scales, and/or skewing parameters. They preserve straight lines but necessarily not angles or lengths. Transformation matrixes for affine transformations are as follows:

9 DOF transformation matrix which includes scale parameters Sx, Sy and Sz looks as follows


12 DOF transformation matrix which includes both scale and skew parameters. In the matrix, skewing parameters are presented as Skx, Sky, and Skz:

References

  • Lisa Gottesfeld Brown, A survey of image registration techniques (abstract), ACM Computing Surveys (CSUR) archive, Volume 24 , Issue 4, December 1992), Pages: 325 - 376
  • Simonson, K., Drescher, S., Tanner, F., A Statistics Based Approach to Binary Image Registration with Uncertainty Analysis. IEEE Pattern Analysis and Machine Intelligence, Vol. 29, No. 1, January 2007
  • Domokos, C., Kato, Z., Francos, J., Parametric estimation of affine deformations of binary images. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, 2008

See also: