diff options
Diffstat (limited to 'uploader')
-rw-r--r-- | uploader/templates/phenotypes/add-phenotypes-raw-files.html | 148 |
1 files changed, 143 insertions, 5 deletions
diff --git a/uploader/templates/phenotypes/add-phenotypes-raw-files.html b/uploader/templates/phenotypes/add-phenotypes-raw-files.html index 612bff7..a39ace8 100644 --- a/uploader/templates/phenotypes/add-phenotypes-raw-files.html +++ b/uploader/templates/phenotypes/add-phenotypes-raw-files.html @@ -43,7 +43,11 @@ <span class="form-text text-muted"> Provide the character that separates the fields in your file(s). It should be the same character for all files (if more than one is provided).<br /> - A tab character will be assumed if you leave this field blank.</span> + A tab character will be assumed if you leave this field blank. See + <a href="#docs-file-separator" + title="Documentation for file-separator characters"> + documentation for more information</a>. + </span> </div> <div class="form-group"> @@ -62,7 +66,11 @@ </div> <span class="form-text text-muted"> This specifies that lines that begin with the character provided will be - considered comment lines and ignored in their entirety.</span> + considered comment lines and ignored in their entirety. See + <a href="#docs-file-comment-character" + title="Documentation for comment characters"> + documentation for more information</a>. + </span> </div> <div class="form-group"> @@ -81,7 +89,9 @@ This specifies strings in your file indicate that there is no value for a particular cell (a cell is where a column and row intersect). Provide a space-separated list of strings if you have more than one way of - indicating no values.</span> + indicating no values. See + <a href="#docs-file-na" title="Documentation for no-value fields"> + documentation for more information</a>.</span> </div> </fieldset> @@ -112,10 +122,11 @@ required="required" /> <span class="form-text text-muted"> Provide a file that contains only the phenotype data. See - <a href="#docs-file-example" + <a href="#docs-phenotype-data" title="Documentation of the phenotype data file format."> the documentation for the expected format of the file</a>.</span> </div> + {%if population.Family in families_with_se_and_n%} <div class="form-group"> <label for="finput-phenotype-se" class="form-label">Phenotype: Standard Errors</label> @@ -146,7 +157,134 @@ {%block page_documentation%} -page documentation goes here!!! +<div class="row"> + <h2 class="heading" id="docs-help">Help</h2> + <h3 class="subheading">Common Features</h3> + <p>The following are the common expectations for <strong>ALL</strong> the + files provided in the form above: + <ul> + <li>The file <strong>MUST</strong> be character-separated values (CSV) + text file</li> + <li>The first row in the file <strong>MUST</strong> be a heading row, and + will be composed of the list identifiers for all of + samples/individuals/cases involved in your study.</li> + <li>The first column of data in the file <strong>MUST</strong> be the + identifiers for all of the phenotypes you wish to upload.</li> + </ul> + </p> + + <p>If you do not specify the separator character, then we will assume a + <strong>TAB</strong> character was used as your separator.</p> + + <p>We also assume you might include comments lines in your files. In that + case, if you do not specify what character denotes that a line in your files + is a comment line, we will assume the <strong>#</strong> character.<br /> + A comment <strong>MUST ALWAYS</strong> begin at the start of the line marked + with the comment character specified.</p> + + <h3 class="subheading" id="docs-file-metadata">File Metadata</h3> + <p>We request some details about your files to help us parse and process the + files correctly. The details we collect are:</p> + <dl> + <dt id="docs-file-separator">File separator</dt> + <dd>The files you provide should be character-separated value (CSV) files. + We need to know what character you used to separate the values in your + file. Some common ones are the Tab character, the comma, etc.<br /> + Providing that information makes it possible for the system to parse and + process your files correctly.<br> + <strong>NOTE:</strong> All the files you upload MUST use the same + separator.</dd> + + <dt id="docs-file-comment-character">Comment character</dt> + <dd>We support use of comment lines in your files. We only support one type + of comment style, the <em>line comment</em>.<br /> + This mean the comment begins at the start of the line, and the end of that + line indicates the end of that comment. If you have a really long comment, + then you need to break it across multiple lines, marking each line a + comment line.<br /> + The "comment character" is the character at the start of the line that + indicates that the line is a line comment.</dd> + + <dt id="docs-file-na">No-Value indicator(s)</dt> + <dd>Data in the real world is messy, and in some cases, entirely absent. You + need to indicate, in your files, that a particular field did not have a + value, and once you do that, you then need to let the system know how you + mark such fields. Common ways of indicating "empty values" are, leaving + the field blank, using a character such as '-', or using strings like + "NA", "N/A", "NULL", etc.<br /> + Providing this information will help with parsing and processing such + no-value fields the correct way.</dd> + </dl> + + <h3 class="subheading" id="docs-file-phenotype-description"> + file: Phenotypes Descriptions</h3> + <p>The data in this file is a matrix of <em>phenotypes × metadata-fields</em>. + Please note we use the term "metadata-fields" above loosely, due to lack of + a good word for this.</p> + <p>The file <strong>MUST</strong> have columns in this order: + <dl> + <dt>Phenotype Identifiers</dt> + <dd>These are the names/identifiers for your phenotypes. These + names/identifiers are the same ones you will have in all the other files you are + uploading.</dd> + + <dt>Descriptions</dt> + <dd>Each phenotype will need a description. Good description are necessary + to inform other people of what the data is about. Good description are + hard to construct, so we provide + <a href="https://info.genenetwork.org/faq.php#q-22" + title="How to write phenotype descriptions"> + advice on describing your phenotypes.</a></dd> + + <dt>Units</dt> + <dd>Each phenotype will need units for the measurements taken. If there are + none, then indicate the field is a no-value field.</dd> + </dl></p> + <p>You can add more columns after those three if you want to, but these 3 + <strong>MUST</strong> be present.</p> + <p>The file would, for example, look like the following:</p> + <code>id,description,units,…<br /> + pheno10001|Central nervous system, behavior, cognition; …|mg|…<br /> + pheno10002|Aging, metabolism, central nervous system: …|mg|…<br /> + ⋮<br /></code> + + <p><strong>Note 01</strong>: The first usable row is the heading row.</p> + <p><strong>Note 02: </strong>This example demonstrates a subtle issue that + could make your CSV file invalid — the choice of your field separator + character.<br > + In the example above, we use the pipe character (<code>|</code>) as our + field separator. This is because, if we follow the advice on how to write + good descriptions, then we cannot use the comma as our separator – if + we did, then our CSV file would be invalid because the system would have no + way to tell the difference between the comma as a field separator, and the + comma as a way to separate the "general category and ontology terms".</p> + + <h3 class="subheading">file: Phenotype Data, Standard Errors and/or Sample Counts</h3> + <span id="docs-phenotype-data"></span> + <span id="docs-phenotype-se"></span> + <span id="docs-phenotype-n"></span> + <p>The data is a matrix of <em>phenotypes × individuals</em>, e.g.</p> + <code> + # num-cases: 2549 + # num-phenos: 13 + id,IND001,IND002,IND003,IND004,…<br /> + pheno10001,61.400002,54.099998,483,49.799999,…<br /> + pheno10002,49,50.099998,403,45.5,…<br /> + pheno10003,62.5,53.299999,501,62.900002,…<br /> + pheno10004,53.099998,55.099998,403,NA,…<br /> + ⋮<br /></code> + + <p>where <code>IND001,IND002,IND003,IND004,…</code> are the + samples/individuals/cases in your study, and + <code>pheno10001,pheno10002,pheno10004,pheno10004,…</code> are the + identifiers for your phenotypes.</p> + <p>The lines beginning with the "<em>#</em>" symbol (i.e. + <code># num-cases: 2549</code> and <code># num-phenos: 13</code> are comment + lines and will be ignored</p> + <p>In this example, the comma (,) is used as the file separator.</p> +</div> + +{%endblock%} {%block more_javascript%} |