aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorFrederick Muriuki Muriithi2024-12-13 13:57:37 -0600
committerFrederick Muriuki Muriithi2024-12-13 14:44:03 -0600
commitf9c2780c228a8f8e0290f758e19ea6985be9883e (patch)
treee5ad9f18f91f2ad2baa57b6c3a69fcb0e11d3fab
parent903fb847a03f9d3d814065349eca08e3bdc24865 (diff)
downloadgn-uploader-f9c2780c228a8f8e0290f758e19ea6985be9883e.tar.gz
Add page documentation.
-rw-r--r--uploader/templates/phenotypes/add-phenotypes-raw-files.html148
1 files changed, 143 insertions, 5 deletions
diff --git a/uploader/templates/phenotypes/add-phenotypes-raw-files.html b/uploader/templates/phenotypes/add-phenotypes-raw-files.html
index 612bff7..a39ace8 100644
--- a/uploader/templates/phenotypes/add-phenotypes-raw-files.html
+++ b/uploader/templates/phenotypes/add-phenotypes-raw-files.html
@@ -43,7 +43,11 @@
<span class="form-text text-muted">
Provide the character that separates the fields in your file(s). It should
be the same character for all files (if more than one is provided).<br />
- A tab character will be assumed if you leave this field blank.</span>
+ A tab character will be assumed if you leave this field blank. See
+ <a href="#docs-file-separator"
+ title="Documentation for file-separator characters">
+ documentation for more information</a>.
+ </span>
</div>
<div class="form-group">
@@ -62,7 +66,11 @@
</div>
<span class="form-text text-muted">
This specifies that lines that begin with the character provided will be
- considered comment lines and ignored in their entirety.</span>
+ considered comment lines and ignored in their entirety. See
+ <a href="#docs-file-comment-character"
+ title="Documentation for comment characters">
+ documentation for more information</a>.
+ </span>
</div>
<div class="form-group">
@@ -81,7 +89,9 @@
This specifies strings in your file indicate that there is no value for a
particular cell (a cell is where a column and row intersect). Provide a
space-separated list of strings if you have more than one way of
- indicating no values.</span>
+ indicating no values. See
+ <a href="#docs-file-na" title="Documentation for no-value fields">
+ documentation for more information</a>.</span>
</div>
</fieldset>
@@ -112,10 +122,11 @@
required="required" />
<span class="form-text text-muted">
Provide a file that contains only the phenotype data. See
- <a href="#docs-file-example"
+ <a href="#docs-phenotype-data"
title="Documentation of the phenotype data file format.">
the documentation for the expected format of the file</a>.</span>
</div>
+
{%if population.Family in families_with_se_and_n%}
<div class="form-group">
<label for="finput-phenotype-se" class="form-label">Phenotype: Standard Errors</label>
@@ -146,7 +157,134 @@
{%block page_documentation%}
-page documentation goes here!!!
+<div class="row">
+ <h2 class="heading" id="docs-help">Help</h2>
+ <h3 class="subheading">Common Features</h3>
+ <p>The following are the common expectations for <strong>ALL</strong> the
+ files provided in the form above:
+ <ul>
+ <li>The file <strong>MUST</strong> be character-separated values (CSV)
+ text file</li>
+ <li>The first row in the file <strong>MUST</strong> be a heading row, and
+ will be composed of the list identifiers for all of
+ samples/individuals/cases involved in your study.</li>
+ <li>The first column of data in the file <strong>MUST</strong> be the
+ identifiers for all of the phenotypes you wish to upload.</li>
+ </ul>
+ </p>
+
+ <p>If you do not specify the separator character, then we will assume a
+ <strong>TAB</strong> character was used as your separator.</p>
+
+ <p>We also assume you might include comments lines in your files. In that
+ case, if you do not specify what character denotes that a line in your files
+ is a comment line, we will assume the <strong>#</strong> character.<br />
+ A comment <strong>MUST ALWAYS</strong> begin at the start of the line marked
+ with the comment character specified.</p>
+
+ <h3 class="subheading" id="docs-file-metadata">File Metadata</h3>
+ <p>We request some details about your files to help us parse and process the
+ files correctly. The details we collect are:</p>
+ <dl>
+ <dt id="docs-file-separator">File separator</dt>
+ <dd>The files you provide should be character-separated value (CSV) files.
+ We need to know what character you used to separate the values in your
+ file. Some common ones are the Tab character, the comma, etc.<br />
+ Providing that information makes it possible for the system to parse and
+ process your files correctly.<br>
+ <strong>NOTE:</strong> All the files you upload MUST use the same
+ separator.</dd>
+
+ <dt id="docs-file-comment-character">Comment character</dt>
+ <dd>We support use of comment lines in your files. We only support one type
+ of comment style, the <em>line comment</em>.<br />
+ This mean the comment begins at the start of the line, and the end of that
+ line indicates the end of that comment. If you have a really long comment,
+ then you need to break it across multiple lines, marking each line a
+ comment line.<br />
+ The "comment character" is the character at the start of the line that
+ indicates that the line is a line comment.</dd>
+
+ <dt id="docs-file-na">No-Value indicator(s)</dt>
+ <dd>Data in the real world is messy, and in some cases, entirely absent. You
+ need to indicate, in your files, that a particular field did not have a
+ value, and once you do that, you then need to let the system know how you
+ mark such fields. Common ways of indicating "empty values" are, leaving
+ the field blank, using a character such as '-', or using strings like
+ "NA", "N/A", "NULL", etc.<br />
+ Providing this information will help with parsing and processing such
+ no-value fields the correct way.</dd>
+ </dl>
+
+ <h3 class="subheading" id="docs-file-phenotype-description">
+ file: Phenotypes Descriptions</h3>
+ <p>The data in this file is a matrix of <em>phenotypes × metadata-fields</em>.
+ Please note we use the term "metadata-fields" above loosely, due to lack of
+ a good word for this.</p>
+ <p>The file <strong>MUST</strong> have columns in this order:
+ <dl>
+ <dt>Phenotype Identifiers</dt>
+ <dd>These are the names/identifiers for your phenotypes. These
+ names/identifiers are the same ones you will have in all the other files you are
+ uploading.</dd>
+
+ <dt>Descriptions</dt>
+ <dd>Each phenotype will need a description. Good description are necessary
+ to inform other people of what the data is about. Good description are
+ hard to construct, so we provide
+ <a href="https://info.genenetwork.org/faq.php#q-22"
+ title="How to write phenotype descriptions">
+ advice on describing your phenotypes.</a></dd>
+
+ <dt>Units</dt>
+ <dd>Each phenotype will need units for the measurements taken. If there are
+ none, then indicate the field is a no-value field.</dd>
+ </dl></p>
+ <p>You can add more columns after those three if you want to, but these 3
+ <strong>MUST</strong> be present.</p>
+ <p>The file would, for example, look like the following:</p>
+ <code>id,description,units,…<br />
+ pheno10001|Central nervous system, behavior, cognition; …|mg|…<br />
+ pheno10002|Aging, metabolism, central nervous system: …|mg|…<br />
+ ⋮<br /></code>
+
+ <p><strong>Note 01</strong>: The first usable row is the heading row.</p>
+ <p><strong>Note 02: </strong>This example demonstrates a subtle issue that
+ could make your CSV file invalid &mdash; the choice of your field separator
+ character.<br >
+ In the example above, we use the pipe character (<code>|</code>) as our
+ field separator. This is because, if we follow the advice on how to write
+ good descriptions, then we cannot use the comma as our separator &ndash; if
+ we did, then our CSV file would be invalid because the system would have no
+ way to tell the difference between the comma as a field separator, and the
+ comma as a way to separate the "general category and ontology terms".</p>
+
+ <h3 class="subheading">file: Phenotype Data, Standard Errors and/or Sample Counts</h3>
+ <span id="docs-phenotype-data"></span>
+ <span id="docs-phenotype-se"></span>
+ <span id="docs-phenotype-n"></span>
+ <p>The data is a matrix of <em>phenotypes × individuals</em>, e.g.</p>
+ <code>
+ # num-cases: 2549
+ # num-phenos: 13
+ id,IND001,IND002,IND003,IND004,…<br />
+ pheno10001,61.400002,54.099998,483,49.799999,…<br />
+ pheno10002,49,50.099998,403,45.5,…<br />
+ pheno10003,62.5,53.299999,501,62.900002,…<br />
+ pheno10004,53.099998,55.099998,403,NA,…<br />
+ ⋮<br /></code>
+
+ <p>where <code>IND001,IND002,IND003,IND004,…</code> are the
+ samples/individuals/cases in your study, and
+ <code>pheno10001,pheno10002,pheno10004,pheno10004,…</code> are the
+ identifiers for your phenotypes.</p>
+ <p>The lines beginning with the "<em>#</em>" symbol (i.e.
+ <code># num-cases: 2549</code> and <code># num-phenos: 13</code> are comment
+ lines and will be ignored</p>
+ <p>In this example, the comma (,) is used as the file separator.</p>
+</div>
+
+{%endblock%}
{%block more_javascript%}