Issues: (phenotype-correlation-error) Add more information

* Add notes on failures against the "BXD Genotypes" dataset * Add some reflections
author: Frederick Muriuki Muriithi 2022-10-04 11:56:55 +0300
committer: Frederick Muriuki Muriithi 2022-10-04 11:56:55 +0300
commit: fafce2504a5f716daa42c7c25e8dcdca2cec2c7a (patch)
tree: df70423e9f39d232cf45fe8c97030f6dd51bf825
parent: 92b92785405f0cf7c6d88e1a5dbbe71e4d95632d (diff)
download: gn-gemtext-fafce2504a5f716daa42c7c25e8dcdca2cec2c7a.tar.gz
1 files changed, 13 insertions, 3 deletions
diff --git a/issues/phenotype-correlation-error.gmi b/issues/phenotype-correlation-error.gmi
index 4d34dec..8b37975 100644
--- a/issues/phenotype-correlation-error.gmi
+++ b/issues/phenotype-correlation-error.gmi
@@ -41,12 +41,14 @@ KeyError: "'1' information is not found in the database."
 
 The error above was caused by processing the data for output way too early. This has been fixed.
 
-## Tissue Correlation: Probeset Trait Against Publish Dataset
+## Tissue Correlation: Probeset Trait Against Publish/Genotype Dataset
 
 Running "Tissue" correlations on
 => https://genenetwork.org/show_trait?trait_id=1442370_at&dataset=HC_M2_0606_P
 against the "BXD Published Phenotypes" database fails with the error:
 
+This also fails if you run it against the "BXD Genotypes" dataset.
+
 ```
 Traceback (most recent call last):
   File "/usr/local/guix-profiles/gn-latest-20220820/lib/python3.9/site-packages/flask/app.py", line 1523, in full_dispatch_request
@@ -71,7 +73,7 @@ so far, triangulated the issue to possibly being the fact that the "target_datas
 
 @zsloan and @alexm: any ideas?
 
-## Literature Correlation: ...
+## Literature Correlation: Probeset Trait Against Publish/Genotype Dataset
 
 Run literature correlation for
 => http://localhost:5033/show_trait?trait_id=1442370_at&dataset=HC_M2_0606_P this trait
@@ -101,7 +103,9 @@ AttributeError: 'PhenotypeDataSet' object has no attribute 'retrieve_genes'
 
 The literature correlations computation calls the `retrieve_genes` method, that is only present in the `base.data_set.mrnaassaydataset.MrnaAssayDataSet` class, which handles traits of type "ProbeSet".
 
-==================
+The code seems to imply that we should not run literature correlations against any dataset that is not of type "ProbeSet".
+
+## Some Reflections
 
 In my (fredm) work on partial correlations, before doing the computations,
 => https://github.com/genenetwork/genenetwork3/blob/ff34aee0f39c2e91db243461d7d67405e7aea0e3/gn3/computations/partial_correlations.py#L704-L750 there were error checks
@@ -109,6 +113,12 @@ that were run.
 
 Should these be present for the full correlations too?
 
+The failures above with the Publish/Genotype datasets implies one of two things:
+* The code is not general enough, or
+* We need to handle the exceptions, and present the selection errors as appropriate.
+
+Better yet, we should probably not present invalid data to the user, i.e. do not present user with a dataset which would lead to errors if a correlation of a particular type is run against it with the given trait.
+
 ## Tags
 * assigned: alexm, fredm, zsloan
 * type: bug
author	Frederick Muriuki Muriithi	2022-10-04 11:56:55 +0300
committer	Frederick Muriuki Muriithi	2022-10-04 11:56:55 +0300
commit	fafce2504a5f716daa42c7c25e8dcdca2cec2c7a (patch)
tree	df70423e9f39d232cf45fe8c97030f6dd51bf825
parent	92b92785405f0cf7c6d88e1a5dbbe71e4d95632d (diff)
download	gn-gemtext-fafce2504a5f716daa42c7c25e8dcdca2cec2c7a.tar.gz