aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-09-27Narrow the exception and add commentsFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Only catch the `FileExistsError` allowing any other exception to pass through. This tries to conform a little to the review at https://github.com/genenetwork/genenetwork3/pull/37#discussion_r714552696
2021-09-27Update terminology: `riset` to `group`Frederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Update terminology to use the appropriate domain terminology according to Zachary's direction at https://github.com/genenetwork/genenetwork3/pull/37#issuecomment-926041744
2021-09-27Update terminology: `strain` to `sample`Frederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Update the terminology used: use `sample` in place of `strain` according to Zachary's direction at https://github.com/genenetwork/genenetwork3/pull/37#issuecomment-926043306
2021-09-23Add missing dependencies causing pylint to failFrederick Muriuki Muriithi
* Add some dependencies used by the system that were missing in the test environment, leading to the pylint step failing.
2021-09-23Refactor: Move common sample data to separate fileFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Move common sample test data into a separate file where it can be imported from, to prevent pylint error R0801 which proved tricky to silence in any other way.
2021-09-22Fix more pylint errorsFrederick Muriuki Muriithi
2021-09-22Fix typing issuesFrederick Muriuki Muriithi
* Ignore some errors * Update typing definitions for some portions of code * Add missing imports
2021-09-22Fix pylint errorsFrederick Muriuki Muriithi
* Add missing function and module docstrings * Remove unused imports * Fix import order * Rework some code sections to fix issues * Disable some pylint errors.
2021-09-22Update check: Heatmaps need at least 2 itemsFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Update the check to look for at least 2 traits before trying to generate the heatmap.
2021-09-22Return serialized plotly figureFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * gn3/api/heatmaps.py: Serialize the figure to JSON * gn3/heatmaps.py: Return the figure object Serialize the Plotly figure into JSON, and return that, so that it can be used on the client to display the image.
2021-09-20Enable Cross-Origin Resource SharingFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * gn3/api/heatmaps.py: Fix bugs in data parsing * gn3/app.py: enable CORS * gn3/settings.py: add flask-cors configurations * guix.scm: Add flask-cors dependency For easier testing of the heatmaps generation feature, this commit activates the cross-origin resource sharing for all "localhost" origins.
2021-09-20Remove proof-of-concept test codeFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Remove the proof-of-concept CLI-only module that was used to learn how the heatmaps work and identify the appropriate data for use with them.
2021-09-20Return only the dataFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * gn3/api/heatmaps.py: Parse incoming data to build up correct trait names and respond with only the computed heatmap data. * gn3/heatmaps.py: Return only the computed data for heatmaps and clustering. Since GN3 is supposed to handle only the data, and db-access, this commit ensures that GN3 responds to the client with only the computed heatmap data, and does not try to generate the heatmaps themselves. The generation of the heatmaps will be delegated to the UI clients, such as GeneNetwork2.
2021-09-17Fix a number of linting issuesFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi
2021-09-17Return path to generated filename for nowFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * To help with demonstrating that the code is producing the expected output, for now, we return the path to the generated html file that displays the interactive heatmap. At this point, it is mostly useful in the development environment. Moving forward, we might have to actually stream the raw html, or if we can get the Kaleido library packaged for GNU Guix, stream the images binary data instead.
2021-09-17Fix some layout issues and update colorscaleFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Update the plot layouts and size to display the dendrogram and individual chromosome heatmaps side by side * Update the colour scale to begin with the grays rather than absolute black
2021-09-17Create dendrogram to show clustering treeFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Provide the clustering data to be used for the creation of the clustering dendrogram in the final clustered heatmap plot.
2021-09-16Intergrate the heatmap generation with the APIFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Intergrate the heatmap generation code on the /api/heatmaps/clustered endpoint. The endpoint should take a json query of the form: {"traits_names": [ ... ] } where the "traits_name" value is a list of the full names of traits. A sample query to the endpoint could be something like the following: curl -i -X POST "http://localhost:8080/api/heatmaps/clustered" \ -H "Accept: application/json" \ -H "Content-Type: application/json" \ -d '{ "traits_names": [ "UCLA_BXDBXH_CARTILAGE_V2::ILM103710672", "UCLA_BXDBXH_CARTILAGE_V2::ILM2260338", "UCLA_BXDBXH_CARTILAGE_V2::ILM3140576", "UCLA_BXDBXH_CARTILAGE_V2::ILM5670577", "UCLA_BXDBXH_CARTILAGE_V2::ILM2070121", "UCLA_BXDBXH_CARTILAGE_V2::ILM103990541", "UCLA_BXDBXH_CARTILAGE_V2::ILM1190722", "UCLA_BXDBXH_CARTILAGE_V2::ILM6590722", "UCLA_BXDBXH_CARTILAGE_V2::ILM4200064", "UCLA_BXDBXH_CARTILAGE_V2::ILM3140463" ] }' which should respond with a json response containing the raw binary string for the png format and possibly another for the svg format.
2021-09-16Fix minor bugsFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Fix a few minor bugs left over from the integration of code from the proof-of-concept code.
2021-09-16Add missing importsFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Add missing imports that are needed in the code.
2021-09-15Update entry-point function for heatmap generationFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Copy over code from the proof-of-concept implementation and clean it up a little for the entry-point function for heatmap generation via the API
2021-09-15Generate heatmaps in a single plotFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Add a function to generate the heatmaps for each chromosome into a single plot.
2021-09-15Process data into format usable by heatmapsFrederick Muriuki Muriithi
* gn3/heatmaps.py: implement `process_traits_data_for_heatmap` function, that will process the data into a form usable by heatmaps. * tests/unit/test_heatmaps.py: check that the function processes the data into the correct form.
2021-09-15Integrate get_lsr_from_chr functionFrederick Muriuki Muriithi
* gn3/heatmaps.py: copy over function * tests/unit/test_heatmaps.py: add tests Copy function over from proof of concept and add some tests to ensure it works as expected.
2021-09-15Reorganise modulesFrederick Muriuki Muriithi
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * The heatmap generation does not fall cleanly within the computations or db modules. This commit moves it to the higher level gn3 module.
2021-09-15Fix format of arguments and expected valuesFrederick Muriuki Muriithi
* tests/unit/computations/test_heatmap.py: ordering is not longer provided as a list of tuples; the ordering values are just a list of numbers now. This commit updates the test to take this into consideration. * tests/unit/computations/test_qtlreaper.py: the 'Chr' value if numeric, is represented by an actual number, not a string. This commit updates the code to take this into consideration.
2021-09-15Add missing sample file for testsFrederick Muriuki Muriithi
* tests/unit/db/data/genotypes/genotype_sample1.geno: new file Add a missing sample data file needed for unit tests.
2021-09-09Update proof-of-concept codeMuriithi Frederick Muriuki
* Add individual heatmaps * Add dendograms * Merge multiple heatmaps to single plot Updated the proof of concept code to provide a sample of what is needed to generate the appropriate heatmaps. To generate the sample heatmaps, one can run something like: env SQL_URI="mysql://webqtlout:webqtlout@127.0.0.1:3306/db_webqtl" \ python3 qtlfilesexport.py assuming that the database can be accessed at 127.0.0.1:3306
2021-09-08Ease search for traits and chromosomesMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Return a dict of values rather than list for the traits and chromosomes to ease searching through the data.
2021-09-08Parse Chr value as int where possibleMuriithi Frederick Muriuki
* To ease sorting of data by numerical order down the line, sort the "Chr" values by numerical order.
2021-09-08Remove extraneous text to ease sortingMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Change the id from 'T<n>' to simply '<n>' to ease sorting of the trait results by numerical order rather than string order.
2021-09-08Fix the traits order computations for clusteringMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * gn3/computations/heatmap.py: Fix ordering function * tests/unit/computations/test_heatmap.py: update test The order of the traits is important for the clustering algorithm, since the clustering seems to use the distance of one trait from another to determine how to order them. This commit also gets rid of the xoffset argument that is not important to the ordering, and was used in the older GN1 to determine how to draw the clustering lines.
2021-09-06Provide function to organise parsed QTLReaper resultsMuriithi Frederick Muriuki
* gn3/computations/qtlreaper.py: Provide a function to organise the results by trait for easier use down the line. * tests/unit/computations/test_qtlreaper.py: provide a test to ensure that the organising function works as expected.
2021-09-06Leave "Chr" value as string when parsingMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * The "Chr" value seems to be mostly a name of some sort, despite it being, seemingly an number. This commit parses the "Chr" value as a string. It also updates the tests to expec a string, rather than a number for "Chr" values.
2021-09-06Handle type-coercion exceptionsMuriithi Frederick Muriuki
* gn3/computations/qtlreaper.py: handle exceptions Sometimes, the values being parsed are plain strings and cannot be cast to the float types. This commit handles that by casting only those values that can be cast to float, and returning the others as strings.
2021-09-06Find nearest markerMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * Migrate the `web.webqtl.heatmap.Heatmap.getNearestMarker` function in GN1 to GN3.
2021-09-01Fix linting and typing issuesMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi
2021-09-01Built top-level genotype file parsing functionMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * gn3/db/genotypes.py: parse genotype files * tests/unit/db/test_genotypes.py: test parsing is correct Add the overall genotype files parsing function and tests to check that the parsing works as expected.
2021-09-01Parse data lines into markersMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * gn3/db/genotypes.py: parse data lines in file to genetic markers. * tests/unit/db/test_genotypes.py: test that parsing works. Add some tests to check that the parsing of the markers works as expected, and add the code to actually parse the markers.
2021-09-01Parse the genotype file's data headerMuriithi Frederick Muriuki
* gn3/db/genotypes.py: parse data header * tests/unit/db/test_genotypes.py: check that header's parse works correctly. Add tests to check that the parser works as expected. Add code to implement the parsing and pass the tests.
2021-09-01Implement parsing of genotype labelsMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * gn3/db/genotypes.py: parse genotype labels * tests/unit/db/test_genotypes.py: test that genotype labels are parsed correctly As part of parsing the genotype files into usable python data structures, this commit adds a function to parse the label lines (beginning with "@") into the appropriate values.
2021-08-31Fix linting errors, minor bugs and reorganise codeMuriithi Frederick Muriuki
* Fix some linting errors and some minor bugs caught by the linter. Move the `random_string` function to separate module for use in multiple places in the code.
2021-08-31Update `heatmap_data` function: remove extraneous dataMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * gn3/computations/heatmap.py: update function * gn3/db/traits.py: new function Remove extraneous data and arguments from the function. - Load the genotype file - Generate traits file - Provide both raw traits data, and exported traits data in return
2021-08-31Fix testMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * The number of the arguments to the function changed, and so the tests for the function needed to be updated.
2021-08-31Parse QTLReaper outputsMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * gn3/computations/qtlreaper.py: pass output files * tests/unit/computations/data/qtlreaper/main_output_sample.txt: sample test data * tests/unit/computations/data/qtlreaper/permu_output_sample.txt: sample test data * tests/unit/computations/test_qtlreaper.py: add tests Add code to parse the QTLReaper output data files.
2021-08-31Merge branch 'main' of github.com:genenetwork/genenetwork3 into ↵Muriithi Frederick Muriuki
heatmap_generation
2021-08-31Fix bugs with `run_reaper` functionMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * gn3/computations/qtlreaper.py: Fix some bugs * qtlfilesexport.py: Test out running rust-qtlreaper Test out the qtlreaper interface code and fix some bugs caught in the process.
2021-08-31Provide utilities for genotype filesMuriithi Frederick Muriuki
Issue: https://github.com/genenetwork/gn-gemtext-threads/blob/main/topics/gn1-migration-to-gn2/clustering.gmi * gn3/db/genotypes.py: New module * gn3/settings.py: Add new configuration variable * qtlfilesexport.py: Test out new code Add a module containing functions dealing with the genotype files. Add a configuration variable to point to the location of the genotype files.
2021-08-30Update documentation on genotype filesMuriithi Frederick Muriuki
* Provide documentation on downloading and using the genotype files.
2021-08-30Fix issues with traits file formatMuriithi Frederick Muriuki
* README.md: update header: Traits ==> Trait * gn3/computations/qtlreaper.py: update header: Traits ==> Trait * qtlfilesexport.py: Choose only BXD strains Rename the first column header from "Traits" to "Trait" to correspond with what `rust-qtlreaper` expects. Choose only the BXD strains for the proof-of-concept example - this helped bring out the fact that the traits file SHOULD NOT contain a strain column for a strain that does not exist in the genotype file in consideration. If the traits file has a strain column which does not exist in the genotype file, then `rust-qtlreaper` fails with a panic, since, from what I can tell, it tries to get a value from the genotype file for the non-existent strain, which results to a `None` type. Subsequent attempts at running an operation on the `None` type lead to the panic.