From 4734eff0a9a517c84cb36d28abda2adb65ba3219 Mon Sep 17 00:00:00 2001 From: rupertoverall Date: Mon, 26 Jun 2023 22:51:40 +0200 Subject: Add idea of data versions. --- api/alternative-API-structure.md | 37 ++++++++++++++++++++++++------------- 1 file changed, 24 insertions(+), 13 deletions(-) (limited to 'api/alternative-API-structure.md') diff --git a/api/alternative-API-structure.md b/api/alternative-API-structure.md index 208ec0e..fc97698 100644 --- a/api/alternative-API-structure.md +++ b/api/alternative-API-structure.md @@ -23,13 +23,23 @@ - There may be several alternative genotypes for a species (e.g. mm9, mm10 or GRCm39 for mouse). - These may also be served in different formats. The content and the format are conceptually different, so they should be handled as such. -- `http://genenetwork.org/api/v_pre1/genotypes/bimbam/BXD` Is wrong. -- `http://genenetwork.org/api/v_pre1/genotypes/BXD.bimbam` Is better. -- `http://genenetwork.org/api/v_pre1/mouse/BXD/genotypes/mm10.bimbam` [**! not a real URL**] Is ideal. - + - `http://genenetwork.org/api/v_pre1/genotypes/bimbam/BXD` Is wrong. + - `http://genenetwork.org/api/v_pre1/genotypes/BXD.bimbam` Is better. + - `http://genenetwork.org/api/v_pre1/mouse/BXD/genotypes/mm10.bimbam` [**! not a real URL**] Is ideal. +- The genotypes are conceptually at the same level as datasets, and could be seen as a special type of dataset. + ### Datasets - Again, why are these not nested under 'population'? +### Data versions (?) +Perhaps another hierarchy level below dataset is needed to accommodate versions/releases. +- For microarray experiments, this would be the processing variant (e.g. PDNN/RMA). +- For genotypes, this would be the genome build. +Each version would be associated with different raw sample data. + +Currently, all versions have unique IDs so they can be (and are) accessed as distinct datasets. +There may be no advantage to making this level of the hierarchy explicit. + ### Sample data - Surely should be nested under dataset? Does this need to be called 'sample_data" rather than just 'data'? @@ -53,15 +63,16 @@ The user should be able to retrieve some sample-level data in one deep query and - genotypes - dataset - dataset_info - - trait - - trait_info - - trait_data - - trait_qtl - - whole-dataset_data_matrix - - whole-dataset_qtl_matrix - -### URL construction -It should be unnecessary to include the class terms 'species', 'genotypes', 'datasets' etc. in the URL. + - versions (?) + - trait + - trait_info + - trait_data + - trait_qtl + - whole-dataset_data_matrix + - whole-dataset_qtl_matrix + +## URL construction +It should be unnecessary to include the class terms 'species', 'populations', 'datasets' etc. in the URL. These levels should be implicit in the nesting structure. If used, these can refer to a listing of all available options. -- cgit v1.2.3