- github Document reduction issue
The small test database (2GB)
The default install comes with a smaller database which includes a number of the BSD's and the Human liver dataset (GSE9588).
GeneNetwork database
Estimated table sizes
select tablename,round(((datalength + indexlength) / 1024 / 1024), 2) `Size in MB` from informationschema.TABLES where tableschema = "dbwebqtl" order by datalength;
table_name | Size in MB |
ProbeSetData SnpAll ProbeData SnpPattern ProbeSetSE QuickSearch ProbeSetXRef LCorrRamin3 ProbeSE ProbeSet Probe GenoData CeleraINFO_mm6 pubmedsearch ProbeXRef GeneRIF_BASIC BXDSnpPosition EnsemblProbe EnsemblProbeLocation Genbank TissueProbeSetData AccessLog GeneList Geno MachineAccessLog IndelAll PublishData TissueProbeSetXRef ProbeH2 GenoXRef TempData GeneList_rn3 GORef Phenotype temporary InfoFiles Publication Homologene Datasets GeneList_rn33 PublishSE GeneRIF Vlookup H2 PublishXRef NStrain IndelXRef Strain GeneMap_cuiyan user_collection CaseAttributeXRef StrainXRef GeneIDXRef Docs News ProbeSetFreeze GeneRIFXRef Sample login user TableFieldAnnotation DatasetMapInvestigator User ProbeFreeze TableComments Investigators DBList Tissue GeneChip GeneCategory SampleXRef InbredSet SnpAllele_to_be_deleted Organizations PublishFreeze GenoFreeze Chr_Length SnpSource AvgMethod Species Dataset_mbat TissueProbeFreeze EnsemblChip TissueProbeSetFreeze UserPrivilege CaseAttribute MappingMethod DBType InfoFilesUser_md5 GenoCode DatasetStatus GeneChipEnsemblXRef GenoSE user_openids roles_users role Temp |
59358.80 15484.67 22405.44 9177.05 14551.02 5972.86 4532.89 18506.53 6263.83 2880.21 2150.30 3291.91 989.80 1032.50 743.38 448.54 224.44 133.66 105.49 37.71 74.42 42.38 34.11 33.90 28.34 22.42 22.54 14.73 13.26 22.83 8.35 5.54 4.97 6.50 3.59 3.32 3.42 5.69 2.31 2.61 4.71 2.18 1.87 2.18 2.18 4.80 2.91 1.07 0.51 0.30 0.44 0.56 0.77 0.17 0.17 0.22 0.24 0.06 0.06 0.04 0.05 0.05 0.04 0.06 0.02 0.02 0.03 0.02 0.01 0.01 0.01 0.01 0.00 0.01 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 NULL |
97 rows in set, 1 warning (0.01 sec)
All *Data tables are large
User access
According to the meta data:
This table tracks access time and IP addresses. Used for logging in registered users and tracking cookies.
select * from AccessLog limit 5;
id | accesstime | ip_address |
12174 12173 3 4 5 |
2003-10-28 02:17:41 2003-10-28 02:16:27 2003-02-22 07:38:33 2003-02-22 07:49:13 2003-02-22 07:51:08 |
130.120.104.71 130.120.104.71 192.117.159.1 192.117.159.1 192.117.159.1 |
select * from AccessLog order by accesstime desc limit 5;
id | accesstime | ip_address |
1025735 1025734 1025733 1025732 1025731 |
2016-02-08 14:23:29 2016-02-08 13:54:28 2016-02-08 13:43:37 2016-02-08 13:39:50 2016-02-08 13:15:46 |
100.43.81.157 180.76.15.144 66.249.65.217 66.249.65.217 66.249.65.217 |
Quite a few trait page hits:
select count(*) from AccessLog;
count(*) |
1025685 |
show indexes from AccessLog;
Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
AccessLog | 0 | PRIMARY | 1 | id | A | 1025685 | NULL | NULL | BTREE |
This table is being used by both GN1 and GN2 from the trait pages!
grep -ir AccessLog *|grep -e "^gn1\|^gn2"|grep \.py|grep -v doc
gn1/web/webqtl/showTrait/ShowTraitPage.py: query = "SELECT count(id) FROM AccessLog WHERE ipaddress = %s and \ gn1/web/webqtl/showTrait/ShowTraitPage.py: self.cursor.execute("insert into AccessLog(accesstime,ipaddress) values(Now(),%s)" ,userip) gn1/web/webqtl/textUI/cmdClass.py: query = """SELECT count(id) FROM AccessLog WHERE ipaddress = %s AND UNIXTIMESTAMP()-UNIXTIMESTAMP(accesstime)<86400""" gn1/web/webqtl/textUI/cmdClass.py: query = """INSERT INTO AccessLog(accesstime,ipaddress) values(Now(),%s)""" gn2/wqflask/wqflask/showtrait/showtraitpage.py: query = "SELECT count(id) FROM AccessLog WHERE ipaddress = %s and \ gn2/wqflask/wqflask/showtrait/showtraitpage.py: self.cursor.execute("insert into AccessLog(accesstime,ipaddress) values(Now(),%s)", userip)
When looking at the code in GN1 and GN2 it restricts the daily use of the trait data page (set to 1,000 - whoever reaches that?). Unlike mentioned in the schema description, this table does not keep track of cookies.
From the code it looks like GN2 uses a mixture of Redis and sqlalchemy to keep track of logged in sessions (see gn2/wqflask/wqflask/usermanager.py) and cookies through a useruuid in model.py.
In gn2/wqflask/wqflask/templates/collections/viewanonymous.html it showtraitpage appears to be loaded (need to check).
AvgMethod
Probesetfreeze refers to AvgMethod
BXDSnPosition
Snp table (all snps)
Mapping in GN1 shows snps when you select a chromosome.
CaseAttribute(XRef)
Metadata
CeleralINFOmm6
?
ChrLength
Default mm9, column for mm8
Datasetmbat
Menu for BXD (linkouts)
DatasetMapInvestigator
Arthur?
DataSets
Information/metadata
DatasetStatus
Arthur private/public
DBList and DBType
Hooked in API (URL encoding)
Docs
GN2 only (see menu bar)
Ensembl*
Probe information
(will be deprecated)
Genbank
Linkout and not important
GeneCategory
Not important. GeneWiki notes function classification.
Deprecate.
GeneChip
GeneIDXRef
Interspecies gene comparison
GeneList
Track info
Genlistrn3(3)
Rat list
GeneMapcuiyan
Link outs
GeneRIF
Wiki info (nightly updated from NCBI)
XRef should be foreign keys
Geno
SNP or marker info
GenoCode
Belongs to someone else
GenoData
Allele info
GenoFreeze
Big menu (Freeze refers to menu)
GenoSE
SE standard err, not used
GenoXREF
Very important. Key links between Geno, GenoData
GORef
GO terms
H2
Heritability for probeset(?)
Homologene
Homology, not used much
InbredSet
Group in menu
Indelall, SnpAll, SnpPattern, SnpSource
Indel Snp browser (variant browser Gn1)
Info*
Infra system PhP
Data Info button
Infosystem users has separate entries
Also Investigators, User, Organizations,
LCorrRamin3
Lit. Correlations Prof. Ramin
Login
GN2 login info
MachineAccessLog
Old
MappingMethod
GN1
News
GN2
NStrain
pheno publishfreeze (menu) xref (keys) xref links to publish (pubmed), phenotype, pubishdata geno genofreeze xref (keys) xref links to publish (pubmed), genotype, genodata probeset/expr. probesetfreeze xref (keys) xref links to publish (pubmed), probeset, probesetdata probe/expr. probefreeze xref (keys) xref links to publish (pubmed), probe, probedata
Each dataset has 3 values (real value (1), number of samples (2), stderr (3))
NStrain = number of phenotype samples
ProbesetFreeze contains all data, incl. metabolomic.
Phenotype
This table contains names, full descriptions, and short symbols for traits and phenotype used primarily in the Published Phenotypes databases.
Contains 10k rows, March 2016, of which 5000 are for the BXDs).
Id | Prepublicationdescription | Postpublicationdescription | Originaldescription | Units | Prepublicationabbreviation | Postpublicationabbreviation | Labcode | Submitter | Owner | AuthorizedUsers |
1 2 3 4 5 |
NULL NULL NULL NULL NULL |
Hippocampus weight Cerebellum weight Interleukin 1 activity by peritoneal macrophages stimulated with 10 ug/ml lipopolysaccharide [units/100 ug protein] Central nervous system, morphology: Cerebellum weight, whole, bilateral in adults of both sexes [mg] The coat color of 79 BXD RI strain |
Original post publication description: Hippocampus weight Original post publication description: Cerebellum weight Original post publication description: Interleukin 1 activity by peritoneal macrophages stimulated with 10 ug/ml lipopolysaccharide [units/100 ug protein] Original post publication description: Cerebellum weight [mg] Original post publication description: The coat color of 79 BXD RI strain |
Unknown mg units/100 ug protein mg Unknown |
NULL NULL NULL NULL NULL |
HPCWT CBLWT IL1Activity CBLWT2 CoatColor |
NULL NULL NULL NULL NULL |
robwilliams robwilliams robwilliams robwilliams robwilliams |
NULL NULL NULL NULL NULL |
robwilliams robwilliams robwilliams robwilliams robwilliams |
5 rows in set (0.00 sec)
ProbeData
Table with fine-grained probe level Affymetrix data only. Contains 1 billion rows March 2016. This table may be deletable since it is only used by the Probe Table display in GN1. Not used in GN2 (double-check).