aboutsummaryrefslogtreecommitdiff
path: root/scripts/index-genenetwork
AgeCommit message (Collapse)Author
2024-06-12Add method to check the validity of the tables+RDF checksums.Munyoki Kilyungi
* scripts/index-genenetwork (verify_checksums): New function. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2024-06-12Generate a SHA256 checksum for the generif graph.Munyoki Kilyungi
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2024-06-01Use global cache to store generif metadata.Munyoki Kilyungi
This global caches has 3,528 entries and there's no expectation for it to grow significantly. Since child processes inherit the parent’s memory, we can pass the global cache to them, reducing fetch times from 0.001s to 0.00001s, significantly boosting performance when indexing the entire database and enriching results with RDF metadata. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2024-06-01Add geneRIF to gene index.Munyoki Kilyungi
* scripts/index-genenetwork: Import Template, lru_cache, SPARQLWrapper, JSON (get_rif_metadata): New function. (index_rif_comments): New function. (index_genes): Add rif comments to probeset index. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
2023-05-31scripts: Write table checksums into index.Arun Isaac
* scripts/index-genenetwork (main): Write table checksums into index.
2023-05-31scripts: Introduce SQLTableClause.Arun Isaac
* scripts/index-genenetwork (SQLTableClause): New variable. (genes_query, phenotypes_query): Express tables using SQLTableClause. (serialize_sql): Serialize SQLTableClause.
2023-05-31scripts: Fold long lines.Arun Isaac
* scripts/index-genenetwork (write_document, index_query): Fold long lines.
2023-05-31scripts: Ensure only one indexing job may run at a time.Arun Isaac
* scripts/index-genenetwork (main): Ensure no other indexing job is running.
2023-05-22Make directory at "path" and all intermediate ones.Frederick Muriuki Muriithi
Make the directory at the given path, and any intermediate ones to avoid errors in the indexing code when the directory, or its parent(s) do not exist.
2023-04-05Enable use of `database_connection` in scripts without current_appFrederick Muriuki Muriithi
There is need to run external scripts using the same configurations as the application but without the need to couple the script to the application. In this case, we provide the needed configuration directly in the CLI, and modify the existing `gn3.db_utils.database_connection` function to allow it to work coupled to the app or otherwise.
2023-02-13scripts: Fallback to 1 worker when indexing.Arun Isaac
* scripts/index-genenetwork (worker_queue): Set default number of workers to 1 if the number of CPUs cannot be determined.
2023-02-13scripts: Type hint xapian indexing script.Arun Isaac
* scripts/index-genenetwork: Import Callable, Generator, Iterable and List from typing. Type hint all functions.
2022-10-18Add xapian indexing script.Arun Isaac
* scripts/index-genenetwork: New file. * setup.py (install_requires): Add click, pymonad and xapian-bindings. (scripts): Add scripts/index-genenetwork.