Age | Commit message (Collapse) | Author | |
---|---|---|---|
2024-07-03 | Add type-hints to hash_generif_graph. | Munyoki Kilyungi | |
* scripts/index-genenetwork (hash_generif_graph): Add proper type hints. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-07-03 | Refactor how the generif md5 sum is calculated and stored in XAPIAN. | Munyoki Kilyungi | |
* scripts/index-genenetwork (hash_generif_graph): Build the generif checksum by directly building it from the file. (is_data_modified): Update how generif-checksums are verified. (create_xapian_index): Update how generif-checksums are stored in XAPIAN. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-07-03 | Use correct cache for RIF/Wiki entries. | Munyoki Kilyungi | |
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-07-03 | feat: drop intermediate folders when running parallel xapian compact | John Nduli | |
2024-07-03 | feat: add support for parallel xapian compact | John Nduli | |
2024-07-03 | feat: index rif and wiki without positions | John Nduli | |
2024-07-03 | feat: drop common words when building rdf caches | John Nduli | |
2024-07-03 | feat: set 67 parallel processes to run in prod | John Nduli | |
2024-07-03 | fix: remove namespaces since child processes copy the rdf caches | John Nduli | |
2024-07-03 | fix: use correct prefix and index key; group wiki cache query | John Nduli | |
2024-07-03 | feat: add wikidata prefix to search api | John Nduli | |
2024-07-03 | feat: add wikidata indexing | John Nduli | |
2024-07-03 | feat: add global wikicache | John Nduli | |
2024-07-03 | feat: add sparql query to get wikidata | John Nduli | |
2024-06-26 | Increase max number of results to 50000 for Xapian search | zsloan | |
This change needs to be accompanied by a change in GN2! If it's lower than the GN2 MAX_SEARCH_RESULTS value, searches will throw an error. | |||
2024-06-24 | Use dataset Name instead of FullName for indexing | zsloan | |
The Name is generally used as the identifier, while the FullName can container spaces which can cause problems | |||
2024-06-18 | Revert "Set the file path for the logger." | Munyoki Kilyungi | |
This reverts commit b21102bc4ad3678173e7c94d3e66333ec7c1d40a. | |||
2024-06-18 | refactor: drop global variables | John Nduli | |
2024-06-17 | Check table names in Xapian; if not, default to "-1". | Munyoki Kilyungi | |
Without this check, there will always be an error when this script is run with the "is-data-modified" flag should there be no database in the XAPIAN_DIRECTORY. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-17 | Fetch distinct comments. | Munyoki Kilyungi | |
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-14 | fix: typehints in index-genenetwork script | John Nduli | |
2024-06-14 | fix: fix incorrect parameters in index_query function | John Nduli | |
2024-06-12 | Move the generated xapian files to the correct directory. | Munyoki Kilyungi | |
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-12 | Set the file path for the logger. | Munyoki Kilyungi | |
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-12 | Change the date format for the logger. | Munyoki Kilyungi | |
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-12 | Log how long it takes to run the indexing script. | Munyoki Kilyungi | |
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-12 | Check for a running process by viewing the build dir's contents. | Munyoki Kilyungi | |
In the CI build, the actual build is run in the xapian_directory/build, which is seen as the xapian_directory in this script. The CI handles clean up WRT removing files related to the build process. * scripts/index-genenetwork (create_xapian_index): Create the xapian directory if it doesn't exist. If the xapian directory has files, exit. Create the temporary directory inside the xapian_directory. Remove "build_directory.rmdir()" Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-12 | Return 0 if data changes, else exit with 1. | Munyoki Kilyungi | |
* scripts/index-genenetwork (is_data_modified): Replace click.echo with the respective sys.exit call. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-12 | Explicitly pass sparql_uri to script. | Munyoki Kilyungi | |
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-12 | Rework how the indexes are built. | Munyoki Kilyungi | |
Right now, the checks are done in Guix's build expression. This moves that work to the index-genenetwork script. | |||
2024-06-12 | Add method to check the validity of the tables+RDF checksums. | Munyoki Kilyungi | |
* scripts/index-genenetwork (verify_checksums): New function. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-12 | Generate a SHA256 checksum for the generif graph. | Munyoki Kilyungi | |
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-12 | refactor: add db_utils global logger that will be the default | John Nduli | |
2024-06-12 | fix: use current_app's logger to log db errors | John Nduli | |
2024-06-12 | fix: log errors when an exception occurs due to db_utils | John Nduli | |
2024-06-01 | Use global cache to store generif metadata. | Munyoki Kilyungi | |
This global caches has 3,528 entries and there's no expectation for it to grow significantly. Since child processes inherit the parent’s memory, we can pass the global cache to them, reducing fetch times from 0.001s to 0.00001s, significantly boosting performance when indexing the entire database and enriching results with RDF metadata. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-01 | Add geneRIF prefix. | Munyoki Kilyungi | |
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-01 | Add geneRIF to gene index. | Munyoki Kilyungi | |
* scripts/index-genenetwork: Import Template, lru_cache, SPARQLWrapper, JSON (get_rif_metadata): New function. (index_rif_comments): New function. (index_genes): Add rif comments to probeset index. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-05-20 | Delete search endpoints for datasets/publications. | Munyoki Kilyungi | |
* gn3/api/metadata.py: Delete "query_and_frame" import. (search_datasets): Delete. (search_publications): Ditto. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-05-20 | Remove unused variable. | Munyoki Kilyungi | |
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-05-09 | Improve error messaging for use of invalid fahamu token. | Alexander_Kabui | |
2024-05-01 | pep8 formatting | Alexander_Kabui | |
2024-05-01 | Delete debug code | Alexander_Kabui | |
2024-05-01 | Debug: check for if config is loaded in gn3. | Alexander_Kabui | |
2024-05-01 | Fix: fix string formatting error and remove unused imports. | Alexander_Kabui | |
2024-05-01 | Refactoring | Alexander_Kabui | |
*general cleanup for debug code * improve error messaging for successful rating | |||
2024-05-01 | Debug:fix issue use current_app to fetch config | Alexander_Kabui | |
2024-05-01 | Debug Process for LLM_DB_PATH | Alexander_Kabui | |
* this commit is a debugging process for llm_path on cd * issue: writes to db but not correct path | |||
2024-05-01 | Load LLM_DB_PATH as a setting. | Alexander_Kabui | |
2024-05-01 | Add more error info for Database Open error raised | Alexander_Kabui | |