Age | Commit message (Collapse) | Author | |
---|---|---|---|
2024-07-23 | fix: resolve duplicate errors when updating data | John Nduli | |
2024-07-23 | refactor: clean query for insert | John Nduli | |
2024-07-23 | refactor: clean up insert query | John Nduli | |
2024-07-23 | refactor: reorganize update_rif script to be more pythonic | John Nduli | |
2024-07-23 | chore: fix pylint errors | John Nduli | |
2024-07-23 | refactor: rename addRIf to update_rif_table.py | John Nduli | |
2024-07-12 | Rename hash_rdf_graph -> md5hash_ttl_dir. | Munyoki Kilyungi | |
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-07-12 | Use correct ttl-dir path when generating checksums. | Munyoki Kilyungi | |
* scripts/index-genenetwork (is_data_modified): Provide directory instead of specific ttl file. (create_xapian_index): Ditto. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-07-12 | fix: remove .py extension for addRif to prevent pylint checks | John Nduli | |
2024-07-12 | refactor: fix mypy and pylint errors | John Nduli | |
2024-07-12 | feat: copy addRif script from genenetwork1 | John Nduli | |
original: https://github.com/genenetwork/genenetwork1/blob/master/web/webqtl/maintainance/addRif.py Included some changes to make it python3 compatible | |||
2024-07-08 | Pass output directory to R/qtl script instead of pulling it from the | zsloan | |
environment Also fixes issue where the control marker keyword was wrong | |||
2024-07-05 | fix: return query error message from xapian | John Nduli | |
2024-07-03 | Return a "-1" if the turtle directory does not exist. | Munyoki Kilyungi | |
* scripts/index-genenetwork (hash_rdf_graph): Remove check for the turtle directory. (is_data_modified): Ditto. (create_xapian_index): Ditto. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-07-03 | Generate a checksum for all the ttl files. | Munyoki Kilyungi | |
* scripts/index-genenetwork (hash_generif_graph): Rename to hash_rdf_graph. Generate a checksum of all the turtle files inside the ttl directory that's the basis for the GN virtuoso graph. (create_xapian_index): Rename hash_generif_graph -> hash_rdf_graph. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-07-03 | Add type-hints to hash_generif_graph. | Munyoki Kilyungi | |
* scripts/index-genenetwork (hash_generif_graph): Add proper type hints. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-07-03 | Refactor how the generif md5 sum is calculated and stored in XAPIAN. | Munyoki Kilyungi | |
* scripts/index-genenetwork (hash_generif_graph): Build the generif checksum by directly building it from the file. (is_data_modified): Update how generif-checksums are verified. (create_xapian_index): Update how generif-checksums are stored in XAPIAN. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-07-03 | Use correct cache for RIF/Wiki entries. | Munyoki Kilyungi | |
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-07-03 | feat: drop intermediate folders when running parallel xapian compact | John Nduli | |
2024-07-03 | feat: add support for parallel xapian compact | John Nduli | |
2024-07-03 | feat: index rif and wiki without positions | John Nduli | |
2024-07-03 | feat: drop common words when building rdf caches | John Nduli | |
2024-07-03 | feat: set 67 parallel processes to run in prod | John Nduli | |
2024-07-03 | fix: remove namespaces since child processes copy the rdf caches | John Nduli | |
2024-07-03 | fix: use correct prefix and index key; group wiki cache query | John Nduli | |
2024-07-03 | feat: add wikidata prefix to search api | John Nduli | |
2024-07-03 | feat: add wikidata indexing | John Nduli | |
2024-07-03 | feat: add global wikicache | John Nduli | |
2024-07-03 | feat: add sparql query to get wikidata | John Nduli | |
2024-06-26 | Increase max number of results to 50000 for Xapian search | zsloan | |
This change needs to be accompanied by a change in GN2! If it's lower than the GN2 MAX_SEARCH_RESULTS value, searches will throw an error. | |||
2024-06-24 | Use dataset Name instead of FullName for indexing | zsloan | |
The Name is generally used as the identifier, while the FullName can container spaces which can cause problems | |||
2024-06-18 | Revert "Set the file path for the logger." | Munyoki Kilyungi | |
This reverts commit b21102bc4ad3678173e7c94d3e66333ec7c1d40a. | |||
2024-06-18 | refactor: drop global variables | John Nduli | |
2024-06-17 | Check table names in Xapian; if not, default to "-1". | Munyoki Kilyungi | |
Without this check, there will always be an error when this script is run with the "is-data-modified" flag should there be no database in the XAPIAN_DIRECTORY. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-17 | Fetch distinct comments. | Munyoki Kilyungi | |
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-14 | fix: typehints in index-genenetwork script | John Nduli | |
2024-06-14 | fix: fix incorrect parameters in index_query function | John Nduli | |
2024-06-12 | Move the generated xapian files to the correct directory. | Munyoki Kilyungi | |
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-12 | Set the file path for the logger. | Munyoki Kilyungi | |
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-12 | Change the date format for the logger. | Munyoki Kilyungi | |
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-12 | Log how long it takes to run the indexing script. | Munyoki Kilyungi | |
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-12 | Check for a running process by viewing the build dir's contents. | Munyoki Kilyungi | |
In the CI build, the actual build is run in the xapian_directory/build, which is seen as the xapian_directory in this script. The CI handles clean up WRT removing files related to the build process. * scripts/index-genenetwork (create_xapian_index): Create the xapian directory if it doesn't exist. If the xapian directory has files, exit. Create the temporary directory inside the xapian_directory. Remove "build_directory.rmdir()" Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-12 | Return 0 if data changes, else exit with 1. | Munyoki Kilyungi | |
* scripts/index-genenetwork (is_data_modified): Replace click.echo with the respective sys.exit call. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-12 | Explicitly pass sparql_uri to script. | Munyoki Kilyungi | |
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-12 | Rework how the indexes are built. | Munyoki Kilyungi | |
Right now, the checks are done in Guix's build expression. This moves that work to the index-genenetwork script. | |||
2024-06-12 | Add method to check the validity of the tables+RDF checksums. | Munyoki Kilyungi | |
* scripts/index-genenetwork (verify_checksums): New function. Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-12 | Generate a SHA256 checksum for the generif graph. | Munyoki Kilyungi | |
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com> | |||
2024-06-12 | refactor: add db_utils global logger that will be the default | John Nduli | |
2024-06-12 | fix: use current_app's logger to log db errors | John Nduli | |
2024-06-12 | fix: log errors when an exception occurs due to db_utils | John Nduli | |