Age | Commit message (Collapse) | Author |
|
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
* scripts/index-genenetwork (is_data_modified): Provide directory
instead of specific ttl file.
(create_xapian_index): Ditto.
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
* scripts/index-genenetwork (hash_rdf_graph): Remove check for the
turtle directory.
(is_data_modified): Ditto.
(create_xapian_index): Ditto.
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
* scripts/index-genenetwork (hash_generif_graph): Rename to
hash_rdf_graph. Generate a checksum of all the turtle files inside
the ttl directory that's the basis for the GN virtuoso graph.
(create_xapian_index): Rename hash_generif_graph -> hash_rdf_graph.
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
* scripts/index-genenetwork (hash_generif_graph): Add proper type
hints.
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
* scripts/index-genenetwork (hash_generif_graph): Build the generif
checksum by directly building it from the file.
(is_data_modified): Update how generif-checksums are verified.
(create_xapian_index): Update how generif-checksums are stored in
XAPIAN.
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The Name is generally used as the identifier, while the FullName can container spaces which can cause problems
|
|
This reverts commit b21102bc4ad3678173e7c94d3e66333ec7c1d40a.
|
|
|
|
Without this check, there will always be an error when this script is
run with the "is-data-modified" flag should there be no database in
the XAPIAN_DIRECTORY.
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
|
|
|
|
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
In the CI build, the actual build is run in the
xapian_directory/build, which is seen as the xapian_directory in this
script. The CI handles clean up WRT removing files related to the
build process.
* scripts/index-genenetwork (create_xapian_index): Create the xapian
directory if it doesn't exist. If the xapian directory has files,
exit. Create the temporary directory inside the xapian_directory.
Remove "build_directory.rmdir()"
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
* scripts/index-genenetwork (is_data_modified): Replace click.echo
with the respective sys.exit call.
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
Right now, the checks are done in Guix's build expression. This moves
that work to the index-genenetwork script.
|
|
* scripts/index-genenetwork (verify_checksums): New function.
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
This global caches has 3,528 entries and there's no expectation for it
to grow significantly. Since child processes inherit the parent’s
memory, we can pass the global cache to them, reducing fetch times
from 0.001s to 0.00001s, significantly boosting performance when
indexing the entire database and enriching results with RDF metadata.
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
* scripts/index-genenetwork: Import Template, lru_cache,
SPARQLWrapper, JSON
(get_rif_metadata): New function.
(index_rif_comments): New function.
(index_genes): Add rif comments to probeset index.
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
|
|
* scripts/index-genenetwork (main): Write table checksums into index.
|
|
* scripts/index-genenetwork (SQLTableClause): New variable.
(genes_query, phenotypes_query): Express tables using SQLTableClause.
(serialize_sql): Serialize SQLTableClause.
|
|
* scripts/index-genenetwork (write_document, index_query): Fold long lines.
|
|
* scripts/index-genenetwork (main): Ensure no other indexing job is running.
|
|
Make the directory at the given path, and any intermediate ones to avoid
errors in the indexing code when the directory, or its parent(s) do not exist.
|
|
There is need to run external scripts using the same configurations as the
application but without the need to couple the script to the application.
In this case, we provide the needed configuration directly in the CLI, and
modify the existing `gn3.db_utils.database_connection` function to allow it to
work coupled to the app or otherwise.
|
|
* scripts/index-genenetwork (worker_queue): Set default number of workers to 1
if the number of CPUs cannot be determined.
|
|
* scripts/index-genenetwork: Import Callable, Generator, Iterable and List
from typing. Type hint all functions.
|
|
* scripts/index-genenetwork: New file.
* setup.py (install_requires): Add click, pymonad and xapian-bindings.
(scripts): Add scripts/index-genenetwork.
|