diff options
author | Munyoki Kilyungi | 2024-06-21 19:50:33 +0300 |
---|---|---|
committer | Munyoki Kilyungi | 2024-06-21 19:50:33 +0300 |
commit | eae3d25e056db22faf98da4a6dca016381378138 (patch) | |
tree | 79aaa3d25e96ec9ff05a0440e24dbddcfc1c5938 | |
parent | 02a6d9ebc2e2c160874e2f52412cfc7be67b7231 (diff) | |
download | gn-gemtext-eae3d25e056db22faf98da4a6dca016381378138.tar.gz |
doc: document our current xapian search issues.
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
-rw-r--r-- | issues/rdf/search-indexing-general-issues.gmi | 32 |
1 files changed, 32 insertions, 0 deletions
diff --git a/issues/rdf/search-indexing-general-issues.gmi b/issues/rdf/search-indexing-general-issues.gmi new file mode 100644 index 0000000..3bcc36a --- /dev/null +++ b/issues/rdf/search-indexing-general-issues.gmi @@ -0,0 +1,32 @@ + +# XAPIAN Search General Issues + +* assigned: bonfacem + +## Dataset Search Issues + +The following full dataset name search yields no results + +> dataset:"BXD Published Phenotypes" + +In the indexer, we index the dataset name using "index_text" + +> index_dataset = lambda dataset: termgenerator.index_text(dataset, 0, "XDS") + +Yet in the search, we use a boolean prefix: + +> queryparser.add_boolean_prefix("dataset", "XDS") + +Currently to be able to do a search for "BXD Published Phenotypes", one would have to do: + +> dataset:bxd dataset:published dataset:phenotypes + +Note that the search is in all lower-case. The reason for this is that we have: + +> queryparser.set_stemming_strategy(queryparser.STEM_SOME) + +A fix for this would be to replace "add_boolean_prefix" with "add_prefix". + +## CIS/TRANS Searches + +The challenge with this search is that we would have to compare valuse for each possible result against one another, necessitating the generation of position values separately for every possible result. Also, for the devs (jnduli, bonfacem) we need to have a better understanding of how this work, which is currently vague. |