From 08d350bab237ece0ee2d71ca561d334b7c5d832b Mon Sep 17 00:00:00 2001 From: Arun Isaac Date: Tue, 14 Mar 2023 20:01:42 +0000 Subject: Add brca search issue. --- issues/search-for-brca.gmi | 10 ++++++++++ 1 file changed, 10 insertions(+) create mode 100644 issues/search-for-brca.gmi (limited to 'issues') diff --git a/issues/search-for-brca.gmi b/issues/search-for-brca.gmi new file mode 100644 index 0000000..c42c745 --- /dev/null +++ b/issues/search-for-brca.gmi @@ -0,0 +1,10 @@ +# Search for brca + +* assigned: arun + +Search for brca does not return results for brca1 and brca2. It should. +=> https://cd.genenetwork.org/gsearch?type=gene&terms=brca + +The xapian stemmer does not stem brca1 to brca. That's why when one searches for brca, results for brca1 are not returned. + +Perhaps we should write a custom stemmer that stems brca1 to brca. But, at the same time, we should be wary of stemming terms like p450 to p. Pjotr suggests the heuristic that we look for at least 2 or 3 alphabetic characters at the beginning. Another approach is to hard-code a list of candidates to look for. -- cgit v1.2.3