summaryrefslogtreecommitdiff
path: root/issues
diff options
context:
space:
mode:
authorMunyoki Kilyungi2023-03-31 16:00:03 +0300
committerMunyoki Kilyungi2023-04-03 10:48:05 +0300
commit95614b282d8bc201161282c053ac406e2558dd76 (patch)
tree70dadb5f54d0809cb40a21dc8aa8b234cd5a4e33 /issues
parent222fc5196623f177db77d00439dcbb367ed69b7b (diff)
downloadgn-gemtext-95614b282d8bc201161282c053ac406e2558dd76.tar.gz
List broken utf-8 characters during genewiki dump
Signed-off-by: Munyoki Kilyungi <me@bonfacemunyoki.com>
Diffstat (limited to 'issues')
-rw-r--r--issues/dump-genewiki-metadata.gmi18
1 files changed, 18 insertions, 0 deletions
diff --git a/issues/dump-genewiki-metadata.gmi b/issues/dump-genewiki-metadata.gmi
index 398fb67..a665dee 100644
--- a/issues/dump-genewiki-metadata.gmi
+++ b/issues/dump-genewiki-metadata.gmi
@@ -43,3 +43,21 @@ To query these entries:
```
SELECT * FROM GeneRIF_BASIC WHERE symbol = 'NEWENTRY'\G
```
+
+* Broken UTF-8 character sets that rapper errored out on and that had to be manually fixed. Here's a list:
+
+```
+'(("\x28" . "")
+ ("\x29" . "")
+ ("\xa0" . " ")
+ ("â\x81„" . "/")
+ ("â€\x9d" . #\")
+ ("’" . #\')
+ ("\x02" . "")
+ ("\x01" . "")
+ ("β" . "β")
+ ("α-Â\xad" . "α")
+ ("Â\xad" . "")
+ ("α" . "α")
+ ("–" . "-"))
+```