From e1cfcced925e5f290c67b0194817b1af907ddeb6 Mon Sep 17 00:00:00 2001
From: Frederick Muriuki Muriithi
Date: Tue, 16 Apr 2024 11:58:51 +0300
Subject: New issue: Data integrity issue with table identifiers

---
 .../gn-uploader/data-integrity-for-table-ids.gmi   | 39 ++++++++++++++++++++++
 1 file changed, 39 insertions(+)
 create mode 100644 issues/gn-uploader/data-integrity-for-table-ids.gmi

(limited to 'issues/gn-uploader')

diff --git a/issues/gn-uploader/data-integrity-for-table-ids.gmi b/issues/gn-uploader/data-integrity-for-table-ids.gmi
new file mode 100644
index 0000000..7e9d815
--- /dev/null
+++ b/issues/gn-uploader/data-integrity-for-table-ids.gmi
@@ -0,0 +1,39 @@
+# Data Integrity for Table Identifiers
+
+## Tags
+
+* type: bug
+* status: open
+* assigned: fredm
+* priority: critical
+* keywords: data integrity, mariadb, gn-uploader
+
+## Description
+
+The use of
+
+```
+SELECT MAX(Id) FROM …
+```
+
+to get the identifier for a new row is not very robust, especially for auto-increment fields.
+
+It also leads to problems when we have tables that have multiple identifiers, e.g table 'InbredSet' with identifiers 'Id' and 'InbredSetId'.
+
+Instead, we need to do something like:
+
+```
+︙
+cursor.execute("INSERT INTO InbredSet(…) …")
+new_id = cursor.lastrowid
+cursor.execute("UPDATE InbredSet SET InbredSetId=%s WHERE Id=%s",
+               (new_id, new_id))
+︙
+```
+
+To see the modules that need to be updated, do
+
+```
+$ cd /path/to/gn-uploader/
+$ find ./ -name '*.py' -print0 | xargs --no-run-if-empty --null --max-procs=7 grep 'MAX(Id)'
+```
-- 
cgit v1.2.3