summaryrefslogtreecommitdiff
path: root/issues
diff options
context:
space:
mode:
Diffstat (limited to 'issues')
-rw-r--r--issues/gn-uploader/data-integrity-for-table-ids.gmi39
1 files changed, 39 insertions, 0 deletions
diff --git a/issues/gn-uploader/data-integrity-for-table-ids.gmi b/issues/gn-uploader/data-integrity-for-table-ids.gmi
new file mode 100644
index 0000000..7e9d815
--- /dev/null
+++ b/issues/gn-uploader/data-integrity-for-table-ids.gmi
@@ -0,0 +1,39 @@
+# Data Integrity for Table Identifiers
+
+## Tags
+
+* type: bug
+* status: open
+* assigned: fredm
+* priority: critical
+* keywords: data integrity, mariadb, gn-uploader
+
+## Description
+
+The use of
+
+```
+SELECT MAX(Id) FROM …
+```
+
+to get the identifier for a new row is not very robust, especially for auto-increment fields.
+
+It also leads to problems when we have tables that have multiple identifiers, e.g table 'InbredSet' with identifiers 'Id' and 'InbredSetId'.
+
+Instead, we need to do something like:
+
+```
+︙
+cursor.execute("INSERT INTO InbredSet(…) …")
+new_id = cursor.lastrowid
+cursor.execute("UPDATE InbredSet SET InbredSetId=%s WHERE Id=%s",
+ (new_id, new_id))
+︙
+```
+
+To see the modules that need to be updated, do
+
+```
+$ cd /path/to/gn-uploader/
+$ find ./ -name '*.py' -print0 | xargs --no-run-if-empty --null --max-procs=7 grep 'MAX(Id)'
+```