From 2e2e1b6a472c5453b5cf37cdb463ec1bcb29f446 Mon Sep 17 00:00:00 2001 From: Arun Isaac Date: Mon, 10 Oct 2022 14:08:23 +0530 Subject: Comment on xapian scalability. --- topics/xapian-index-building-scalability.svg | 299 +++++++++++++++++++++++++++ topics/xapian-scalability.gmi | 18 ++ 2 files changed, 317 insertions(+) create mode 100644 topics/xapian-index-building-scalability.svg create mode 100644 topics/xapian-scalability.gmi diff --git a/topics/xapian-index-building-scalability.svg b/topics/xapian-index-building-scalability.svg new file mode 100644 index 0000000..5452117 --- /dev/null +++ b/topics/xapian-index-building-scalability.svg @@ -0,0 +1,299 @@ + + + +Gnuplot +Produced by GNUPLOT 5.4 patchlevel 4 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 0.5 + + + + + + + + + + + + + 0.51 + + + + + + + + + + + + + 0.52 + + + + + + + + + + + + + 0.53 + + + + + + + + + + + + + 0.54 + + + + + + + + + + + + + 0.55 + + + + + + + + + + + + + 0.56 + + + + + + + + + + + + + 0.57 + + + + + + + + + + + + + 0.58 + + + + + + + + + + + + + 0.59 + + + + + + + + + + + + + 0.6 + + + + + + + + + + + + + 10000 + + + + + + + + + + + + + 100000 + + + + + + + + + + + + + 1x106 + + + + + + + + + + + + + 1x107 + + + + + + + + + Time (in ms) + + + + + Index size (in number of documents) + + + + + 'xapian-times' using 1:(1000*$2/$1) + + + + + + + + + + + + + + + + + + + + + + + + Time per document to build an index of various sizes + + + + + + + diff --git a/topics/xapian-scalability.gmi b/topics/xapian-scalability.gmi new file mode 100644 index 0000000..f6eb768 --- /dev/null +++ b/topics/xapian-scalability.gmi @@ -0,0 +1,18 @@ +# Xapian scalability + +As the index grows larger, Xapian takes longer to insert new documents. Shown below is the time (in seconds) taken to build indices of various sizes (in number of documents). + +* 10k: 5.0009455639999985 +* 20k: 10.747144626 +* 40k: 21.052039352999998 +* 80k: 42.342509834 +* 160k: 87.152875767 +* 320k: 176.353327516 +* 640k: 353.9213599920001 +* 1280k: 727.506412363 +* 2560k: 1494.2583154410001 +* 5120k: 3037.993937756 + +Notice that it takes 607x, not 512x, more time to build the 5120k index than it takes to build the 10k index. In terms of time, the 10k index takes on average 0.5 ms per document while the 5120k index takes on average 0.59 ms per document. We show this graphically below. + +=> xapian-index-building-scalability.svg -- cgit v1.2.3