summaryrefslogtreecommitdiff
path: root/issues/pre-cache-datasets.gmi
blob: 6f1c165e4915690b39ffc7af6965c73007ca0d23 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# Pre-Cache Datasets

## Tags

* assigned:
* priority: medium
* type: enhancement
* status: open
* keywords: cache, optimisation

## Description

To improve the performance of the system when running computations (correlations, mappings, etc), we need to pre-cache the datasets in text files.

The triggers for pre-caching could be:
* creation of a new dataset
* changes in data for (a) trait(s) in the dataset
* changes in sample list for a dataset

I propose an external job be triggered whenever any of the triggers above happen. The job could, among other things:
* Delete existing cache file
* Create new cache file with new data

Maybe, if possible, we could have the pre-cache service generate the cache files based on dates for latest changes without deleting older cache files -- we could look into whether this is possible.

## Related Issues

=> /issues/sample-data-caching-problem Sample Data Caching Bug