You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

8.8 KiB

Integration guide/process

Here I'll be documenting the process of replacing the SQL queries in GN2's, including creating the proxy resource types, and populating Redis with resource entries.


These are all taken from

Probe and ProbeSet queries

This query is used twice, starting at line 78, then 97. Line 78 and 97

query1 = """SELECT Probe.Sequence, Probe.Name
            FROM Probe, ProbeSet, ProbeSetFreeze, ProbeSetXRef
            WHERE ProbeSetXRef.ProbeSetFreezeId = ProbeSetFreeze.Id AND
                  ProbeSetXRef.ProbeSetId = ProbeSet.Id AND
                  ProbeSetFreeze.Name = '%s' AND
                  ProbeSet.Name = '%s' AND
                  Probe.ProbeSetId = ProbeSet.Id order by Probe.SerialOrder"""
            % (,

To define a proxy resource type to take the place of this, we need to define an action set, and associate it with the resource type name in the resource-types hashmap in resource.rkt. For convenience it's also good to have a constructor function to create new instances of the resource in Racket, though it's also possible to directly craft the JSON and put it in Redis.

First we need to translate this Python query into Racket. Racket's db library provides a simple and effective way (though not very secure) of constructing queries from strings akin to Python's string format method.

The function for running the above query in Racket is as follows:

(define (select-probe dataset-name trait-name)
    (query-row (mysql-conn)
               "SELECT Probe.Sequence, Probe.Name
                FROM Probe, ProbeSet, ProbeSetFreeze, ProbeSetXRef
                WHERE ProbeSetXRef.ProbeSetFreezeId = ProbeSetFreeze.Id AND
                      ProbeSetXRef.ProbeSetId = ProbeSet.Id AND
                      ProbeSetFreeze.Name = ? AND
                      ProbeSet.Name = ? AND
                      Probe.ProbeSetId = ProbeSet.Id order by Probe.SerialOrder"

dataset-name corresponds to in the Python version, trait-name to query-row takes a database connection, a query string, and a number of values to interpolate into the string. Question marks in the query string are replaced with those values.

mysql-conn is a Racket parameter, defined in db.rkt, and can be viewed as a function that returns the global database connection, though it can be changed on a per-thread basis – nothing we need to worry about quite yet!

sql-result->json is a function defined in resource.rkt that transforms the query result into a JSON array, transforming SQL nulls into JSON nulls. In the future, we may have to extend this function to transform other SQL types into JSON, but it's enough for now.

select-probe is just a function, now we need to make it into an action. More specifically, we need to define a value of the action struct (defined in privileges.rkt) that tells the proxy how the function should be used in the context of a resource.

An action consists of a function that takes two parameters, data and params, where data is whatever is required from the resource itself to perform an action, and params is whatever is required from the user. Actions also have a name, and a list of symbols that describe what params are required.

In this case, the action is only retrieving data from the database, so no params are required. The definition of the view-probe action is:

(define view-probe
  (action "view"  ; the name
          (lambda (data     ;
                   params)  ; the wrapped function
            (select-probe (hash-ref data 'dataset)
                          (hash-ref data 'trait)))
          '()))   ; the empty params list

Even though no user-provided parameters are required, the wrapped function still needs to take two arguments for now. This action is simple, and just extracts the two strings from the data provided.

Next we need to create an "action set" that contains this action. An action set defines the access control structure for a resource type. An action set can be viewed as a tree that only branches at the root, where each branch is a list of actions, each requiring a higher level of access to execute than the last.

In this case, our action set consists only of one branch, and we really have only one action. However, to allow for limited access, we need a "no access" action that precedes the database query action. This action is already defined in resource.rkt as no-access-action, and can be used in any action branch without modification.

(define dataset-probe-data
  (list (cons "no-access" no-access-action)
        (cons "view" view-probe)))

(define dataset-probe-actions
  (hasheq 'data dataset-probe-data))

dataset-probe-data is the branch, dataset-probe-actions is the action set. As can be seen, the action set is simply a structure that defines the shape of the access control on a resource type, as a sort of dictionary of dictionaries.

With the action set defined, it must be added to the resource-types hashmap, so the proxy knows to associate the new resource type with this action set. All that's needed is to add 'dataset-probe dataset-probe actions to resource-types, and the 'dataset-probe type is fully defined:

(define resource-types
  (hash 'dataset-file dataset-file-actions
        'dataset-publish dataset-publish-actions
        'dataset-geno dataset-geno-actions
        'dataset-probe dataset-probe-actions))

Finally, it's nice to be able to create resources of this type in Racket, so we want a constructor function:

(define (new-probe-resource name
  (resource name
            (hasheq 'dataset dataset-name
                    'trait trait-name)

Other queries

The chromosome queries are almost, but not quite the same:

Line 305

query = """SELECT chromosome, txStart, txEnd
        FROM GeneList
        WHERE geneSymbol = '{}'""".format(self.this_trait.symbol)

Line 324

query = """SELECT kgID, chromosome, txStart, txEnd
        FROM GeneList_rn33
        WHERE geneSymbol = '{}'""".format(self.this_trait.symbol)

The only differences are the presence of the kgID column in the 2nd query, and the table that's being searched.

There's also this geno query, that's very similar to what already exists:

Line 503

query = """SELECT Geno.Name
           FROM Geno, GenoXRef, GenoFreeze
           WHERE Geno.Chr = '{}' AND
                 GenoXRef.GenoId = Geno.Id AND
                 GenoFreeze.Id = GenoXRef.GenoFreezeId AND
                 GenoFreeze.Name = '{}'
           ORDER BY ABS( Geno.Mb - {}) LIMIT 1""".format(this_chr,"Geno", this_mb)

Thinking about it, it probably doesn't make sense to have the proxy handle these requests, so I won't waste time on them for now.

Populating Redis

For the proxy to be able to execute actions on these resources, they need to have entries in the Redis database. I've added functions to resource.rkt that take the required IDs/names and add them to Redis if they don't exist, add-probe-resource, add-publish-resource and add-geno-resource.

To add a dataset-publish resource with the dataset ID 1, trait ID 17465, with name "resource1" at Redis ID 0:

(add-publish-resource "0"

Using the API

There are two REST endpoints, available and run-action.


Takes two parameters, resource and user. An example query URL would look like:


A JSON representation of the actions available to the given user on that resource is returned. For instance, if the resource in question is one of the above, that either allows or disallows access to the data action, and the user has full access, the result would be:



Takes at least three parameters, resource, user, and action. An example query URL would look like:


Actions may require additional parameters, which are provided like any other URL parameters. What the returned value actually is obviously depends on the action in question.