Here I'll be documenting the process of replacing the SQL queries in GN2's show_trait.py, including creating the proxy resource types, and populating Redis with resource entries.
These are all taken from show_trait.py.
This query is used twice, starting at line 78, then 97. Line 78 and 97
query1 = """SELECT Probe.Sequence, Probe.Name FROM Probe, ProbeSet, ProbeSetFreeze, ProbeSetXRef WHERE ProbeSetXRef.ProbeSetFreezeId = ProbeSetFreeze.Id AND ProbeSetXRef.ProbeSetId = ProbeSet.Id AND ProbeSetFreeze.Name = '%s' AND ProbeSet.Name = '%s' AND Probe.ProbeSetId = ProbeSet.Id order by Probe.SerialOrder""" % (self.this_trait.dataset.name, self.this_trait.name)
To define a proxy resource type to take the place of this, we need to
define an action set, and associate it with the resource type name
in the resource-types
hashmap in resource.rkt
. For convenience
it's also good to have a constructor function to create new instances
of the resource in Racket, though it's also possible to directly craft
the JSON and put it in Redis.
First we need to translate this Python query into Racket. Racket's
db
library provides a simple and effective way (though not very
secure) of constructing queries from strings akin to Python's string
format method.
The function for running the above query in Racket is as follows:
(define (select-probe dataset-name trait-name) (sql-result->json (query-row (mysql-conn) "SELECT Probe.Sequence, Probe.Name FROM Probe, ProbeSet, ProbeSetFreeze, ProbeSetXRef WHERE ProbeSetXRef.ProbeSetFreezeId = ProbeSetFreeze.Id AND ProbeSetXRef.ProbeSetId = ProbeSet.Id AND ProbeSetFreeze.Name = ? AND ProbeSet.Name = ? AND Probe.ProbeSetId = ProbeSet.Id order by Probe.SerialOrder" dataset-name trait-name)))
dataset-name
corresponds to self.this_trait.dataset.name
in the
Python version, trait-name
to self.this_trait.name
. query-row
takes a database connection, a query string, and a number of values to
interpolate into the string. Question marks in the query string are
replaced with those values.
mysql-conn
is a Racket parameter, defined in db.rkt
, and can
be viewed as a function that returns the global database connection,
though it can be changed on a per-thread basis – nothing we need
to worry about quite yet!
sql-result->json
is a function defined in resource.rkt
that
transforms the query result into a JSON array, transforming
SQL nulls into JSON nulls. In the future, we may have to extend
this function to transform other SQL types into JSON, but it's
enough for now.
select-probe
is just a function, now we need to make it into an
action. More specifically, we need to define a value of the action
struct (defined in privileges.rkt
) that tells the proxy how the
function should be used in the context of a resource.
An action
consists of a function that takes two parameters, data
and params
, where data
is whatever is required from the resource
itself to perform an action, and params
is whatever is required
from the user. Actions also have a name, and a list of symbols that
describe what params
are required.
In this case, the action is only retrieving data from the database,
so no params
are required. The definition of the view-probe
action is:
(define view-probe (action "view" ; the name (lambda (data ; params) ; the wrapped function (select-probe (hash-ref data 'dataset) (hash-ref data 'trait))) '())) ; the empty params list
Even though no user-provided parameters are required, the wrapped
function still needs to take two arguments for now. This action is
simple, and just extracts the two strings from the data
provided.
Next we need to create an "action set" that contains this action. An action set defines the access control structure for a resource type. An action set can be viewed as a tree that only branches at the root, where each branch is a list of actions, each requiring a higher level of access to execute than the last.
In this case, our action set consists only of one branch, and we
really have only one action. However, to allow for limited access, we
need a "no access" action that precedes the database query action.
This action is already defined in resource.rkt
as
no-access-action
, and can be used in any action branch without
modification.
(define dataset-probe-data (list (cons "no-access" no-access-action) (cons "view" view-probe))) (define dataset-probe-actions (hasheq 'data dataset-probe-data))
dataset-probe-data
is the branch, dataset-probe-actions
is the
action set. As can be seen, the action set is simply a structure
that defines the shape of the access control on a resource type,
as a sort of dictionary of dictionaries.
With the action set defined, it must be added to the resource-types
hashmap, so the proxy knows to associate the new resource type with
this action set. All that's needed is to add 'dataset-probe
dataset-probe actions
to resource-types
, and the 'dataset-probe
type is fully defined:
(define resource-types (hash 'dataset-file dataset-file-actions 'dataset-publish dataset-publish-actions 'dataset-geno dataset-geno-actions 'dataset-probe dataset-probe-actions))
Finally, it's nice to be able to create resources of this type in Racket, so we want a constructor function:
(define (new-probe-resource name owner-id dataset-name trait-name default-mask) (resource name owner-id (hasheq 'dataset dataset-name 'trait trait-name) 'dataset-probe default-mask (hasheq)))
The chromosome queries are almost, but not quite the same:
Line 305
query = """SELECT chromosome, txStart, txEnd FROM GeneList WHERE geneSymbol = '{}'""".format(self.this_trait.symbol)
Line 324
query = """SELECT kgID, chromosome, txStart, txEnd FROM GeneList_rn33 WHERE geneSymbol = '{}'""".format(self.this_trait.symbol)
The only differences are the presence of the kgID
column in the 2nd
query, and the table that's being searched.
There's also this geno query, that's very similar to what already exists:
Line 503
query = """SELECT Geno.Name FROM Geno, GenoXRef, GenoFreeze WHERE Geno.Chr = '{}' AND GenoXRef.GenoId = Geno.Id AND GenoFreeze.Id = GenoXRef.GenoFreezeId AND GenoFreeze.Name = '{}' ORDER BY ABS( Geno.Mb - {}) LIMIT 1""".format(this_chr, this_db.group.name+"Geno", this_mb)
Thinking about it, it probably doesn't make sense to have the proxy handle these requests, so I won't waste time on them for now.
For the proxy to be able to execute actions on these resources, they
need to have entries in the Redis database. I've added functions
to resource.rkt
that take the required IDs/names and add them to
Redis if they don't exist, add-probe-resource
, add-publish-resource
and add-geno-resource
.
To add a dataset-publish resource with the dataset ID 1, trait ID 17465, with name "resource1" at Redis ID 0:
(add-publish-resource "0" "resource1" "1" "17465")
There are two REST endpoints, available and run-action.
Takes two parameters, resource
and user
. An example query URL
would look like:
http://localhost:8080/available/?resource=0&user=0
A JSON representation of the actions available to the given user
on that resource is returned. For instance, if the resource in
question is one of the above, that either allows or disallows
access to the data
action, and the user has full access, the
result would be:
{"data":["no-access","view"]}
Takes at least three parameters, resource
, user
, and action
. An
example query URL would look like:
http://localhost:8080/run-action/?resource=0&user=0&branch=data&action=view
Actions may require additional parameters, which are provided like any other URL parameters. What the returned value actually is obviously depends on the action in question.