blob: 9658879a3adacef142335d2dbeecf3b8da787083 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
|
#+TITLE: Running the common workflow language on GNU Guix
* Introduction
The common workflow language (CWL) can run workflows defined in a YAML
definition. Some key concepts are that CWL workflows can be analysed
and reasoned on (unlike shell scripts) and CWL workflows are a
separation of concerns: (1) tools/scripts, (2) data and (3) the
workflow, i.e. how it connects up.
CWL is also agnostic about finding underlying tooling. Docker links
are often provided as hints, but with ~--no-container~ a tool just
gets invoked. This is great in the context of GNU Guix environments!
* Install CWL using GNU Guix
You may need to install GNU Guix and see the README on
http://git.genenetwork.org/guix-bioinformatics/guix-bioinformatics
Recent versions of GNU Guix contain =cwl-runner=:
: guix pull
: ~/.config/guix/current/bin/guix package -A cwl
: cwltool 3.0.20201121085451 out gnu/packages/bioinformatics.scm:2627:2
Install with
: guix package -i cwltool
or in a special profile (I tend to do that)
: guix package -i cwltool -p ~/opt/CWL
Set the PATH and you should be able to run cwltool
: . ~/opt/CWL/etc/profile
: cwltool
* Set up a more advanced workflow
Let's run the workflow that was described in [[https://hpc.guix.info/blog/2019/01/creating-a-reproducible-workflow-with-cwl/][creating a reproducible
workflow with GNU Guix]]:
: git clone https://github.com/pjotrp/CWL-workflows
Build the contained trimmomatic (if you are unlucky this may take a
while)
: cd CWL-workflows
: env GUIX_PACKAGE_PATH=. guix build trimmomatic-jar
Now let's rerun the workflow as set up in above [[https://hpc.guix.info/blog/2019/01/creating-a-reproducible-workflow-with-cwl/][BLOG]] (I created a
local version to skip IPFS). Make sure your PATH points to all the
tools and
: cwltool --no-container Workflows/test-workflow.cwl Jobs/local-small.ERR034597.test-workflow.yml
in the first run gives an error: ERROR 'fastqc' not found. We need to
add the tool to the environment. For this I created a file .guix-deploy
in the root of the repo:
: cat .guix-deploy
: env GUIX_PACKAGE_PATH=.:~/iwrk/opensource/guix/guix-bioinformatics/ ~/.config/guix/current/bin/guix environment -C guix --ad-hoc cwltool trimmomatic-jar bwa fastqc go-ipfs curl --network
You can see it requires the guix-bioinformatics, so you may need to clone
that repo first. Next start the Guix container:
: . ./guix-deploy
: cwltool --no-container Workflows/test-workflow.cwl Jobs/local-small.ERR034597.test-workflow.yml
Now the workflow should run fastq. When it works it should say
: <lots of output>
: INFO Final process status is success
The current workflow is only working partly. It now complains with
ILLUMINACLIP:/gnu/store/v2jys382g6j5b7lsxzh8v4vfhd414nhz-profile/lib/share/jar/adapters/TruSeq2-PE.fa:2:40:15.
Error: Unable to access jarfile /gnu/store/v2jys382g6j5b7lsxzh8v4vfhd414nhz-profile/lib/share/jar/trimmomatic-0.38.jar
This is because I hard coded two paths which you need to point to your Guix
profile first:
: Tools/trimmomaticPE.cwl: valueFrom: /gnu/store/v2jys382g6j5b7lsxzh8v4vfhd414nhz-profile/lib/share/jar/trimmomatic-0.38.jar
: Tools/trimmomaticPE.cwl: valueFrom: 'ILLUMINACLIP:/gnu/store/v2jys382g6j5b7lsxzh8v4vfhd414nhz-profile/lib/share/jar/adapters/TruSeq2-PE.fa:2:40:15'
In the container the Guix profile can be found with
: echo $GUIX_ENVIRONMENT
Plug it into above values. This is not typical and I should find a
proper way to do this. cwltool has a switch `--preserve-environment
ENVVAR'. After modifying the source by splitting in the GUIX_ENVIROMENT
it worked.
#+begin_src diff
diff --git a/Tools/trimmomaticPE.cwl b/Tools/trimmomaticPE.cwl
index ed57eb5..aedd23a 100644
--- a/Tools/trimmomaticPE.cwl
+++ b/Tools/trimmomaticPE.cwl
@@ -55,7 +55,7 @@ outputs:
arguments:
- position: 1
- valueFrom: /gnu/store/v2jys382g6j5b7lsxzh8v4vfhd414nhz-profile/lib/share/jar/trimmomatic-0.38.jar
+ valueFrom: /gnu/store/j1ljhxzaxmcqy8v6d4v1y37p48c68f5q-profile/lib/share/jar/trimmomatic-0.38.jar
- position: 2
valueFrom: PE
- position: 5
@@ -67,4 +67,4 @@ arguments:
- position: 8
valueFrom: $(inputs.fq2.basename).trim.2U.fastq
- position: 9
- valueFrom: 'ILLUMINACLIP:/gnu/store/v2jys382g6j5b7lsxzh8v4vfhd414nhz-profile/lib/share/jar/adapters/TruSeq2-PE.fa:2:40:15'
+ valueFrom: 'ILLUMINACLIP:/gnu/store/j1ljhxzaxmcqy8v6d4v1y37p48c68f5q-profile/lib/share/jar/adapters/TruSeq2-PE.fa:2:40:15'
#+end_src
Try
: . ./guix-deploy
: cwltool --no-container --preserve-environment GUIX_ENVIRONMENT Workflows/test-workflow.cwl Jobs/local-small.ERR034597.test-workflow.yml
: (output)
: INFO Final process status is success
|