You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
Pjotr Prins df33db0880 -a 3 years ago
bio Compilation fixes for ldc 1.11 3 years ago
bio2 Compilation fixes for ldc 1.11 3 years ago
examples correct url for undead 4 years ago
src_ragel Moved Cigar functionality into its own module. Reran ragel. 4 years ago
test Getting rid of unittest noise 4 years ago
.gitignore Add dub config file 6 years ago
.travis.yml test osx build 5 years ago
LICENSE changed license to MIT 9 years ago
README.md update github account link 4 years ago
RELEASE-NOTES.md Changelog 3 years ago
VERSION -a 3 years ago
dub.json fix tabs/spaces in dub.json 5 years ago
meson.build Add Meson build definition 5 years ago

README.md

BioD Build Status DUB Package

BioD is a fast and memory efficient bioinformatics library written in the D programming language.

BioD aims to:

  • Provide a platform for writing high-performance bioinformatics applications in D. BioD achieves this by:
    • automatic parallelization of tasks where possible for example reading and writing BAM files
    • reducing the GC overhead by avoiding unnecessary memory allocations
  • Offer support for manipulating common biological data formats

Why D?

D is a language that suits parallel programming because of guarantees the compiler provides. D is both a low-level language and a high-level hybrid OOP/FP language. There is no other programming language that matches those features. Also, D templating/generics is far easier that that of C++ or, say, Scala.

That is not to say that D is an easy language. A powerful toolbox will be complicated. If you want to do everything with a hammer, maybe better choose Java instead ;).

For more information about D find Andrei Alexandrecu's D book. It is a classic. Ali Çehreli's book also is recommended.

Current development

Our current focus is to provide a bamreader and bamwriter that is really fast and easy to use. We believe the BAM format is here to stay for the foreseeable future in pipelines. With D we have an good way to write performance parsers, particularly with three typical scenarios:

  1. Go through a BAM file a read at a time
  2. Go through a BAM file a nucleotide at a time (pileup)
  3. Go through a BAM file with a sliding window

The sliding window is a derivation of the first - a read at a time or a nucleotide at a time.

At this point this functionality is mostly in BioD, but not in an intuitive way. We are building up this functionality and will give examples (WIP).

Install

The current default is to provide the path to the checked out repo to the D-compiler. For example in sambamba we use

DFLAGS = -wi -I. -IBioD -g

Usage

See the examples directory for examples and usage.

BioD is also a crucial part of the sambamba tool.

Contributing

Simply clone the repository on github and put in a pull request.

BioD contributors and support

See contributors. For support use the issue tracker or contact

License

BioD is licensed under the liberal MIT (expat) license.