aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--INSTALL.md159
-rw-r--r--README.md196
2 files changed, 37 insertions, 318 deletions
diff --git a/INSTALL.md b/INSTALL.md
index 417cbea..4ac587b 100644
--- a/INSTALL.md
+++ b/INSTALL.md
@@ -1,158 +1,3 @@
-# INSTALL GEMMA: Genome-wide Efficient Mixed Model Association
+# INSTALL PanGEMMA: Genome-wide Efficient Mixed Model Association
-## Check version
-
-Simply run gemma once installed
-
- gemma
-
-and it should give you the version.
-
-## GEMMA dependencies
-
-GEMMA runs on Linux, MAC OSX and Windows (with Docker). The runtime
-has the following dependencies:
-
-* C++ tool chain >= 5.5.0 (see Travis CI and we test with file .guix-dev-gcc-older)
-* GNU Science library (GSL) 2.x (GEMMA dropped support for GSL 1.x)
-* blas/openblas
-* lapack
-* zlib
-
-See below for installation on Guix.
-
-## Install GEMMA
-
-### Debian and Ubuntu
-
-Travis-CI uses Ubuntu for testing. Check the test logs for version numbers.
-
-[![Build Status](https://travis-ci.org/genetics-statistics/GEMMA.svg?branch=master)](https://travis-ci.org/genetics-statistics/GEMMA)
-
-Current settings can be found in [travis.yml](.travis.yml).
-
-### Bioconda
-
-(Note Bioconda install is a work in [progress](https://github.com/genetics-statistics/GEMMA/issues/52)
-
-Recent versions of GEMMA can be installed with
-[BioConda](http://ddocent.com/bioconda/) without root permissions using the following
-command
-
- conda install gemma
-
-### FreeBSD
-
-Recent editions of FreeBSD ports include [GEMMA](https://www.freebsd.org/cgi/ports.cgi?query=gemma&stype=all)
-
-### GNU Guix
-
-The GNU Guix package manager can install recent versions of [GEMMA](https://www.gnu.org/software/guix/packages/g.html)
-using the following command
-
- guix package -i gemma
-
-A more recent version may be found in the guix-bioinformatics channel
-which is maintained by the authors. See the
-[README](http://git.genenetwork.org/guix-bioinformatics/guix-bioinformatics), e.g.
-
- env GUIX_PACKAGE_PATH=./guix-bioinformatics guix package -A gemma
-
-To build GEMMA from source you can opt to install the build tools with
-GNU Guix, the current build container is in [guix.scm](guix.scm). See the first lines on how to create one using `guix shell`.
-
-An alternative is the command line invocation of [guix-dev](./.guix-dev)
-
- source .guix-dev
- make
-
-Guix allows for easy versioning. To build with an older gcc, for
-example:
-
- guix environment -C guix --ad-hoc gcc-toolchain@9.3.0 gdb gsl openblas zlib bash ld-wrapper perl vim which
-
-### Install with Docker
-
-Recent version of GEMMA come with a 64-bit Docker image that should run
-on Linux, Windows and MacOS.
-
-### Install from source
-
-Install listed dependencies (you may want to take hints from
-the Travis-CI [tests](./.travis.yml)) and run
-
- make -j 4
-
-(the -j switch builds on 4 cores).
-
- time make check
-
-You can run gemma in the debugger with, for example
-
- gdb --args \
- ./bin/gemma -g example/mouse_hs1940.geno.txt.gz \
- -p example/mouse_hs1940.pheno.txt -a example/mouse_hs1940.anno.txt \
- -snps example/snps.txt -nind 400 -loco 1 -gk -debug -o myoutput
-
-Note that if you get <optimized out> warnings on inspecting variables you
-should compile with GCC_FLAGS="" to disable optimizations (-O3). E.g.
-
- make WITH_OPENBLAS=1 GCC_FLAGS=
-
-Other options, such as compiling with warnings, are listed in the
-Makefile.
-
-### GNU Guix commands used
-
-Some development examples. With git bisect build the older versions
-of gemma with openblas
-
- ~/.config/guix/current/bin/guix environment -C guix --ad-hoc gcc gdb gfortran:lib gsl lapack openblas zlib bash ld-wrapper perl ldc
- make clean ; make WITH_OPENBLAS=1 FORCE_DYNAMIC=1 -j 8
-
-or with atlas
-
- ~/.config/guix/current/bin/guix environment -C guix --ad-hoc gcc gdb gfortran:lib gsl lapack atlas zlib bash ld-wrapper perl ldc
- make clean ; make WITH_OPENBLAS= FORCE_DYNAMIC=1 -j 25
-
-## Run tests
-
-GEMMA includes the shunit2 test framework (version 2.0).
-
- make check
-
-or
-
- ./run_tests.sh
-
-## Releases
-
-### Docker release
-
-To distribute GEMMA I made static versions of the binary. A container
-can be made instead with, for example
-
-```sh
-env GUIX_PACKAGE_PATH=~/guix-bioinformatics ~/.config/guix/current/bin/guix \
- pack -f docker gemma-gn2 -S /bin=bin
-```
-
-which created a container in of size 51MB. Tiny! For more information
-see
-[GUIX-NOTES](http://git.genenetwork.org/guix-bioinformatics/guix-notes/CONTAINERS.org).
-
-
-### Static release
-
-To create a static release, locate the gfortran lib and use
-
- source .guix-dev-static
- make WITH_GFORTRAN=1 EXTRA_FLAGS=-L/gnu/store/741057r2x06zwg6zcmqmdyv51spm6n9i-gfortran-7.5.0-lib/lib static
-
-otherwise OpenBlas will complain with
-
- undefined reference to `_gfortran_concat_string'
-
-Note you can use guix.scm if you load with
-
- guix shell -L ~/guix-pjotr -C -D -f guix.scm
+WIP
diff --git a/README.md b/README.md
index 31837fa..e13ca36 100644
--- a/README.md
+++ b/README.md
@@ -1,39 +1,10 @@
-![Genetic associations identified in CFW mice using GEMMA (Parker et al,
-Nat. Genet., 2016)](cfw.gif)
-
-# GEMMA: Genome-wide Efficient Mixed Model Association
-
-[![Github-CI](https://github.com/genetics-statistics/GEMMA/actions/workflows/c-cpp.yml/badge.svg)](https://github.com/genetics-statistics/GEMMA/actions/workflows/c-cpp.yml) [![Anaconda-Server Badge](
-https://anaconda.org/bioconda/gemma/badges/platforms.svg)](https://anaconda.org/bioconda/gemma) [![DL](https://anaconda.org/bioconda/gemma/badges/downloads.svg)](https://anaconda.org/bioconda/gemma) [![BrewBadge](https://img.shields.io/badge/%F0%9F%8D%BAbrew-gemma--0.98-brightgreen.svg)](https://github.com/brewsci/homebrew-bio) [![GuixBadge](https://img.shields.io/badge/gnuguix-gemma-brightgreen.svg)](https://packages.guix.gnu.org/packages/gemma/) [![DebianBadge](https://badges.debian.net/badges/debian/testing/gemma/version.svg)](https://packages.debian.org/search?keywords=gemma&searchon=names&suite=all&section=all)
-
-GEMMA is a software toolkit for fast application of linear mixed
-models (LMMs) and related models to genome-wide association studies
-(GWAS) and other large-scale data sets.
-
-Check out [RELEASE-NOTES.md](RELEASE-NOTES.md) to see what's new in
-each GEMMA release.
-
-Please post suspected bugs to
-[Github issues](https://github.com/genetics-statistics/GEMMA/issues). For
-questions or other discussion, please post to the
-[GEMMA Google Group](https://groups.google.com/group/gemma-discussion). We
-also encourage contributions, for example, by forking the repository,
-making your changes to the code, and issuing a pull request.
-
-Currently, GEMMA provides a runnable Docker container for 64-bit
-MacOS, Windows and Linux platforms. GEMMA can be installed with
-Debian, Conda, Homebrew and GNU Guix. With Guix you find the latest
-version
-[here](http://git.genenetwork.org/guix-bioinformatics/guix-bioinformatics)
-as it is the version we use every day on http://genenetwork.org. For
-installation instructions see also [INSTALL.md](INSTALL.md). We use
-continous integration builds on Travis-CI for Linux (amd64 & arm64)
-and MacOS (amd64). GEMMA builds on multiple architectures, see the
-[Debian build farm](https://buildd.debian.org/status/package.php?p=gemma).
-
-*(The above image depicts physiological and behavioral trait
-loci identified in CFW mice using GEMMA, from [Parker et al, Nature
-Genetics, 2016](https://doi.org/10.1038/ng.3609).)
+# PanGEMMA: Genome-wide Efficient Mixed Model Association
+
+This repository is used to rewrite and modernize the original GEMMA tool. The idea it to upgrade the software, but keeping it going using ideas from Hanson and Sussman's book on *Software Design for Flexibility: How to Avoid Programming Yourself into a Corner*. It is work in progress (WIP). For more information see [PanGEMMA design](./doc/code/pangemma.md)
+
+GEMMA is the original software toolkit for fast application of linear mixed models (LMMs) and related models to genome-wide association studies (GWAS) and other large-scale data sets. You can find the original code on [github](https://github.com/genetics-statistics/GEMMA).
+
+Check out [RELEASE-NOTES.md](./RELEASE-NOTES.md) to see what's new in each release.
* [Key features](#key-features)
* [Installation](#installation)
@@ -53,57 +24,21 @@ Genetics, 2016](https://doi.org/10.1038/ng.3609).)
## Key features
-1. Fast assocation tests implemented using the univariate linear mixed
-model (LMM). In GWAS, this can correct for population structure and
-sample non-exchangeability. It also provides estimates of the
-proportion of variance in phenotypes explained by available genotypes
-(PVE), often called "chip heritability" or "SNP heritability".
+1. Fast assocation tests implemented using the univariate linear mixed model (LMM). In GWAS, this can correct for population structure and sample non-exchangeability. It also provides estimates of the proportion of variance in phenotypes explained by available genotypes (PVE), often called "chip heritability" or "SNP heritability".
-2. Fast association tests for multiple phenotypes implemented using a
-multivariate linear mixed model (mvLMM). In GWAS, this can correct for
-population structure and sample (non)exchangeability - jointly in
-multiple complex phenotypes.
+2. Fast association tests for multiple phenotypes implemented using a multivariate linear mixed model (mvLMM). In GWAS, this can correct for population structure and sample (non)exchangeability - jointly in multiple complex phenotypes.
-3. Bayesian sparse linear mixed model (BSLMM) for estimating PVE,
-phenotype prediction, and multi-marker modeling in GWAS.
+3. Bayesian sparse linear mixed model (BSLMM) for estimating PVE, phenotype prediction, and multi-marker modeling in GWAS.
-4. Estimation of variance components ("chip/SNP heritability") partitioned
-by different SNP functional categories from raw (individual-level)
-data or summary data. For raw data, HE regression or the REML AI
-algorithm can be used to estimate variance components when
-individual-level data are available. For summary data, GEMMA uses the
-MQS algorithm to estimate variance components.
+4. Estimation of variance components ("chip/SNP heritability") partitioned by different SNP functional categories from raw (individual-level) data or summary data. For raw data, HE regression or the REML AI algorithm can be used to estimate variance components when individual-level data are available. For summary data, GEMMA uses the MQS algorithm to estimate variance components.
## Installation
-To install GEMMA you can
-
-1. Download the precompiled or Docker binaries
- from [releases](https://github.com/genetics-statistics/GEMMA/releases).
-
-2. Use existing package managers, see [INSTALL.md](INSTALL.md).
-
-3. Compile GEMMA from source, see [INSTALL.md](INSTALL.md).
-
-Compiling from source takes more work, but can potentially boost
-performance of GEMMA when using specialized C++ compilers and
-numerical libraries.
+WIP
### Precompiled binaries
-1. Fetch the [latest stable release][latest_release] and download the
- file appropriate for your platform.
-
-2. For Docker images, install Docker, load the image into Docker and
- run with something like
-
- docker run -w /run -v ${PWD}:/run ed5bf7499691 gemma -gk -bfile example/mouse_hs1940
-
-3. For .gz files run `gunzip gemma.linux.gz` or `gunzip
-gemma.linux.gz` to unpack the file. And make sure it is executable with
-
- chmod u+x gemma-linux
- ./gemma-linux
+WIP
## Run GEMMA
@@ -163,9 +98,11 @@ the build optimization notes in [INSTALL.md](INSTALL.md).
+ [Tutorial on GEMMA for genome-wide association
analysis](https://github.com/rcc-uchicago/genetic-data-analysis-2).
-## Citing GEMMA
+## Citing PanGEMMA
-If you use GEMMA for published work, please cite our paper:
+PanGEMMA is not published yet.
+
+But if you use GEMMA for published work, please cite our paper:
+ Xiang Zhou and Matthew Stephens (2012). [Genome-wide efficient
mixed-model analysis for association studies.](http://doi.org/10.1038/ng.2310)
@@ -195,108 +132,45 @@ studies.](https://doi.org/10.1101/042846) *Annals of Applied Statistics*, in pre
## License
-Copyright (C) 2012–2021, Xiang Zhou, Pjotr Prins and team.
-
-The *GEMMA* source code repository is free software: you can
-redistribute it under the terms of the
-[GNU General Public License](http://www.gnu.org/licenses/gpl.html). All
-the files in this project are part of *GEMMA*. This project is
-distributed in the hope that it will be useful, but **without any
-warranty**; without even the implied warranty of **merchantability or
-fitness for a particular purpose**. See file [LICENSE](LICENSE) for
-the full text of the license.
-
-Both the source code for the
-[gzstream zlib wrapper](http://www.cs.unc.edu/Research/compgeom/gzstream/)
-and [shUnit2](https://github.com/genenetwork/shunit2) unit testing
-framework included in GEMMA are distributed under the
-[GNU Lesser General Public License](contrib/shunit2-2.0.3/doc/LGPL-2.1),
-either version 2.1 of the License, or (at your option) any later
-revision.
-
-The source code for the included [Catch](http://catch-lib.net) unit
-testing framework is distributed under the
-[Boost Software Licence version 1](https://github.com/philsquared/Catch/blob/master/LICENSE.txt).
-
-## Optimizing performance
-
-Precompiled binaries and libraries may not be optimal for your particular
-hardware. See [INSTALL.md](INSTALL.md) for speeding up tips.
-
-## Building from source
-
-More information on source code, dependencies and installation can be
-found in [INSTALL.md](INSTALL.md).
-
-## Input data formats
+PanGEMMA Copyright (C) 2012–2025, Pjotr Prins, Xiang Zhou and others (see the soure file headers and git log).
-Currently GEMMA takes two types of input formats
+The *PanGEMMA* and *GEMMA* source code repository is free software: you can redistribute it under the terms of the [GNU General Public License](http://www.gnu.org/licenses/gpl.html). All the files in this project are part of *GEMMA*. This project is distributed in the hope that it will be useful, but **without any warranty**; without even the implied warranty of **merchantability or fitness for a particular purpose**. See file [LICENSE](LICENSE) for the full text of the license.
-1. BIMBAM format (preferred)
-2. PLINK format
+Both the source code for the [gzstream zlib wrapper](http://www.cs.unc.edu/Research/compgeom/gzstream/) and [shUnit2](https://github.com/genenetwork/shunit2) unit testing framework included in GEMMA are distributed under the [GNU Lesser General Public License](contrib/shunit2-2.0.3/doc/LGPL-2.1), either version 2.1 of the License, or (at your option) any later revision.
-See this [example](./doc/example/data-munging.org) where we convert some
-spreadsheets for use in GEMMA.
+The source code for the included [Catch](http://catch-lib.net) unit testing framework is distributed under the [Boost Software Licence version 1](https://github.com/philsquared/Catch/blob/master/LICENSE.txt).
-## Reporting a GEMMA bug or issue
+## Optimizing performance
-For bugs GEMMA has an
-[issue tracker](https://github.com/genetics-statistics/GEMMA/issues)
-on github. For general support GEMMA has a mailing list at
-[gemma-discussion](https://groups.google.com/forum/#!forum/gemma-discussion)
+Precompiled binaries and libraries may not be optimal for your particular hardware. See [INSTALL.md](INSTALL.md) for speeding up tips.
-Before posting an issue search the issue tracker and mailing list
-first. It is likely someone may have encountered something
-similiar. Also try running the latest version of GEMMA to make sure it
-has not been fixed already. Support/installation questions should be
-aimed at the mailing list - it is the best resource to get answers.
+## Building from source
-The issue tracker is specifically meant for development issues around
-the software itself. When reporting an issue include the output of the
-program and the contents of the .log.txt file in the output directory.
+More information on source code, dependencies and installation can be found in [INSTALL.md](INSTALL.md).
-### Check list:
+## Input data formats
-1. [X] I have found an issue with GEMMA
-2. [ ] I have searched for it on the [issue tracker](https://github.com/genetics-statistics/GEMMA/issues?q=is%3Aissue) (incl. closed issues)
-3. [ ] I have searched for it on the [mailing list](https://groups.google.com/forum/#!forum/gemma-discussion)
-4. [ ] I have tried the latest [release](https://github.com/genetics-statistics/GEMMA/releases) of GEMMA
-5. [ ] I have read and agreed to below code of conduct
-6. [ ] If it is a support/install question I have posted it to the [mailing list](https://groups.google.com/forum/#!forum/gemma-discussion)
-7. [ ] If it is software development related I have posted a new issue on the [issue tracker](https://github.com/genetics-statistics/GEMMA/issues) or added to an existing one
-8. [ ] In the message I have included the output of my GEMMA run
-9. [ ] In the message I have included the relevant .log.txt file in the output directory
-10. [ ] I have made available the data to reproduce the problem (optional)
+## Contributing code, reporting a PanGEMMA bug or issue
-To find bugs the GEMMA software developers may ask to install a
-development version of the software. They may also ask you for your
-data and will treat it confidentially. Please always remember that
-GEMMA is written and maintained by volunteers with good
-intentions. Our time is valuable too. By helping us as much as
-possible we can provide this tool for everyone to use.
+WIP
## Code of conduct
-By using GEMMA and communicating with its communtity you implicitely
-agree to abide by the
-[code of conduct](https://software-carpentry.org/conduct/) as
-published by the Software Carpentry initiative.
+By using GEMMA and communicating with its communtity you implicitely agree to abide by the [code of conduct](https://software-carpentry.org/conduct/) as published by the Software Carpentry initiative.
## Credits
+The *PanGEMMA* software was developmed
+
+[Pjotr Prins](http://thebird.nl/)<br>
+Dept. of Genetics, Genomics and Informatics<br>
+University of Tennessee Health Science Center<br>
+
The *GEMMA* software was developed by:
[Xiang Zhou](http://www.xzlab.org)<br>
Dept. of Biostatistics<br>
University of Michigan<br>
-and
-
-[Pjotr Prins](http://thebird.nl/)<br>
-Dept. of Genetics, Genomics and Informatics<br>
-University of Tennessee Health Science Center<br>
-
with contributions from Peter Carbonetto, Tim Flutre, Matthew Stephens,
and [others](https://github.com/genetics-statistics/GEMMA/graphs/contributors).
-
-[latest_release]: https://github.com/genetics-statistics/GEMMA/releases "Most recent stable releases"