From f957ecc3a372298ffdd76af92c148c0a7b6aa16f Mon Sep 17 00:00:00 2001
From: DannyArends
Date: Wed, 28 Feb 2018 23:40:34 +0100
Subject: Compilation instruction for GEMMA using the R-toolchain under windows

---
 doc/compile_GEMMA_win64.txt | 48 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)
 create mode 100644 doc/compile_GEMMA_win64.txt

(limited to 'doc')

diff --git a/doc/compile_GEMMA_win64.txt b/doc/compile_GEMMA_win64.txt
new file mode 100644
index 0000000..84f16ec
--- /dev/null
+++ b/doc/compile_GEMMA_win64.txt
@@ -0,0 +1,48 @@
+// install R 3.4.3
+https://cran.r-project.org/bin/windows/base/
+
+// install Rtools 3.4
+https://cran.r-project.org/bin/windows/Rtools/
+
+// Download openblas (v0.2.19-Win64-int32)
+https://sourceforge.net/projects/openblas/files/v0.2.19/
+
+// Make a place to store the files
+mkdir Github
+cd Github/
+
+// Clone the required dependancies
+git clone https://github.com/eigenteam/eigen-git-mirror.git
+git clone https://github.com/genetics-statistics/GEMMA.git
+
+// Download and install gsl2.4, unzip/tar into Github
+http://gnu.askapache.com/gsl/
+
+// Download and install msys from http://downloads.sourceforge.net/mingw/MSYS-1.0.11.exe
+cd c:/msys/1.0
+// Run mysys
+msys.bat
+
+// Under msys, Compile GSL-2.4 inside the msys
+cd /c/
+cd Github/gsl-2.4
+./configure --prefix=C:/MinGW
+make -j 4
+make install
+
+// Building Gemma on the R tool chain under windows using CMD compile gemma
+cd gemma
+make -j 2
+
+// Get all the DLLs from:
+- MinGW DLLs: https://sourceforge.net/projects/openblas/files/v0.2.12/mingw64_dll.zip/download
+- DLLs from the compiled gsl-2.4
+- DLLs from openBLAS
+
+// Required DLLs:
++ libgcc_s_seh-1.dll
++ libgfortran-3.dll
++ libgsl-23.dll
++ libgslcblas-0.dll
++ libopenblas.dll
++ libquadmath-0.dll
-- 
cgit v1.2.3


From 159f95233afd36c98335059e35cd6c51e4760d24 Mon Sep 17 00:00:00 2001
From: xiangzhou
Date: Fri, 15 Jun 2018 07:38:39 -0400
Subject: explain how to deal with population stratification in the reference
 panel

---
 doc/manual.pdf | Bin 269308 -> 319480 bytes
 doc/manual.tex |   4 ++++
 2 files changed, 4 insertions(+)

(limited to 'doc')

diff --git a/doc/manual.pdf b/doc/manual.pdf
index b760cc1..1b7dc5d 100644
Binary files a/doc/manual.pdf and b/doc/manual.pdf differ
diff --git a/doc/manual.tex b/doc/manual.tex
index 1e042e7..8e5efe2 100644
--- a/doc/manual.tex
+++ b/doc/manual.tex
@@ -1373,6 +1373,10 @@ format. In addition, to fit MQS-LDW, you will need to add "-wcat
 specifies the LD score file, which can be provided in a gzip
 compressed format.
 
+A feature of MQS based variance component estimation is that one only need to use a subset of samples to estimate certain quantities. Using a subset of samples dramatically improves computation speed while maintaining variance component estimation accuracy. To take this strategy, one can use ``-sample [num]" to use a fixed number of random samples to perform estimation.
+
+Instead of using the genotype data from the study, one can also use genotype data from a reference panel. For example, one can use the genotype data from the 1000 genomes project as the reference. However, any population stratification in the reference panel should be dealt with first. For example, the individuals with European ancestry in the 1000 genomes project come from five subpopulations: CEU, FIN, GBR, IBS, and TSI. MQS computes SNP correlations across all SNP pairs as it should be under the LMM assumption. Therefore, any population stratification in the reference panel would increase the overall SNP correlation estimate, leading to down-ward bias in the final heritability estimate. To address the population stratification in the reference panel, one can include a few dummy variables in the model fitting step as covariates. These covariates represent, for example, the five subpopulations, and are used to effectively center the genotype mean in each subpopulation separately. To do this, one can create a covariate file containing five columns (no header): the first column is all 1 representing the intercept; the second column is 1 for CEU and 0 for others; the third column is 1 for FIN and 0 for others; ...; while the fifth column is 1 for IBS and 0 for others. Afterwards, one can add "-c [filename]" to include this covariate file in the command line.
+
 \subsubsection{Detailed Information}
 
 MQS-LDW uses an iterative procedure to update the variance
-- 
cgit v1.2.3