@@ Line 1: / Line 1: @@
-This page contains information about the program '''asaMap''', a tool for doing ancestry specific assocaition mapping for large scales genetic studies. It is based on called genotypes in the binary plink format (.bed). The program is written in C++.
+The program is available and described on github:
-=Download=
-The program can be downloaded from github:
 https://github.com/e-jorsboe/asaMap
-<pre>
-git clone https://github.com/e-jorsboe/asaMap.git;
-cd asaMap
-make
-</pre>
-So far it has only been tested on Linux systems. Use curl if you are on a MAC.
-=Example=
-To be added...
-=Input Files=
-Input files are called genotypes in the binary plink files (*.bed) format [https://www.cog-genomics.org/plink2]. And estimated admixture proportions and population specific allele frequencies. For estimating admixture proportions and population specific allele frequencies [http://software.genetics.ucla.edu/admixture/ ADMIXTURE], can be used, where '''.Q and .P files''' respectively can be given directly to asaMap.
-A phenotype also has to be provided, this should just be text file with one line for each individual in the .fam file, sorted in the same way:
-<pre>
--0.712027291121767
--0.158413122435864
--1.77167888612947
--0.800940619551485
-.3016297021294
-...
-</pre>
-A covarite file can also be provided, where each column is a covariate and each row is an individual - '''should NOT have columns of 1s for intercept (intercept will be included automatically)'''. This file has to have same number of rows as phenotype file and .fam file.
-<pre>
-.0127096117618385 -0.0181281029917176 -0.0616739439849275 -0.0304606694443973
-.0109944672768584 -0.0205785925514037 -0.0547523583405743 -0.0208813157640705
-.0128395346453956 -0.0142116856067135 -0.0471689997039534 -0.0266186436009881
-.00816783754598649 -0.0189271733933446 -0.0302259313905976 -0.0222247658768436
-.00695928218989132 -0.0089960963981644 -0.0384886176827146 -0.012649019770168
-...
-</pre>
-Example of a command of how to run asaMap with covariates included and first running ADMIXTURE:
-<pre>
-#run admixture
-admixture plinkFile.bed 2
-#run asaMap with admix proportions
-./asaMap -p plinkFile  -o out -c $COV -y pheno.files -Q plinkFile.2.Q -f plinkFile.2.P
-</pre>
-This produces a out.log logfile and a out.res with results for each site (after filtering).
-=Running asaMap=
-Example of a command of how to run asaMap with covariates included and first running ADMIXTURE:
-<pre>
-#run admixture
-admixture plinkFile.bed 2
-#run asaMap with admix proportions
-./asaMap -p plinkFile  -o out -c $COV -y pheno.files -Q plinkFile.2.Q -f plinkFile.2.P
-</pre>
-This produces a '''out.log''' logfile and a '''out.res''' with results for each site (after filtering).
-A whole list of options can be explored by running asaMap without any input:
-<pre>
-./asaMap
-</pre>
-'''Must be specified:'''
-; -p <filename>
-Plink prefix filename of binary plink files - so without .bed/.fam/.bim suffixes.
-; -o <filename>
-Output filename - a .res file will be written with the results and a .log log file.
-; -y <filename>
-Phenotypes file, has to be plain text file - with as many rows as .fam file.
-; -Q <filename> (either -a or -Q)
-Admixture proportions, .Q file from ADMIXTURE. Either specify this or -a.
-; -a <filename> (either -a or -Q)
-Admixture proportions (for source pop1) - so first column from .Q file from ADMIXTURE. Either specify this or -Q.
-; -f <filename>
-Allele frequencies, .P file from ADMIXTURE.
-'''Optional:'''
-; -c <filename>
-Covariates, plain text file with one column for each covariates, same number of rows as .fam file. SHOULD NOT HAVE COLUMN OF 1s (for intercept) WILL BE ADDED AUTOMATICALLY!
-; -m <INT>
-Model, whether an additive genotype model, or a recessive genotype model should be used (0: additive, 1: recessive - default: 0).
-; -l <INT>
-Regression, whether a linear or logistic regression, should be used. Logistic regression is for binary phenotype data, linear regresion is fo quantative phenotype data. (0: linear regression, 1: logistic regression - default: 0)
-; -b <filename>
-Text file containing a starting guess of the estimated coefficients.
-; -i <INT>
-The maximum number of iterations to run for the EM algorithm (default: 80).
-; -t <FLOAT>
-Tolerance for change in likelihood between EM iterations for finishing analysis (default: 0.0001).
-; -r <INT>
-Give seed, for generation of starting values of coefficients.
-; -P <INT>
-Number of threads to be used for analysis. Each thread will write to temporary file in path specified by "-o".
-; -e <INT>
-Estimate standard error of coefficients (0: no, 1: yes - default: 0).
-; -w <INT>
-Run M0/R0 model that models effect of other allele. Analyses are faster without having to run M0/R0. (0: no, 1: yes - default: 1)
-=Outputs=
-A '''.res''' file with the likelihoods of each model and the estimated coefficients in each model is produced, here for the additive:
-<pre>
-Chromo  Position  nInd  f1        f2        llh(M0)      llh(M1)      llh(M2)      llh(M3)      llh(M4)      llh(M5)      b1(M1)     b2(M1)     b1(M2)     b2(M3)     b(M4)
-       9855422    1237  0.935997  0.537511  3242.099033  3242.214834  3243.033924  3242.812740  3243.019888  3243.115326  0.093018   -0.166907  -0.053931  0.047357   0.020093
-       10684283   1217  0.999990  0.509715  nan          nan          nan          3214.598952  3214.974638  3215.569371  nan        nan        nan        -0.110044  -0.054084
-       11247763   1237  0.856692  0.78175  3234.025418  3241.930891  3242.902363  3242.561728  3242.820387  3243.028131  -0.048894  0.108007   0.045277   -0.030582  -0.016838
-...
-</pre>
-For the recessive model it looks like this:
-<pre>
-Chromo  Position  nInd  f1        f2        llh(R0)      llh(R1)      llh(R2)      llh(R3)      llh(R4)      llh(R5)      llh(R6)      llh(R7)      b1(R1)     b2(R1)     bm(R1)     b1(R2)     b2m(R2)    b1m(R3)    b2(R3)     b1(R4)     b2(R5)     b(R6)
-       9855422    1237  0.935997  0.537511  3236.442376  3241.191367  3242.235364  3241.191468  3243.112239  3241.188747  3242.691370  3243.115326  0.023373   -2.082935  -0.027433  0.016608   -0.582318  0.004700   -2.083112  -0.046849  -2.083275  -0.259338
-       10684283   1217  0.999990  0.509715  nan          nan          nan          nan          3215.162291  3215.133559  3214.502575  3215.569371  nan        nan        nan        nan        nan        nan        nan        -0.529999  -0.721649  -0.438317
-       11247763   1237  0.856692  0.78175  3235.030514  3242.807127  3242.809076  3242.836233  3242.818987  3243.028431  3242.907072  3243.028131  0.064419   -0.047597  -0.004021  0.068119   -0.019760  0.042905   -0.078669  0.060373   -0.018537  0.029227
-...
-</pre>
-P-values can be generated doing a likelihood ratio test, between the 2 desired models.
-An Rscript '''getPvalues.R''' is provided that makes it easy to obtain P-values from the '''.res''' file:
-<pre>
-Rscript R/getPvalues.R out.res
-</pre>
-Which produces a file with the suffix '''.Pvalues''':
-<pre>
-Chromo  Position  nInd  f1        f2        M0vM1                 M1vM5              M1vM2              M1vM3              M1vM4              M2vM5              M3vM5              M4vM5
-       9855422    1237  0.935997  0.537511  0.630338505521655     0.40636967666779   0.200575362363081  0.274160334109282  0.204476621296224  0.686587953953705  0.436611450245155  0.662188528285713
-       10684283   1217  0.99999   0.509715  NA                    NA                 NA                 NA                 NA                 NA                 0.163577574260359  0.275437296874114
-       11247763   1237  0.856692  0.78175  6.99963946833027e-05  0.333791076895669  0.163349235419537  0.261334462945287  0.182273151757048  0.615995603296571  0.334134847663281  0.51919707427275
-...
-</pre>
-=Models=
-asaMap implements a range of linear models, making it possible to test specific hypotheses.
-For the additive model there are 6 different models:
-{| class="wikitable"
-|-
-! scope="col"| Model
-! scope="col"| Parameters
-! scope="col"| Notes
-! scope="col"| Effect Parameters
-|-
-| M0
-| (beta_1, beta_2, delta_1) in R^3
-| effect of non-assumed effect allele
-| 1
-|-
-| M1
-| (beta_1, beta_2) in R^2
-| population specific effects
-| 2
-|-
-| M2
-| beta_1=0, beta_2 in R
-| no effect in population 1
-| 1
-|-
-| M3
-| beta_1 in R, beta_2=0
-| no effect in population 2
-| 1
-|-
-| M4
-| beta_1=beta_2 in R
-| same effect in both populations
-| 1
-|-
-| M5
-| beta_1=beta_2=0
-| no effect in any population
-| 0
-|}
-For the recessive model there are 8 different models:
-{| class="wikitable"
-|-
-! scope="col"| Model
-! scope="col"| Parameters
-! scope="col"| Notes
-! scope="col"| Effect Parameters
-|-
-| R0
-| (beta_1, beta_m, beta_2, delta_1, delta_2) in R^5
-| recessive effect of non-assumed effect alleles
-| 2
-|-
-| R1
-| (beta_1, beta_m, beta_2) in R^3
-| population specific effects
-| 3
-|-
-| R2
-| beta_1 in R, beta_m=beta_2 in R
-| same effect when one or both variant alleles are from pop 2
-| 2
-|-
-| R3
-| beta_1=beta_m in R, beta_2 in R
-| same effect when one or both variant alleles are from pop 1
-| 2
-|-
-| R4
-| beta_1 in R, beta_m=beta_2=0
-| only an effect when both variant alleles are from pop 1
-| 1
-|-
-| R5
-| beta_1=beta_m=0, beta_2 in R
-| only an effect when both variant alleles are from pop 2
-| 1
-|-
-| R6
-| beta_1=beta_m=beta_2 in R
-| same effect regardless of ancestry
-| 1
-|-
-| R7
-| beta_1=beta_m=beta_2=0
-| no effect in any population
-| 0
-|}
-'''beta_1''' and '''beta_2''' are the effect of the assumed effect-allele in population 1 and 2 respectively. '''beta_m''' is the recessive effect of being recessive for an allele with one copy from population 1 and one copy from population 2. '''delta_1''' and '''delta_2''' are the effect of the assumed non-effect-allele in population 1 and 2 respectively.
-=Citation=

AsaMap: Difference between revisions

Latest revision as of 09:32, 24 March 2026

Navigation menu