AsaMap: Difference between revisions

From software
Jump to navigation Jump to search
(Replaced content with "The program is available and described on github: https://github.com/e-jorsboe/asaMap")
Tag: Replaced
 
(16 intermediate revisions by one other user not shown)
Line 1: Line 1:
 
The program is available and described on github:
=Download=
 
The program can be downloaded from github:


https://github.com/e-jorsboe/asaMap
https://github.com/e-jorsboe/asaMap
<pre>
git clone https://github.com/e-jorsboe/asaMap.git;
cd asaMap
make
</pre>
So far it has only been tested on Linux systems. Use curl if you are on a MAC.
=Example=
This an example!!
=Input Files=
Input files are called genotypes in the binary plink files (*.bed) format [https://www.cog-genomics.org/plink2]. And estimated admixture proportions and population specific allele frequencies. For estimating admixture proportions and population specific allele frequencies [http://software.genetics.ucla.edu/admixture/ ADMIXTURE], can be used, where .Q and .P files respectively can be given directly to asaMap.
A phenotype also has to be provided, this should just be text file with one line for each individual in the .fam file, sorted in the same way:
<pre>
-0.712027291121767
-0.158413122435864
-1.77167888612947
-0.800940619551485
0.3016297021294
...
</pre>
A covarite file can also be provided, where each column is a covariate and each row is an individual - should NOT have columns of 1s for intercept (intercept will be included automatically). This file has to have same number of rows as phenotype file and .fam file.
<pre>
0.0127096117618385 -0.0181281029917176 -0.0616739439849275 -0.0304606694443973
0.0109944672768584 -0.0205785925514037 -0.0547523583405743 -0.0208813157640705
0.0128395346453956 -0.0142116856067135 -0.0471689997039534 -0.0266186436009881
0.00816783754598649 -0.0189271733933446 -0.0302259313905976 -0.0222247658768436
0.00695928218989132 -0.0089960963981644 -0.0384886176827146 -0.012649019770168
...
</pre>
Example of a command of how to run asaMap with covariates included and first running ADMIXTURE:
<pre>
#run admixture
admixture plinkFile.bed 2
#run asaMap with admix proportions
./asaMap -p plinkFile  -o out -c $COV -y pheno.files -Q plinkFile.2.Q -f plinkFile.2.P
</pre>
This produces a out.log logfile and a out.res with results for each site (after filtering).
=Running asaMap=
Example of a command of how to run asaMap with covariates included and first running ADMIXTURE:
<pre>
#run admixture
admixture plinkFile.bed 2
#run asaMap with admix proportions
./asaMap -p plinkFile  -o out -c $COV -y pheno.files -Q plinkFile.2.Q -f plinkFile.2.P
</pre>
This produces a out.log logfile and a out.res with results for each site (after filtering).
A whole list of options can be explored by running asaMap without any input:
<pre>
./asaMap
</pre>
'''Must be specified:'''
; -p <filename>     
Plink prefix filename of binary plink files - so without .bed/.fam/.bim suffixes.
; -o <filename>     
Output filename - a .res file will be written with the results and a .log log file.
; -y <filename>     
Phenotypes file, has to be plain text file - with as many rows as .fam file.
; -Q <filename> (either -a or -Q)     
Admixture proportions, .Q file from ADMIXTURE. Either specify this or -a.
; -a <filename> (either -a or -Q)     
Admixture proportions (for source pop1) - so first column from .Q file from ADMIXTURE. Either specify this or -Q.
; -f <filename>     
Allele frequencies, .P file from ADMIXTURE.
'''Optional:'''
; -c <filename>     
Covariates, plain text file with one column for each covariates, same number of rows as .fam file. SHOULD NOT HAVE COLUMN OF 1s (for intercept) WILL BE ADDED AUTOMATICALLY!
; -m <INT>       
Model, whether an additive genotype model, or a recessive genotype model should be used (0: additive, 1: recessive - default: 0).
; -l <INT>       
Regression, whether a linear or logistic regression, should be used. Logistic regression is for binary phenotype data, linear regresion is fo quantative phenotype data. (0: linear regression, 1: logistic regression - default: 0)
; -b <filename>     
Text file containing a starting guess of the estimated coefficients.
; -i <INT>     
The maximum number of iterations to run for the EM algorithm (default: 80).
; -t <FLOAT>         
Tolerance for change in likelihood between EM iterations for finishing analysis (default: 0.0001).
; -r <INT>         
Give seed, for generation of starting values of coefficients.
; -P <INT>           
Number of threads to be used for analysis. Each thread will write to temporary file in path specified by "-o".
; -e <INT>           
Estimate standard error of coefficients (0: no, 1: yes - default: 0).
; -w <INT>           
Run M0/R0 model that models effect of other allele. Analyses are faster without having to run M0/R0. (0: no, 1: yes - default: 1)
=Outputs=
A .res file with the likelihoods of each model and the estimated coefficents in each model is produced, here for the additive:
<pre>
Chromo  Position  nInd  f1        f2        llh(M0)      llh(M1)      llh(M2)      llh(M3)      llh(M4)      llh(M5)      b1(M1)    b2(M1)    b1(M2)    b2(M3)    b(M4)
1      980552    2737  0.935997  0.937511  3242.099033  3242.214834  3243.033924  3242.812740  3243.019888  3243.115326  0.093018  -0.166907  -0.053931  0.047357  0.020093
1      1068883  2717  0.999990  0.809715  nan          nan          nan          3214.598952  3214.974638  3215.569371  nan        nan        nan        -0.110044  -0.054084
1      1124663  2737  0.886692  0.388175  3234.025418  3241.930891  3242.902363  3242.561728  3242.820387  3243.028131  -0.048894  0.108007  0.045277  -0.030582  -0.016838
1      1171417  2736  0.999990  0.445701  nan          nan          nan          3239.320653  3239.524956  3239.641824  nan        nan        nan        -0.033530  -0.015845
1      1366830  2735  0.999990  0.374078  nan          nan          nan          3241.698019  3241.675158  3241.696793  nan        nan        nan        0.002135  0.007140
1      1450947  2738  0.659605  0.906222  3240.054094  3243.544587  3243.770254  3243.708934  3243.777517  3243.800524  -0.026101  0.044039  0.016671  -0.014242  -0.005544
1      1995211  2737  0.856699  0.982350  3235.516404  3242.070487  3242.928680  3242.571223  3242.756177  3242.941750  0.074805  -0.142018  -0.020892  0.039110  0.021462
1      2004098  2738  0.443711  0.815725  3241.253250  3242.382033  3243.741660  3242.955646  3243.532476  3243.800524  0.058767  -0.055806  -0.016451  0.041228  0.016158
1      2040898  2738  0.676808  0.610463  3242.664546  3243.371593  3243.574375  3243.801527  3243.787426  3243.800524  -0.024109  0.081087  0.047793  -0.001765  0.004108
</pre>
For the recessive model it looks like this:
<pre>
Chromo  Position  nInd  f1        f2        llh(R0)      llh(R1)      llh(R2)      llh(R3)      llh(R4)      llh(R5)      llh(R6)      llh(R7)      b1(R1)    b2(R1)    bm(R1)    b1(R2)    b2m(R2)    b1m(R3)    b2(R3)    b1(R4)    b2(R5)    b(R6)
1      980552    2737  0.935997  0.937511  3236.442376  3241.191367  3242.235364  3241.191468  3243.112239  3241.188747  3242.691370  3243.115326  0.023373  -2.082935  -0.027433  0.016608  -0.582318  0.004700  -2.083112  -0.046849  -2.083275  -0.259338
1      1068883  2717  0.999990  0.809715  nan          nan          nan          nan          3215.162291  3215.133559  3214.502575  3215.569371  nan        nan        nan        nan        nan        nan        nan        -0.529999  -0.721649  -0.438317
1      1124663  2737  0.886692  0.388175  3235.030514  3242.807127  3242.809076  3242.836233  3242.818987  3243.028431  3242.907072  3243.028131  0.064419  -0.047597  -0.004021  0.068119  -0.019760  0.042905  -0.078669  0.060373  -0.018537  0.029227
1      1171417  2736  0.999990  0.445701  nan          nan          nan          nan          3238.750760  3239.274351  3238.288964  3239.641824  nan        nan        nan        nan        nan        nan        nan        -0.210643  -0.267111  -0.144645
1      1366830  2735  0.999990  0.374078  nan          nan          nan          nan          3241.645871  3241.199416  3241.338290  3241.696793  nan        nan        nan        nan        nan        nan        nan        -0.045970  -0.273382  -0.070305
1      1450947  2738  0.659605  0.906222  3240.883715  3242.545834  3243.515375  3243.627600  3243.713843  3243.659336  3243.802228  3243.800524  0.047735  0.291966  -0.216232  0.044591  -0.069851  -0.016796  0.170637  0.032325  0.146528  0.002457
1      1995211  2737  0.856699  0.982350  3234.731598  3241.839632  3241.919398  3241.997812  3242.204980  3242.750902  3242.000261  3242.941750  0.072845  0.113462  0.601882  0.114683  0.366807  0.175891  0.261334  0.209120  0.516155  0.181162
1      2004098  2738  0.443711  0.815725  3238.336234  3238.488951  3241.228881  3243.661958  3242.407555  3243.783839  3243.676693  3243.800524  0.133629  0.236260  -0.298383  0.122912  -0.100454  0.025324  -0.013486  0.097341  0.030391  0.019042
1      2040898  2738  0.676808  0.610463  3241.442146  3242.449918  3242.502684  3243.202847  3243.802047  3243.233496  3243.496321  3243.800524  -0.065485  0.095602  0.207722  -0.057787  0.165752  0.014559  0.205258  0.003543  0.221293  0.037588
</pre>
P-values can be generated doing a likelihood ratio test, between the 2 desired models.
An Rscript "getPvalues.R" is provided that makes it easy to obtain P-values from the .res file:
<pre>
Rscript R/getPvalues.R out.res
</pre>
=Models=
=Citation=

Latest revision as of 09:32, 24 March 2026

The program is available and described on github:

https://github.com/e-jorsboe/asaMap