NgsAdmix: Difference between revisions

From software
Jump to navigation Jump to search
(Replaced content with "NGSadmix is a tool for estimating individual admixture proportions low depth sequencing data based on genotype likelihoods The software including tutorials can be found here https://github.com/aalbrechtsen/NGSadmix")
Tag: Replaced
 
(102 intermediate revisions by 4 users not shown)
Line 1: Line 1:
This will contain the program called NGSadmix, which is a very nice tool for finding admixture. It is based on genotype likelihoods or genotype probabilities.
NGSadmix is a tool for estimating individual admixture proportions low depth sequencing data based on genotype likelihoods  
It is a fancy multithreaded c/c++ program


=Installation=
The software including tutorials can be found here
 
https://github.com/aalbrechtsen/NGSadmix
<pre>
wget popgen.dk/software/NGSadmix/ngsadmix32.cpp
g++ ngsadmix32.cpp -O3 -lpthread -lz -o NGSadmix
</pre>
 
=Input Files=
The current input files are the widely used beagle inputfiles, or beagle imputed outputfiles [http://faculty.washington.edu/browning/beagle/beagle.html]. We recommend [[ANGSD]] for easy transformation of Next-generation sequencing data to beagle format.
 
=Options=
<pre>
./NGSadmix
Arguments:
-likes Beagle likelihood filename
-K Number of ancestral populations
Optional:
-fname Ancestral population frequencies
-qname Admixture proportions
-outfiles Prefix for output files
-printInfo print ID and mean maf for the SNPs that were analysed
Setup:
-seed Seed for initial guess in EM
-P Number of threads
-method If 0 no acceleration of EM algorithm
-misTol Tolerance for considering site as missing
Stop chriteria:
-tolLike50 Loglikelihood difference in 50 iterations
-tol Tolerance for convergence
-dymBound Use dymamic boundaries (1: yes (default) 0: no)
-maxiter Maximum number of EM iterations
Filtering
-minMaf Minimum minor allele frequency
-minLrt Minimum likelihood ratio value for maf>0
-minInd Minumum number of informative individuals
 
</pre>
=Output Files=
Program outputs 3 files.
 
#  PREFIX.log
#  PREFIX.fopt.gz
# PREFIX.qopt
 
==Log file==
 
Contents of the file log file
<pre class="mw-collapsible-content">
-> Dumping file: tskSim/tsk6GL.beagle.s1.log
-> Dumping file: tskSim/tsk6GL.beagle.s1.filter
Input: lname=tskSim/tsk6GL.beagle nPop=3, fname=(null) qname=(null) outfiles=tskSim/tsk6GL.beagle.s1
Setup: seed=1 nThreads=10 method=1
Convergence: maxIter=2000 tol=0.000000 tolLike50=0.010000 dymBound=0
Filters: misTol=0.050000 minMaf=0.000000 minLrt=0.000000 minInd=0
Input file has dim: nsites=100000 nind=75
Input file has dim (AFTER filtering): nsites=100000 nind=75
iter[start] like is=9299805.984931
iter[50] like is=-6531138.892608 thres=0.002800
iter[100] like is=-6528710.773349 thres=0.001289
iter[150] like is=-6528405.896951 thres=0.001211
iter[200] like is=-6528306.803820 thres=0.000420
iter[250] like is=-6528277.160993 thres=0.000546
iter[300] like is=-6528271.925055 thres=0.000033
iter[350] like is=-6528271.177692 thres=0.000008
iter[400] like is=-6528270.876315 thres=0.000005
iter[450] like is=-6528270.772894 thres=0.000140
iter[500] like is=-6528270.747721 thres=0.000002
iter[550] like is=-6528270.740654 thres=0.000002
Convergence achived because log likelihooditer difference for 50 iteraction is less than 0.010000
best like=-6528270.740654 after 550 iterations
-> Dumping file: tskSim/tsk6GL.beagle.s1.qopt
-> Dumping file: tskSim/tsk6GL.beagle.s1.fopt.gz
[ALL done] cpu-time used =  671.82 sec
[ALL done] walltime used =  114.00 sec
</pre>
</div>
 
=log=
* v32 june 25-2013; modified code such that it now compiles on OSX
* v31 june 24-2013; First public version.

Latest revision as of 09:28, 24 March 2026

NGSadmix is a tool for estimating individual admixture proportions low depth sequencing data based on genotype likelihoods

The software including tutorials can be found here https://github.com/aalbrechtsen/NGSadmix