ANGSD: Analysis of next generation Sequencing Data

Latest tar.gz version is (0.938/0.939 on github), see Change_log for changes, and download it here.

SnpFilters: Difference between revisions

From angsd
Jump to navigation Jump to search
No edit summary
No edit summary
Line 16: Line 16:


These statistics will be calculated and reported and written into a file called '''PREFIX.snpStat.gz'''
These statistics will be calculated and reported and written into a file called '''PREFIX.snpStat.gz'''
The HWE requies -dohwe which is depended on domaf. So a full example would be:


<pre>
<pre>

Revision as of 19:13, 24 November 2016

Angsd has different snpfilters/snpstats.

  • SB1 strand bias1
  • SB2 strand bias2
  • SB3 strand bias3
  • deviation from HWE
  • Wilcox rank sum test for qscore bias


The 3 strand bias filters are described here: http://www.biomedcentral.com/1471-2164/13/666

The deviation from HWE is described in http://www.ncbi.nlm.nih.gov/pubmed/23950147

The wilcox rank sum test is not described anywhere

These statistics will be calculated and reported and written into a file called PREFIX.snpStat.gz

./angsd -b list -domaf 1 -domajorminor 1 -gl 1 -snp_pval 1e-2  -P 5 -dosnpstat 1 

Please notice that -doSnpStat 1 does not filter out sites, but will only report stats. In the above command we therefore limit the analysis and output to the sites that are likely to be truly variable (-snp_pval 1e-2). But we calculate HWE deviations from all sites, but does not filter by it (-hwe_pval 1)

Example Output

hromo  Position        +Major +Minor -Major -Minor     SB1:SB2:SB3     HWE_LRT:HWE_pval        baseQ_Z:baseQ_pval
1       14000023        45 0 22 4       -2.730769:0.163031:0.015386     0.060143:8.062706e-01   -1.882799:5.972750e-02
1       14000072        58 0 43 1       -2.318182:0.022952:0.431373     -0.000006:1.000000e+00  -1.647226:9.951167e-02
1       14000202        33 0 24 15      -1.846154:0.485830:0.000023     -0.000021:1.000000e+00  -2.114540:3.446902e-02
1       14000873        41 20 56 21     0.185598:0.339238:0.574272      1.973686:1.600571e-01   -3.496682:4.711723e-04
1       14001018        37 14 32 11     0.070296:0.278303:1.000000      1.759127:1.847334e-01   -3.037824:2.383068e-03
1       14001501        80 1 66 1       -0.190897:0.014943:1.000000     -0.000002:1.000000e+00  -0.357063:7.210450e-01
1       14001867        46 21 52 13     0.440386:0.337740:0.165207      0.288166:5.913983e-01   -1.961795:4.978620e-02
1       14002342        52 1 53 3       -0.945670:0.054563:0.618547     0.659996:4.165614e-01   -0.161165:8.719638e-01
1       14002422        41 17 29 20     -0.332741:0.441037:0.228091     6.478374:1.091948e-02   -0.822001:4.110760e-01
1       14002474        66 6 46 5       -0.164439:0.098696:0.761125     -0.000012:1.000000e+00  -1.763711:7.778050e-02
1       14002970        47 0 50 4       -1.870370:0.077129:0.121143     -0.000094:1.000000e+00  -2.411706:1.587805e-02
1       14003581        59 22 53 18     0.068718:0.275157:0.854787      0.870476:3.508235e-01   -1.033482:3.013785e-01
1       14004473        57 2 59 1       0.683522:0.034195:0.618617      -0.000022:1.000000e+00  -1.067950:2.855431e-01
1       14004623        57 21 56 34     -0.331562:0.410438:0.142272     0.616781:4.322460e-01   -1.788061:7.376604e-02
1       14005069        73 4 77 1       1.212954:0.052991:0.209619      -0.000002:1.000000e+00  -1.002612:3.160481e-01

Source Code

The source code can be found here: https://github.com/ANGSD/angsd/blob/master/abcFilterSNP.cpp

NB please use latest dev version for these options