ANGSD: Analysis of next generation Sequencing Data

Latest tar.gz version is (0.938/0.939 on github), see Change_log for changes, and download it here.

HWE test: Difference between revisions

From angsd
Jump to navigation Jump to search
No edit summary
No edit summary
Line 1: Line 1:
Test for Hardy Weinberg equilibrium based on genotype likelihoods. This class works both as a filter for all other classes and outputs the results in a file.
Test for Hardy Weinberg equilibrium based on genotype likelihoods. This class works both as a filter for all other classes and outputs the results in a file.
The output described below are in the latest version of angsd run as default with the -doSnpStat 1, command.




Line 9: Line 7:
If you want to estimate inbreeding for individuals or include inbreeding information in your analysis try [[HWE_and_Inbreeding_estimates]].  
If you want to estimate inbreeding for individuals or include inbreeding information in your analysis try [[HWE_and_Inbreeding_estimates]].  


;-HWE_pval [float]  
;-doHWE [int]  
p-value threshold. The value must be above 0 and a maximum of 1.
Estimate the divination from HWE for each site


;-doMajorMinor [int]
;-doMajorMinor [int]
Method only works for diallelic sites. There choose a methods for selecting the major and minor allele (see [[Inferring_Major_and_Minor_alleles]])
Method only works for diallelic sites. There choose a methods for selecting the major and minor allele (see [[Inferring_Major_and_Minor_alleles]])
==example==
<pre>
angsd -bam bam.filelist  -doHWE 1 -domajorminor 1 -GL 1
</pre>


==Use as a filter==
==Use as a filter==


Sites with a p-value below the p-value threshold will be removed.
see [snpFilters]


==Output==
==Output==


This function will also print the results of the selected sites. If you choose -HWE_pval 1 then all sites (that pass other filters) will be outputted.
This function will also print the results of the selected sites.  
<div class="toccolours mw-collapsible mw-collapsed">
<div class="toccolours mw-collapsible mw-collapsed">
Example of output *.hwe.gz
Example of output *.hwe.gz

Revision as of 14:50, 18 July 2017

Test for Hardy Weinberg equilibrium based on genotype likelihoods. This class works both as a filter for all other classes and outputs the results in a file.


This function has been updated to allow for all kinds of deviations not just F>0. This approach works from version 0.912 and in the latest developmental version from github


If you want to estimate inbreeding for individuals or include inbreeding information in your analysis try HWE_and_Inbreeding_estimates.

-doHWE [int]

Estimate the divination from HWE for each site

-doMajorMinor [int]

Method only works for diallelic sites. There choose a methods for selecting the major and minor allele (see Inferring_Major_and_Minor_alleles)


example

angsd -bam bam.filelist  -doHWE 1 -domajorminor 1 -GL 1 

Use as a filter

see [snpFilters]

Output

This function will also print the results of the selected sites.

Example of output *.hwe.gz

Chromo  Position        Major   Minor   hweFreq Freq    F       LRT     p-value
1       14000873        G       A       0.282473        0.263594        0.674624        3.140936e+00    7.634997e-02
1       14015890        A       G       0.283119        0.300032        0.999762        8.207572e+00    4.171594e-03
1       14018430        A       C       0.276112        0.299817        0.675018        2.780118e+00    9.544113e-02
1       14033343        A       G       0.295368        0.299442        0.999762        6.473824e+00    1.094747e-02
1       14037881        T       A       0.306003        0.341598        -0.518384       3.178415e+00    7.461710e-02
1       14038946        T       C       0.329113        0.333424        0.999775        6.925424e+00    8.497884e-03


Chromo is the chromosome

Position is the position Major is the major allele

Minor is the minor allele

hweFreq is the allele frequency assuming HWE (same as -doMaf 1)

Freq is the allele frequency without HWE assumption

F is the scale departure from HWE (inbreeding coefficient - see model)

LRT is the likelihood ratio statistic

p-value is the p-value based on a likelihood ratio test

Model

Probability of genotypes without assumption of HWE

n
total number of individuals
X
all sequencing data for a site
f
allele frequency
F
inbreeding coefficient*
G
true unobserved genotype

total likelihood


  • NB! we allow for negative values of F in order to be able to detect any divination from HWE.