ANGSD: Analysis of next generation Sequencing Data

Latest tar.gz version is (0.938/0.939 on github), see Change_log for changes, and download it here.

ErrorSykMethod

From angsd
Revision as of 16:12, 26 February 2014 by Albrecht (talk | contribs) (Created page with "==Method 1== The method is described in kim2011 and work by simultaneously estimating allele frequencies, genotype likelihoods and the error rates. The likelihood of t...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Method 1

The method is described in kim2011 and work by simultaneously estimating allele frequencies, genotype likelihoods and the error rates.


The likelihood of the sequencing data Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle D} of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle n} individuals for Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle M} sites can be described through the allele frequencies Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f} and the type specific error rates Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle e}

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} p(D|f,e) &= \prod_{i=1}^n \prod_{j=1}^M p(D_j^i|f_j,e)\\ &= \prod_{i=1}^n \prod_{j=1}^M \sum_{g=0}^2 p(g|f_j)p(D_j^i|g,e) \end{align}}

by summing over the unknown genotypes Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g} . The genotype likelihood Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p(D_j^i|g,e)} relies on the type specific error rates (see kim2011 p.14 for details). The type specific error rates are obtain along site the allele frequencies by

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \hat{e},\hat{f} = argmax_{f,e} p(D|f,e)}