ErrorSykMethod

The method is described in kim2011 and work by simultaneously estimating allele frequencies, genotype likelihoods and the error rates.

The likelihood of the sequencing data $D$ of $n$ individuals for $M$ sites can be described through the allele frequencies $f$ and the type specific error rates $e$

${\begin{aligned}p(D|f,e)&=\prod _{i=1}^{n}\prod _{j=1}^{M}p(D_{j}^{i}|f_{j},e)\\&=\prod _{i=1}^{n}\prod _{j=1}^{M}\sum _{g=0}^{2}p(g|f_{j})p(D_{j}^{i}|g,e)\end{aligned}}$

by summing over the unknown genotypes $g$ . The genotype likelihood $p(D_{j}^{i}|g,e)$ relies on the type specific error rates (see kim2011 p.14 for details). The type specific error rates are obtain along site the allele frequencies by

${\hat {e}},{\hat {f}}=argmax_{f,e}p(D|f,e)$

ErrorSykMethod

Navigation menu