ANGSD: Analysis of next generation Sequencing Data

Latest tar.gz version is (0.938/0.939 on github), see Change_log for changes, and download it here.

Contamination: Difference between revisions

From angsd
Jump to navigation Jump to search
No edit summary
No edit summary
Line 1: Line 1:
Angsd can estimate contamination, but only for chromosomes that exists in one genecopy (eg chrX for males). This method requires a list of HapMap sites along with their frequency and we also recommend to discard regions with low mappability.
Angsd can estimate contamination, but only for chromosomes that exists in one genecopy (eg chrX for males). This method requires a list of HapMap sites along with their frequency and we also recommend to discard regions with low mappability.


Angsd comes with these two files in the '''RES'''/ subfolder and these are called:
We have included a mappability and HapMap files for chrX these are found in the '''RES''' subfolder of the angsd source package.
So if you are working with humans, and your sample is a male then you can estimate the contamination with the follow two commands.
 
* First we generate a binary count file for chrX for a single BAM file (ANGSD cprogram)
* Then we do a fisher test for finding a p-value, and jackknife to get an estimate of contamination (Rprogram)




An example are found below:
An example are found below:
<pre>
<pre>


</pre>
</pre>

Revision as of 11:56, 27 June 2014

Angsd can estimate contamination, but only for chromosomes that exists in one genecopy (eg chrX for males). This method requires a list of HapMap sites along with their frequency and we also recommend to discard regions with low mappability.

We have included a mappability and HapMap files for chrX these are found in the RES subfolder of the angsd source package. So if you are working with humans, and your sample is a male then you can estimate the contamination with the follow two commands.

  • First we generate a binary count file for chrX for a single BAM file (ANGSD cprogram)
  • Then we do a fisher test for finding a p-value, and jackknife to get an estimate of contamination (Rprogram)


An example are found below: