ANGSD: Analysis of next generation Sequencing Data

Latest tar.gz version is (0.938/0.939 on github), see Change_log for changes, and download it here.

ANGSD: Difference between revisions

From angsd
Jump to navigation Jump to search
No edit summary
No edit summary
Line 1: Line 1:
Latest version is 0.43 from september 22 2012
Latest version is 0.441 from september 25 2012





Revision as of 03:34, 25 September 2012

Latest version is 0.441 from september 25 2012


About

ANGSD is a software for analyzing next generation sequencing data. The software can handle a number of different input types from mapped reads to imputed genotype probabilities. All methods take genotype uncertainty into account instead of basing the analysis on called genotypes. This is especially useful for low and medium depth data. The software is written in C++ and can handle thousands of samples

Overview of input and intermediary data

The input and intermediary data structures of angsd.

<classdiagram type="dir:LR"> [sequence data]->[genotype;likelihoods] [genotype;likelihoods]->[genotype;probabilities] [sequence files|bam files;SOAP files{bg:orange}]->[sequence data] [glf files|glfv3;soapSNP{bg:orange}]->[genotype;likelihoods] [genotype prob|beagle output{bg:orange}]->[genotype;probabilities] </classdiagram>

Analysis from sequencing data

<classdiagram> // [input|bam files;SOAP files{bg:orange}]->[sequence data]

[sequence data]->[output|summary stats;phat estimates;error estimates{bg:blue}]
</classdiagram>

Analysis from genotype likelihoods

<classdiagram> //[input data|glf files{bg:orange}]->[genotype;likelihoods] [genotype;likelihoods]->[output|glf files;beagle files;MAF estimates;MAF associations;SNP Calling;realSFS;error estimates;Inbreeding{bg:blue}]

</classdiagram>


Analysis from genotype probabilities

<classdiagram> //[input data|beagle output{bg:orange}]->[genotype;probabilities] [genotype;probabilities]->[output|genotype calling;MAF estimates;associations{bg:blue}]

</classdiagram>

Synopsis

./angsd [OPTIONS]

example of allele frequency estimated from genotype likelihoods with bam files as input using 10 threads

./angsd -out outFileName -bam bam.filelist -GL 1 -doMaf 2 -doMajorMinor 1 -nThreads 10