2d SFS Estimation
Angsd can estimate a 2d site frequency spectrum. This is an extension of the 1d site frequency spectrum method. Never versions of ANGSD can estimate even higher dimensions (upto 4)
And is best explained by a full example.
- Assume you have a 12 bamfiles for population in the file pop1.list
- Assume you have a 14 bamfiles for population in the file pop2.list
- Assume you have a fastafile containing the ancestral state in the anc.fa
Let's start by finding the positions for which we have data in population1 and population2
# as always you can add -minMapQ 1 and -minQ 20 to only keep high quality data. angsd -GL 1 -b pop1.list -anc anc.fa -r chr1: -P 10 -out pop1 -doSaf 1 angsd -GL 1 -b pop2.list -anc anc.fa -r chr1: -P 10 -out pop2 -doSaf 1
#sfs for pop1 realSFS pop1.saf.idx -P 24 >pop1.saf.sfs #sfs for pop2 realSFS pop2.saf.idx -P 24 >pop2.saf.sfs #2d sfs for pop1 and pop2 realSFS pop1.saf.idx pop2.saf.idx -P 24 >2dsfs.sfs
The output is then located in a nice flattened matrix format(25x29) in the file: 2dsfs.sfs. Good luck visualising it, some people are using dadi, we have been using heat maps in R.