ANGSD: Analysis of next generation Sequencing Data

Latest tar.gz version is (0.938/0.939 on github), see Change_log for changes, and download it here.

Filters: Difference between revisions

From angsd
Jump to navigation Jump to search
No edit summary
No edit summary
Line 58: Line 58:
1 13999979 T A 0.000005 8
1 13999979 T A 0.000005 8
</pre>
</pre>
If we are interested in all sites with a p-value of 10^(-6) of being variable
<pre>
../angsd0.3/angsd -doMaf 2 -doMajorMinor 1 -out TSK -bam bam.filelist -GL 1 -r 1: -minLRT 24 -doSNP 1
head TSK.mafs
chromo position major minor knownEM pK-EM nInd
1 14000202 G A 0.279722 42.623150 9
1 14000873 G A 0.212120 79.118476 10
1 14001018 T C 0.333736 89.040311 8
1 14001867 A G 0.200232 47.195423 10
1 14002422 A T 0.167692 43.196259 9
1 14003581 C T 0.207404 58.593208 9
1 14004623 T C 0.219838 102.856433 10
1 14007493 A G 0.453217 28.398647 9
1 14007558 C T 0.395670 80.236777 7
</pre>
==Deprecated options==
==Deprecated options==
These options should either be included (as is) or be discarded
These options should either be included (as is) or be discarded

Revision as of 14:04, 19 June 2012

In most analysis you are only interested in a subset of sites and not all sites. Currently we have the following filter options.


-minMaf float
only work with sites with a maf above 'float'
-minKeepInd int
only work with sites with information from atleast int individiduals
-minLRT float
only work with sits with an LRT>float


First we do a run with no filters

./angsd  -doMaf 2 -doMajorMinor 1 -out TSK -bam bam.filelist -GL 1 -r 1:
...
head TSK.mafs 
chromo	position	major	minor	knownEM	nInd
1	13999919	A	C	0.000008	1
1	13999920	G	A	0.000008	1
1	13999921	G	A	0.000008	1
1	13999922	C	A	0.000008	1
1	13999923	A	C	0.000008	1
1	13999924	G	A	0.000008	1
1	13999925	G	A	0.000008	1
1	13999926	A	C	0.000008	1
1	13999927	G	A	0.000008	1

Now we do a filter with MAF cutoff of 1\%

../angsd0.3/angsd -doMaf 2 -doMajorMinor 1 -out TSK -bam bam.filelist -GL 1 -r 1: -minMaf 0.01
head TSK.mafs 
chromo	position	major	minor	knownEM	nInd
1	13999950	T	G	0.495291	2
1	14000019	G	T	0.047247	9
1	14000056	C	T	0.055851	10
1	14000127	G	T	0.060760	10
1	14000170	C	T	0.052388	9
1	14000176	G	A	0.047928	10
1	14000202	G	A	0.279722	9
1	14000262	C	T	0.058555	9
1	14000322	A	G	0.040471	8

Similar if we only want sites with information for atleast 5 samples

../angsd0.3/angsd -doMaf 2 -doMajorMinor 1 -out TSK -bam bam.filelist -GL 1 -r 1: -minKeepInd 5
head TSK.mafs 
chromo	position	major	minor	knownEM	nInd
1	13999971	T	A	0.000007	6
1	13999972	G	A	0.000007	6
1	13999973	C	A	0.000005	5
1	13999974	G	A	0.000006	6
1	13999975	C	A	0.000002	5
1	13999976	C	A	0.000004	7
1	13999977	A	C	0.000005	8
1	13999978	C	A	0.000005	8
1	13999979	T	A	0.000005	8

If we are interested in all sites with a p-value of 10^(-6) of being variable

../angsd0.3/angsd -doMaf 2 -doMajorMinor 1 -out TSK -bam bam.filelist -GL 1 -r 1: -minLRT 24 -doSNP 1 
head TSK.mafs 
chromo	position	major	minor	knownEM	pK-EM	nInd
1	14000202	G	A	0.279722	42.623150	9
1	14000873	G	A	0.212120	79.118476	10
1	14001018	T	C	0.333736	89.040311	8
1	14001867	A	G	0.200232	47.195423	10
1	14002422	A	T	0.167692	43.196259	9
1	14003581	C	T	0.207404	58.593208	9
1	14004623	T	C	0.219838	102.856433	10
1	14007493	A	G	0.453217	28.398647	9
1	14007558	C	T	0.395670	80.236777	7


Deprecated options

These options should either be included (as is) or be discarded

-minDepth
-maxDepth