The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Simple haploid output based on sampling or consensus. Latest github version of angsd has a small utility program in the misc folde that converts to plink output (tfam/tped).

[BAM files{bg:orange}]->[Sequence data|Random base;Consensus base]

[sequence data]->[*.haplo.gz|single base file{bg:blue}] </classdiagram>

Brief Overview

> ./angsd -doHaploCall
	-> angsd version: 0.910-45-g2b2b4f0-dirty (htslib: 1.2.1-192-ge7e2b3d) build(Jan  3 2016 14:45:41)
	-> Analysis helpbox/synopsis information:
	-> Command: 
./angsd -doHaploCall 	-> Sun Jan  3 15:18:15 2016
--------------
abcHaploCall.cpp:
	-doHaploCall	0
	(Sampling strategies)
	 0:	 no haploid calling 
	 1:	 (Sample single base)
	 2:	 (Concensus base)
	-doCounts	0	Must choose -doCount 1
Optional
	-minMinor	0	Minimum observed minor alleles
	-maxMis	-1	Maximum missing bases (per site)

This function outputs a base for each individual for each site

Options

-doHaploCall [int]

1; sample a random base 2; most frequent base. Random base for ties

-doCounts 1

use -doCounts 1 in order to count the bases at each sites after filters.

-minMinor [int]

Minimum observed minor alleles; only prints sites with more than minMinor sampled alleles (across individuals).

-maxMis [int]

maximum allowed missing alleles (accross individuals). -maxMis 0 means only sites without missing alleles are printed

Output

.haplo.gz

Output: Each line represents site. chromsome name (Column 1), position (Column 2), major allele (Column 3). One column for each individual with the sampled allele.

Example

Create a fasta file bases from a random samples of bases.

./angsd -bam bam.filelist -dohaplocall 1 -doCounts 1 -r 1: -minMinor 1

Output

chr	pos	major	ind0	ind1	ind2	ind3	ind4	ind5	ind6
1	14000170	C	T	T	C	N	C	C	C
1	14000202	A	A	N	G	A	N	N	G
1	14000457	G	G	G	G	G	G	N	A
1	14000459	G	G	G	G	G	A	N	N
1	14000774	G	T	G	G	G	G	G	T
1	14002083	C	G	N	C	C	C	C	C
1	14002351	A	A	C	C	A	C	N	A
1	14002950	A	T	A	A	A	T	N	T
1	14004832	G	G	G	A	G	G	A	G
1	14006543	G	T	G	G	G	G	G	G
1	14006631	A	C	N	A	N	A	N	A
1	14007068	G	T	T	T	G	G	G	N
1	14009284	A	A	C	C	C	N	A	N
1	14009775	G	G	G	G	G	C	G	C
1	14009787	T	T	T	G	T	G	T	T
1	14009791	A	G	G	A	G	A	G	A
1	14009794	A	A	A	A	N	N	A	A
1	14009800	A	G	A	A	G	N	G	A
1	14010748	A	G	N	A	G	A	A	A

columns are

chr

chromosome

pos

position

major

major allele (most common of the sampled alleles)

ind0

first individual - same order as in the input files

Haploid calling

Contents

Brief Overview

Options

Output

Example

Output

Navigation menu