RefFinder: Difference between revisions

From software
Jump to navigation Jump to search
(Created page with "Small fast cprogram to extract bases from a fasta file.")
 
m (Reverted edits by Albrecht (talk) to last revision by Thorfinn)
 
(20 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Small fast cprogram to extract bases from a fasta file.
Small fast cprogram to extract bases from a fasta file. Download here [http://popgen.dk/software/download/refFinder/refFinder.tar.gz]
 
Program can either work as a standalone program, or allow for easy retrieval of reference bases by using the API.
 
=Install=
<pre>
wget http://popgen.dk/software/download/refFinder/refFinder.tar.gz
tar xf refFinder.tar.gz
cd refFinder/
make
cd ..
</pre>
 
=Stand alone=
==Example==
 
Generate samtools chr pos ref doing
 
<pre>
samtools mpileup -b smallBam.filelist -f /space/genomes/refgenomes/hg19/merged/hg19NoChr.fa |cut -f1-3 >small.sam
</pre>
 
Use refFinder to find the bases for each position in '''small.sam'''
<pre>
cut -f1-2 ../angsd/test/small.sam |./refFinder /space/genomes/refgenomes/hg19/merged/hg19NoChr.fa full >tst
cmp tst ../angsd/test/small.sam
</pre>
 
possible options are
;inputIsZero
;full
 
These are flags, so examples are
 
<pre>
cut -f1-2 ../angsd/test/small.sam |./refFinder /space/genomes/refgenomes/hg19/merged/hg19NoChr.fa |head
a
g
c
t
a
c
t
c
g
g
</pre>
 
Or if we want the chr position also
 
<pre>
cut -f1-2 ../angsd/test/small.sam |./refFinder /space/genomes/refgenomes/hg19/merged/hg19NoChr.fa full |head
1 13999902   a
1 13999903   g
1 13999904   c
1 13999905   t
1 13999906   a
1 13999907   c
1 13999908   t
1 13999909   c
1 13999910   g
1 13999911   g
</pre>
 
Or if the positions are zero index as opposed to one indexed:
 
<pre>
cut -f1-2 ../angsd/test/small.sam |./refFinder /space/genomes/refgenomes/hg19/merged/hg19NoChr.fa full inputIsZero |head
1 13999902 g
1 13999903 c
1 13999904 t
1 13999905 a
1 13999906 c
1 13999907 t
1 13999908 c
1 13999909 g
1 13999910 g
1 13999911 g
</pre>
 
=API=
 
<pre>
 
#include "refFinder.h"
perFasta *pf = init("hg19.fa");
char refbase = getchar("chr20",130224101,pf)
//refbase now contains the reference base for chr20 at position 130,224,101
destroy(pf);
</pre>
 
Remember to link with refFinder.o and -lz
 
<pre>
g++ sampleProg.cpp refFinder.o -lz
</pre>
 
=bugs=
# do check if reference file doesn't exist.

Latest revision as of 15:55, 20 March 2014

Small fast cprogram to extract bases from a fasta file. Download here [1]

Program can either work as a standalone program, or allow for easy retrieval of reference bases by using the API.

Install

wget http://popgen.dk/software/download/refFinder/refFinder.tar.gz
tar xf refFinder.tar.gz
cd refFinder/
make
cd ..

Stand alone

Example

Generate samtools chr pos ref doing

samtools mpileup -b smallBam.filelist -f /space/genomes/refgenomes/hg19/merged/hg19NoChr.fa |cut -f1-3 >small.sam

Use refFinder to find the bases for each position in small.sam

cut -f1-2 ../angsd/test/small.sam |./refFinder /space/genomes/refgenomes/hg19/merged/hg19NoChr.fa full >tst
cmp tst ../angsd/test/small.sam

possible options are

inputIsZero
full

These are flags, so examples are

cut -f1-2 ../angsd/test/small.sam |./refFinder /space/genomes/refgenomes/hg19/merged/hg19NoChr.fa |head
a
g
c
t
a
c
t
c
g
g

Or if we want the chr position also

cut -f1-2 ../angsd/test/small.sam |./refFinder /space/genomes/refgenomes/hg19/merged/hg19NoChr.fa full |head
1	13999902	  a
1	13999903	  g
1	13999904	  c
1	13999905	  t
1	13999906	  a
1	13999907	  c
1	13999908	  t
1	13999909	  c
1	13999910	  g
1	13999911	  g

Or if the positions are zero index as opposed to one indexed:

cut -f1-2 ../angsd/test/small.sam |./refFinder /space/genomes/refgenomes/hg19/merged/hg19NoChr.fa full inputIsZero |head
1	13999902	 g
1	13999903	 c
1	13999904	 t
1	13999905	 a
1	13999906	 c
1	13999907	 t
1	13999908	 c
1	13999909	 g
1	13999910	 g
1	13999911	 g

API


#include "refFinder.h"
perFasta *pf = init("hg19.fa");
char refbase = getchar("chr20",130224101,pf)
//refbase now contains the reference base for chr20 at position 130,224,101
destroy(pf);

Remember to link with refFinder.o and -lz

g++ sampleProg.cpp refFinder.o -lz

bugs

  1. do check if reference file doesn't exist.