Setting up Standalone BLAST Software in Linux
Installing and executing stand-alone BLAST softwares in Linux.
The local version is significant when we have a large set of sequences to BLAST and this is not affected by the Internet speed /Traffic etc and it can be automated.
The stand alone blast can be downloaded from the NCBI FTP site (The link can be found at the bottom side tool bar in the NCBI main page “FTP Site-> Blast-> executables->Latest”).
The file should be in binary mode. Filenames are of the following form:
jk@jk:~/Desktop/blast-2.2.18/bin$ gunzip blast-2.2.18-ia32-linux.tar.gz #uncompress
jk@jk:~/Desktop/blast-2.2.18/bin$ tar -xpf blast-2.2.18-ia32-linux.tar #extract
For more information on the options look into $man tar/gunzip.
How to execute bl2seq (BLAST two sequence):
The input files to any BLAST softwares should always be in FASTA format.
eg
>gi|229673|pdb|1ALC| Alpha-Lactalbumin
KQFTKCELSQNLYDIDGYGRIALPELICTMFHTSGYDTQAIVENDESTEYGLFQISNALWCKSSQSPQSR
NICDITCDKFLDDDITDDIMCAKKILDIKGIDYWIAHKALCTEKLEQWLCEKE
Syntax:
jk@jk:~/Desktop/blast-2.2.18/bin$ ./bl2seq - # Displays all options
jk@jk:~/Desktop/blast-2.2.18/bin$ ./bl2seq -p blastp -e 0.01 -i -j # blastp -to execute protein sequence
-i First sequence [File In]
-j Second sequence [File In]
-p Program name: blastp, blastn, blastx, tblastn, tblastx. For blastx 1st sequence should be nucleotide, tblastn 2nd sequence nucleotide.
-e E-Value # (optional)
How to execute Blastall:
You can download the Protein or Nucleotide database from swissprot or NCBI. for eg to download the human chr22,
go to NCBI-> FTP site-> RefSeq-> H_sapiens-> H_sapiens ->chr22.
Note:
>gi|86438068|gb|AAI12638.1| HGD protein [Bos taurus]
MTELKYISGFGNECASEDPRCPGALPEGQNNPQVCPYNLYAEQLSGSAFTCPRSTNKRSWLYRILPSVSH
KPFEFIDQGHITHNWD
>gi|116283875|gb|AAH44758.1| Hgd protein [Mus musculus]
MSVLQRILAVQVPCPKDSWLYRILPSVSHKPFESIDQGHVTHNWDEVGPDPNQLRWKPFEIPKASEKKVD
FVSGLYTLCGAGDIKSNNGLAVHIFLCNSSMENRCFYNSDGDFLIVPQKGKLLIYTEFGKMSLQPNEICV
>gi|116283724|gb|AAH24369.1| Hgd protein [Mus musculus]
MSVLQRILAVQVPCPKDSWLYRILPSVSHKPFESIDQGHVTHNWDEVGPDPNQLRWKPFEIPKASEKKVD
Formatdb:
jk@jk:~/Desktop/blast-2.2.18/bin$ ./formatdb - # displays all options
jk@jk:~/Desktop/blast-2.2.18/bin$ ./blast-2.2.18/bin/formatdb -i -o T -p T
-i Input file(s) for formatting (this parameter must be set) [File In]
-p Type of file T - protein F - nucleotide (default = T)
-o Parse options T - True: Parse SeqId and create indexes. F - False: Do not parse SeqId. ( default = F)
2. Executing Blastall:
jk@jk:~/Desktop/blast-2.2.18/bin$ ./blastall -i -p blastp -d -o
-p Program Name [String] Input should be one of “blastp”, “blastn”, “blastx”, “tblastn”, or “tblastx”.
-d Database [String] default = nr The database specified must first be formatted with formatdb.
-i Query File [File In]
-o BLAST report Output File [File Out]
The output file will contain the BLAST output for all the input query sequences.