NCBI BLAST databases

The NCBI databases are downloaded every Sunday to a directory with that date. The file /datasets/bio/ncbi-db/.ncbirc is then updated to point to the new copy once the download has been verified. This allows running jobs to have a consistent database throughout the run.

Note that other tools that can use the NCBI database but do not read this configuration file can use the output of blastdb_path to find the current copy, as shown in the following example:

module load blast-plus/2.14.1 diamond/2.1.10
NR=$(blastdb_path -db nr -dbtype prot)
diamond blastp --db "$NR" -q query.fasta -o matches.tsv

Conda or other binary installed versions of BLAST may not have blastdb_path. In this case, the tools expect NCBI to be set (2.14+). The modules set this for you, but if you are using a conda environment, you should set it explicitly:

export NCBI=/datasets/bio/ncbi-db

If you need the exact search path, you can use the command:

blastdbcmd -show_blastdb_search_path
Path:/datasets/bio/ncbi-db/
URL:https://ftp.ncbi.nlm.nih.gov/blast/db/
Downloaded:weekly
Cite:https://support.nlm.nih.gov/knowledgebase/article/KA-03391/en-us
Variant: