How to make your own database from UniProt with taxonomy


New member
I have metagenomic shotgun sequencing data from a soil sample and a list of genes we are interested in. I want to only blast for these genes, to see if they occur in the sample and then find out, which taxon they belong to.
I want to use diamond for that and build that contains only these genes, blast my short reads angainst it and view the results in MEGAN.

So far I tried downloading all available sequences for these genes from Uniprot and making a database out of them with:
diamond makedb --db Uniprot-myGenes --in Uniprot-genes.fasta

when I then perform a search with
diamond blastx --db Uniprot-myGenes.dmdb --out diamond-Uniprot-test --outfmt 100
and then meganize the resulting daa file and load it in megan, it says that all the reads not assigned.

When i try and make the database with
diamond makedb --db Uniprot-myGenes --in Uniprot-genes.fasta --taxonnodes taxdmp/nodes.dmp --taxonmap prot.accession2taxid.gz
it doesn't seem to add the taxonomic information to the database, and i get the same results.

Do i need a different mapping file for my genes? Or am i just completely on the wrong path here
How can i make a specific database, for genes of interest? So far i haven't really found any good suggestions on how to do that and i kindly ask for you help

Many thanks

Benjamin Buchfink

Staff member
The taxonomy mapping files from NCBI only work for NCBI databases. Diamond does not support the Uniprot mapping files at this time.

If you want to view the results in Megan however, taxonomy mapping needs to be done by Megan, not Diamond. I'm not sure if Megan supports the Uniprot taxonomy. You can get support at the Megan community here: