You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to clean several de novo assemblies of insects from common contamination sources: human, bacteria and the database UniVec. For this, I concatenated all the fasta in one file called ' cat.fna' and I made a mapping file at the species level called 'ids_cat'.
I then ran: conterminator dna cat.fna ids_cat conterminator.results tmp_conterminator --threads 20 --blacklist 10239 --kingdoms '2,28384,9606,50557'
I changed the option kingdoms in order to look for contamination between bacteria, other sequences (which is the taxid I used for the sequences of UniVec), homo sapiens and insects. I do not which to look for contamination between by insect genomes.
I do not need to ignore any taxa, so I just specified 10239 in the option blacklist in order to avoid the default taxons (which contain 28384, which I need).
Running this command, I get the following error message rescorediagonal step died.
Interestingly, it works if I only specify 2,28384,9606 or 2,50557 or even 9606,50557 for kingdoms. Do you have any idea, why the combination I used do not work? 28384,50557 does not work either, but I get a different error message: Extractframes died
Moreover, I do not understand why in the output in which I used 2,50557, I have contamination between bacteria and human? Shouldn't it not even be looking for contamination at all between these two taxons in this configuration?
N.B. Just to let you know, it seems that conterminator cannot deal with some pattern of fasta identifier. The sequences of UniVec look like gnl|uv|X66730.1:1-2687-49. I had to change that to gnl uv|X66730.1:1-2687-49 and to write in the mapping file: gnl 28384. Otherwise I had the error: crosstaxonfilterorf step died.
The text was updated successfully, but these errors were encountered:
Hi,
I am trying to clean several de novo assemblies of insects from common contamination sources: human, bacteria and the database UniVec. For this, I concatenated all the fasta in one file called ' cat.fna' and I made a mapping file at the species level called 'ids_cat'.
I then ran:
conterminator dna cat.fna ids_cat conterminator.results tmp_conterminator --threads 20 --blacklist 10239 --kingdoms '2,28384,9606,50557'
I changed the option kingdoms in order to look for contamination between bacteria, other sequences (which is the taxid I used for the sequences of UniVec), homo sapiens and insects. I do not which to look for contamination between by insect genomes.
I do not need to ignore any taxa, so I just specified 10239 in the option blacklist in order to avoid the default taxons (which contain 28384, which I need).
Running this command, I get the following error message
rescorediagonal step died
.Interestingly, it works if I only specify
2,28384,9606
or2,50557
or even9606,50557
for kingdoms. Do you have any idea, why the combination I used do not work?28384,50557
does not work either, but I get a different error message:Extractframes died
Moreover, I do not understand why in the output in which I used
2,50557
, I have contamination between bacteria and human? Shouldn't it not even be looking for contamination at all between these two taxons in this configuration?nohup_conterminator.txt
Thanks,
Héloïse
N.B. Just to let you know, it seems that conterminator cannot deal with some pattern of fasta identifier. The sequences of UniVec look like
gnl|uv|X66730.1:1-2687-49
. I had to change that tognl uv|X66730.1:1-2687-49
and to write in the mapping file:gnl 28384
. Otherwise I had theerror: crosstaxonfilterorf step died
.The text was updated successfully, but these errors were encountered: