Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hisat-genotype run since more than 24h without writting anything or do anything... #81

Open
GKerdivel opened this issue Mar 13, 2024 · 5 comments

Comments

@GKerdivel
Copy link

Hi
I am desesperately trying to make hisat-genotype work since week but without success. The best I got is the software starting (big success already) but it get stuck at the beginning...

Here is my command Line:

hisatgenotype -x genotype_genome --base hla --locus-list A -1 myfastq_R1.fastq.gz -2 myfastq_R2.fastq.gz --assembly -p 40 --pp 40
Here is the what happens in the terminal:

Files found: Omitted extracting reads from myfastq_R1.fastq.gz
 A

The terminal is still in use since 24h but nothing happenned since then...

Two files are created :

  • myfastq_R1_fastq_gz-hla-extracted-1_fq.bam.unsorted
  • 2024-03-12_hisat-genotype.log
    Both are totally empty and nothing was written in there.

Checking the processes, nothing related to hisat-genotype seems to appear and no resources are used...

I would realy appreciate to get some help making this work...

Thanks in advance

Gwenneg

@GKerdivel GKerdivel changed the title Hisat-genotype run since more than 24h writting anything or do anything... Hisat-genotype run since more than 24h without writting anything or do anything... Mar 13, 2024
@DarioMarzella
Copy link

Hi, I am not one of the developers but I am also trying to get this software running...
For what I understand, it might be that you are requesting way too many cores/threads. You are requesting 40 threads (-p 40) and, for each of those, 40 cores/threads (-pp 40)? It's not really clear to me how the parallelization works for this software, but it might be that you are actually requesting 40*40 cores (for a total of 1600 cores). Unless you are using a very powerful HPC that allows you to access those many cores without you specifying it, I assume what is happening is that the software tries to spawn way too many processes, causing it to idle for forever. Try simply removing the -pp 40 and see what happens. Also maybe try it with their test data from the ["typing and assembly" section here](Typing and Assembly), which seems to be quite small, so should allow you to run a quick test. Also make sure you do have access to 40 cores in the computer/cluster you are using.
Hope this helps!

@GKerdivel
Copy link
Author

GKerdivel commented Mar 22, 2024

Hi, I am not one of the developers but I am also trying to get this software running... For what I understand, it might be that you are requesting way too many cores/threads. You are requesting 40 threads (-p 40) and, for each of those, 40 cores/threads (-pp 40)? It's not really clear to me how the parallelization works for this software, but it might be that you are actually requesting 40*40 cores (for a total of 1600 cores). Unless you are using a very powerful HPC that allows you to access those many cores without you specifying it, I assume what is happening is that the software tries to spawn way too many processes, causing it to idle for forever. Try simply removing the -pp 40 and see what happens. Also maybe try it with their test data from the ["typing and assembly" section here](Typing and Assembly), which seems to be quite small, so should allow you to run a quick test. Also make sure you do have access to 40 cores in the computer/cluster you are using. Hope this helps!

Thanks for your answer @DarioMarzella . I really wonder if anyone ever saw this tool working ^^
I do have more than 40 core. In fact I tried with the defaults at first but it was getting stuck as well so I thought maybe it is just slow so I tried to boost it up. Started with the -p options but I saw some posts from other issues mentionning that the -p option was not used at the first stages of the pipeline where I get stuck so I tried the --pp option as suggested in theses posts...
I must say I lost hope but I will try indeed with there test data at least ^^ What errors/problems do you have?

@DarioMarzella
Copy link

Please try using either only -p or -pp, because if you use them like this I think you are requesting 1600 cores, which I am not sure you have available.

I will post my issue in a separate thread maybe, also because I managed to find a solution. Simply, the hisat2 folder was empty (as hisat2 is no more in this repo), so I had to install it separately and then move it within the hisat2 folder in hisatgenotype, and it's owrking now (yes, it is actually working, quite impressive).

@GKerdivel
Copy link
Author

Yes indeed it can be usefull to create a separate thread for your issue, it can help people.
As I mentionned I already tried we either -p, --pp, or nothing with the same results... I just launched it with the test filesa nd it worked though... I think maybe my fastq files are just too big or something... What kind of data did you use?

@DarioMarzella
Copy link

I actually might be experiencing now your same issue. Previously I used some WES files which were not too big and it worked flawlessly.
Now I am trying to use some WGS file (roughly 180GB per read direction, so 360GB in total) and indeed looks like Hisat-genotype stalls for forever, barely using one core althugh I provided 64 threads and just doing seamingly nothing until the walltime is reached.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants