-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
profile hapestry memory usage and reduce it #55
Comments
This diagram roughly summarizes the data structures in use. Targets for optimization in order of (intuitive) priority are:
|
However, when running on Terra on the entirety of chr1 with 47 samples, a much less dramatic difference in RAM usage is observed: Binary: 18.89 GB Also, with the additional RAM checkpoint I added, it appears that GraphAligner and other postprocessing takes a significant amount of RAM. Preprocessing windows (loading reads, VcfRecords): Overall peak (after performing alignments and storing the CSVs, no solver): This indicates approximately:
More profiling is needed |
Unfortunately when running on a full task on Terra, valgrind slows the program down to the point where it will likely never finish. Should try much faster alternative: or some other profiler. |
|
After working on the following optimizations, the run time and memory of the solver have been reduced:
Summary stats from each of the 8 chunks of chr1 on HPRC 47 Pre optimization
Post optimization
*these RAM comparisons may not be accurate because i switched from measuring with VmPeak to VmHWM |
Hapestry has poor CPU utilization and excessive RAM usage when run on small chunks of the 1074 AoU VCFs.
It appears that the majority of the RAM usage actually comes from the alignment+optimization step, so more work is needed to reduce the input size of the problem to the solver. More fine-grained profiling is also needed to fully understand the source of the allocations.
The text was updated successfully, but these errors were encountered: