Skip to content

Foldseek Release 10-941cd33

Latest
Compare
Choose a tag to compare
@martin-steinegger martin-steinegger released this 19 Jan 13:29
· 1 commit to master since this release

Foldseek Release 10

Foldseek introduces GPU support for monomer and multimer search, improved profile search and ProstT5 integration, new databases, several performance improvements and bug fixes.

Major Features

  • GPU Support for Search and Multimers
    Both easy-search and easy-multimersearch now support accelerated searches on the GPU execution. Use --gpu 1 to enable GPU mode and CUDA_VISIBLE_DEVICES to control the number of GPUs. (#391, #411). GPU-enabled binaries require glibc >= 2.17, NVIDIA driver >= 525.60.13, and a Turing or newer GPU. On a single 4090 GPU, searches are 4x faster, and on eight GPUs, they are up to 37x faster than a 128-core CPU using the k-mer prefilter. For more details, see our preprint.
  • Improved Structural Profile Search
    Alignment results can now be converted into position-specific scoring profiles with result2profile, enabling the creation of structural protein family representations. (#411)
  • Enhanced ProstT5 Integration
    Multi-GPU and Apple Metal support added, Improved handling of large input sequence through splitting and switched backend to llama.cpp for better compatibility and performance. (#391)
  • New Databases
    Introduced BFVD as a new virus-specific Foldseek database. (#344).
    For more details about the database, check out the BFVD paper.
  • Improved Multimer Search Workflows
    Optimized multimer workflows for improved speed and reliability, with contributions by @Woosub-Kim. For more details, see our Foldseek-Multimer preprint.
  • Clustering Multimers (First Version)
    Introduced experimental multimer clustering (easy-multimercluster) by @sooyoung-cha and @rachelse, supporting clustering by interface LDDT, chain TM-score, and complex coverage. See filtercomplex for more details.

Breaking Changes

  • Results may differ as masking of letters repeated six or more times is now enabled by default --mask-n-repeat. Disable this option to reproduce previous results.

Other Features

  • Improved Compatibility with MMseqs2 Modules: createsubdb, makepaddedseqdb, and result2profile now work seamlessly with Foldseek databases.
  • Taxonomy Reports in easy-search: Added options to generate taxonomy reports directly within easy-search. (#389)
  • Residue Mapping Rework: Residue mapping has been reworked to combine most gemmi amino acids with previous Foldseek amino acids. (#387)

Bug Fixes

  • Fix order dependent --format-output issue of qtmaln,ttmaln,lddt,u,t (b40729c)
  • Fix clustering of structures without Cα information (0d8d966)

Full Changelog

View the full changelog: 9-427df8a...10-941cd33.