Add pagination support to list_models() API for systematic model discovery #2741

darwich6 · 2025-01-08T22:00:44Z

Is your feature request related to a problem? Please describe.
The current list_models() API only supports a limit parameter without true pagination support. This makes it impossible to systematically discover models beyond the initial limit. For example, when fetching most downloaded models:

models = hf.list_models(
filter="text-generation",
sort="downloads",
direction=-1,
limit=100
)

This will always return the same top 100 models unless a new model has overtaken the top 100 model downloads. There's no way to get models 101-200, unless you change the limit to 200 and grab the 1-200 records. This is problematic for services that need to:

Discover new models systematically
Process models in smaller batches
Index or monitor the full model ecosystem

Describe the solution you'd like

Add proper pagination support to the API by either:

Adding an offset parameter:

python
models = hf.list_models(
filter="text-generation",
sort="downloads",
limit=100,
offset=100 # Get next 100 models
)

Or exposing the internal cursor-based pagination that's already used by paginate():

python
response = hf.list_models(
filter="text-generation",
limit=100,
cursor="next_page_token" # From previous response
)
next_cursor = response.next_cursor

Describe alternatives you've considered

Current workarounds we've tried:

Fetching very large batches (1000+ models) and filtering locally
Using different sort criteria to try to get different models
Using the search parameter with different queries

None of these provide a reliable way to systematically discover all models and ensure we are getting different models with each call.

Additional context
Looking at the source code, the API already uses internal pagination via paginate():

items = paginate(path, params=params, headers=headers)
if limit is not None:
items = islice(items, limit)

Exposing this functionality would align with common API practices and enable better tooling around the Hub's model ecosystem.

The text was updated successfully, but these errors were encountered:

hanouticelina · 2025-01-10T19:21:16Z

Hi @darwich6,
sorry for the late answer! thanks for this feature request. cursor-based pagination is already implemented server-side in the /api/models endpoint, but we haven't exposed it yet in HfApi.list_models(). we'll prioritize adding this feature.

in the meantime, here's a small script that leverage the cursor-based pagination:

from typing import Literal, Optional
from urllib.parse import parse_qs, urlparse

from huggingface_hub.utils import get_session, hf_raise_for_status


class ModelIterator:
    def __init__(self, items, next_cursor: Optional[str] = None):
        self.items = items
        self.next_cursor = next_cursor

    def __iter__(self):
        yield from self.items


def list_models_with_cursor(
    *,
    filter: Optional[str] = None,
    limit: int = 100,
    direction: Optional[Literal[-1]] = None,
    cursor: Optional[str] = None,
) -> ModelIterator:
    """
    List models with cursor-based pagination.
    """
    url = "https://huggingface.co/api/models"
    params = {
        "filter": filter,
        "limit": limit,
        "direction": direction,
        "cursor": cursor,
    }

    response = get_session().get(url, params=params)
    hf_raise_for_status(response)
    next_url = response.links.get("next", {}).get("url")
    next_cursor = None
    if next_url:
        next_cursor = parse_qs(urlparse(next_url).query)["cursor"][0]

    return ModelIterator(response.json(), next_cursor)


response = list_models_with_cursor(filter="text-generation", limit=100)
if response.next_cursor:
    next_response = list_models_with_cursor(filter="text-generation", limit=100, cursor=response.next_cursor)

darwich6 · 2025-01-13T16:22:27Z

Thank you for your prioritization and response! Looking forward to the feature!

hanouticelina added the enhancement New feature or request label Jan 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add pagination support to list_models() API for systematic model discovery #2741

Add pagination support to list_models() API for systematic model discovery #2741

darwich6 commented Jan 8, 2025

hanouticelina commented Jan 10, 2025

darwich6 commented Jan 13, 2025

Add pagination support to list_models() API for systematic model discovery #2741

Add pagination support to list_models() API for systematic model discovery #2741

Comments

darwich6 commented Jan 8, 2025

hanouticelina commented Jan 10, 2025

darwich6 commented Jan 13, 2025