Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zarr v3 does not accept stores of type FSMap #2706

Open
malmans2 opened this issue Jan 14, 2025 · 9 comments
Open

zarr v3 does not accept stores of type FSMap #2706

malmans2 opened this issue Jan 14, 2025 · 9 comments
Labels
documentation Improvements to the documentation help wanted Issue could use help from someone with familiarity on the topic

Comments

@malmans2
Copy link
Member

Zarr version

v3.0.0

Numcodecs version

v0.14.1

Python Version

3.11.11

Operating System

Mac

Installation

using pip into conda environment

Description

With Zarr v2, I was able to pass a store of type fsspec.mapping.FSMap. However, with Zarr v3, I get the following error:

TypeError: Unsupported type for store_like: 'FSMap'

Is this a bug, a breaking change, or was passing mappers like FSMap not officially supported even in Zarr v2?
The snippet below only works with Zarr v2.

Steps to reproduce

import tempfile

import fsspec
import zarr

fs = fsspec.filesystem("file")
with tempfile.TemporaryDirectory() as tmpdir:
    mapper = fs.get_mapper(tmpdir)
    z = zarr.open(store=mapper, shape=(100, 100), chunks=(10, 10), dtype="f4")

Additional output

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[7], line 9
      7 with tempfile.TemporaryDirectory() as tmpdir:
      8     mapper = fs.get_mapper(tmpdir)
----> 9     z = zarr.open(store=mapper, shape=(100, 100), chunks=(10, 10), dtype="f4")

File ~/miniforge3/envs/zarr/lib/python3.11/site-packages/zarr/_compat.py:43, in _deprecate_positional_args.<locals>._inner_deprecate_positional_args.<locals>.inner_f(*args, **kwargs)
     41 extra_args = len(args) - len(all_args)
     42 if extra_args <= 0:
---> 43     return f(*args, **kwargs)
     45 # extra_args > 0
     46 args_msg = [
     47     f"{name}={arg}"
     48     for name, arg in zip(kwonly_args[:extra_args], args[-extra_args:], strict=False)
     49 ]

File ~/miniforge3/envs/zarr/lib/python3.11/site-packages/zarr/api/synchronous.py:190, in open(store, mode, zarr_version, zarr_format, path, storage_options, **kwargs)
    152 @_deprecate_positional_args
    153 def open(
    154     store: StoreLike | None = None,
   (...)
    161     **kwargs: Any,  # TODO: type kwargs as valid args to async_api.open
    162 ) -> Array | Group:
    163     """Open a group or array using file-mode-like semantics.
    164
    165     Parameters
   (...)
    188         Return type depends on what exists in the given store.
    189     """
--> 190     obj = sync(
    191         async_api.open(
    192             store=store,
    193             mode=mode,
    194             zarr_version=zarr_version,
    195             zarr_format=zarr_format,
    196             path=path,
    197             storage_options=storage_options,
    198             **kwargs,
    199         )
    200     )
    201     if isinstance(obj, AsyncArray):
    202         return Array(obj)

File ~/miniforge3/envs/zarr/lib/python3.11/site-packages/zarr/core/sync.py:142, in sync(coro, loop, timeout)
    139 return_result = next(iter(finished)).result()
    141 if isinstance(return_result, BaseException):
--> 142     raise return_result
    143 else:
    144     return return_result

File ~/miniforge3/envs/zarr/lib/python3.11/site-packages/zarr/core/sync.py:98, in _runner(coro)
     93 """
     94 Await a coroutine and return the result of running it. If awaiting the coroutine raises an
     95 exception, the exception will be returned.
     96 """
     97 try:
---> 98     return await coro
     99 except Exception as ex:
    100     return ex

File ~/miniforge3/envs/zarr/lib/python3.11/site-packages/zarr/api/asynchronous.py:309, in open(store, mode, zarr_version, zarr_format, path, storage_options, **kwargs)
    280 """Convenience function to open a group or array using file-mode-like semantics.
    281
    282 Parameters
   (...)
    305     Return type depends on what exists in the given store.
    306 """
    307 zarr_format = _handle_zarr_version_or_format(zarr_version=zarr_version, zarr_format=zarr_format)
--> 309 store_path = await make_store_path(store, mode=mode, path=path, storage_options=storage_options)
    311 # TODO: the mode check below seems wrong!
    312 if "shape" not in kwargs and mode in {"a", "r", "r+", "w"}:

File ~/miniforge3/envs/zarr/lib/python3.11/site-packages/zarr/storage/_common.py:316, in make_store_path(store_like, path, mode, storage_options)
    314     else:
    315         msg = f"Unsupported type for store_like: '{type(store_like).__name__}'"  # type: ignore[unreachable]
--> 316         raise TypeError(msg)
    318     result = await StorePath.open(store, path=path_normalized, mode=mode)
    320 if storage_options and not used_storage_options:

TypeError: Unsupported type for store_like: 'FSMap'
@malmans2 malmans2 added the bug Potential issues with the zarr-python library label Jan 14, 2025
@d-v-b
Copy link
Contributor

d-v-b commented Jan 14, 2025

accepting instances of FSMap was part of the v2 design (see this PR), so I'd say you are seeing a regression / bug.

I will let the fsspec experts weigh in on whether there's any good reason not to accept FSMap instances here -- this may very well be a case of "we didn't implement this yet", in which case we should implement it :)

@jhamman
Copy link
Member

jhamman commented Jan 17, 2025

This was an intentional breaking change. The Fsspec's FSMap interface was used as a drop in for Zarr's previous store interface (i.e. MutableMapping).

If you have a fsspec FileSystem, you can create the store directly though. Something like this should work:

import fsspec
import zarr

fs = fsspec.filesystem("s3", asynchronous=True)
store = zarr.storage.FsspecStore(fs, path="bucket/foo/bar")
z = zarr.open(store=mapper, shape=(100, 100), chunks=(10, 10), dtype="f4")

Note that the file implementation in fsspec does not support async so it cannot be currently used in zarr (see #2533 for a wip pr).

@jhamman jhamman removed the bug Potential issues with the zarr-python library label Jan 17, 2025
@dstansby dstansby added the documentation Improvements to the documentation label Jan 17, 2025
@dstansby
Copy link
Contributor

I'll label this as docs, since it would be good to put the above comment in our migration guide.

@rabernat
Copy link
Contributor

rabernat commented Jan 17, 2025

This is going to bite a lot of people. 😬

It is relatively easy to convert an FSMap object to the type of store we need. We could consider adding some convenience layer for this.

@jhamman
Copy link
Member

jhamman commented Jan 17, 2025

It is relatively easy to convert an FSMap object to the type of store we need. We could consider adding some convenience layer for this.

@martindurant will know best but remember that we're using Fsspec's async interface which is not active in most (any?) FSMap implementations. There was concern a while back about loop-in-loop issues if using filesystems with asynchronous=false.

@martindurant
Copy link
Member

Rather than wrapping, passing the filesystem and path arguments directly seems the right thing to do, which was already suggested above.

I'm not sure what might happen if a (sync) FSMap was used at the (hidden) dict within a (async) MemoryStore. It would probably work and for local files not make a performance difference?

@jhamman
Copy link
Member

jhamman commented Jan 23, 2025

👍 - very open to us finding a way to accept FSMap objects as store arguments - we'll need to convert them to an FSspecStore but it sounds like we can do that without much risk.

@jhamman jhamman added the help wanted Issue could use help from someone with familiarity on the topic label Jan 23, 2025
@martindurant
Copy link
Member

we'll need to convert them to an FSspecStore

This will generally require wrapping sync FSs or setting asynchronous=True for async-able FSs.

@christine-e-smit
Copy link

For the record, we ran into this issue when we tried to upgrade to 3.0.1 yesterday to deal with a python dependency build issue. So +1 on this causing users some pain. We're now planning to go ahead and take the plunge earlier than planned to transition to zarr>=3 using your handy, dandy guide: https://zarr.readthedocs.io/en/latest/user-guide/v3_migration.html.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements to the documentation help wanted Issue could use help from someone with familiarity on the topic
Projects
None yet
Development

No branches or pull requests

7 participants