Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Another zarr 3 "Unsupported type for store_like ..." issue #2748

Closed
ktyle opened this issue Jan 22, 2025 · 3 comments
Closed

Another zarr 3 "Unsupported type for store_like ..." issue #2748

ktyle opened this issue Jan 22, 2025 · 3 comments

Comments

@ktyle
Copy link

ktyle commented Jan 22, 2025

Hi, similar to #2706, I'm wondering how I might be able to migrate an existing workflow from v2 to v3.

In short, this is how I would open a particular pair of Zarr files from an AWS bucket in v2:

url1 = 's3://hrrrzarr/sfc/20250121/20250121_18z_anl.zarr/2m_above_ground/TMP/2m_above_ground'
url2 = 's3://hrrrzarr/sfc/20250121/20250121_18z_anl.zarr/2m_above_ground/TMP'

fs = s3fs.S3FileSystem(anon=True)
file1 = s3fs.S3Map(url1, s3=fs)
file2 = s3fs.S3Map(url2, s3=fs)

ds = xr.open_mfdataset([file1,file2], engine='zarr')

Following @jhamman 's advice in the above-referenced issue, I tried:

# don't include leading s3:// 
url1 = 'hrrrzarr/sfc/20250121/20250121_18z_anl.zarr/2m_above_ground/TMP/2m_above_ground'
url2 = 'hrrrzarr/sfc/20250121/20250121_18z_anl.zarr/2m_above_ground/TMP'

fs = fsspec.filesystem("s3", asynchronous=True)
store1 = zarr.storage.FsspecStore(fs, path=url1)
store2 = zarr.storage.FsspecStore(fs, path=url2)

z1 = zarr.open(store=store1)
z2 = zarr.open(store=store2)

ds = xr.open_mfdataset([z1, z2],engine='zarr')

This fails with:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[15], line 1
----> 1 ds = xr.open_mfdataset([z1, z2],engine='zarr')

File /nfs/knight/mamba_aug23/envs/HRRR-AWS-cookbook-dev/lib/python3.12/site-packages/xarray/backends/api.py:1611, in open_mfdataset(paths, chunks, concat_dim, compat, preprocess, engine, data_vars, coords, combine, parallel, join, attrs_file, combine_attrs, **kwargs)
   1608     open_ = open_dataset
   1609     getattr_ = getattr
-> 1611 datasets = [open_(p, **open_kwargs) for p in paths1d]
   1612 closers = [getattr_(ds, "_close") for ds in datasets]
   1613 if preprocess is not None:

File /nfs/knight/mamba_aug23/envs/HRRR-AWS-cookbook-dev/lib/python3.12/site-packages/xarray/backends/api.py:679, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, inline_array, chunked_array_type, from_array_kwargs, backend_kwargs, **kwargs)
    667 decoders = _resolve_decoders_kwargs(
    668     decode_cf,
    669     open_backend_dataset_parameters=backend.open_dataset_parameters,
   (...)
    675     decode_coords=decode_coords,
    676 )
    678 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None)
--> 679 backend_ds = backend.open_dataset(
    680     filename_or_obj,
    681     drop_variables=drop_variables,
    682     **decoders,
    683     **kwargs,
    684 )
    685 ds = _dataset_from_backend_dataset(
    686     backend_ds,
    687     filename_or_obj,
   (...)
    697     **kwargs,
    698 )
    699 return ds

File /nfs/knight/mamba_aug23/envs/HRRR-AWS-cookbook-dev/lib/python3.12/site-packages/xarray/backends/zarr.py:1564, in ZarrBackendEntrypoint.open_dataset(self, filename_or_obj, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta, group, mode, synchronizer, consolidated, chunk_store, storage_options, zarr_version, zarr_format, store, engine, use_zarr_fill_value_as_mask, cache_members)
   1562 filename_or_obj = _normalize_path(filename_or_obj)
   1563 if not store:
-> 1564     store = ZarrStore.open_group(
   1565         filename_or_obj,
   1566         group=group,
   1567         mode=mode,
   1568         synchronizer=synchronizer,
   1569         consolidated=consolidated,
   1570         consolidate_on_close=False,
   1571         chunk_store=chunk_store,
   1572         storage_options=storage_options,
   1573         zarr_version=zarr_version,
   1574         use_zarr_fill_value_as_mask=None,
   1575         zarr_format=zarr_format,
   1576         cache_members=cache_members,
   1577     )
   1579 store_entrypoint = StoreBackendEntrypoint()
   1580 with close_on_error(store):

File /nfs/knight/mamba_aug23/envs/HRRR-AWS-cookbook-dev/lib/python3.12/site-packages/xarray/backends/zarr.py:703, in ZarrStore.open_group(cls, store, mode, synchronizer, group, consolidated, consolidate_on_close, chunk_store, storage_options, append_dim, write_region, safe_chunks, zarr_version, zarr_format, use_zarr_fill_value_as_mask, write_empty, cache_members)
    678 @classmethod
    679 def open_group(
    680     cls,
   (...)
    696     cache_members: bool = True,
    697 ):
    698     (
    699         zarr_group,
    700         consolidate_on_close,
    701         close_store_on_close,
    702         use_zarr_fill_value_as_mask,
--> 703     ) = _get_open_params(
    704         store=store,
    705         mode=mode,
    706         synchronizer=synchronizer,
    707         group=group,
    708         consolidated=consolidated,
    709         consolidate_on_close=consolidate_on_close,
    710         chunk_store=chunk_store,
    711         storage_options=storage_options,
    712         zarr_version=zarr_version,
    713         use_zarr_fill_value_as_mask=use_zarr_fill_value_as_mask,
    714         zarr_format=zarr_format,
    715     )
    717     return cls(
    718         zarr_group,
    719         mode,
   (...)
    727         cache_members,
    728     )

File /nfs/knight/mamba_aug23/envs/HRRR-AWS-cookbook-dev/lib/python3.12/site-packages/xarray/backends/zarr.py:1761, in _get_open_params(store, mode, synchronizer, group, consolidated, consolidate_on_close, chunk_store, storage_options, zarr_version, use_zarr_fill_value_as_mask, zarr_format)
   1759 if consolidated is None:
   1760     try:
-> 1761         zarr_group = zarr.open_consolidated(store, **open_kwargs)
   1762     except (ValueError, KeyError):
   1763         # ValueError in zarr-python 3.x, KeyError in 2.x.
   1764         try:

File /nfs/knight/mamba_aug23/envs/HRRR-AWS-cookbook-dev/lib/python3.12/site-packages/zarr/api/synchronous.py:212, in open_consolidated(use_consolidated, *args, **kwargs)
    207 def open_consolidated(*args: Any, use_consolidated: Literal[True] = True, **kwargs: Any) -> Group:
    208     """
    209     Alias for :func:`open_group` with ``use_consolidated=True``.
    210     """
    211     return Group(
--> 212         sync(async_api.open_consolidated(*args, use_consolidated=use_consolidated, **kwargs))
    213     )

File /nfs/knight/mamba_aug23/envs/HRRR-AWS-cookbook-dev/lib/python3.12/site-packages/zarr/core/sync.py:142, in sync(coro, loop, timeout)
    139 return_result = next(iter(finished)).result()
    141 if isinstance(return_result, BaseException):
--> 142     raise return_result
    143 else:
    144     return return_result

File /nfs/knight/mamba_aug23/envs/HRRR-AWS-cookbook-dev/lib/python3.12/site-packages/zarr/core/sync.py:98, in _runner(coro)
     93 """
     94 Await a coroutine and return the result of running it. If awaiting the coroutine raises an
     95 exception, the exception will be returned.
     96 """
     97 try:
---> 98     return await coro
     99 except Exception as ex:
    100     return ex

File /nfs/knight/mamba_aug23/envs/HRRR-AWS-cookbook-dev/lib/python3.12/site-packages/zarr/api/asynchronous.py:346, in open_consolidated(use_consolidated, *args, **kwargs)
    341 if use_consolidated is not True:
    342     raise TypeError(
    343         "'use_consolidated' must be 'True' in 'open_consolidated'. Use 'open' with "
    344         "'use_consolidated=False' to bypass consolidated metadata."
    345     )
--> 346 return await open_group(*args, use_consolidated=use_consolidated, **kwargs)

File /nfs/knight/mamba_aug23/envs/HRRR-AWS-cookbook-dev/lib/python3.12/site-packages/zarr/api/asynchronous.py:800, in open_group(store, mode, cache_attrs, synchronizer, path, chunk_store, storage_options, zarr_version, zarr_format, meta_array, attributes, use_consolidated)
    797 if chunk_store is not None:
    798     warnings.warn("chunk_store is not yet implemented", RuntimeWarning, stacklevel=2)
--> 800 store_path = await make_store_path(store, mode=mode, storage_options=storage_options, path=path)
    802 if attributes is None:
    803     attributes = {}

File /nfs/knight/mamba_aug23/envs/HRRR-AWS-cookbook-dev/lib/python3.12/site-packages/zarr/storage/_common.py:316, in make_store_path(store_like, path, mode, storage_options)
    314     else:
    315         msg = f"Unsupported type for store_like: '{type(store_like).__name__}'"  # type: ignore[unreachable]
--> 316         raise TypeError(msg)
    318     result = await StorePath.open(store, path=path_normalized, mode=mode)
    320 if storage_options and not used_storage_options:

TypeError: Unsupported type for store_like: 'Group'

Is there a way to do this in Zarr 3?

@jhamman
Copy link
Member

jhamman commented Jan 22, 2025

@ktyle - if you remove the zarr.open lines from your example, things work:

In [2]: import xarray as xr

In [3]: import fsspec

In [4]: import zarr

In [5]: url1 = 'hrrrzarr/sfc/20250121/20250121_18z_anl.zarr/2m_above_ground/TMP/2m_above_ground'
   ...: url2 = 'hrrrzarr/sfc/20250121/20250121_18z_anl.zarr/2m_above_ground/TMP'
   ...:
   ...: fs = fsspec.filesystem("s3", asynchronous=True)
   ...: store1 = zarr.storage.FsspecStore(fs, path=url1)
   ...: store2 = zarr.storage.FsspecStore(fs, path=url2)
   ...:
   ...: ds = xr.open_mfdataset([store1, store2],engine='zarr')

In [6]: ds
Out[6]:
<xarray.Dataset> Size: 8MB
Dimensions:                  (projection_y_coordinate: 1059,
                              projection_x_coordinate: 1799)
Coordinates:
  * projection_x_coordinate  (projection_x_coordinate) float64 14kB -2.698e+0...
  * projection_y_coordinate  (projection_y_coordinate) float64 8kB -1.587e+06...
Data variables:
    TMP                      (projection_y_coordinate, projection_x_coordinate) float32 8MB dask.array<chunksize=(150, 150), meta=np.ndarray>
    forecast_period          timedelta64[ns] 8B ...
    forecast_reference_time  datetime64[ns] 8B ...
    height                   float64 8B ...
    pressure                 float64 8B ...
    time                     datetime64[ns] 8B ...

I don't think Xarray supports passing a zarr.Group into open_mfdataset so that explains the error you were getting above.

@ktyle
Copy link
Author

ktyle commented Jan 22, 2025

@jhamman perfect! That did the trick. 🥇

@ktyle ktyle closed this as completed Jan 22, 2025
@rabernat
Copy link
Contributor

I don't think Xarray supports passing a zarr.Group into open_mfdataset

It probably should though! There are times when that would be very convenient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants