issues: 2243268327
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2243268327 | I_kwDOAMm_X86FtY7n | 8944 | When opening a zipped Dataset stored under Zarr on a s3 bucket, `botocore.exceptions.NoCredentialsError: Unable to locate credentials` | 119882363 | closed | 0 | 2 | 2024-04-15T10:13:58Z | 2024-04-15T19:51:44Z | 2024-04-15T19:51:43Z | NONE | What happened?A zipped Zarr store is available on s3 bucket that requires authentication. When using NoCredentialsError: Unable to locate credentials What did you expect to happen?I expected the dataset to be openable. Minimal Complete Verifiable ExampleIt is difficult for me to describe a MCVE as it requires a remote file on an s3 bucket requiring authentication. To reproduce fully, one must have access to a zipped zarr on an s3 bucket requiring authentication. ```Python import xarray as xr credentials_key = "key" credentials_secret = "secret" credentials_endpoint_url = "endpoint_url" credentials_region_name = "region" storage_options = dict( key=credentials_key, secret=credentials_secret, client_kwargs=dict( endpoint_url=credentials_endpoint_url, region_name=credentials_region_name, ), ) zip_s3_zarr_path = "zip::s3://path/to/my/dataset.zarr.zip" xds = xr.open_dataset( zip_s3_zarr_path, backend_kwargs={"storage_options": storage_options}, engine="zarr", group="/", consolidated=True, ) ``` MVCE confirmation
Relevant log output
```Python
---------------------------------------------------------------------------
NoCredentialsError Traceback (most recent call last)
Cell In[4], line 1
----> 1 xds = xr.open_dataset(
2 zip_s3_zarr_path,
3 backend_kwargs={"storage_options": storage_options},
4 engine="zarr",
5 group="/",
6 consolidated=True,
7 )
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/xarray/backends/api.py:573, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, inline_array, chunked_array_type, from_array_kwargs, backend_kwargs, **kwargs)
561 decoders = _resolve_decoders_kwargs(
562 decode_cf,
563 open_backend_dataset_parameters=backend.open_dataset_parameters,
(...)
569 decode_coords=decode_coords,
570 )
572 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None)
--> 573 backend_ds = backend.open_dataset(
574 filename_or_obj,
575 drop_variables=drop_variables,
576 **decoders,
577 **kwargs,
578 )
579 ds = _dataset_from_backend_dataset(
580 backend_ds,
581 filename_or_obj,
(...)
591 **kwargs,
592 )
593 return ds
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/xarray/backends/zarr.py:967, in ZarrBackendEntrypoint.open_dataset(self, filename_or_obj, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta, group, mode, synchronizer, consolidated, chunk_store, storage_options, stacklevel, zarr_version)
946 def open_dataset( # type: ignore[override] # allow LSP violation, not supporting **kwargs
947 self,
948 filename_or_obj: str | os.PathLike[Any] | BufferedIOBase | AbstractDataStore,
(...)
964 zarr_version=None,
965 ) -> Dataset:
966 filename_or_obj = _normalize_path(filename_or_obj)
--> 967 store = ZarrStore.open_group(
968 filename_or_obj,
969 group=group,
970 mode=mode,
971 synchronizer=synchronizer,
972 consolidated=consolidated,
973 consolidate_on_close=False,
974 chunk_store=chunk_store,
975 storage_options=storage_options,
976 stacklevel=stacklevel + 1,
977 zarr_version=zarr_version,
978 )
980 store_entrypoint = StoreBackendEntrypoint()
981 with close_on_error(store):
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/xarray/backends/zarr.py:454, in ZarrStore.open_group(cls, store, mode, synchronizer, group, consolidated, consolidate_on_close, chunk_store, storage_options, append_dim, write_region, safe_chunks, stacklevel, zarr_version, write_empty)
451 raise FileNotFoundError(f"No such file or directory: '{store}'")
452 elif consolidated:
453 # TODO: an option to pass the metadata_key keyword
--> 454 zarr_group = zarr.open_consolidated(store, **open_kwargs)
455 else:
456 zarr_group = zarr.open_group(store, **open_kwargs)
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/zarr/convenience.py:1334, in open_consolidated(store, metadata_key, mode, **kwargs)
1332 # normalize parameters
1333 zarr_version = kwargs.get("zarr_version")
-> 1334 store = normalize_store_arg(
1335 store, storage_options=kwargs.get("storage_options"), mode=mode, zarr_version=zarr_version
1336 )
1337 if mode not in {"r", "r+"}:
1338 raise ValueError("invalid mode, expected either 'r' or 'r+'; found {!r}".format(mode))
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/zarr/storage.py:197, in normalize_store_arg(store, storage_options, mode, zarr_version)
195 else:
196 raise ValueError("zarr_version must be either 2 or 3")
--> 197 return normalize_store(store, storage_options, mode)
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/zarr/storage.py:167, in _normalize_store_arg_v2(store, storage_options, mode)
165 if isinstance(store, str):
166 if "://" in store or "::" in store:
--> 167 return FSStore(store, mode=mode, **(storage_options or {}))
168 elif storage_options:
169 raise ValueError("storage_options passed with non-fsspec path")
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/zarr/storage.py:1377, in FSStore.__init__(self, url, normalize_keys, key_separator, mode, exceptions, dimension_separator, fs, check, create, missing_exceptions, **storage_options)
1375 if protocol in (None, "file") and not storage_options.get("auto_mkdir"):
1376 storage_options["auto_mkdir"] = True
-> 1377 self.map = fsspec.get_mapper(url, **{**mapper_options, **storage_options})
1378 self.fs = self.map.fs # for direct operations
1379 self.path = self.fs._strip_protocol(url)
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/mapping.py:245, in get_mapper(url, check, create, missing_exceptions, alternate_root, **kwargs)
214 """Create key-value interface for given URL and options
215
216 The URL will be of the form "protocol://location" and point to the root
(...)
242 ``FSMap`` instance, the dict-like key-value store.
243 """
244 # Removing protocol here - could defer to each open() on the backend
--> 245 fs, urlpath = url_to_fs(url, **kwargs)
246 root = alternate_root if alternate_root is not None else urlpath
247 return FSMap(root, fs, check, create, missing_exceptions=missing_exceptions)
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/core.py:388, in url_to_fs(url, **kwargs)
386 inkwargs["fo"] = urls
387 urlpath, protocol, _ = chain[0]
--> 388 fs = filesystem(protocol, **inkwargs)
389 return fs, urlpath
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/registry.py:290, in filesystem(protocol, **storage_options)
283 warnings.warn(
284 "The 'arrow_hdfs' protocol has been deprecated and will be "
285 "removed in the future. Specify it as 'hdfs'.",
286 DeprecationWarning,
287 )
289 cls = get_filesystem_class(protocol)
--> 290 return cls(**storage_options)
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/spec.py:79, in _Cached.__call__(cls, *args, **kwargs)
77 return cls._cache[token]
78 else:
---> 79 obj = super().__call__(*args, **kwargs)
80 # Setting _fs_token here causes some static linters to complain.
81 obj._fs_token_ = token
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/implementations/zip.py:56, in ZipFileSystem.__init__(self, fo, mode, target_protocol, target_options, compression, allowZip64, compresslevel, **kwargs)
52 fo = fsspec.open(
53 fo, mode=mode + "b", protocol=target_protocol, **(target_options or {}), # **kwargs
54 )
55 self.of = fo
---> 56 self.fo = fo.__enter__() # the whole instance is a context
57 self.zip = zipfile.ZipFile(
58 self.fo,
59 mode=mode,
(...)
62 compresslevel=compresslevel,
63 )
64 self.dir_cache = None
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/core.py:100, in OpenFile.__enter__(self)
97 def __enter__(self):
98 mode = self.mode.replace("t", "").replace("b", "") + "b"
--> 100 f = self.fs.open(self.path, mode=mode)
102 self.fobjects = [f]
104 if self.compression is not None:
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/spec.py:1307, in AbstractFileSystem.open(self, path, mode, block_size, cache_options, compression, **kwargs)
1305 else:
1306 ac = kwargs.pop("autocommit", not self._intrans)
-> 1307 f = self._open(
1308 path,
1309 mode=mode,
1310 block_size=block_size,
1311 autocommit=ac,
1312 cache_options=cache_options,
1313 **kwargs,
1314 )
1315 if compression is not None:
1316 from fsspec.compression import compr
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/s3fs/core.py:671, in S3FileSystem._open(self, path, mode, block_size, acl, version_id, fill_cache, cache_type, autocommit, size, requester_pays, cache_options, **kwargs)
668 if cache_type is None:
669 cache_type = self.default_cache_type
--> 671 return S3File(
672 self,
673 path,
674 mode,
675 block_size=block_size,
676 acl=acl,
677 version_id=version_id,
678 fill_cache=fill_cache,
679 s3_additional_kwargs=kw,
680 cache_type=cache_type,
681 autocommit=autocommit,
682 requester_pays=requester_pays,
683 cache_options=cache_options,
684 size=size,
685 )
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/s3fs/core.py:2099, in S3File.__init__(self, s3, path, mode, block_size, acl, version_id, fill_cache, s3_additional_kwargs, autocommit, cache_type, requester_pays, cache_options, size)
2097 self.details = s3.info(path)
2098 self.version_id = self.details.get("VersionId")
-> 2099 super().__init__(
2100 s3,
2101 path,
2102 mode,
2103 block_size,
2104 autocommit=autocommit,
2105 cache_type=cache_type,
2106 cache_options=cache_options,
2107 size=size,
2108 )
2109 self.s3 = self.fs # compatibility
2111 # when not using autocommit we want to have transactional state to manage
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/spec.py:1663, in AbstractBufferedFile.__init__(self, fs, path, mode, block_size, autocommit, cache_type, cache_options, size, **kwargs)
1661 self.size = size
1662 else:
-> 1663 self.size = self.details["size"]
1664 self.cache = caches[cache_type](
1665 self.blocksize, self._fetch_range, self.size, **cache_options
1666 )
1667 else:
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/spec.py:1676, in AbstractBufferedFile.details(self)
1673 @property
1674 def details(self):
1675 if self._details is None:
-> 1676 self._details = self.fs.info(self.path)
1677 return self._details
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/asyn.py:118, in sync_wrapper.<locals>.wrapper(*args, **kwargs)
115 @functools.wraps(func)
116 def wrapper(*args, **kwargs):
117 self = obj or args[0]
--> 118 return sync(self.loop, func, *args, **kwargs)
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/asyn.py:103, in sync(loop, func, timeout, *args, **kwargs)
101 raise FSTimeoutError from return_result
102 elif isinstance(return_result, BaseException):
--> 103 raise return_result
104 else:
105 return return_result
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/asyn.py:56, in _runner(event, coro, result, timeout)
54 coro = asyncio.wait_for(coro, timeout=timeout)
55 try:
---> 56 result[0] = await coro
57 except Exception as ex:
58 result[0] = ex
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/s3fs/core.py:1302, in S3FileSystem._info(self, path, bucket, key, refresh, version_id)
1300 if key:
1301 try:
-> 1302 out = await self._call_s3(
1303 "head_object",
1304 self.kwargs,
1305 Bucket=bucket,
1306 Key=key,
1307 **version_id_kw(version_id),
1308 **self.req_kw,
1309 )
1310 return {
1311 "ETag": out.get("ETag", ""),
1312 "LastModified": out["LastModified"],
(...)
1318 "ContentType": out.get("ContentType"),
1319 }
1320 except FileNotFoundError:
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/s3fs/core.py:348, in S3FileSystem._call_s3(self, method, *akwarglist, **kwargs)
346 logger.debug("CALL: %s - %s - %s", method.__name__, akwarglist, kw2)
347 additional_kwargs = self._get_s3_method_kwargs(method, *akwarglist, **kwargs)
--> 348 return await _error_wrapper(
349 method, kwargs=additional_kwargs, retries=self.retries
350 )
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/s3fs/core.py:140, in _error_wrapper(func, args, kwargs, retries)
138 err = e
139 err = translate_boto_error(err)
--> 140 raise err
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/s3fs/core.py:113, in _error_wrapper(func, args, kwargs, retries)
111 for i in range(retries):
112 try:
--> 113 return await func(*args, **kwargs)
114 except S3_RETRYABLE_ERRORS as e:
115 err = e
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/aiobotocore/client.py:366, in AioBaseClient._make_api_call(self, operation_name, api_params)
362 maybe_compress_request(
363 self.meta.config, request_dict, operation_model
364 )
365 apply_request_checksum(request_dict)
--> 366 http, parsed_response = await self._make_request(
367 operation_model, request_dict, request_context
368 )
370 await self.meta.events.emit(
371 'after-call.{service_id}.{operation_name}'.format(
372 service_id=service_id, operation_name=operation_name
(...)
377 context=request_context,
378 )
380 if http.status_code >= 300:
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/aiobotocore/client.py:391, in AioBaseClient._make_request(self, operation_model, request_dict, request_context)
387 async def _make_request(
388 self, operation_model, request_dict, request_context
389 ):
390 try:
--> 391 return await self._endpoint.make_request(
392 operation_model, request_dict
393 )
394 except Exception as e:
395 await self.meta.events.emit(
396 'after-call-error.{service_id}.{operation_name}'.format(
397 service_id=self._service_model.service_id.hyphenize(),
(...)
401 context=request_context,
402 )
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/aiobotocore/endpoint.py:96, in AioEndpoint._send_request(self, request_dict, operation_model)
94 context = request_dict['context']
95 self._update_retries_context(context, attempts)
---> 96 request = await self.create_request(request_dict, operation_model)
97 success_response, exception = await self._get_response(
98 request, operation_model, context
99 )
100 while await self._needs_retry(
101 attempts,
102 operation_model,
(...)
105 exception,
106 ):
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/aiobotocore/endpoint.py:84, in AioEndpoint.create_request(self, params, operation_model)
80 service_id = operation_model.service_model.service_id.hyphenize()
81 event_name = 'request-created.{service_id}.{op_name}'.format(
82 service_id=service_id, op_name=operation_model.name
83 )
---> 84 await self._event_emitter.emit(
85 event_name,
86 request=request,
87 operation_name=operation_model.name,
88 )
89 prepared_request = self.prepare_request(request)
90 return prepared_request
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/aiobotocore/hooks.py:66, in AioHierarchicalEmitter._emit(self, event_name, kwargs, stop_on_response)
63 logger.debug('Event %s: calling handler %s', event_name, handler)
65 # Await the handler if its a coroutine.
---> 66 response = await resolve_awaitable(handler(**kwargs))
67 responses.append((handler, response))
68 if stop_on_response and response is not None:
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/aiobotocore/_helpers.py:15, in resolve_awaitable(obj)
13 async def resolve_awaitable(obj):
14 if inspect.isawaitable(obj):
---> 15 return await obj
17 return obj
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/aiobotocore/signers.py:24, in AioRequestSigner.handler(self, operation_name, request, **kwargs)
19 async def handler(self, operation_name=None, request=None, **kwargs):
20 # This is typically hooked up to the "request-created" event
21 # from a client's event emitter. When a new request is created
22 # this method is invoked to sign the request.
23 # Don't call this method directly.
---> 24 return await self.sign(operation_name, request)
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/aiobotocore/signers.py:82, in AioRequestSigner.sign(self, operation_name, request, region_name, signing_type, expires_in, signing_name)
79 else:
80 raise e
---> 82 auth.add_auth(request)
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/botocore/auth.py:418, in SigV4Auth.add_auth(self, request)
416 def add_auth(self, request):
417 if self.credentials is None:
--> 418 raise NoCredentialsError()
419 datetime_now = datetime.datetime.utcnow()
420 request.context['timestamp'] = datetime_now.strftime(SIGV4_TIMESTAMP)
NoCredentialsError: Unable to locate credentials
```
### Anything else we need to know?
#### Summary
When debugging, I found a bugfix, to be made in the `fsspec` library.
I still wanted to create the issue in the xarray repo as the bug happened to me while using xarray, and another xarray users might have similar issues, so creating the issue here serves as a potential bridge for future users
#### Details
Bug in `fsspec: 2023.10.0`: it forgets to pass the `kwargs` to the `open` method in `ZipFileSystem.__init__`.
Current:
```python
fo = fsspec.open(
fo, mode=mode + "b", protocol=target_protocol, **(target_options or {})
)
```
Bugfix: (passing the kwargs)
```python
fo = fsspec.open(
fo, mode=mode + "b", protocol=target_protocol, **(target_options or {}), **kwargs
)
```
Note: the missing kwargs passing is still present in the latest main branch at the time of writing this issue: https://github.com/fsspec/filesystem_spec/blob/37c1bc63b9c5a5b2b9a0d5161e89b4233f888b29/fsspec/implementations/zip.py#L56 Tested on my local environment by editing fsspec itself. The Zip Zarr store on the s3 bucket can then be opened successfully. Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.11.6 (main, Jan 10 2024, 20:45:04) [GCC 9.4.0]
python-bits: 64
OS: Linux
OS-release: 5.15.0-102-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.9.3-development
xarray: 2023.10.1
pandas: 2.1.4
numpy: 1.26.2
scipy: 1.11.3
netCDF4: 1.6.5
pydap: None
h5netcdf: None
h5py: 3.10.0
Nio: None
zarr: 2.16.1
cftime: 1.6.3
nc_time_axis: None
PseudoNetCDF: None
iris: None
bottleneck: None
dask: 2023.11.0
distributed: 2023.11.0
matplotlib: 3.7.1
cartopy: 0.22.0
seaborn: None
numbagg: None
fsspec: 2023.10.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 68.2.2
pip: 23.2.1
conda: None
pytest: 7.4.3
mypy: 1.7.0
IPython: 8.20.0
sphinx: 6.2.1
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8944/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |