home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 576337745

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
576337745 MDU6SXNzdWU1NzYzMzc3NDU= 3831 Errors using to_zarr for an s3 store 15351025 closed 0     15 2020-03-05T15:30:40Z 2024-04-28T19:59:02Z 2024-04-28T19:59:02Z NONE      

Hello, I have been trying to write zarr files from xarray directly into an s3 store but keep getting errors for missing arrays. It looks like the structure of the zarr archive is created in my s3 bucket, I can see .zarray and .zattrs files but it's missing the 0.0.0, 0.0.1, etc files. I have been able to write the same arrays directly to my disk so don't think it's an issue with the dataset itself.

MCVE Code Sample

```python s3 = s3fs.S3FileSystem(anon=False) store= s3fs.S3Map(root=f's3://my-bucket/data.zarr', s3=s3, check=False)

ds.to_zarr(store=store, consolidated=True, mode='w')

```

Output

The variable name of the array changes by the run, it's not always the same one that it says is missing.

logs -------------------------------------------------------------------------- NoSuchKey Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/s3fs/core.py in _fetch_range(client, bucket, key, version_id, start, end, max_attempts, req_kw) 1196 Range='bytes=%i-%i' % (start, end - 1), -> 1197 **kwargs) 1198 return resp['Body'].read() ~/.local/lib/python3.7/site-packages/botocore/client.py in _api_call(self, *args, **kwargs) 315 # The "self" in this scope is referring to the BaseClient. --> 316 return self._make_api_call(operation_name, kwargs) 317 ~/.local/lib/python3.7/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params) 625 error_class = self.exceptions.from_code(error_code) --> 626 raise error_class(parsed_response, operation_name) 627 else: NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist. During handling of the above exception, another exception occurred: FileNotFoundError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/fsspec/mapping.py in __getitem__(self, key, default) 75 try: ---> 76 result = self.fs.cat(key) 77 except: # noqa: E722 /opt/conda/lib/python3.7/site-packages/fsspec/spec.py in cat(self, path) 545 """ Get the content of a file """ --> 546 return self.open(path, "rb").read() 547 /opt/conda/lib/python3.7/site-packages/fsspec/spec.py in read(self, length) 1129 return b"" -> 1130 out = self.cache._fetch(self.loc, self.loc + length) 1131 self.loc += len(out) /opt/conda/lib/python3.7/site-packages/fsspec/caching.py in _fetch(self, start, end) 338 # First read, or extending both before and after --> 339 self.cache = self.fetcher(start, bend) 340 self.start = start /opt/conda/lib/python3.7/site-packages/s3fs/core.py in _fetch_range(self, start, end) 1059 def _fetch_range(self, start, end): -> 1060 return _fetch_range(self.fs.s3, self.bucket, self.key, self.version_id, start, end, req_kw=self.req_kw) 1061 /opt/conda/lib/python3.7/site-packages/s3fs/core.py in _fetch_range(client, bucket, key, version_id, start, end, max_attempts, req_kw) 1212 return b'' -> 1213 raise translate_boto_error(e) 1214 except Exception as e: FileNotFoundError: The specified key does not exist. During handling of the above exception, another exception occurred: KeyError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/zarr/core.py in _load_metadata_nosync(self) 149 mkey = self._key_prefix + array_meta_key --> 150 meta_bytes = self._store[mkey] 151 except KeyError: /opt/conda/lib/python3.7/site-packages/fsspec/mapping.py in __getitem__(self, key, default) 79 return default ---> 80 raise KeyError(key) 81 return result KeyError: 'my-bucket/data.zarr/lv_HTGL7_l1/.zarray' During handling of the above exception, another exception occurred: ValueError Traceback (most recent call last) <ipython-input-7-c21938cc83d3> in <module> 7 ds.to_zarr(store=s3_store_dest, 8 consolidated=True, ----> 9 mode='w') /opt/conda/lib/python3.7/site-packages/xarray/core/dataset.py in to_zarr(self, store, mode, synchronizer, group, encoding, compute, consolidated, append_dim) 1623 compute=compute, 1624 consolidated=consolidated, -> 1625 append_dim=append_dim, 1626 ) 1627 /opt/conda/lib/python3.7/site-packages/xarray/backends/api.py in to_zarr(dataset, store, mode, synchronizer, group, encoding, compute, consolidated, append_dim) 1341 writer = ArrayWriter() 1342 # TODO: figure out how to properly handle unlimited_dims -> 1343 dump_to_store(dataset, zstore, writer, encoding=encoding) 1344 writes = writer.sync(compute=compute) 1345 /opt/conda/lib/python3.7/site-packages/xarray/backends/api.py in dump_to_store(dataset, store, writer, encoder, encoding, unlimited_dims) 1133 variables, attrs = encoder(variables, attrs) 1134 -> 1135 store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims) 1136 1137 /opt/conda/lib/python3.7/site-packages/xarray/backends/zarr.py in store(self, variables, attributes, check_encoding_set, writer, unlimited_dims) 385 self.set_dimensions(variables_encoded, unlimited_dims=unlimited_dims) 386 self.set_variables( --> 387 variables_encoded, check_encoding_set, writer, unlimited_dims=unlimited_dims 388 ) 389 /opt/conda/lib/python3.7/site-packages/xarray/backends/zarr.py in set_variables(self, variables, check_encoding_set, writer, unlimited_dims) 444 dtype = str 445 zarr_array = self.ds.create( --> 446 name, shape=shape, dtype=dtype, fill_value=fill_value, **encoding 447 ) 448 zarr_array.attrs.put(encoded_attrs) /opt/conda/lib/python3.7/site-packages/zarr/hierarchy.py in create(self, name, **kwargs) 877 """Create an array. Keyword arguments as per 878 :func:`zarr.creation.create`.""" --> 879 return self._write_op(self._create_nosync, name, **kwargs) 880 881 def _create_nosync(self, name, **kwargs): /opt/conda/lib/python3.7/site-packages/zarr/hierarchy.py in _write_op(self, f, *args, **kwargs) 656 657 with lock: --> 658 return f(*args, **kwargs) 659 660 def create_group(self, name, overwrite=False): /opt/conda/lib/python3.7/site-packages/zarr/hierarchy.py in _create_nosync(self, name, **kwargs) 884 kwargs.setdefault('cache_attrs', self.attrs.cache) 885 return create(store=self._store, path=path, chunk_store=self._chunk_store, --> 886 **kwargs) 887 888 def empty(self, name, **kwargs): /opt/conda/lib/python3.7/site-packages/zarr/creation.py in create(shape, chunks, dtype, compressor, fill_value, order, store, synchronizer, overwrite, path, chunk_store, filters, cache_metadata, cache_attrs, read_only, object_codec, **kwargs) 123 # instantiate array 124 z = Array(store, path=path, chunk_store=chunk_store, synchronizer=synchronizer, --> 125 cache_metadata=cache_metadata, cache_attrs=cache_attrs, read_only=read_only) 126 127 return z /opt/conda/lib/python3.7/site-packages/zarr/core.py in __init__(self, store, path, read_only, chunk_store, synchronizer, cache_metadata, cache_attrs) 122 123 # initialize metadata --> 124 self._load_metadata() 125 126 # initialize attributes /opt/conda/lib/python3.7/site-packages/zarr/core.py in _load_metadata(self) 139 """(Re)load metadata from store.""" 140 if self._synchronizer is None: --> 141 self._load_metadata_nosync() 142 else: 143 mkey = self._key_prefix + array_meta_key /opt/conda/lib/python3.7/site-packages/zarr/core.py in _load_metadata_nosync(self) 150 meta_bytes = self._store[mkey] 151 except KeyError: --> 152 err_array_not_found(self._path) 153 else: 154 /opt/conda/lib/python3.7/site-packages/zarr/errors.py in err_array_not_found(path) 19 20 def err_array_not_found(path): ---> 21 raise ValueError('array not found at path %r' % path) 22 23 ValueError: array not found at path 'lv_HTGL7_l1'

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 | packaged by conda-forge | (default, Jan 7 2020, 22:33:48) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 4.14.165-133.209.amzn2.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: None libnetcdf: None xarray: 0.15.0 pandas: 1.0.1 numpy: 1.18.1 scipy: 1.4.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: 1.5.5 zarr: 2.4.0 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: 1.1.3 cfgrib: None iris: None bottleneck: None dask: 2.11.0 distributed: 2.11.0 matplotlib: 3.1.3 cartopy: None seaborn: 0.10.0 numbagg: None setuptools: 45.2.0.post20200209 pip: 20.0.2 conda: 4.7.12 pytest: None IPython: 7.12.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3831/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 2 rows from issues_id in issues_labels
  • 12 rows from issue in issue_comments
Powered by Datasette · Queries took 0.828ms · About: xarray-datasette