issues: 733201109
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
733201109 | MDU6SXNzdWU3MzMyMDExMDk= | 4556 | quick overview example not working with `to_zarr` function with gcs store | 8398696 | closed | 0 | 4 | 2020-10-30T13:54:43Z | 2021-04-19T03:18:50Z | 2021-04-19T03:18:50Z | NONE | Hello, Consider the following code: ```py import os import xarray as xr import numpy as np import zarr import gcsfs from .helpers import project, credentials, bucketname # project specific def make_store(key): if key == "memory": return zarr.MemoryStore() if key == "disc": return zarr.DirectoryStore("example.zarr") if key == "gcs": gcs = gcsfs.GCSFileSystem(project=project(), token=credentials()) root = os.path.join(bucketname, "xarray-testing") return gcsfs.GCSMap(root, gcs=gcs, check=False)
data = xr.DataArray(np.random.randn(2, 3), dims=("x", "y"), coords={"x": [10, 20]}) ds = xr.Dataset({"foo": data, "bar": ("x", [1, 2]), "baz": np.pi}) ds.to_zarr(make_store("gcs"), consolidated=True, mode="w") ``` The example dataset is from the quick overview example. The above code works fine for both ```py
ipdb> p data array(3.14159265) ```
I also have implemented a custom zarr store (Details of which are present in this zarr issue) which gives more insight into the issue: ```py
~/.venv/valkyrie/lib/python3.8/site-packages/zarr/core.py in set_basic_selection(self, selection, value, fields) ~/.venv/valkyrie/lib/python3.8/site-packages/zarr/core.py in _set_basic_selection_zd(self, selection, value, fields) ~gcsstore.py in setitem(self, key, value) ~/.venv/valkyrie/lib/python3.8/site-packages/google/cloud/storage/blob.py in upload_from_string(self, data, content_type, client, predefined_acl, if_generation_match, if_generation_not_match, if_metageneration_match, if_metageneration_not_match, timeout, checksum) 2437 "md5", "crc32c" and None. The default is None. 2438 """ -> 2439 data = _to_bytes(data, encoding="utf-8") 2440 string_buffer = BytesIO(data) 2441 self.upload_from_file( ~/.venv/valkyrie/lib/python3.8/site-packages/google/cloud/_helpers.py in _to_bytes(value, encoding) 368 return result 369 else: --> 370 raise TypeError("%r could not be converted to bytes" % (value,)) 371 372 TypeError: array(3.14159265) could not be converted to bytes ``` It seems to me that zarr is not converting the data into its serialized representation (via their codec library) and is directly passing the datatype into MutableMapping which results in an exception since google libraries don't know how to convert the passed data (np.pi) into bytes. ```py ipdb> u
ipdb> p key 'baz/0' ipdb> p value array(3.14159265) ``` Please let me know if you think I should raise this issue in zarr project rather than here. version of xarray and zarr:
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/4556/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |