issues: 1188946146
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1188946146 | I_kwDOAMm_X85G3eDi | 6432 | Improve UX/documentation for loading data in cloud storage | 3309802 | open | 0 | 0 | 2022-03-31T22:39:39Z | 2022-04-04T15:47:04Z | NONE | What is your issue?I recently tried to use xarray to open some netCDF files stored in a bucket, and was surprised how hard it was to figure out the right incantation to make this work. The fact that passing an fsspec URL (like However, h5netcdf does work if you pass an fsspec file-like object (not sure if other engines support this as well?). But to add to the confusion, you can't pass the ```python
KeyError Traceback (most recent call last)
...
FileNotFoundError: [Errno 2] Unable to open file (unable to open file: name = 's3://noaa-nwm-retrospective-2-1-pds/model_output/1979/197902010100.CHRTOUT_DOMAIN1.comp', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)
AttributeError Traceback (most recent call last) ... File ~/miniconda3/envs/xarray-buckets/lib/python3.10/site-packages/xarray/backends/common.py:23, in _normalize_path(path) 21 def _normalize_path(path): 22 if isinstance(path, os.PathLike): ---> 23 path = os.fspath(path) 25 if isinstance(path, str) and not is_remote_uri(path): 26 path = os.path.abspath(os.path.expanduser(path)) File ~/miniconda3/envs/xarray-buckets/lib/python3.10/site-packages/fsspec/core.py:98, in OpenFile.fspath(self) 96 def fspath(self): 97 # may raise if cannot be resolved to local file ---> 98 return self.open().fspath() AttributeError: 'S3File' object has no attribute 'fspath'
Some things that might be nice:
1. Explicit documentation on working with data in cloud storage, perhaps broken down by file type/engine (xref https://github.com/pydata/xarray/issues/2712). It might be nice to have a table/quick reference of which engines support reading from cloud storage, and how to pass in the URL (string? fsspec file object?)
2. Informative error linking to these docs when opening fails and As more and more data is available on cloud storage, newcomers to xarray will probably be increasingly looking to use it with remote data. Since xarray already supports this in some cases, this is great! With a few tweaks to docs and error messages, I think we could change an experience that took me multiple hours of debugging and reading the source into an easy 30sec experience for new users. cc @martindurant @phobson |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6432/reactions", "total_count": 7, "+1": 7, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
13221727 | issue |