home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 745801652

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
745801652 MDU6SXNzdWU3NDU4MDE2NTI= 4591 Serialization issue with distributed, h5netcdf, and fsspec (ImplicitToExplicitIndexingAdapter) 1197350 closed 0     12 2020-11-18T16:18:42Z 2021-06-30T17:53:54Z 2020-11-19T15:54:38Z MEMBER      

This was originally reported by @jkingslake at https://github.com/pangeo-data/pangeo-datastore/issues/116.

What happened:

I tried to open a netcdf file over http using fsspec and the h5netcdf engine and compute data using dask.distributed. It appears that our ImplicitToExplicitIndexingAdapter is [no longer?] serializable?

What you expected to happen:

Things would work. Indeed, I could swear this used to work with previous versions.

Minimal Complete Verifiable Example:

```python import xarray as xr import fsspec from dask.distributed import Client

example needs to use distributed to reproduce the bug

client = Client()

url = 'https://storage.googleapis.com/ldeo-glaciology/bedmachine/BedMachineAntarctica_2019-11-05_v01.nc'
with fsspec.open(url, mode='rb') as openfile:
dsc = xr.open_dataset(openfile, chunks=3000) dsc.surface.mean().compute() ```

raises the following error Traceback (most recent call last): File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/protocol/core.py", line 50, in dumps data = { File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/protocol/core.py", line 51, in <dictcomp> key: serialize( File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/protocol/serialize.py", line 277, in serialize raise TypeError(msg, str(x)[:10000]) TypeError: ('Could not serialize object of type ImplicitToExplicitIndexingAdapter.', 'ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.h5netcdf_.H5NetCDFArrayWrapper object at 0x7ff8e3988540>, key=BasicIndexer((slice(None, None, None), slice(None, None, None))))))') distributed.comm.utils - ERROR - ('Could not serialize object of type ImplicitToExplicitIndexingAdapter.', 'ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.h5netcdf_.H5NetCDFArrayWrapper object at 0x7ff8e3988540>, key=BasicIndexer((slice(None, None, None), slice(None, None, None))))))')

Anything else we need to know?:

One can work around this by using the netcdf4 library's new and undocumented ability to open files over http.

python url = 'https://storage.googleapis.com/ldeo-glaciology/bedmachine/BedMachineAntarctica_2019-11-05_v01.nc#mode=bytes' ds = xr.open_dataset(url, engine='netcdf4', chunks=3000) ds

However, the fsspec + h5netcdf path should work!

Environment:

Output of <tt>xr.show_versions()</tt> ``` INSTALLED VERSIONS ------------------ commit: None python: 3.8.6 | packaged by conda-forge | (default, Oct 7 2020, 19:08:05) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 4.19.112+ machine: x86_64 processor: x86_64 byteorder: little LC_ALL: C.UTF-8 LANG: C.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.16.1 pandas: 1.1.3 numpy: 1.19.2 scipy: 1.5.2 netCDF4: 1.5.4 pydap: installed h5netcdf: 0.8.1 h5py: 2.10.0 Nio: None zarr: 2.4.0 cftime: 1.2.1 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: 1.1.7 cfgrib: 0.9.8.4 iris: None bottleneck: 1.3.2 dask: 2.30.0 distributed: 2.30.0 matplotlib: 3.3.2 cartopy: 0.18.0 seaborn: None numbagg: None pint: 0.16.1 setuptools: 49.6.0.post20201009 pip: 20.2.4 conda: None pytest: 6.1.1 IPython: 7.18.1 sphinx: 3.2.1 ``` Also fsspec 0.8.4

cc @martindurant for fsspec integration.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4591/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 12 rows from issue in issue_comments
Powered by Datasette · Queries took 0.856ms · About: xarray-datasette