github: issues: 11 rows where state = "closed" and user = 11750960 sorted by updated

11 rows where state = "closed" and user = 11750960 sorted by updated_at descending

Search:

descending

id	node_id	number	title	user	state	comments	created_at	updated_at ▲	closed_at	author_association	draft	pull_request	body	reactions	state_reason	repo	type
1589771368	I_kwDOAMm_X85ewfxo	7541	standard deviation over one dimension of a chunked DataArray leads to NaN	apatlpo 11750960	closed	2	2023-02-17T18:07:06Z	2023-02-17T18:12:10Z	2023-02-17T18:12:10Z	CONTRIBUTOR			What happened? When computing the standard deviation over one dimension of a chunked DataArray, one may get NaNs. What did you expect to happen? We should not have any NaNs Minimal Complete Verifiable Example ```Python x = (np.random.randn(10,10) + 1jnp.random.randn(10,10)) da = xr.DataArray(x).chunk(dict(dim_0=3)) da.std("dim_0").compute() # NaN da.compute().std("dim_0") # no NaNs MVCE confirmation [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [X] Complete example — the example is self-contained, including all data and the text of any traceback. [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [X] New issue — a search of GitHub Issues suggests this is not a duplicate. Relevant log output No response* Anything else we need to know? No response Environment INSTALLED VERSIONS ------------------ commit: None python: 3.10.8 (main, Nov 24 2022, 08:09:04) [Clang 14.0.6 ] python-bits: 64 OS: Darwin OS-release: 22.3.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: fr_FR.UTF-8 LOCALE: ('fr_FR', 'UTF-8') libhdf5: 1.12.2 libnetcdf: 4.8.1 xarray: 2022.12.0 pandas: 1.5.2 numpy: 1.23.5 scipy: 1.9.3 netCDF4: 1.6.2 pydap: None h5netcdf: None h5py: 3.7.0 Nio: None zarr: 2.13.3 cftime: 1.5.1.1 nc_time_axis: None PseudoNetCDF: None rasterio: 1.3.4 cfgrib: None iris: None bottleneck: 1.3.5 dask: 2022.02.1 distributed: 2022.2.1 matplotlib: 3.6.2 cartopy: 0.21.1 seaborn: 0.12.1 numbagg: None fsspec: 2022.11.0 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 65.5.0 pip: 22.3.1 conda: None pytest: None mypy: None IPython: 8.7.0 sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7541/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
294735496	MDU6SXNzdWUyOTQ3MzU0OTY=	1889	call to colorbar not thread safe	apatlpo 11750960	closed	12	2018-02-06T12:05:44Z	2022-04-27T23:47:56Z	2022-04-27T23:47:56Z	CONTRIBUTOR			The following call in `xarray/xarray/plot/plot.py` does not seem to be thread safe: `cbar = plt.colorbar(primitive, cbar_kwargs)` It leads to systematic crashes when distributed, with a cryptic error message (`ValueError: Unknown element o`). I have to call colorbars outside the xarray plot call to prevent crashes. A call of the following type may fix the problem: `cbar = fig.colorbar(primitive, cbar_kwargs)` But `fig` does not seem to be available directly in plot.py. Maybe: `cbar = ax.get_figure().colorbar(primitive, **cbar_kwargs)` cheers	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1889/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
667763555	MDU6SXNzdWU2Njc3NjM1NTU=	4284	overwriting netcdf file fails at read time	apatlpo 11750960	closed	1	2020-07-29T11:17:10Z	2020-08-01T20:54:16Z	2020-08-01T20:54:16Z	CONTRIBUTOR			I generate a dataset once: `ds = xr.DataArray(np.arange(10), name='x').to_dataset() ds.to_netcdf('test.nc', mode='w')` Now I overwrite with a new netcdf file and load: `ds = xr.DataArray(np.arange(20), name='x').to_dataset() ds.to_netcdf('test.nc', mode='w') ds_out = xr.open_dataset('test.nc') print(ds_out)` outputs: `<xarray.Dataset> Dimensions: (dim_0: 10) Dimensions without coordinates: dim_0 Data variables: x (dim_0) int64 ...` I would have expected to get the new dataset. If I use netcdf4, the file seems to have been properly overwritten: `import netCDF4 as nc d = nc.Dataset('test.nc') d` outputs: `<class 'netCDF4._netCDF4.Dataset'> root group (NETCDF4 data model, file format HDF5): dimensions(sizes): dim_0(20) variables(dimensions): int64 x(dim_0) groups:` Environment: Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 \| packaged by conda-forge \| (default, Mar 23 2020, 23:03:20) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.12.53-60.30-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.4 xarray: 0.15.1 pandas: 1.0.3 numpy: 1.18.1 scipy: 1.4.1 netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.4.0 cftime: 1.1.1.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.13.0 distributed: 2.13.0 matplotlib: 3.3.0 cartopy: 0.17.0 seaborn: 0.10.0 numbagg: None setuptools: 46.1.3.post20200325 pip: 20.0.2 conda: None pytest: None IPython: 7.13.0 sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4284/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
614854414	MDExOlB1bGxSZXF1ZXN0NDE1MzI0ODUw	4048	improve to_zarr doc about chunking	apatlpo 11750960	closed	9	2020-05-08T16:43:09Z	2020-05-20T18:55:38Z	2020-05-20T18:55:33Z	CONTRIBUTOR	0	pydata/xarray/pulls/4048	[X] follows #4046 [X] Passes `isort -rc . && black . && mypy . && flake8` I'm not sure the last point is really necessary for this PR, is it?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4048/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
595666886	MDExOlB1bGxSZXF1ZXN0NDAwMTAwMzIz	3944	implement a more threadsafe call to colorbar	apatlpo 11750960	closed	7	2020-04-07T07:51:28Z	2020-04-09T07:01:12Z	2020-04-09T06:26:57Z	CONTRIBUTOR	0	pydata/xarray/pulls/3944	[ ] Xref #1889 [ ] Tests added [ ] Passes `isort -rc . && black . && mypy . && flake8` [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API If you think this is relevant, I'll go ahead and start working on the items above, even though I'm not sure new tests are needed.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3944/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
593825520	MDU6SXNzdWU1OTM4MjU1MjA=	3932	Element wise dataArray generation	apatlpo 11750960	closed	6	2020-04-04T12:24:16Z	2020-04-07T04:32:12Z	2020-04-07T04:32:12Z	CONTRIBUTOR			I'm in a situation where I want to generate a bidimensional DataArray from a method that takes each of the two dimensions as input parameters. I have two methods to do this but neither of these looks particularly elegant to me and I wondered whether somebody would have better ideas. Method 1 : dask delayed ``` x = np.arange(10) y = np.arange(20) some_exp = lambda x, y: np.ones((Nstats)) some_exp_delayed = dask.delayed(some_exp, pure=True) lazy_data = [some_exp_delayed(_x, _y) for _x in x for _y in y] sample = lazy_data[0].compute() arrays = [da.from_delayed(lazy_value, dtype=sample.dtype, shape=sample.shape) for lazy_value in lazy_data] stack = (da.stack(arrays, axis=0) .reshape((len(x),len(y),sample.size)) ) ds = xr.DataArray(stack, dims=['x','y','stats']) ``` I tend to prefer this option because it imposes less requirement on the `some_exp` data shape. That being said it still seems like too many lines of code to achieve such result. Method 2: apply_ufunc ``` x = np.arange(10) y = np.arange(20) ds = xr.Dataset(coords={'x': x, 'y': y}) ds['_y'] = (0*ds.x+ds.y) # breaks apply_ufunc otherwise ds = ds.chunk({'x': 1, 'y':1}) let's say each experiment outputs 5 statistical diagnostics Nstats = 5 some_exp = lambda x, y: np.ones((1,1,Nstats)) out = (xr.apply_ufunc(some_exp, ds.x, ds._y, dask='parallelized', output_dtypes=[float], output_sizes={'stats': Nstats}, output_core_dims=[['stats']]) ) `` I don't understand why I have to use the dummy variableds._y`in this case. Having to rely on`apply_ufunc` seems like an overkill.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3932/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
593860909	MDU6SXNzdWU1OTM4NjA5MDk=	3933	plot.line breaks depending on coordinate shape	apatlpo 11750960	closed	2	2020-04-04T13:27:35Z	2020-04-04T18:42:10Z	2020-04-04T17:57:20Z	CONTRIBUTOR			`plot.line` breaks depending on coordinate shape, see the code below: ```python x = np.arange(10) y = np.arange(20) ds = xr.Dataset(coords={'x': x, 'y': y}) ds = ds.assign_coords(z=ds.y+ds.x) # goes through ds = ds.assign_coords(z=ds.x+ds.y) # breaks ds['v'] = (ds.x+ds.y) ds['v'].plot.line(y='z', hue='x') `This breaks with the following error:` ... ~/.miniconda3/envs/equinox/lib/python3.7/site-packages/matplotlib/axes/_base.py in _plot_args(self, tup, kwargs) 340 341 if x.shape[0] != y.shape[0]: --> 342 raise ValueError(f"x and y must have same first dimension, but " 343 f"have shapes {x.shape} and {y.shape}") 344 if x.ndim > 2 or y.ndim > 2: ValueError: x and y must have same first dimension, but have shapes (20, 10) and (10, 20) ``` I would have expected that that dimension order would not matter Versions Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 \| packaged by conda-forge \| (default, Mar 23 2020, 23:03:20) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.12.53-60.30-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.4 xarray: 0.15.1 pandas: 1.0.3 numpy: 1.18.1 scipy: 1.4.1 netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.4.0 cftime: 1.1.1.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.13.0 distributed: 2.13.0 matplotlib: 3.2.1 cartopy: 0.17.0 seaborn: 0.10.0 numbagg: None setuptools: 46.1.3.post20200325 pip: 20.0.2 conda: None pytest: None IPython: 7.13.0 sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3933/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
340192831	MDU6SXNzdWUzNDAxOTI4MzE=	2278	can't store zarr after open_zarr and isel	apatlpo 11750960	closed	10	2018-07-11T10:59:23Z	2019-05-17T14:03:38Z	2018-08-14T03:46:34Z	CONTRIBUTOR			Code Sample, a copy-pastable example if possible This works fine: ```python nx, ny, nt = 32, 32, 64 ds = xr.Dataset({}, coords={'x':np.arange(nx),'y':np.arange(ny), 't': np.arange(nt)}) ds = ds.assign(v=ds.tnp.cos(np.pi/180./100ds.x)np.cos(np.pi/180./50ds.y)) ds = ds.chunk({'t': 1, 'x': nx/2, 'y': ny/2}) ds.isel(t=0).to_zarr('data_t0.zarr', mode='w') ``` But if I store, reload and select, I cannot store: `ds.to_zarr('data.zarr', mode='w') ds = xr.open_zarr('data.zarr') ds.isel(t=0).to_zarr('data_t0.zarr', mode='w')` Error message ends with: ``` ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/backends/zarr.py in _extract_zarr_variable_encoding(variable, raise_on_invalid) 181 182 chunks = _determine_zarr_chunks(encoding.get('chunks'), variable.chunks, --> 183 variable.ndim) 184 encoding['chunks'] = chunks 185 return encoding ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/backends/zarr.py in _determine_zarr_chunks(enc_chunks, var_chunks, ndim) 112 raise ValueError("zarr chunks tuple %r must have same length as " 113 "variable.ndim %g" % --> 114 (enc_chunks_tuple, ndim)) 115 116 for x in enc_chunks_tuple: ValueError: zarr chunks tuple (1, 16, 16) must have same length as variable.ndim 2 ``` Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Linux OS-release: 3.12.53-60.30-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.7 pandas: 0.23.1 numpy: 1.14.2 scipy: 1.1.0 netCDF4: 1.4.0 h5netcdf: 0.6.1 h5py: 2.8.0 Nio: None zarr: 2.2.0 bottleneck: 1.2.1 cyordereddict: None dask: 0.18.1 distributed: 1.22.0 matplotlib: 2.2.2 cartopy: 0.16.0 seaborn: None setuptools: 39.2.0 pip: 10.0.1 conda: None pytest: None IPython: 6.4.0 sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2278/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
165104458	MDU6SXNzdWUxNjUxMDQ0NTg=	896	mfdataset fails at chunking after opening	apatlpo 11750960	closed	5	2016-07-12T15:08:34Z	2019-01-27T14:51:58Z	2019-01-27T14:51:58Z	CONTRIBUTOR			Hi all, We are trying to specify chunks after opening an mfdataset but it does not work. This works fine with datasets. Is this behavior expected? Are we doing anything wrong? ``` - Modules import sys, os import xarray as xr chunks = (1727, 2711) xr_chunks = {'x': chunks[-1], 'y': chunks[-2], 'time_counter':1, 'deptht': 1} - Parameter natl60_path = '/home7/pharos/othr/NATL60/' filename = natl60_path+'NATL60-MJM155-S/5d/2008/NATL60-MJM155_y2008m01d09.5d_gridT.nc' filenames = natl60_path+'NATL60-MJM155-S/5d/2008/NATL60-MJM155_y2008m01d0gridT.nc' dataset open ds = xr.open_dataset(filename,chunks=None) chunk ds = ds.chunk(xr_chunks) plot print 'With dataset:' print ds['votemper'].isel(time_counter=0,deptht=0).values mfdataset open ds = xr.open_mfdataset(filenames,chunks=None, lock=False) plot print 'With mfdataset no chunks:' print ds['votemper'].isel(time_counter=0,deptht=0).values chunk print 'With mfdataset with chunks:' ds = ds.chunk(xr_chunks) print ds['votemper'].isel(time_counter=0,deptht=0) print ds['votemper'].isel(time_counter=0,deptht=0).values ``` The output is: With dataset: [[ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan] ..., [ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan]] With mfdataset no chunks: [[ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan] ..., [ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan]] With mfdataset with chunks: <xarray.DataArray 'votemper' (y: 3454, x: 5422)> dask.array<getitem..., shape=(3454, 5422), dtype=float64, chunksize=(1727, 2711)> Coordinates: nav_lat (y, x) float32 26.5648 26.5648 26.5648 26.5648 26.5648 ... nav_lon (y, x) float32 -81.4512 -81.4346 -81.4179 -81.4012 ... deptht float32 0.480455 x (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ... * y (y) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ... time_counter datetime64[ns] 2008-01-02T12:00:00 time_centered datetime64[ns] 2008-01-02T12:00:00 Attributes: long_name: temperature units: degC online_operation: average interval_operation: 40s interval_write: 5d The code hangs for a while and then spits: ``` Traceback (most recent call last): File "/home/slyne/aponte/natl60/python/natl60_dimup/overview/aurelien/plot_snapshot_2d_v4_break.py", line 44, in <module> print ds['votemper'].isel(time_counter=0,deptht=0).values File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/xarray/core/dataarray.py", line 364, in values return self.variable.values File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/xarray/core/variable.py", line 288, in values return _as_array_or_item(self._data_cached()) File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/xarray/core/variable.py", line 254, in _data_cached self._data = np.asarray(self._data) File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/numpy/core/numeric.py", line 460, in asarray return array(a, dtype, copy=False, order=order) File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/array/core.py", line 867, in array x = self.compute() File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/base.py", line 37, in compute return compute(self, kwargs)[0] File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/base.py", line 110, in compute results = get(dsk, keys, kwargs) File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/threaded.py", line 57, in get *kwargs) File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/async.py", line 481, in get_async raise(remote_exception(res, tb)) dask.async.MemoryError: Traceback File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/async.py", line 264, in execute_task result = _execute_task(task, data) File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/async.py", line 245, in _execute_task args2 = [_execute_task(a, cache) for a in args] File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/async.py", line 245, in _execute_task args2 = [_execute_task(a, cache) for a in args] File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/async.py", line 242, in _execute_task return [_execute_task(a, cache) for a in arg] File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/async.py", line 242, in _execute_task return [_execute_task(a, cache) for a in arg] File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/async.py", line 242, in _execute_task return [_execute_task(a, cache) for a in arg] File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/async.py", line 242, in _execute_task return [_execute_task(a, cache) for a in arg] File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/async.py", line 245, in _execute_task args2 = [_execute_task(a, cache) for a in args] File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/async.py", line 246, in _execute_task return func(args2) File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/array/core.py", line 50, in getarray c = np.asarray(c) File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/numpy/core/numeric.py", line 460, in asarray return array(a, dtype, copy=False, order=order) File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/xarray/core/indexing.py", line 312, in array return np.asarray(array[self.key], dtype=None) File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/xarray/conventions.py", line 359, in getitem self.scale_factor, self.add_offset, self._dtype) File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/xarray/conventions.py", line 57, in mask_and_scale values = np.array(array, dtype=dtype, copy=True) ``` Cheers Aurelien	{ "url": "https://api.github.com/repos/pydata/xarray/issues/896/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
373449569	MDU6SXNzdWUzNzM0NDk1Njk=	2504	isel slows down computation significantly after open_dataset	apatlpo 11750960	closed	3	2018-10-24T12:09:18Z	2018-10-25T19:12:06Z	2018-10-25T19:12:06Z	CONTRIBUTOR			isel significantly slows down a simple mean calculation: `python ds = xr.open_dataset(grid_dir_nc+'Depth.nc', chunks={'face':1}) print(ds) % time print(ds.Depth.mean().values)` leads to: `<xarray.Dataset> Dimensions: (face: 13, i: 4320, j: 4320) Coordinates: * i (i) int64 0 1 2 3 4 5 6 7 ... 4313 4314 4315 4316 4317 4318 4319 * j (j) int64 0 1 2 3 4 5 6 7 ... 4313 4314 4315 4316 4317 4318 4319 * face (face) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 Data variables: Depth (face, j, i) float32 dask.array<shape=(13, 4320, 4320), chunksize=(1, 4320, 4320)> 1935.0237 CPU times: user 241 ms, sys: 16.9 ms, total: 258 ms Wall time: 1.05 s` `ds = xr.open_dataset(grid_dir_nc+'Depth.nc', chunks={'face':1}) ds = ds.isel(i=slice(None,None,4),j=slice(None,None,4)) % time print(ds.Depth.mean().values)` leads to: `1935.0199 CPU times: user 9.43 s, sys: 819 ms, total: 10.3 s Wall time: 2min 57s` Is this expected behavior? Output of `xr.show_versions()` I am using latest xarray version (`pip install https://github.com/pydata/xarray/archive/master.zip`) INSTALLED VERSIONS ------------------ commit: None python: 3.6.6.final.0 python-bits: 64 OS: Linux OS-release: 3.10.0-862.2.3.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0+unknown pandas: 0.23.4 numpy: 1.15.3 scipy: 1.1.0 netCDF4: 1.4.1 h5netcdf: None h5py: None Nio: None zarr: 2.2.0 cftime: 1.0.1 PseudonetCDF: None rasterio: None iris: None bottleneck: None cyordereddict: None dask: 0.19.2 distributed: 1.23.2 matplotlib: 2.2.2 cartopy: 0.16.0 seaborn: None setuptools: 40.2.0 pip: 10.0.1 conda: None pytest: None IPython: 6.5.0 sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2504/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
323333361	MDU6SXNzdWUzMjMzMzMzNjE=	2132	to_netcdf - RuntimeError: NetCDF: HDF error	apatlpo 11750960	closed	3	2018-05-15T18:31:49Z	2018-05-16T19:50:52Z	2018-05-16T18:52:59Z	CONTRIBUTOR			I am trying to store data to a netcdf file, and have issues: Data is created according to: ```python import numpy as np import xarray as xr i = np.arange(4320) j = np.arange(4320) face = np.arange(13) v = xr.DataArray(np.random.randn(face.size, j.size, i.size), \ coords={'i': i, 'j': j, 'face': face}, dims=['face','j','i']) ``` The following works: `file_out = 'rand.nc' v.to_netcdf(file_out)` there is a minor warning: `/home1/datahome/aponte/.miniconda3/envs/equinox/lib/python3.6/site-packages/distributed/utils.py:128: RuntimeWarning: Couldn't detect a suitable IP address for reaching '8.8.8.8', defaulting to '127.0.0.1': [Errno 101] Network is unreachable % (host, default, e), RuntimeWarning)` But this does not work: `file_out = '/home1/datawork/aponte/mit_tmp/rand.nc' v.to_netcdf(file_out)` with the following error message: ``` --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/backends/api.py in to_netcdf(dataset, path_or_file, mode, format, group, engine, writer, encoding, unlimited_dims) 656 dataset.dump_to_store(store, sync=sync, encoding=encoding, --> 657 unlimited_dims=unlimited_dims) 658 if path_or_file is None: ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/core/dataset.py in dump_to_store(self, store, encoder, sync, encoding, unlimited_dims) 1073 store.store(variables, attrs, check_encoding, -> 1074 unlimited_dims=unlimited_dims) 1075 if sync: ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/backends/common.py in store(self, variables, attributes, check_encoding_set, unlimited_dims) 362 self.set_variables(variables, check_encoding_set, --> 363 unlimited_dims=unlimited_dims) 364 ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/backends/netCDF4_.py in set_variables(self, args, kwargs) 353 with self.ensure_open(autoclose=False): --> 354 super(NetCDF4DataStore, self).set_variables(args, *kwargs) 355 ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/backends/common.py in set_variables(self, variables, check_encoding_set, unlimited_dims) 401 --> 402 self.writer.add(source, target) 403 ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/backends/common.py in add(self, source, target) 264 else: --> 265 target[...] = source 266 ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/backends/netCDF4_.py in __setitem__(self, key, value) 46 data = self.get_array() ---> 47 data[key] = value 48 netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Variable.__setitem__() netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Variable._put() netCDF4/_netCDF4.pyx in netCDF4._netCDF4._ensure_nc_success() RuntimeError: NetCDF: HDF error During handling of the above exception, another exception occurred: RuntimeError Traceback (most recent call last) <ipython-input-4-9da9ecadc6a6> in <module>() 2 if os.path.isfile(file_out): 3 os.remove(file_out) ----> 4 v.to_netcdf(file_out) ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/core/dataarray.py in to_netcdf(self, args, *kwargs) 1515 dataset = self.to_dataset() 1516 -> 1517 return dataset.to_netcdf(args, **kwargs) 1518 1519 def to_dict(self): ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/core/dataset.py in to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims) 1135 return to_netcdf(self, path, mode, format=format, group=group, 1136 engine=engine, encoding=encoding, -> 1137 unlimited_dims=unlimited_dims) 1138 1139 def to_zarr(self, store=None, mode='w-', synchronizer=None, group=None, ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/backends/api.py in to_netcdf(dataset, path_or_file, mode, format, group, engine, writer, encoding, unlimited_dims) 660 finally: 661 if sync and isinstance(path_or_file, basestring): --> 662 store.close() 663 664 if not sync: ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/backends/netCDF4_.py in close(self) 419 ds = find_root(self.ds) 420 if ds._isopen: --> 421 ds.close() 422 self._isopen = False netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Dataset.close() netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Dataset._close() netCDF4/_netCDF4.pyx in netCDF4._netCDF4._ensure_nc_success() RuntimeError: NetCDF: HDF error ``` The following may of some use: `(equinox) aponte@datarmor1:~/mit_equinox/sandbox> stat -f -L -c %T /home1/datawork/aponte/mit_tmp/ gpfs (equinox) aponte@datarmor1:~/mit_equinox/sandbox> stat -f -L -c %T . nfs` (the `.`directory being where the notebook seats) Output of `xr.show_versions()` # Paste the output here xr.show_versions() here ``` /home1/datahome/aponte/.miniconda3/envs/equinox/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. from ._conv import register_converters as _register_converters INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Linux OS-release: 3.12.53-60.30-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.3 pandas: 0.22.0 numpy: 1.14.2 scipy: 1.0.1 netCDF4: 1.3.1 h5netcdf: 0.5.1 h5py: 2.7.1 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: 0.17.2 distributed: 1.21.6 matplotlib: 2.2.2 cartopy: 0.16.0 seaborn: None setuptools: 39.0.1 pip: 9.0.3 conda: None pytest: None IPython: 6.3.1 sphinx: None ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2132/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);

issues

11 rows where state = "closed" and user = 11750960 sorted by updated_at descending

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

MVCE confirmation

Relevant log output

Anything else we need to know?

Environment

let's say each experiment outputs 5 statistical diagnostics

ds = ds.assign_coords(z=ds.y+ds.x) # goes through

Versions

Code Sample, a copy-pastable example if possible

Output of xr.show_versions()

- Modules

- Parameter

dataset

open

chunk

plot

mfdataset

open

plot

chunk

Traceback

Output of xr.show_versions()

Output of xr.show_versions()

Advanced export

Output of `xr.show_versions()`

Output of `xr.show_versions()`

Output of `xr.show_versions()`