home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

11 rows where state = "closed" and user = 11750960 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, created_at (date), updated_at (date), closed_at (date)

type 2

  • issue 9
  • pull 2

state 1

  • closed · 11 ✖

repo 1

  • xarray 11
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1589771368 I_kwDOAMm_X85ewfxo 7541 standard deviation over one dimension of a chunked DataArray leads to NaN apatlpo 11750960 closed 0     2 2023-02-17T18:07:06Z 2023-02-17T18:12:10Z 2023-02-17T18:12:10Z CONTRIBUTOR      

What happened?

When computing the standard deviation over one dimension of a chunked DataArray, one may get NaNs.

What did you expect to happen?

We should not have any NaNs

Minimal Complete Verifiable Example

```Python x = (np.random.randn(10,10) + 1j*np.random.randn(10,10)) da = xr.DataArray(x).chunk(dict(dim_0=3))

da.std("dim_0").compute() # NaN da.compute().std("dim_0") # no NaNs

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.10.8 (main, Nov 24 2022, 08:09:04) [Clang 14.0.6 ] python-bits: 64 OS: Darwin OS-release: 22.3.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: fr_FR.UTF-8 LOCALE: ('fr_FR', 'UTF-8') libhdf5: 1.12.2 libnetcdf: 4.8.1 xarray: 2022.12.0 pandas: 1.5.2 numpy: 1.23.5 scipy: 1.9.3 netCDF4: 1.6.2 pydap: None h5netcdf: None h5py: 3.7.0 Nio: None zarr: 2.13.3 cftime: 1.5.1.1 nc_time_axis: None PseudoNetCDF: None rasterio: 1.3.4 cfgrib: None iris: None bottleneck: 1.3.5 dask: 2022.02.1 distributed: 2022.2.1 matplotlib: 3.6.2 cartopy: 0.21.1 seaborn: 0.12.1 numbagg: None fsspec: 2022.11.0 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 65.5.0 pip: 22.3.1 conda: None pytest: None mypy: None IPython: 8.7.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7541/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
294735496 MDU6SXNzdWUyOTQ3MzU0OTY= 1889 call to colorbar not thread safe apatlpo 11750960 closed 0     12 2018-02-06T12:05:44Z 2022-04-27T23:47:56Z 2022-04-27T23:47:56Z CONTRIBUTOR      

The following call in xarray/xarray/plot/plot.py does not seem to be thread safe: cbar = plt.colorbar(primitive, **cbar_kwargs) It leads to systematic crashes when distributed, with a cryptic error message (ValueError: Unknown element o). I have to call colorbars outside the xarray plot call to prevent crashes.

A call of the following type may fix the problem: cbar = fig.colorbar(primitive, **cbar_kwargs) But fig does not seem to be available directly in plot.py. Maybe: cbar = ax.get_figure().colorbar(primitive, **cbar_kwargs)

cheers

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1889/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
667763555 MDU6SXNzdWU2Njc3NjM1NTU= 4284 overwriting netcdf file fails at read time apatlpo 11750960 closed 0     1 2020-07-29T11:17:10Z 2020-08-01T20:54:16Z 2020-08-01T20:54:16Z CONTRIBUTOR      

I generate a dataset once: ds = xr.DataArray(np.arange(10), name='x').to_dataset() ds.to_netcdf('test.nc', mode='w') Now I overwrite with a new netcdf file and load: ds = xr.DataArray(np.arange(20), name='x').to_dataset() ds.to_netcdf('test.nc', mode='w') ds_out = xr.open_dataset('test.nc') print(ds_out) outputs: <xarray.Dataset> Dimensions: (dim_0: 10) Dimensions without coordinates: dim_0 Data variables: x (dim_0) int64 ... I would have expected to get the new dataset.

If I use netcdf4, the file seems to have been properly overwritten: import netCDF4 as nc d = nc.Dataset('test.nc') d outputs: <class 'netCDF4._netCDF4.Dataset'> root group (NETCDF4 data model, file format HDF5): dimensions(sizes): dim_0(20) variables(dimensions): int64 x(dim_0) groups:

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 | packaged by conda-forge | (default, Mar 23 2020, 23:03:20) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.12.53-60.30-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.4 xarray: 0.15.1 pandas: 1.0.3 numpy: 1.18.1 scipy: 1.4.1 netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.4.0 cftime: 1.1.1.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.13.0 distributed: 2.13.0 matplotlib: 3.3.0 cartopy: 0.17.0 seaborn: 0.10.0 numbagg: None setuptools: 46.1.3.post20200325 pip: 20.0.2 conda: None pytest: None IPython: 7.13.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4284/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
614854414 MDExOlB1bGxSZXF1ZXN0NDE1MzI0ODUw 4048 improve to_zarr doc about chunking apatlpo 11750960 closed 0     9 2020-05-08T16:43:09Z 2020-05-20T18:55:38Z 2020-05-20T18:55:33Z CONTRIBUTOR   0 pydata/xarray/pulls/4048
  • [X] follows #4046
  • [X] Passes isort -rc . && black . && mypy . && flake8

I'm not sure the last point is really necessary for this PR, is it?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4048/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
595666886 MDExOlB1bGxSZXF1ZXN0NDAwMTAwMzIz 3944 implement a more threadsafe call to colorbar apatlpo 11750960 closed 0     7 2020-04-07T07:51:28Z 2020-04-09T07:01:12Z 2020-04-09T06:26:57Z CONTRIBUTOR   0 pydata/xarray/pulls/3944
  • [ ] Xref #1889
  • [ ] Tests added
  • [ ] Passes isort -rc . && black . && mypy . && flake8
  • [ ] Fully documented, including whats-new.rst for all changes and api.rst for new API

If you think this is relevant, I'll go ahead and start working on the items above, even though I'm not sure new tests are needed.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3944/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
593825520 MDU6SXNzdWU1OTM4MjU1MjA= 3932 Element wise dataArray generation apatlpo 11750960 closed 0     6 2020-04-04T12:24:16Z 2020-04-07T04:32:12Z 2020-04-07T04:32:12Z CONTRIBUTOR      

I'm in a situation where I want to generate a bidimensional DataArray from a method that takes each of the two dimensions as input parameters. I have two methods to do this but neither of these looks particularly elegant to me and I wondered whether somebody would have better ideas.

  • Method 1 : dask delayed

``` x = np.arange(10) y = np.arange(20)

some_exp = lambda x, y: np.ones((Nstats)) some_exp_delayed = dask.delayed(some_exp, pure=True)

lazy_data = [some_exp_delayed(_x, _y) for _x in x for _y in y] sample = lazy_data[0].compute() arrays = [da.from_delayed(lazy_value, dtype=sample.dtype, shape=sample.shape) for lazy_value in lazy_data]

stack = (da.stack(arrays, axis=0) .reshape((len(x),len(y),sample.size)) ) ds = xr.DataArray(stack, dims=['x','y','stats']) ```

I tend to prefer this option because it imposes less requirement on the some_exp data shape. That being said it still seems like too many lines of code to achieve such result.

  • Method 2: apply_ufunc

``` x = np.arange(10) y = np.arange(20)

ds = xr.Dataset(coords={'x': x, 'y': y}) ds['_y'] = (0*ds.x+ds.y) # breaks apply_ufunc otherwise ds = ds.chunk({'x': 1, 'y':1})

let's say each experiment outputs 5 statistical diagnostics

Nstats = 5 some_exp = lambda x, y: np.ones((1,1,Nstats))

out = (xr.apply_ufunc(some_exp, ds.x, ds._y, dask='parallelized', output_dtypes=[float], output_sizes={'stats': Nstats}, output_core_dims=[['stats']]) ) `` I don't understand why I have to use the dummy variableds._yin this case. Having to rely onapply_ufunc` seems like an overkill.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3932/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
593860909 MDU6SXNzdWU1OTM4NjA5MDk= 3933 plot.line breaks depending on coordinate shape apatlpo 11750960 closed 0     2 2020-04-04T13:27:35Z 2020-04-04T18:42:10Z 2020-04-04T17:57:20Z CONTRIBUTOR      

plot.line breaks depending on coordinate shape, see the code below:

```python x = np.arange(10) y = np.arange(20)

ds = xr.Dataset(coords={'x': x, 'y': y})

ds = ds.assign_coords(z=ds.y+ds.x) # goes through

ds = ds.assign_coords(z=ds.x+ds.y) # breaks ds['v'] = (ds.x+ds.y) ds['v'].plot.line(y='z', hue='x') This breaks with the following error: ... ~/.miniconda3/envs/equinox/lib/python3.7/site-packages/matplotlib/axes/_base.py in _plot_args(self, tup, kwargs) 340 341 if x.shape[0] != y.shape[0]: --> 342 raise ValueError(f"x and y must have same first dimension, but " 343 f"have shapes {x.shape} and {y.shape}") 344 if x.ndim > 2 or y.ndim > 2:

ValueError: x and y must have same first dimension, but have shapes (20, 10) and (10, 20) ```

I would have expected that that dimension order would not matter

Versions

Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 | packaged by conda-forge | (default, Mar 23 2020, 23:03:20) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.12.53-60.30-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.4 xarray: 0.15.1 pandas: 1.0.3 numpy: 1.18.1 scipy: 1.4.1 netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.4.0 cftime: 1.1.1.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.13.0 distributed: 2.13.0 matplotlib: 3.2.1 cartopy: 0.17.0 seaborn: 0.10.0 numbagg: None setuptools: 46.1.3.post20200325 pip: 20.0.2 conda: None pytest: None IPython: 7.13.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3933/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
340192831 MDU6SXNzdWUzNDAxOTI4MzE= 2278 can't store zarr after open_zarr and isel apatlpo 11750960 closed 0     10 2018-07-11T10:59:23Z 2019-05-17T14:03:38Z 2018-08-14T03:46:34Z CONTRIBUTOR      

Code Sample, a copy-pastable example if possible

This works fine: ```python nx, ny, nt = 32, 32, 64 ds = xr.Dataset({}, coords={'x':np.arange(nx),'y':np.arange(ny), 't': np.arange(nt)}) ds = ds.assign(v=ds.tnp.cos(np.pi/180./100ds.x)np.cos(np.pi/180./50ds.y)) ds = ds.chunk({'t': 1, 'x': nx/2, 'y': ny/2})

ds.isel(t=0).to_zarr('data_t0.zarr', mode='w') ```

But if I store, reload and select, I cannot store: ds.to_zarr('data.zarr', mode='w') ds = xr.open_zarr('data.zarr') ds.isel(t=0).to_zarr('data_t0.zarr', mode='w')

Error message ends with:

``` ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/backends/zarr.py in _extract_zarr_variable_encoding(variable, raise_on_invalid) 181 182 chunks = _determine_zarr_chunks(encoding.get('chunks'), variable.chunks, --> 183 variable.ndim) 184 encoding['chunks'] = chunks 185 return encoding

~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/backends/zarr.py in _determine_zarr_chunks(enc_chunks, var_chunks, ndim) 112 raise ValueError("zarr chunks tuple %r must have same length as " 113 "variable.ndim %g" % --> 114 (enc_chunks_tuple, ndim)) 115 116 for x in enc_chunks_tuple:

ValueError: zarr chunks tuple (1, 16, 16) must have same length as variable.ndim 2 ```

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Linux OS-release: 3.12.53-60.30-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.7 pandas: 0.23.1 numpy: 1.14.2 scipy: 1.1.0 netCDF4: 1.4.0 h5netcdf: 0.6.1 h5py: 2.8.0 Nio: None zarr: 2.2.0 bottleneck: 1.2.1 cyordereddict: None dask: 0.18.1 distributed: 1.22.0 matplotlib: 2.2.2 cartopy: 0.16.0 seaborn: None setuptools: 39.2.0 pip: 10.0.1 conda: None pytest: None IPython: 6.4.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2278/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
165104458 MDU6SXNzdWUxNjUxMDQ0NTg= 896 mfdataset fails at chunking after opening apatlpo 11750960 closed 0     5 2016-07-12T15:08:34Z 2019-01-27T14:51:58Z 2019-01-27T14:51:58Z CONTRIBUTOR      

Hi all,

We are trying to specify chunks after opening an mfdataset but it does not work. This works fine with datasets. Is this behavior expected? Are we doing anything wrong?

```

- Modules

import sys, os import xarray as xr

chunks = (1727, 2711) xr_chunks = {'x': chunks[-1], 'y': chunks[-2], 'time_counter':1, 'deptht': 1}

- Parameter

natl60_path = '/home7/pharos/othr/NATL60/' filename = natl60_path+'NATL60-MJM155-S/5d/2008/NATL60-MJM155_y2008m01d09.5d_gridT.nc' filenames = natl60_path+'NATL60-MJM155-S/5d/2008/NATL60-MJM155_y2008m01d0*gridT.nc'

dataset

open

ds = xr.open_dataset(filename,chunks=None)

chunk

ds = ds.chunk(xr_chunks)

plot

print 'With dataset:' print ds['votemper'].isel(time_counter=0,deptht=0).values

mfdataset

open

ds = xr.open_mfdataset(filenames,chunks=None, lock=False)

plot

print 'With mfdataset no chunks:' print ds['votemper'].isel(time_counter=0,deptht=0).values

chunk

print 'With mfdataset with chunks:' ds = ds.chunk(xr_chunks) print ds['votemper'].isel(time_counter=0,deptht=0) print ds['votemper'].isel(time_counter=0,deptht=0).values ```

The output is:

With dataset: [[ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan] ..., [ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan]] With mfdataset no chunks: [[ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan] ..., [ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan]] With mfdataset with chunks: <xarray.DataArray 'votemper' (y: 3454, x: 5422)> dask.array<getitem..., shape=(3454, 5422), dtype=float64, chunksize=(1727, 2711)> Coordinates: nav_lat (y, x) float32 26.5648 26.5648 26.5648 26.5648 26.5648 ... nav_lon (y, x) float32 -81.4512 -81.4346 -81.4179 -81.4012 ... deptht float32 0.480455 * x (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ... * y (y) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ... time_counter datetime64[ns] 2008-01-02T12:00:00 time_centered datetime64[ns] 2008-01-02T12:00:00 Attributes: long_name: temperature units: degC online_operation: average interval_operation: 40s interval_write: 5d

The code hangs for a while and then spits:

``` Traceback (most recent call last): File "/home/slyne/aponte/natl60/python/natl60_dimup/overview/aurelien/plot_snapshot_2d_v4_break.py", line 44, in <module> print ds['votemper'].isel(time_counter=0,deptht=0).values File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/xarray/core/dataarray.py", line 364, in values return self.variable.values File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/xarray/core/variable.py", line 288, in values return _as_array_or_item(self._data_cached()) File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/xarray/core/variable.py", line 254, in _data_cached self._data = np.asarray(self._data) File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/numpy/core/numeric.py", line 460, in asarray return array(a, dtype, copy=False, order=order) File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/array/core.py", line 867, in array x = self.compute() File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/base.py", line 37, in compute return compute(self, kwargs)[0] File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/base.py", line 110, in compute results = get(dsk, keys, kwargs) File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/threaded.py", line 57, in get **kwargs) File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/async.py", line 481, in get_async raise(remote_exception(res, tb)) dask.async.MemoryError:

Traceback

File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/async.py", line 264, in execute_task result = _execute_task(task, data) File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/async.py", line 245, in _execute_task args2 = [_execute_task(a, cache) for a in args] File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/async.py", line 245, in _execute_task args2 = [_execute_task(a, cache) for a in args] File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/async.py", line 242, in _execute_task return [_execute_task(a, cache) for a in arg] File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/async.py", line 242, in _execute_task return [_execute_task(a, cache) for a in arg] File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/async.py", line 242, in _execute_task return [_execute_task(a, cache) for a in arg] File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/async.py", line 242, in _execute_task return [_execute_task(a, cache) for a in arg] File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/async.py", line 245, in _execute_task args2 = [_execute_task(a, cache) for a in args] File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/async.py", line 246, in _execute_task return func(*args2) File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/dask/array/core.py", line 50, in getarray c = np.asarray(c) File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/numpy/core/numeric.py", line 460, in asarray return array(a, dtype, copy=False, order=order) File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/xarray/core/indexing.py", line 312, in array return np.asarray(array[self.key], dtype=None) File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/xarray/conventions.py", line 359, in getitem self.scale_factor, self.add_offset, self._dtype) File "/home1/homedir5/perso/aponte/miniconda2/envs/natl60/lib/python2.7/site-packages/xarray/conventions.py", line 57, in mask_and_scale values = np.array(array, dtype=dtype, copy=True) ```

Cheers

Aurelien

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/896/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
373449569 MDU6SXNzdWUzNzM0NDk1Njk= 2504 isel slows down computation significantly after open_dataset apatlpo 11750960 closed 0     3 2018-10-24T12:09:18Z 2018-10-25T19:12:06Z 2018-10-25T19:12:06Z CONTRIBUTOR      

isel significantly slows down a simple mean calculation:

python ds = xr.open_dataset(grid_dir_nc+'Depth.nc', chunks={'face':1}) print(ds) % time print(ds.Depth.mean().values) leads to: <xarray.Dataset> Dimensions: (face: 13, i: 4320, j: 4320) Coordinates: * i (i) int64 0 1 2 3 4 5 6 7 ... 4313 4314 4315 4316 4317 4318 4319 * j (j) int64 0 1 2 3 4 5 6 7 ... 4313 4314 4315 4316 4317 4318 4319 * face (face) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 Data variables: Depth (face, j, i) float32 dask.array<shape=(13, 4320, 4320), chunksize=(1, 4320, 4320)> 1935.0237 CPU times: user 241 ms, sys: 16.9 ms, total: 258 ms Wall time: 1.05 s

ds = xr.open_dataset(grid_dir_nc+'Depth.nc', chunks={'face':1}) ds = ds.isel(i=slice(None,None,4),j=slice(None,None,4)) % time print(ds.Depth.mean().values) leads to: 1935.0199 CPU times: user 9.43 s, sys: 819 ms, total: 10.3 s Wall time: 2min 57s

Is this expected behavior?

Output of xr.show_versions()

I am using latest xarray version (pip install https://github.com/pydata/xarray/archive/master.zip)

INSTALLED VERSIONS ------------------ commit: None python: 3.6.6.final.0 python-bits: 64 OS: Linux OS-release: 3.10.0-862.2.3.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0+unknown pandas: 0.23.4 numpy: 1.15.3 scipy: 1.1.0 netCDF4: 1.4.1 h5netcdf: None h5py: None Nio: None zarr: 2.2.0 cftime: 1.0.1 PseudonetCDF: None rasterio: None iris: None bottleneck: None cyordereddict: None dask: 0.19.2 distributed: 1.23.2 matplotlib: 2.2.2 cartopy: 0.16.0 seaborn: None setuptools: 40.2.0 pip: 10.0.1 conda: None pytest: None IPython: 6.5.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2504/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
323333361 MDU6SXNzdWUzMjMzMzMzNjE= 2132 to_netcdf - RuntimeError: NetCDF: HDF error apatlpo 11750960 closed 0     3 2018-05-15T18:31:49Z 2018-05-16T19:50:52Z 2018-05-16T18:52:59Z CONTRIBUTOR      

I am trying to store data to a netcdf file, and have issues:

Data is created according to:

```python import numpy as np import xarray as xr

i = np.arange(4320) j = np.arange(4320) face = np.arange(13) v = xr.DataArray(np.random.randn(face.size, j.size, i.size), \ coords={'i': i, 'j': j, 'face': face}, dims=['face','j','i']) ```

The following works: file_out = 'rand.nc' v.to_netcdf(file_out) there is a minor warning: /home1/datahome/aponte/.miniconda3/envs/equinox/lib/python3.6/site-packages/distributed/utils.py:128: RuntimeWarning: Couldn't detect a suitable IP address for reaching '8.8.8.8', defaulting to '127.0.0.1': [Errno 101] Network is unreachable % (host, default, e), RuntimeWarning)

But this does not work: file_out = '/home1/datawork/aponte/mit_tmp/rand.nc' v.to_netcdf(file_out)

with the following error message:

``` --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/backends/api.py in to_netcdf(dataset, path_or_file, mode, format, group, engine, writer, encoding, unlimited_dims) 656 dataset.dump_to_store(store, sync=sync, encoding=encoding, --> 657 unlimited_dims=unlimited_dims) 658 if path_or_file is None: ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/core/dataset.py in dump_to_store(self, store, encoder, sync, encoding, unlimited_dims) 1073 store.store(variables, attrs, check_encoding, -> 1074 unlimited_dims=unlimited_dims) 1075 if sync: ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/backends/common.py in store(self, variables, attributes, check_encoding_set, unlimited_dims) 362 self.set_variables(variables, check_encoding_set, --> 363 unlimited_dims=unlimited_dims) 364 ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/backends/netCDF4_.py in set_variables(self, *args, **kwargs) 353 with self.ensure_open(autoclose=False): --> 354 super(NetCDF4DataStore, self).set_variables(*args, **kwargs) 355 ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/backends/common.py in set_variables(self, variables, check_encoding_set, unlimited_dims) 401 --> 402 self.writer.add(source, target) 403 ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/backends/common.py in add(self, source, target) 264 else: --> 265 target[...] = source 266 ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/backends/netCDF4_.py in __setitem__(self, key, value) 46 data = self.get_array() ---> 47 data[key] = value 48 netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Variable.__setitem__() netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Variable._put() netCDF4/_netCDF4.pyx in netCDF4._netCDF4._ensure_nc_success() RuntimeError: NetCDF: HDF error During handling of the above exception, another exception occurred: RuntimeError Traceback (most recent call last) <ipython-input-4-9da9ecadc6a6> in <module>() 2 if os.path.isfile(file_out): 3 os.remove(file_out) ----> 4 v.to_netcdf(file_out) ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/core/dataarray.py in to_netcdf(self, *args, **kwargs) 1515 dataset = self.to_dataset() 1516 -> 1517 return dataset.to_netcdf(*args, **kwargs) 1518 1519 def to_dict(self): ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/core/dataset.py in to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims) 1135 return to_netcdf(self, path, mode, format=format, group=group, 1136 engine=engine, encoding=encoding, -> 1137 unlimited_dims=unlimited_dims) 1138 1139 def to_zarr(self, store=None, mode='w-', synchronizer=None, group=None, ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/backends/api.py in to_netcdf(dataset, path_or_file, mode, format, group, engine, writer, encoding, unlimited_dims) 660 finally: 661 if sync and isinstance(path_or_file, basestring): --> 662 store.close() 663 664 if not sync: ~/.miniconda3/envs/equinox/lib/python3.6/site-packages/xarray/backends/netCDF4_.py in close(self) 419 ds = find_root(self.ds) 420 if ds._isopen: --> 421 ds.close() 422 self._isopen = False netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Dataset.close() netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Dataset._close() netCDF4/_netCDF4.pyx in netCDF4._netCDF4._ensure_nc_success() RuntimeError: NetCDF: HDF error ```

The following may of some use: (equinox) aponte@datarmor1:~/mit_equinox/sandbox> stat -f -L -c %T /home1/datawork/aponte/mit_tmp/ gpfs (equinox) aponte@datarmor1:~/mit_equinox/sandbox> stat -f -L -c %T . nfs (the .directory being where the notebook seats)

Output of xr.show_versions()

# Paste the output here xr.show_versions() here ``` /home1/datahome/aponte/.miniconda3/envs/equinox/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. from ._conv import register_converters as _register_converters INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Linux OS-release: 3.12.53-60.30-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.3 pandas: 0.22.0 numpy: 1.14.2 scipy: 1.0.1 netCDF4: 1.3.1 h5netcdf: 0.5.1 h5py: 2.7.1 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: 0.17.2 distributed: 1.21.6 matplotlib: 2.2.2 cartopy: 0.16.0 seaborn: None setuptools: 39.0.1 pip: 9.0.3 conda: None pytest: None IPython: 6.3.1 sphinx: None ```
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2132/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 22.615ms · About: xarray-datasette