home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

10 rows where state = "closed" and user = 167802 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, closed_at, created_at (date), updated_at (date), closed_at (date)

type 2

  • issue 8
  • pull 2

state 1

  • closed · 10 ✖

repo 1

  • xarray 10
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1778486450 PR_kwDOAMm_X85UHfL4 7948 Implement preferred_chunks for netcdf 4 backends mraspaud 167802 closed 0     10 2023-06-28T08:43:30Z 2023-09-12T09:01:03Z 2023-09-11T23:05:49Z CONTRIBUTOR   0 pydata/xarray/pulls/7948

According to the open_dataset documentation, using chunks="auto" or chunks={} should yield datasets with variables chunked depending on the preferred chunks of the backend. However neither the netcdf4 nor the h5netcdf backend seem to implement the preferred_chunks encoding attribute needed for this to work.

This PR adds this attribute to the encoding upon data reading. This results in chunks="auto" in open_dataset returning variables with chunk sizes multiples of the chunks in the nc file, and for chunks={}, returning the variables with then exact nc chunk sizes.

  • [x] Closes #1440
  • [x] Tests added
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [ ] New functions/methods are listed in api.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7948/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
277441150 MDU6SXNzdWUyNzc0NDExNTA= 1743 Assigning data to vector-indexed data doesn't seem to work mraspaud 167802 closed 0     4 2017-11-28T16:06:56Z 2022-02-23T12:23:42Z 2017-12-09T03:29:35Z CONTRIBUTOR      

Code Sample

```python import xarray as xr import numpy as np import dask.array as da

arr = np.arange(25).reshape((5, 5))

l_indices = xr.DataArray(np.array(((0, 1), (2, 3))), dims=['lines', 'cols']) c_indices = xr.DataArray(np.array(((1, 3), (0, 2))), dims=['lines', 'cols'])

xarr = xr.DataArray(da.from_array(arr, chunks=10), dims=['y', 'x'])

print(xarr[l_indices, c_indices])

xarr[l_indices, c_indices] = 2 ```

Problem description

This crashes on the last line with a IndexError: Unlabeled multi-dimensional array cannot be used for indexing: [[0 1] [2 3]] I'm expecting to be able to do assignment this way, and it doesn't work.

Expected Output

Expected output is the modified array with 2's in the indicated positions

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 2.7.5.final.0 python-bits: 64 OS: Linux OS-release: 3.10.0-693.2.2.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: None.None xarray: 0.10.0 pandas: 0.21.0 numpy: 1.13.3 scipy: 0.18.1 netCDF4: 1.1.8 h5netcdf: 0.4.2 Nio: None bottleneck: None cyordereddict: None dask: 0.15.4 matplotlib: 1.2.0 cartopy: None seaborn: None setuptools: 36.2.1 pip: 9.0.1 conda: None pytest: 3.1.3 IPython: 5.1.0 sphinx: 1.3.6
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1743/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
510892578 MDU6SXNzdWU1MTA4OTI1Nzg= 3433 Attributes are dropped after `clip` even if `keep_attrs` is True mraspaud 167802 closed 0     5 2019-10-22T20:32:44Z 2020-10-14T16:29:52Z 2020-10-14T16:29:52Z CONTRIBUTOR      

MCVE Code Sample

```python import xarray as xr import numpy as np

arr = xr.DataArray(np.ones((5, 5)), attrs={'units': 'K'}) xr.set_options(keep_attrs=True) arr

<xarray.DataArray (dim_0: 5, dim_1: 5)>

array([[1., 1., 1., 1., 1.],

[1., 1., 1., 1., 1.],

[1., 1., 1., 1., 1.],

[1., 1., 1., 1., 1.],

[1., 1., 1., 1., 1.]])

Dimensions without coordinates: dim_0, dim_1

Attributes:

units: K

arr.clip(0, 1)

<xarray.DataArray (dim_0: 5, dim_1: 5)>

array([[1., 1., 1., 1., 1.],

[1., 1., 1., 1., 1.],

[1., 1., 1., 1., 1.],

[1., 1., 1., 1., 1.],

[1., 1., 1., 1., 1.]])

Dimensions without coordinates: dim_0, dim_1

```

Expected Output

I would expect the attributes to be kept

Problem Description

keep_attrs set to True doesn't seem to be respected with the DataArray.clip method.

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 21:52:21) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-1062.1.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.6.2 xarray: 0.14.0 pandas: 0.25.1 numpy: 1.17.0 scipy: 1.3.0 netCDF4: 1.5.1.2 pydap: None h5netcdf: 0.7.4 h5py: 2.10.0 Nio: None zarr: 2.3.2 cftime: 1.0.3.4 nc_time_axis: None PseudoNetCDF: None rasterio: 1.0.28 cfgrib: None iris: None bottleneck: None dask: 2.6.0 distributed: 2.6.0 matplotlib: 3.1.1 cartopy: 0.17.0 seaborn: None numbagg: None setuptools: 41.4.0 pip: 19.3 conda: None pytest: 5.0.1 IPython: 7.8.0 sphinx: 2.2.0
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3433/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
559645981 MDU6SXNzdWU1NTk2NDU5ODE= 3746 dataarray arithmetics restore removed coordinates in xarray 0.15 mraspaud 167802 closed 0     5 2020-02-04T11:06:40Z 2020-03-21T19:03:51Z 2020-03-21T19:03:51Z CONTRIBUTOR      

MCVE Code Sample

```python import xarray as xr import numpy as np arr2 = xr.DataArray(np.ones((2, 2)), dims=['y', 'x']) arr1 = xr.DataArray(np.ones((2, 2)), dims=['y', 'x'], coords={'y': [0, 1], 'x': [0, 1]})

del arr1.coords['y'] del arr1.coords['x']

shows arr1 without coordinates

arr1

shows coordinates in xarray 0.15

arr1 * arr2

```

Expected Output

python <xarray.DataArray (y: 2, x: 2)> array([[1., 1.], [1., 1.]]) Dimensions without coordinates: y, x

Problem Description

In xarray 0.15, the coordinates are restored when doing the multiplication: python <xarray.DataArray (y: 2, x: 2)> array([[1., 1.], [1., 1.]]) Coordinates: * y (y) int64 0 1 * x (x) int64 0 1

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.8.1 | packaged by conda-forge | (default, Jan 29 2020, 14:55:04) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 4.18.0-147.0.3.el8_1.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.3 xarray: 0.15.0 pandas: 1.0.0 numpy: 1.18.1 scipy: 1.4.1 netCDF4: 1.5.3 pydap: None h5netcdf: 0.7.4 h5py: 2.10.0 Nio: None zarr: 2.3.2 cftime: 1.0.4.2 nc_time_axis: None PseudoNetCDF: None rasterio: 1.1.2 cfgrib: None iris: None bottleneck: 1.3.1 dask: 2.10.1 distributed: 2.10.0 matplotlib: 3.1.3 cartopy: 0.17.0 seaborn: None numbagg: None setuptools: 45.1.0.post20200119 pip: 20.0.2 conda: None pytest: 5.3.5 IPython: 7.12.0 sphinx: 2.3.1
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3746/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
495198361 MDU6SXNzdWU0OTUxOTgzNjE= 3317 Can't create weakrefs on DataArrays since xarray 0.13.0 mraspaud 167802 closed 0 crusaderky 6213168   8 2019-09-18T12:36:46Z 2019-10-14T21:38:09Z 2019-09-18T15:53:51Z CONTRIBUTOR      

MCVE Code Sample

python import xarray as xr from weakref import ref arr = xr.DataArray([1, 2, 3]) ref(arr)

Expected Output

I expect the weak reference to be created as in former versions

Problem Description

The above code raises the following exception: TypeError: cannot create weak reference to 'DataArray' object

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 21:52:21) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-1062.1.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.13.0 pandas: 0.25.1 numpy: 1.17.0 scipy: 1.3.0 netCDF4: 1.5.1.2 pydap: None h5netcdf: 0.7.4 h5py: 2.9.0 Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudoNetCDF: None rasterio: 1.0.22 cfgrib: None iris: None bottleneck: None dask: 2.3.0 distributed: 2.4.0 matplotlib: 3.1.1 cartopy: 0.17.0 seaborn: None numbagg: None setuptools: 41.2.0 pip: 19.2.3 conda: None pytest: 5.0.1 IPython: 7.8.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3317/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
387732534 MDExOlB1bGxSZXF1ZXN0MjM2MTU0NTUz 2591 Fix h5netcdf saving scalars with filters or chunks mraspaud 167802 closed 0     8 2018-12-05T12:22:40Z 2018-12-11T07:27:27Z 2018-12-11T07:24:36Z CONTRIBUTOR   0 pydata/xarray/pulls/2591
  • [x] Closes #2563
  • [x] Tests added (for all bug fixes or enhancements)
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later)
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2591/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
383667887 MDU6SXNzdWUzODM2Njc4ODc= 2563 Scalars from netcdf dataset can't be written with h5netcdf mraspaud 167802 closed 0     1 2018-11-22T22:44:48Z 2018-12-11T07:24:36Z 2018-12-11T07:24:36Z CONTRIBUTOR      

Code Sample, a copy-pastable example if possible

A "Minimal, Complete and Verifiable Example" will make it much easier for maintainers to help you: http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports

```python import xarray as xr

from netCDF4 import Dataset

def write_netcdf(filename,zlib,least_significant_digit,data,dtype='f4',shuffle=False,contiguous=False,\ chunksizes=None,complevel=6,fletcher32=False): file = Dataset(filename,'w') file.createDimension('n', 1) foo = file.createVariable('data',\ dtype,('n'),zlib=zlib,least_significant_digit=least_significant_digit,\ shuffle=shuffle,contiguous=contiguous,complevel=complevel,fletcher32=fletcher32,chunksizes=chunksizes) foo[:] = data file.close()

write_netcdf("mydatafile.nc",True,None,0.0,shuffle=True, chunksizes=(1,))

data = xr.open_dataset('mydatafile.nc') arr = data['data']

arr[0].to_netcdf('mytestfile.nc', mode='w', engine='h5netcdf')

```

Problem description

The above example crashes with a TypeError since xarray 0.10.4 (works before, hence reporting the error here and not in eg. h5netcdf): TypeError: Scalar datasets don't support chunk/filter options

The problem here is that it is not anymore possible to squeeze an array that comes from a netcdf file that was compressed or filtered.

Expected Output

The expected output is that the creation of the trimmed netcdf file works.

Output of xr.show_versions()

>>> xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.6.6.final.0 python-bits: 64 OS: Linux OS-release: 3.10.0-957.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8 xarray: 0.11.0 pandas: 0.23.4 numpy: 1.15.4 scipy: 1.1.0 netCDF4: 1.3.1 h5netcdf: 0.6.2 h5py: 2.8.0 Nio: None zarr: None cftime: None PseudonetCDF: None rasterio: 1.0.2 iris: None bottleneck: 1.2.1 cyordereddict: None dask: 0.20.2 distributed: None matplotlib: 3.0.0 cartopy: 0.16.0 seaborn: None setuptools: 40.5.0 pip: 9.0.3 conda: None pytest: None IPython: 6.2.1 sphinx: 1.8.1
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2563/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
255989233 MDU6SXNzdWUyNTU5ODkyMzM= 1560 DataArray.unstack taking unreasonable amounts of memory mraspaud 167802 closed 0     11 2017-09-07T16:01:50Z 2018-08-15T00:18:28Z 2018-08-15T00:18:28Z CONTRIBUTOR      

Hi,

While trying to support DataArrays in pyresample, I stumble upon what seems to me to be a bug. It looks like unstacking a dimension takes unreasonable amounts of memory. For example:

```python from xarray import DataArray import numpy as np

arr = DataArray(np.empty([1, 8996, 9223])).stack(flat_dim=['dim_1', 'dim_2']) print(arr) arr.unstack('flat_dim') ```

peaks at about 8GB of my memory (in top), while the array in itself isn't supposed to take more than 635MB approximately. I know my measuring method is not very accurate, but should it be this way ?

As a side note, the unstacking also takes a very long time. What is going on under the hood ?

Martin

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1560/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
296673404 MDU6SXNzdWUyOTY2NzM0MDQ= 1906 Coordinate attributes as DataArray type doesn't export to netcdf mraspaud 167802 closed 0     5 2018-02-13T09:48:53Z 2018-02-26T09:34:24Z 2018-02-26T09:34:24Z CONTRIBUTOR      

Code Sample, a copy-pastable example if possible

```python import numpy as np import xarray as xr

arr = xr.DataArray([[1, 2, 3]], dims=['time', 'x']) arr['time'] = np.array([1]) time_bnds = xr.DataArray([0, 1], dims='time_bounds') arr['time'].attrs['bounds'] = time_bnds

dataset = xr.Dataset({'arr': arr, 'time_bnds': time_bnds})

dataset.to_netcdf('time_bnd.nc') ```

Problem description

This code produces a TypeError Traceback (most recent call last): File "test_time_bounds.py", line 12, in <module> dataset.to_netcdf('time_bnd.nc') File "/home/a001673/.local/lib/python2.7/site-packages/xarray/core/dataset.py", line 1132, in to_netcdf unlimited_dims=unlimited_dims) File "/home/a001673/.local/lib/python2.7/site-packages/xarray/backends/api.py", line 598, in to_netcdf _validate_attrs(dataset) File "/home/a001673/.local/lib/python2.7/site-packages/xarray/backends/api.py", line 121, in _validate_attrs check_attr(k, v) File "/home/a001673/.local/lib/python2.7/site-packages/xarray/backends/api.py", line 112, in check_attr 'files'.format(value)) TypeError: Invalid value for attr: <xarray.DataArray (time_bounds: 2)> array([0, 1]) Dimensions without coordinates: time_bounds must be a number string, ndarray or a list/tuple of numbers/strings for serialization to netCDF files This is a problem for me because we need to provide attributes to the coordinate variables and save the to netcdf in order to be CF compliant. There are workarounds (like saving the time_bnds as a regular variable and putting its name as an attribute of the time variable) , but the provided code seems to be the most intuitive way to do it.

Expected output

I would expect an output like this (ncdump -h): netcdf time_bnd { dimensions: time = 1 ; time_bounds = 2 ; x = 3 ; variables: int64 time(time) ; time:bounds = "time_bnds" ; int64 time_bnds(time_bounds) ; int64 arr(time, x) ;

Output of xr.show_versions()

In [2]: xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 2.7.5.final.0 python-bits: 64 OS: Linux OS-release: 3.10.0-693.11.6.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: None.None xarray: 0.10.0 pandas: 0.21.0 numpy: 1.13.3 scipy: 0.18.1 netCDF4: 1.1.8 h5netcdf: 0.4.2 Nio: None bottleneck: 1.2.1 cyordereddict: None dask: 0.16.1 matplotlib: 2.1.0 cartopy: None seaborn: None setuptools: 38.4.0 pip: 9.0.1 conda: None pytest: 3.1.3 IPython: 5.5.0 sphinx: 1.6.6
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1906/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
289972054 MDU6SXNzdWUyODk5NzIwNTQ= 1842 DataArray read from netcdf with unexpected type mraspaud 167802 closed 0     1 2018-01-19T13:15:11Z 2018-01-23T20:15:29Z 2018-01-23T20:15:29Z CONTRIBUTOR      

Code Sample, a copy-pastable example if possible

```python import numpy as np import h5netcdf

filename = "mask_and_scale_float32.nc"

with h5netcdf.File(filename, 'w') as f: f.dimensions = {'x': 5} v = f.create_variable('hello', ('x',), dtype=np.uint16) v[:] = np.ones(5, dtype=np.uint16) v[0] = np.uint16(65535) v.attrs['_FillValue'] = np.uint16(65535) v.attrs['scale_factor'] = np.float32(2) v.attrs['add_offset'] = np.float32(0.5)

import xarray as xr

v = xr.open_dataset(filename, mask_and_scale=True)['hello'] print(v.dtype) ```

Problem description

The scale_factor and add_offset being float32, I would expect the result from loading to be a float32 array. However, we get a float64 array instead. A float32 array for a very large dataset is better for faster computations.

Expected Output

float32

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 2.7.5.final.0 python-bits: 64 OS: Linux OS-release: 3.10.0-693.11.6.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: None.None xarray: 0.10.0 pandas: 0.21.0 numpy: 1.13.3 scipy: 0.18.1 netCDF4: 1.1.8 h5netcdf: 0.4.2 Nio: None bottleneck: None cyordereddict: None dask: 0.16.0+37.g1fef002 matplotlib: 2.1.0 cartopy: None seaborn: None setuptools: 38.2.4 pip: 9.0.1 conda: None pytest: 3.1.3 IPython: 5.5.0 sphinx: 1.3.6
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1842/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 23.49ms · About: xarray-datasette