home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

2 rows where state = "closed", type = "issue" and user = 1053153 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue · 2 ✖

state 1

  • closed · 2 ✖

repo 1

  • xarray 2
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
621451930 MDU6SXNzdWU2MjE0NTE5MzA= 4084 write/read to zarr subtly changes array with non-dim coord chrisroat 1053153 closed 0     1 2020-05-20T04:34:42Z 2022-01-27T21:46:58Z 2022-01-27T21:46:58Z CONTRIBUTOR      

With an array containing a non-dimension coordinate, I do "where+squeeze" selection and can write the array to zarr without problem. However, if I write out and read back in the array prior to doing the selection, the array no longer writes properly.

The problem may stem from the non-dimension coordinate being read back as a dask array (which I can see by printing the coordinate).

The problem goes away if this line is altered to check for empty tuples in addition to None: if (var_chunks is None or not len(var_chunks)) and (enc_chunks is None or not len(enc_chunks)):

However, I'm not sure if the subtle change in the array will cause other issues, so I don't know if the above modification is a band-aid or a real solution.

MCVE Code Sample

```python

Your code here

import xarray as xr import numpy as np import dask.array as da

def create(): image = da.zeros((2,2)) return xr.DataArray(image, dims=['y', 'x'], coords={'x': [0, 1], 'y': [0, 1], 'xname': ('x', ['apple', 'banana'])})

def select_and_write(arr, fname): arr = arr.where(arr.coords['xname'] == 'apple', drop=True) arr = arr.squeeze('x') arr.to_dataset(name='foo').to_zarr(fname, mode='w')

def ok(): print('ok') arr = create() select_and_write(arr, '/tmp/ok.zarr')

def error(): print('error') arr = create() arr.to_dataset(name='foo').to_zarr('/tmp/error_intermediate.zarr', mode='w') arr2 = xr.open_zarr('/tmp/error_intermediate.zarr')['foo'] select_and_write(arr2, '/tmp/error.zarr')

ok() error() ```

Expected Output

No stacktrace.

Problem Description

Stacktrace ``` --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-1-0db8ba1ef466> in <module> 27 28 ok() ---> 29 error() <ipython-input-1-0db8ba1ef466> in error() 24 arr.to_dataset(name='foo').to_zarr('/tmp/error_intermediate.zarr', mode='w') 25 arr2 = xr.open_zarr('/tmp/error_intermediate.zarr')['foo'] ---> 26 select_and_write(arr2, '/tmp/error.zarr') 27 28 ok() <ipython-input-1-0db8ba1ef466> in select_and_write(arr, fname) 11 arr = arr.where(arr.coords['xname'] == 'apple', drop=True) 12 arr = arr.squeeze('x') ---> 13 arr.to_dataset(name='foo').to_zarr(fname, mode='w') 14 15 ~/.local/share/virtualenvs/starmap2-kOR7I2hi/lib/python3.7/site-packages/xarray/core/dataset.py in to_zarr(self, store, mode, synchronizer, group, encoding, compute, consolidated, append_dim) 1632 compute=compute, 1633 consolidated=consolidated, -> 1634 append_dim=append_dim, 1635 ) 1636 ~/.local/share/virtualenvs/starmap2-kOR7I2hi/lib/python3.7/site-packages/xarray/backends/api.py in to_zarr(dataset, store, mode, synchronizer, group, encoding, compute, consolidated, append_dim) 1341 writer = ArrayWriter() 1342 # TODO: figure out how to properly handle unlimited_dims -> 1343 dump_to_store(dataset, zstore, writer, encoding=encoding) 1344 writes = writer.sync(compute=compute) 1345 ~/.local/share/virtualenvs/starmap2-kOR7I2hi/lib/python3.7/site-packages/xarray/backends/api.py in dump_to_store(dataset, store, writer, encoder, encoding, unlimited_dims) 1133 variables, attrs = encoder(variables, attrs) 1134 -> 1135 store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims) 1136 1137 ~/.local/share/virtualenvs/starmap2-kOR7I2hi/lib/python3.7/site-packages/xarray/backends/zarr.py in store(self, variables, attributes, check_encoding_set, writer, unlimited_dims) 385 self.set_dimensions(variables_encoded, unlimited_dims=unlimited_dims) 386 self.set_variables( --> 387 variables_encoded, check_encoding_set, writer, unlimited_dims=unlimited_dims 388 ) 389 ~/.local/share/virtualenvs/starmap2-kOR7I2hi/lib/python3.7/site-packages/xarray/backends/zarr.py in set_variables(self, variables, check_encoding_set, writer, unlimited_dims) 434 else: 435 # new variable --> 436 encoding = _extract_zarr_variable_encoding(v, raise_on_invalid=check) 437 encoded_attrs = {} 438 # the magic for storing the hidden dimension data ~/.local/share/virtualenvs/starmap2-kOR7I2hi/lib/python3.7/site-packages/xarray/backends/zarr.py in _extract_zarr_variable_encoding(variable, raise_on_invalid) 188 189 chunks = _determine_zarr_chunks( --> 190 encoding.get("chunks"), variable.chunks, variable.ndim 191 ) 192 encoding["chunks"] = chunks ~/.local/share/virtualenvs/starmap2-kOR7I2hi/lib/python3.7/site-packages/xarray/backends/zarr.py in _determine_zarr_chunks(enc_chunks, var_chunks, ndim) 108 if len(enc_chunks_tuple) != ndim: 109 # throw away encoding chunks, start over --> 110 return _determine_zarr_chunks(None, var_chunks, ndim) 111 112 for x in enc_chunks_tuple: ~/.local/share/virtualenvs/starmap2-kOR7I2hi/lib/python3.7/site-packages/xarray/backends/zarr.py in _determine_zarr_chunks(enc_chunks, var_chunks, ndim) 104 enc_chunks_tuple = ndim * (enc_chunks,) 105 else: --> 106 enc_chunks_tuple = tuple(enc_chunks) 107 108 if len(enc_chunks_tuple) != ndim: TypeError: 'NoneType' object is not iterable ```

Versions

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.7.5 (default, Nov 7 2019, 10:50:52) [GCC 8.3.0] python-bits: 64 OS: Linux OS-release: 5.3.0-51-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: None xarray: 0.15.1 pandas: 1.0.3 numpy: 1.18.4 scipy: 1.4.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: 2.4.0 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.15.0 distributed: 2.15.2 matplotlib: 3.2.1 cartopy: None seaborn: 0.10.1 numbagg: None setuptools: 46.1.3 pip: 20.0.2 conda: None pytest: 5.4.1 IPython: 7.14.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4084/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
490476815 MDU6SXNzdWU0OTA0NzY4MTU= 3287 GroupBy of stacked dim with strings renames underlying dims chrisroat 1053153 closed 0     7 2019-09-06T18:59:47Z 2020-03-31T16:10:10Z 2020-03-31T16:10:10Z CONTRIBUTOR      

Names for dimensions are lost (renamed) when they are stacked and grouped, if one of the dimensions has string coordinates.

```python data = np.zeros((2,1,1)) dims = ['c', 'y', 'x']

d1 = xr.DataArray(data, dims=dims) g1 = d1.stack(f=['c', 'x']).groupby('f').first() print('Expected dim names:') print(g1.coords) print()

d2 = xr.DataArray(data, dims=dims, coords={'c': ['R', 'G']}) g2 = d2.stack(f=['c', 'x']).groupby('f').first() print('Unexpected dim names:') print(g2.coords) ```

Output

It is expected the 'f_level_0' and 'f_level_1' be 'c' and 'x', respectively in the second part below. ``` Expected dim names: Coordinates: * f (f) MultiIndex - c (f) int64 0 1 - x (f) int64 0 0

Unexpected dim names: Coordinates: * f (f) MultiIndex - f_level_0 (f) object 'G' 'R' - f_level_1 (f) int64 0 0 ```

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.4 (default, Jul 9 2019, 18:13:23) [Clang 10.0.1 (clang-1001.0.46.4)] python-bits: 64 OS: Darwin OS-release: 18.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.2 libnetcdf: 4.6.3 xarray: 0.12.3 pandas: 0.25.1 numpy: 1.17.1 scipy: 1.3.1 netCDF4: 1.5.2 pydap: None h5netcdf: None h5py: 2.9.0 Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.1.1 cartopy: None seaborn: None numbagg: None setuptools: 41.2.0 pip: 19.2.3 conda: None pytest: None IPython: 7.8.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3287/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 22.217ms · About: xarray-datasette