home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

1 row where state = "open" and user = 19285200 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

type 1

  • issue 1

state 1

  • open · 1 ✖

repo 1

  • xarray 1
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
749924639 MDU6SXNzdWU3NDk5MjQ2Mzk= 4607 set_index(..., append=True) act as with append=False with 'Dimensions without coordinates' ghiggi 19285200 open 0     0 2020-11-24T17:59:49Z 2020-11-24T19:37:04Z   NONE      

What happened:

I get into this strange behaviour when trying to recreate a stacked (MultiIndex) coordinate using set_index(...,append=True).

Since it is not possible to save Dataset to netCDF or Zarr containing stacked / MultiIndex coordinates, before writing to disk I used reset_index(<stacked_coordinate>). When reading such data, I need to use set_index(.., append=True) to recreate such stacked coordinate.

What you expected to happen:

I would expect that set_index(..., append=True) would recreate the MultiIndex stacked coordinate. However, this does not occur if the dimension coordinate specified within set_index() is a 'dimension without coordinate'. In such situation, set_index(..., append=True) behaves as set_index(, append=False).

Minimal Complete Verifiable Example:

```python import xarray as xr import numpy as np

Create Datasets

arr1 = np.random.rand(4, 5).reshape(4,5) arr2 = np.random.rand(4, 5).reshape(4,5) da1 = xr.DataArray(arr1, dims=['nodes','time'], coords={"time": [1,2,3,4,5], "nodes": [1,2,3,4]}, name='var1') da2 = xr.DataArray(arr2, dims=['nodes','time'], coords={"time": [1,2,3,4,5], "nodes": [1,2,3,4]}, name='var2') ds_unstacked = xr.Dataset({'var1':da1,'var2':da2}) print(ds_unstacked)

- Stack variables across a new dimension

da_stacked = ds_unstacked.to_stacked_array(new_dim="variables", variable_dim='variable', sample_dims=['nodes','time'], name="Stacked_Variables") ds_stacked = da_stacked.to_dataset()

- Look at the stacked MultiIndex coordinate 'variables'

print(ds_stacked) print(da_stacked.variables.indexes)

Remove MultiIndex (to save Dataset to netCDF/Zarr, ...)

ds_stacked_disk = ds_stacked.reset_index('variables') print(ds_stacked_disk)

Try to recreate MultiIndex

print(ds_stacked_disk.set_index(variables=['variable'], append=False)) # GOOD ! Replace 'variable' coordinate with 'variables' print(ds_stacked_disk.set_index(variables=['variable'], append=True)) # BUG ! Do not create the expected MultiIndex !

Current workaround to obtain a MultiIndex stacked coordinate

tmp_ds = ds_stacked_disk.assign_coords(variables=(np.arange(0,2))) ds_stacked1 = tmp_ds.set_index(variables=['variable'], append=True)
print(ds_stacked1) # But with level 0 - 'variables_level_0'

Unstack back

- If the BUG is solved, no need to specify the level argument

ds_stacked1['Stacked_Variables'].to_unstacked_dataset(dim='variables', level='variable')

```

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.8.5 | packaged by conda-forge | (default, Sep 24 2020, 16:55:52) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 5.4.0-48-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.16.1 pandas: 1.1.2 numpy: 1.19.1 scipy: 1.5.2 netCDF4: 1.5.4 pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: 2.5.0 cftime: 1.2.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: 0.9.8.4 iris: None bottleneck: 1.3.2 dask: 2.27.0 distributed: 2.27.0 matplotlib: 3.3.2 cartopy: 0.18.0 seaborn: None numbagg: None pint: None setuptools: 49.6.0.post20200917 pip: 20.2.3 conda: None pytest: None IPython: 7.18.1 sphinx: 3.2.1
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4607/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 19.999ms · About: xarray-datasette