home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 372006204

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
372006204 MDU6SXNzdWUzNzIwMDYyMDQ= 2496 Incorrect conversion from sliced pd.MultiIndex 1882397 closed 0     2 2018-10-19T15:25:38Z 2019-02-19T09:42:52Z 2019-02-19T09:42:51Z NONE      

If we convert a pandas dataframe with a multiindex, slice it to remove some entries from the index, a converted DataArray still contains the removed items in the coordinates (although the values are NaN).

```python

We create an example dataframe

idx = pd.MultiIndex.from_product([list('abc'), list('xyz')]) df = pd.DataFrame(data={'col': np.random.randn(len(idx))}, index=idx) df.columns.name = 'cols' df.index.names = ['idx1', 'idx2'] df2 = df.loc[['a', 'b']] python

df2 does not contain c in the first level

df2 cols col idx1 idx2
a x -0.844476 y -0.845998 z 1.965143 b x -0.159293 y 0.188163 z -1.076204

It still shows up in the converted xarray though:

xr.DataArray(df2).unstack('dim_0') <xarray.DataArray (cols: 1, idx1: 3, idx2: 3)> array([[[-0.844476, -0.845998, 1.965143], [-0.159293, 0.188163, -1.076204], [ nan, nan, nan]]]) Coordinates: * cols (cols) object 'col' * idx1 (idx1) object 'a' 'b' 'c' * idx2 (idx2) object 'x' 'y' 'z' ```

If the original dataframe is very sparse, this can lead to gigantic unnecessary memory usage.

#### Output of ``xr.show_versions()`` ``` INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Darwin OS-release: 17.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: en_GB.UTF-8 LANG: None LOCALE: en_GB.UTF-8 xarray: 0.10.9 pandas: 0.23.4 numpy: 1.15.2 scipy: 1.1.0 netCDF4: 1.4.1 h5netcdf: 0.6.2 h5py: 2.8.0 Nio: None zarr: None cftime: 1.0.0b1 PseudonetCDF: None rasterio: None iris: None bottleneck: 1.2.1 cyordereddict: None dask: 0.19.2 distributed: 1.23.2 matplotlib: 3.0.0 cartopy: None seaborn: 0.9.0 setuptools: 40.4.3 pip: 18.0 conda: 4.5.11 pytest: 3.8.1 IPython: 7.0.1 sphinx: 1.8.1 ```
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2496/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 2 rows from issue in issue_comments
Powered by Datasette · Queries took 156.62ms · About: xarray-datasette