home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

1 row where user = 22961670 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue 1

state 1

  • closed 1

repo 1

  • xarray 1
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
2228373305 I_kwDOAMm_X86E0kc5 8915 Weird behavior of DataSet.where(... , drop=True) johannespletzer 22961670 closed 0     4 2024-04-05T16:03:05Z 2024-04-08T09:32:48Z 2024-04-08T09:32:48Z NONE      

What happened?

I work with an aircraft emission dataset that is freely available online: emission dataset

During my calculations I eventually convert the DataSet to a DataFrame. My motivation is to avoid unnecessary rows in the DataFrame. Doing some calculations my code returned unexpected results. Eventually I could narrow it down to a DataSet.where(... , drop=True) argument I added along the way, which introduces differences in the data. Here are two examples:

Example 1: Along some dimensions data points vanished if drop=True

Example 2: For other dimensions (these?) data points appeared elsewhere if drop=True

What did you expect to happen?

I expect for my calculations to return the same results, regardless of whether drop=True is active or not.

Minimal Complete Verifiable Example

```Python !wget "https://zenodo.org/records/10818082/files/Emission_Inventory_H2O_Optimized_v0.1_MR3_Fleet_BRU-MYA_2075.nc"

import matplotlib.pyplot as plt import xarray as xr

nc_file = xr.open_dataset('Emission_Inventory_H2O_Optimized_v0.1_MR3_Fleet_BRU-MYA_2075.nc')

fig, axs = plt.subplots(1,2,figsize=(10,4))

nc_file.H2O.where(nc_file.H2O!=0, drop=True).sum(('lon','time')).plot.contour(x='lat',ax=axs[0]) axs[0].set_xlim(-50,90) axs[0].set_title('With drop=True')

nc_file.H2O.where(nc_file.H2O!=0, drop=False).sum(('lon','time')).plot.contour(x='lat',ax=axs[1]) axs[1].set_xlim(-50,90) axs[1].set_title('With drop=False')

plt.tight_layout() plt.show()

fig, axs = plt.subplots(1,2,figsize=(10,4))

nc_file.H2O.where(nc_file.H2O!=0, drop=True).sum(('lat','time')).plot.contour(x='lon',ax=axs[0]) axs[0].set_title('With drop=True')

nc_file.H2O.where(nc_file.H2O!=0, drop=False).sum(('lat','time')).plot.contour(x='lon',ax=axs[1]) axs[1].set_title('With drop=False')

plt.tight_layout() plt.show() ```

MVCE confirmation

  • [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [ ] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
  • [ ] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.10.9 | packaged by Anaconda, Inc. | (main, Mar 1 2023, 18:18:15) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 165 Stepping 2, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: ('en_US', 'ISO8859-1') libhdf5: 1.14.0 libnetcdf: 4.9.2 xarray: 2022.11.0 pandas: 1.5.3 numpy: 1.23.5 scipy: 1.13.0 netCDF4: 1.6.5 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.6.3 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.5 dask: None distributed: None matplotlib: 3.7.0 cartopy: 0.21.1 seaborn: 0.12.2 numbagg: None fsspec: None cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 65.6.3 pip: 22.3.1 conda: None pytest: None IPython: 8.10.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8915/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 20.661ms · About: xarray-datasette