home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

4 rows where repo = 13221727 and user = 4806678 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 2

  • issue 3
  • pull 1

state 2

  • closed 2
  • open 2

repo 1

  • xarray · 4 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1006413760 I_kwDOAMm_X847_KfA 5815 Inconsistent dropping of `DatetimeIndex.freq` attribute from `DataArray`s index. hrzn 4806678 closed 0     7 2021-09-24T12:28:27Z 2023-08-23T13:41:17Z 2023-08-23T13:41:17Z CONTRIBUTOR      

What happened:

Indexing on DataArrays sometimes produce inconsistent behaviors on DataArrays indexed with a DatetimeIndex. There are cases (see code below) where copied versions of the array obtained with DataArray.copy() loose the index's freq attribute when calling isel(), whereas copied versions obtained using DataArray.sortby() do not.

What you expected to happen:

I would expect the freq not to be dropped from the axis when indexing the DataArray. Or at least, if it is dropped, it should always be dropped, and not depend on how the DataArray has been obtained (i.e., it should not depend on whether it's a copy() of another array).

Minimal Complete Verifiable Example:

```python import xarray as xr import pandas as pd import numpy as np from pandas.tseries.frequencies import to_offset

freq = to_offset('MS') time_index = pd.DatetimeIndex([pd.Timestamp('20130101') + i * freq for i in range(10)])

xa_orig = xr.DataArray(np.random.randn(10), dims=('time'), coords={'time': time_index})

xa_copy = xa_orig.copy()

xa_sorted = xa_orig.sortby('time')

Set freq on all DataArrays:

xa_orig.get_index('time').freq = freq xa_copy.get_index('time').freq = freq xa_sorted.get_index('time').freq = freq

Print to confirm the freq is set correctly:

print(xa_orig.get_index('time').freq) print(xa_copy.get_index('time').freq) print(xa_sorted.get_index('time').freq) ```

Output (OK - as expected) ```

<MonthBegin> <MonthBegin> <MonthBegin> ```

Now, try some indexing using isel() on slices: python print(xa_orig.isel({'time': slice(0, 5, None)}).get_index('time').freq) print(xa_copy.isel({'time': slice(0, 5, None)}).get_index('time').freq) # freq is dropped! print(xa_sorted.isel({'time': slice(0, 5, None)}).get_index('time').freq)

Output (Not as expected: freq is dropped inconsistently, only on the copied DataArray) ```

<MonthBegin> None <MonthBegin> ```

Now, try the same using isel() on lists: python print(xa_orig.isel({'time': [0, 1, 2, 3, 4]}).get_index('time').freq) print(xa_copy.isel({'time': [0, 1, 2, 3, 4]}).get_index('time').freq) print(xa_sorted.isel({'time': [0, 1, 2, 3, 4]}).get_index('time').freq)

Output (Not as expected: freq always dropped) ```

None None None ```

Anything else we need to know?:

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.8.5 (default, Sep 4 2020, 02:22:02) [Clang 10.0.0 ] python-bits: 64 OS: Darwin OS-release: 20.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: None LOCALE: (None, 'UTF-8') libhdf5: None libnetcdf: None xarray: 0.19.0 pandas: 1.2.3 numpy: 1.19.5 scipy: 1.6.2 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.4.1 cartopy: None seaborn: None numbagg: None pint: None setuptools: 49.6.0.post20200814 pip: 21.1.3 conda: None pytest: None IPython: 7.22.0 sphinx: 3.2.1
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5815/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1128304139 I_kwDOAMm_X85DQI4L 6256 Indexing a RangeIndexed' DataArray with a RangeIndex returns a deprecated Int64Index hrzn 4806678 open 0     2 2022-02-09T09:55:06Z 2022-02-21T21:28:57Z   CONTRIBUTOR      

What happened?

First, apology if this is not actually a bug - I'm not too sure of what the intended behaviour should be. But I find this counter-intuitive.

When indexing a DataArray that is indexed using a RangeIndex, the resulting index is an Int64Index: ```python my_da.get_index('time')

RangeIndex(start=0, stop=100, step=1, name='time')

a = my_da.sel({'time': pd.RangeIndex(0,2)}) a.get_index('time')

Int64Index([0, 1], dtype='int64', name='time') ```

Setting the index to the desired RangeIndex using assign_coords() then works. But I find it a bit problematic that sel() returns an Int64Index even when used with a RangeIndex. Also because Int64Index has been recently deprecated in Pandas 1.4.

What did you expect to happen?

I would have expected the resulting DataArray to be indexed with the same RangeIndex used in sel().

Minimal Complete Verifiable Example

```python import xarray as xr import numpy as np import pandas as pd

my_da = xr.DataArray(np.random.rand(100,), dims=('time'), coords={'time': pd.RangeIndex(0, 100)})

print(my_da.get_index('time')) a = my_da.sel({'time': pd.RangeIndex(0,2)}) print(a.get_index('time')) ```

Relevant log output

python RangeIndex(start=0, stop=100, step=1, name='time') Int64Index([0, 1], dtype='int64', name='time')

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None python: 3.8.5 (default, Sep 4 2020, 02:22:02) [Clang 10.0.0 ] python-bits: 64 OS: Darwin OS-release: 20.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: None LOCALE: (None, 'UTF-8') libhdf5: None libnetcdf: None

xarray: 0.20.2 pandas: 1.4.0 numpy: 1.22.1 scipy: 1.7.3 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.5.1 cartopy: None seaborn: None numbagg: None fsspec: 2021.11.1 cupy: None pint: None sparse: None setuptools: 59.5.0 pip: 21.3.1 conda: None pytest: 6.2.5 IPython: 8.0.1 sphinx: 4.3.2

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6256/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1007004661 PR_kwDOAMm_X84sQ72v 5819 Add Darts to ecosystem file hrzn 4806678 closed 0     3 2021-09-25T07:16:37Z 2021-09-27T15:55:51Z 2021-09-25T16:45:49Z CONTRIBUTOR   0 pydata/xarray/pulls/5819

Since recently, Darts uses DataArray as the core data structure underpinning its TimeSeries class to represent all time series in the library. It's working great for us, so I also want to take the chance to thank the xarray community here!

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5819/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
919798222 MDU6SXNzdWU5MTk3OTgyMjI= 5466 `DataArray.sortby()` discards `freq` component of `DatetimeIndex` hrzn 4806678 open 0     0 2021-06-13T13:22:04Z 2021-06-13T13:22:20Z   CONTRIBUTOR      

What happened:

Calling sortby() on a DataArray indexed with a pd.DatetimeIndex discards the underlying freq component of the pd.DatetimeIndex.

What you expected to happen:

The freq component should probably be kept.

Minimal Complete Verifiable Example:

```python import xarray as xr import numpy as np import pandas as pd

ar = xr.DataArray(np.random.randn(4, 2), dims=('time', 'component'), coords={'time': pd.date_range('20130101', '20130104'), 'component': ['a', 'b']})

print(ar.get_index('time'))

ar_sorted = ar.sortby('time')

print(ar_sorted.get_index('time')) ```

Output for the above snippet (note how the freq becomes None): DatetimeIndex(['2013-01-01', '2013-01-02', '2013-01-03', '2013-01-04'], dtype='datetime64[ns]', name='time', freq='D') DatetimeIndex(['2013-01-01', '2013-01-02', '2013-01-03', '2013-01-04'], dtype='datetime64[ns]', name='time', freq=None)

Anything else we need to know?:

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.8.10 (default, May 19 2021, 11:01:55) [Clang 10.0.0 ] python-bits: 64 OS: Darwin OS-release: 20.3.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: None LOCALE: (None, 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.18.2 pandas: 1.2.3 numpy: 1.19.5 scipy: 1.6.2 netCDF4: 1.5.6 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.5.0 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2021.05.1 distributed: 2021.05.1 matplotlib: 3.4.1 cartopy: None seaborn: None numbagg: None pint: None setuptools: 52.0.0.post20210125 pip: 21.1.1 conda: None pytest: None IPython: 7.22.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5466/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 31.237ms · About: xarray-datasette