home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

13 rows where comments = 1, state = "open" and user = 2448579 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: draft, created_at (date), updated_at (date)

type 2

  • issue 9
  • pull 4

state 1

  • open · 13 ✖

repo 1

  • xarray 13
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
2278499376 PR_kwDOAMm_X85uhFke 8997 Zarr: Optimize `region="auto"` detection dcherian 2448579 open 0     1 2024-05-03T22:13:18Z 2024-05-04T21:47:39Z   MEMBER   0 pydata/xarray/pulls/8997
  1. This moves the region detection code into ZarrStore so we only open the store once.
  2. Instead of opening the store as a dataset, construct a pd.Index directly to "auto"-infer the region.

The diff is large mostly because a bunch of code moved from backends/api.py to backends/zarr.py

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8997/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2248614324 I_kwDOAMm_X86GByG0 8952 `isel(multi_index_level_name = MultiIndex.level)` corrupts the MultiIndex dcherian 2448579 open 0     1 2024-04-17T15:41:39Z 2024-04-18T13:14:46Z   MEMBER      

What happened?

From https://github.com/pydata/xarray/discussions/8951

if d is a MultiIndex-ed dataset with levels (x, y, z), and m is a dataset with a single coord x m.isel(x=d.x) builds a dataset with a MultiIndex with levels (y, z). This seems like it should work.

cc @benbovy

What did you expect to happen?

No response

Minimal Complete Verifiable Example

```Python import pandas as pd, xarray as xr, numpy as np

xr.set_options(use_flox=True)

test = pd.DataFrame() test["x"] = np.arange(100) % 10 test["y"] = np.arange(100) test["z"] = np.arange(100) test["v"] = np.arange(100)

d = xr.Dataset.from_dataframe(test) d = d.set_index(index = ["x", "y", "z"]) print(d)

m = d.groupby("x").mean() print(m)

print(d.xindexes) print(m.isel(x=d.x).xindexes)

xr.align(d, m.isel(x=d.x))

res = d.groupby("x") - m

print(res)

```

<xarray.Dataset> Dimensions: (index: 100) Coordinates: * index (index) object MultiIndex * x (index) int64 0 1 2 3 4 5 6 7 8 9 0 1 2 ... 8 9 0 1 2 3 4 5 6 7 8 9 * y (index) int64 0 1 2 3 4 5 6 7 8 9 ... 90 91 92 93 94 95 96 97 98 99 * z (index) int64 0 1 2 3 4 5 6 7 8 9 ... 90 91 92 93 94 95 96 97 98 99 Data variables: v (index) int64 0 1 2 3 4 5 6 7 8 9 ... 90 91 92 93 94 95 96 97 98 99 <xarray.Dataset> Dimensions: (x: 10) Coordinates: * x (x) int64 0 1 2 3 4 5 6 7 8 9 Data variables: v (x) float64 45.0 46.0 47.0 48.0 49.0 50.0 51.0 52.0 53.0 54.0 Indexes: ┌ index PandasMultiIndex │ x │ y └ z Indexes: ┌ index PandasMultiIndex │ y └ z ValueError...

MVCE confirmation

  • [x] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [x] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [x] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [x] New issue — a search of GitHub Issues suggests this is not a duplicate.
  • [x] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

No response

Anything else we need to know?

No response

Environment

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8952/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
2215762637 PR_kwDOAMm_X85rMHpN 8893 Avoid extra read from disk when creating Pandas Index. dcherian 2448579 open 0     1 2024-03-29T17:44:52Z 2024-04-08T18:55:09Z   MEMBER   0 pydata/xarray/pulls/8893
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8893/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2224297504 PR_kwDOAMm_X85rpGUH 8906 Add invariant check for IndexVariable.name dcherian 2448579 open 0     1 2024-04-04T02:13:33Z 2024-04-05T07:12:54Z   MEMBER   1 pydata/xarray/pulls/8906

@benbovy this seems to be the root cause of #8646, the variable name in Dataset._variables does not match IndexVariable.name.

A good number of tests seem to fail though, so not sure if this is a good chck.

  • [ ] Closes #xxxx
  • [ ] Tests added
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [ ] New functions/methods are listed in api.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8906/reactions",
    "total_count": 2,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 2,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2213636579 I_kwDOAMm_X86D8Wnj 8887 resetting multiindex may be buggy dcherian 2448579 open 0     1 2024-03-28T16:23:38Z 2024-03-29T07:59:22Z   MEMBER      

What happened?

Resetting a MultiIndex dim coordinate preserves the MultiIndex levels as IndexVariables. We should either reset the indexes for the multiindex level variables, or warn asking the users to do so

This seems to be the root cause exposed by https://github.com/pydata/xarray/pull/8809

cc @benbovy

What did you expect to happen?

No response

Minimal Complete Verifiable Example

```Python import numpy as np import xarray as xr

ND DataArray that gets stacked along a multiindex

da = xr.DataArray(np.ones((3, 3)), coords={"dim1": [1, 2, 3], "dim2": [4, 5, 6]}) da = da.stack(feature=["dim1", "dim2"])

Extract just the stacked coordinates for saving in a dataset

ds = xr.Dataset(data_vars={"feature": da.feature}) xr.testing.assertions._assert_internal_invariants(ds.reset_index(["feature", "dim1", "dim2"]), check_default_indexes=False) # succeeds xr.testing.assertions._assert_internal_invariants(ds.reset_index(["feature"]), check_default_indexes=False) # fails, but no warning either ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8887/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
2064480451 I_kwDOAMm_X857DXjD 8582 Adopt SPEC 0 instead of NEP-29 dcherian 2448579 open 0     1 2024-01-03T18:36:24Z 2024-01-03T20:12:05Z   MEMBER      

What is your issue?

https://docs.xarray.dev/en/stable/getting-started-guide/installing.html#minimum-dependency-versions says that we follow NEP-29, and I think our min versions script also does that.

I propose we follow https://scientific-python.org/specs/spec-0000/

In practice, I think this means we mostly drop Python versions earlier.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8582/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1943543755 I_kwDOAMm_X85z2B_L 8310 pydata/xarray as monorepo for Xarray and NamedArray dcherian 2448579 open 0     1 2023-10-14T20:34:51Z 2023-10-14T21:29:11Z   MEMBER      

What is your issue?

As we work through refactoring for NamedArray, it's pretty clear that Xarray will depend pretty closely on many files in namedarray/. For example various utils.py, pycompat.py, *ops.py, formatting.py, formatting_html.py at least. This promises to be quite painful if we did break NamedArray out in to its own repo (particularly around typing, e.g. https://github.com/pydata/xarray/pull/8309)

I propose we use pydata/xarray as a monorepo that serves two packages: NamedArray and Xarray. - We can move as much as is needed to have NamedArray be independent of Xarray, but Xarray will depend quite closely on many utility functions in NamedArray. - We can release both at the same time similar to dask and distributed. - We can re-evaluate if and when NamedArray grows its own community.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8310/reactions",
    "total_count": 4,
    "+1": 4,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
923355397 MDExOlB1bGxSZXF1ZXN0NjcyMTI5NzY4 5480 Implement weighted groupby dcherian 2448579 open 0     1 2021-06-17T02:57:17Z 2023-07-27T18:09:55Z   MEMBER   1 pydata/xarray/pulls/5480
  • xref #3937
  • [ ] Tests added
  • [ ] Passes pre-commit run --all-files
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [ ] New functions/methods are listed in api.rst

Initial proof-of-concept. Suggestions to improve this are very welcome.

Here's some convenient testing code ``` python
import xarray as xr

ds = xr.tutorial.open_dataset('rasm').load() month_length = ds.time.dt.days_in_month weights = month_length.groupby('time.season') / month_length.groupby('time.season').sum()

actual = ds.weighted(month_length).groupby("time.season").mean() expected = (ds * weights).groupby('time.season').sum(skipna=False) xr.testing.assert_allclose(actual, expected) ```

I've added info to the repr python ds.weighted(month_length).groupby("time.season") WeightedDatasetGroupBy, grouped over 'season' 4 groups with labels 'DJF', 'JJA', 'MAM', 'SON'. weighted along dimensions: time by 'days_in_month'

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5480/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1822982776 I_kwDOAMm_X85sqIJ4 8023 Possible autoray integration dcherian 2448579 open 0     1 2023-07-26T18:57:59Z 2023-07-26T19:26:05Z   MEMBER      

I'm opening this issue for discussion really.

I stumbled on autoray (Github) by @jcmgray which provides an abstract interface to a number of array types.

What struck me was the very general lazy compute system. This opens up the possibility of lazy-but-not-dask computation.

Related: https://github.com/pydata/xarray/issues/2298 https://github.com/pydata/xarray/issues/1725 https://github.com/pydata/xarray/issues/5081

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8023/reactions",
    "total_count": 2,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 2
}
    xarray 13221727 issue
1119647191 I_kwDOAMm_X85CvHXX 6220 [FEATURE]: Use fast path when grouping by unique monotonic decreasing variable dcherian 2448579 open 0     1 2022-01-31T16:24:29Z 2023-01-09T16:48:58Z   MEMBER      

Is your feature request related to a problem?

See https://github.com/pydata/xarray/pull/6213/files#r795716713

We check whether the by variable for groupby is unique and monotonically increasing. But the fast path would also apply to unique and monotonically decreasing variables.

Describe the solution you'd like

Update the condition to is_monotonic_increasing or is_monotonic_decreasing and add a test.

Describe alternatives you've considered

No response

Additional context

No response

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6220/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1194945072 I_kwDOAMm_X85HOWow 6447 allow merging datasets where a variable might be a coordinate variable only in a subset of datasets dcherian 2448579 open 0     1 2022-04-06T17:53:51Z 2022-11-16T03:46:56Z   MEMBER      

Is your feature request related to a problem?

Here are two datasets, in one a is a data_var, in the other a is a coordinate variable. The following fails ``` python import xarray as xr

ds1 = xr.Dataset({"a": ('x', [1, 2, 3])}) ds2 = ds1.set_coords("a") ds2.update(ds1) with 649 ambiguous_coords = coord_names.intersection(noncoord_names) 650 if ambiguous_coords: --> 651 raise MergeError( 652 "unable to determine if these variables should be " 653 f"coordinates or not in the merged result: {ambiguous_coords}" 654 ) 656 attrs = merge_attrs( 657 [var.attrs for var in coerced if isinstance(var, (Dataset, DataArray))], 658 combine_attrs, 659 ) 661 return _MergeResult(variables, coord_names, dims, out_indexes, attrs)

MergeError: unable to determine if these variables should be coordinates or not in the merged result: {'a'} ```

Describe the solution you'd like

I think we should replace this error with a warning and arbitrarily choose to either convert a to a coordinate variable or a data variable.

Describe alternatives you've considered

No response

Additional context

No response

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6447/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
514716299 MDU6SXNzdWU1MTQ3MTYyOTk= 3468 failure when roundtripping empty dataset to pandas dcherian 2448579 open 0     1 2019-10-30T14:28:31Z 2021-11-13T14:54:09Z   MEMBER      

see https://github.com/pydata/xarray/pull/3285

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3468/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
520079199 MDU6SXNzdWU1MjAwNzkxOTk= 3497 how should xarray handle pandas attrs dcherian 2448579 open 0     1 2019-11-08T15:32:36Z 2021-07-04T03:31:02Z   MEMBER      

Continuing discussion form #3491.

Pandas has added attrs to their objects. We should decide on what to do with them in the DataArray constructor. Many tests fail if we don't handle this case explicitly.

@dcherian:

Not sure what we want to do about these attributes in the long term. One option would be to pop the name attribute, assign to DataArray.name and keep the rest as DataArray.attrs? But what if name clashes with the provided name?

@max-sixty:

Agree! I think we could prioritize the supplied name above that in attrs. Another option would be raising an error if both were supplied.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3497/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 3361.544ms · About: xarray-datasette