home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

15 rows where comments = 3 and user = 43316012 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: draft, state_reason, created_at (date), updated_at (date), closed_at (date)

type 2

  • pull 9
  • issue 6

state 2

  • closed 12
  • open 3

repo 1

  • xarray 15
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
2047459696 PR_kwDOAMm_X85iTmr2 8559 Support non-str Hashables in DataArray headtr1ck 43316012 closed 0     3 2023-12-18T21:09:13Z 2024-01-14T20:38:59Z 2024-01-14T20:38:59Z COLLABORATOR   0 pydata/xarray/pulls/8559
  • [x] Closes #8546
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst

Probably we should add a whole bunch of tests for this. For now only testing the constructor.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8559/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2038622503 I_kwDOAMm_X855gukn 8548 Shaping the future of Backends headtr1ck 43316012 open 0     3 2023-12-12T22:08:50Z 2023-12-15T17:14:59Z   COLLABORATOR      

What is your issue?

Backends in xarray are used to read and write files (or in general objects) and transform them into useful xarray Datasets.

This issue will collect ideas on how to continuously improve them.

Current state

Along the reading and writing process there are many implicit and explicit configuration possibilities. There are many backend specific options and many en-,decoder specific options. Most of them are currently difficult or even impossible to discover.

There is the infamous open_dataset method which can do everything, but there are also some specialized methods like open_zarr or to_netcdf.

The only really formalized way to extend xarray capabilities is via the BackendEntrypoint. Currently only for reading files. This has proven to work and things are going so well that people are discussing getting rid of the special reading methods (#7495). A major critique in this thread is again the discoverability of configuration options.

Problems

To name a few:

  • Discoverability of configuration options is poor
  • No distinction between backend and encoding options
  • New options are simply added as another keyword argument to open_dataset
  • No writing support for backends

What already improved

  • Adding URL and description attributes to the backends (#7000, #7200)
  • Add static typing
  • Allow creating instances of backends with their respective options (#8520)

The future

After listing all the problems, lets see how we can improve the situation and make backends an allrounder solution to reading and writing all kinds of files.

What happens behind the scenes

In general the reading and writing of Datasets in xarray is a three-step process.

[ done by backend.open_dataset] Dataset < chunking < decoding < opening_in_store < file Dataset > validating > encoding > storing_in_store > file Probably you could consider combining the chunking and decoding as well as validation and encoding into a single logical step in the pipeline. This view should help decide how to set up a future architecture of backends.

You can see that there is a common middle object in this process, a in-memory representation of the file on disc between en-, decoding and the abstract store. This is actually a xarray.Dataset and is internally called a "backend dataset".

write_dataset method

A quite natural extension of backends would be to implement a write_dataset method (name pending). This would allow backends to fulfill the complete right side of the pipeline.

Transformer class

Due to a lack of a common word for a class that handles "encoding" and "decoding" I will call them transformer here.

The process of en- and decoding is currently done "hardcoded" by the respective open_dataset and to_netcdf methods. One could imagine to introduce the concept of a common class that handles both.

This class could handle the implemented CF or netcdf encoding conventions. But it would also allow users to define their own storing conventions (Why not create a custom transformer that adds indexes based on variable attributes?) The possibilities are endless, and an interface that fulfills all the requirements still has to be found.

This would homogenize the reading and writing process to Dataset <> Transformer <> Backend <> file As a bonus this would increase discoverability of the configuration options of the decoding options (then transformer arguments).

The new interface then could be python backend = Netcdf4BackendEntrypoint(group="data") decoder = CFTransformer(cftime=True) ds = xr.open_dataset("file.nc", engine=backend, decoder=decoder) while of course still allowing to pass all options simply as kwarg (since this is still the easiest way of telling beginners how to open files)

The final improvement here would be to add additional entrypoints for these transformers ;)

Disclaimer

Now this issue is just a bunch of random ideas that require quite some refinement or they might even turn out to be nonsense. So lets have a exciting discussion about these things :) If you have something to add to the above points I will include your ideas as well. This is meant as a collection of ideas on how to improve our backends :)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8548/reactions",
    "total_count": 5,
    "+1": 5,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1928972239 PR_kwDOAMm_X85cC_Wb 8276 Give NamedArray Generic dimension type headtr1ck 43316012 open 0     3 2023-10-05T20:02:56Z 2023-10-16T13:41:45Z   COLLABORATOR   1 pydata/xarray/pulls/8276
  • [x] Towards #8199
  • [ ] Tests added
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [ ] New functions/methods are listed in api.rst

This aims at making the dimenion type a generic parameter. I thought I will start with NamedArray when testing this out because it is much less interconnected.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8276/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1915876808 I_kwDOAMm_X85yMfXI 8236 DataArray with multiple (Pandas)Indexes on the same dimension is impossible to align headtr1ck 43316012 closed 0     3 2023-09-27T15:52:05Z 2023-10-02T06:53:27Z 2023-10-01T07:19:09Z COLLABORATOR      

What happened?

I have a DataArray with a single dimension and multiple (Pandas)Indexes assigned to various coordinates for efficient indexing using sel.

Edit: the problem is even worse than originally described below: such a DataArray breaks all alignment and it's basically unusable...


When I try to add an additional coordinate without any index (I simply use the tuple[dimension, values] way) I get a ValueError about aligning with conflicting indexes.

If the original DataArray only has a single (Pandas)Index everything works as expected.

What did you expect to happen?

I expected that I can simply assign new coordinates without an index.

Minimal Complete Verifiable Example

```Python import xarray as xr

da = xr.DataArray( [1, 2, 3], dims="t", coords={ "a": ("t", [3, 4, 5]), "b": ("t", [5, 6, 7]) } )

set one index

da2 = da.set_xindex("a")

set second index (same dimension, maybe thats a problem?)

da3 = da2.set_xindex("b")

this works

da2.coords["c"] = ("t", [2, 3, 4])

this does not

da3.coords["c"] = ("t", [2, 3, 4]) ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

ValueError: cannot re-index or align objects with conflicting indexes found for the following dimensions: 't' (2 conflicting indexes) Conflicting indexes may occur when - they relate to different sets of coordinate and/or dimension names - they don't have the same type - they may be used to reindex data along common dimensions

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.9.10 (main, Mar 21 2022, 13:08:11) [GCC 4.8.5 20150623 (Red Hat 4.8.5-44)] python-bits: 64 OS: Linux OS-release: 3.10.0-1160.66.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: 4.9.0 xarray: 2022.12.0 pandas: 2.0.2 numpy: 1.24.3 scipy: 1.10.0 netCDF4: 1.6.2 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.6.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.6.3 cartopy: None seaborn: None numbagg: None fsspec: None cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 58.1.0 pip: 21.2.4 conda: None pytest: 7.3.2 mypy: 1.0.0 IPython: 8.8.0 sphinx: None

I have not yet tried this with a newer version of xarray....

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8236/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  not_planned xarray 13221727 issue
1401066481 I_kwDOAMm_X85TgpPx 7141 Coverage shows reduced value since mypy flag was added headtr1ck 43316012 closed 0     3 2022-10-07T12:01:15Z 2023-08-30T18:47:35Z 2023-08-30T18:47:35Z COLLABORATOR      

What is your issue?

The coverage was reduced from ~94% to ~68% after merging #7126 See https://app.codecov.io/gh/pydata/xarray or our badge

I think this is because the unittests never included the tests directory while mypy does. And codecov uses the sum of both coverage reports to come up with its number.

Adding the flag to the badge also does not seem to help?

Not sure how or even if that is possible to solve, maybe we need to ask in codecov?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7141/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1648748263 I_kwDOAMm_X85iRebn 7703 Readthedocs build failing headtr1ck 43316012 closed 0     3 2023-03-31T06:20:53Z 2023-03-31T15:45:10Z 2023-03-31T15:45:10Z COLLABORATOR      

What is your issue?

It seems that the readthedocs build is failing since some upstream update. pydata-sphinx-theme seems to be incompatible with the sphinx-book-theme.

Maybe we have to pin to a specific or a maximum version for now.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7703/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1615980379 PR_kwDOAMm_X85Lm7SK 7600 Enable blacks `skip_magic_trailing_comma` options headtr1ck 43316012 closed 0     3 2023-03-08T21:36:46Z 2023-03-09T20:41:21Z 2023-03-09T20:40:25Z COLLABORATOR   0 pydata/xarray/pulls/7600

This little config change will make black remove trailing commas when they are not necessary to fit something into a single line.

It is a pure design choice but personally I like the clean up it does when function signatures simplify (although this happens rarely with more and more type hints added).

I can understand that some people prefer the manual control over what is multiline and what is not. Feel free to vote on it :)

For me it adds cheap LOCs so it looks like I am working hard, haha.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7600/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1603831809 I_kwDOAMm_X85fmIgB 7572 `test_open_nczarr` failing headtr1ck 43316012 closed 0     3 2023-02-28T21:20:22Z 2023-03-02T16:49:25Z 2023-03-02T16:49:25Z COLLABORATOR      

What is your issue?

In the latest CI runs it seems that test_backends.py::TestNCZarr::test_open_nczarr is failing with

KeyError: 'Zarr object is missing the attribute _ARRAY_DIMENSIONS and the NCZarr metadata, which are required for xarray to determine variable dimensions.'

I don't see an obvious reason for this, especially since the zarr version has not changed compared to some runs that were successful (2.13.6).

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7572/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1424707135 PR_kwDOAMm_X85Bnixp 7228 Raise TypeError if plotting empty data headtr1ck 43316012 closed 0     3 2022-10-26T21:19:30Z 2022-11-10T23:00:42Z 2022-10-28T16:44:31Z COLLABORATOR   0 pydata/xarray/pulls/7228
  • [x] Closes #7156
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [x] ~New functions/methods are listed in api.rst~
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7228/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1419882372 PR_kwDOAMm_X85BXXw0 7200 Backends descriptions headtr1ck 43316012 closed 0     3 2022-10-23T18:23:32Z 2022-10-26T19:45:15Z 2022-10-26T16:01:04Z COLLABORATOR   0 pydata/xarray/pulls/7200
  • [x] Closes #7049
  • [ ] Tests added
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [x] New functions/methods are listed in api.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7200/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1385143758 PR_kwDOAMm_X84_j6Bn 7080 Fix `utils.get_axis` with kwargs headtr1ck 43316012 closed 0     3 2022-09-25T19:50:15Z 2022-09-28T18:02:18Z 2022-09-28T17:11:16Z COLLABORATOR   0 pydata/xarray/pulls/7080
  • [x] Closes #7078
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [x] ~~New functions/methods are listed in api.rst~~
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7080/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1368690120 PR_kwDOAMm_X84-uNM2 7017 Add Ellipsis typehints headtr1ck 43316012 closed 0     3 2022-09-10T17:53:26Z 2022-09-12T15:40:08Z 2022-09-11T13:40:07Z COLLABORATOR   0 pydata/xarray/pulls/7017

This PR adds an Ellipsis typehint to some functions.

Interestingly mypy did not complain at the tests before, I assume it is because "..." is Hashable or something like that?

I don't know what to do with reductions, since they also support ellipsis, but it is basically the same as using None. Therefore, I assume it is not necessary to expose this feature.

Did I miss any functions where ellipsis is supported? It is hard to look for "..."... xD

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7017/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1361262641 PR_kwDOAMm_X84-VY9P 6986 Remove some warnings in tests headtr1ck 43316012 closed 0     3 2022-09-04T21:58:57Z 2022-09-05T16:06:35Z 2022-09-05T10:52:45Z COLLABORATOR   0 pydata/xarray/pulls/6986

This PR tries to get rid of several warnings in the tests.

I could not get rid of RuntimeWarning: All-NaN slice encountered for tests with dask. Does anyone know why is that? pytest.mark.filterwarnings does not seem to capture them...

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6986/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1268697316 PR_kwDOAMm_X845hhHE 6690 Fix Dataset.where with drop=True and mixed dims headtr1ck 43316012 closed 0     3 2022-06-12T20:47:05Z 2022-06-13T18:06:44Z 2022-06-12T22:06:51Z COLLABORATOR   0 pydata/xarray/pulls/6690
  • [x] Closes #6227
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6690/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1221885425 I_kwDOAMm_X85I1H3x 6549 Improved Dataset broadcasting headtr1ck 43316012 open 0     3 2022-04-30T17:51:37Z 2022-05-01T14:37:43Z   COLLABORATOR      

Is your feature request related to a problem?

I am a bit puzzled about how xarrays is broadcasting Datasets. It seems to always add all dimensions to all variables. Is this what you want in general?

See this example: ```python import xarray as xr

da = xr.DataArray([[1, 2, 3]], dims=("x", "y"))

<xarray.DataArray (x: 1, y: 3)>

array([[1, 2, 3]])

ds = xr.Dataset({"a": ("x", [1]), "b": ("z", [2, 3])})

<xarray.Dataset>

Dimensions: (x: 1, z: 2)

Dimensions without coordinates: x, z

Data variables:

a (x) int32 1

b (z) int32 2 3

ds.broadcast_like(da)

returns:

<xarray.Dataset>

Dimensions: (x: 1, y: 3, z: 2)

Dimensions without coordinates: x, y, z

Data variables:

a (x, y, z) int32 1 1 1 1 1 1

b (x, y, z) int32 2 3 2 3 2 3

I think it should return:

<xarray.Dataset>

Dimensions: (x: 1, y: 3, z: 2)

Dimensions without coordinates: x, y, z

Data variables:

a (x, y) int32 1 1 1 # notice here without "z" dim

b (x, y, z) int32 2 3 2 3 2 3

```

Describe the solution you'd like

I would like broadcasting to behave the same way as e.g. a simple addition. In the upper example da + ds produces the dimensions that I want.

Describe alternatives you've considered

ds + xr.zeros_like(da) this works, but seems more like a "dirty hack".

Additional context

Maybe one can add an option to broadcasting that controls this behavior?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6549/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 23.973ms · About: xarray-datasette