issues
30 rows where state = "open" and user = 4160723 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: comments, draft, created_at (date), updated_at (date)
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at ▲ | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1389295853 | I_kwDOAMm_X85Szvjt | 7099 | Pass arbitrary options to sel() | benbovy 4160723 | open | 0 | 4 | 2022-09-28T12:44:52Z | 2024-04-30T00:44:18Z | MEMBER | Is your feature request related to a problem?Currently It would be also useful for custom indexes to expose their own selection options, e.g.,
From #3223, it would be nice if we could also pass distinct options values per index. What would be a good API for that? Describe the solution you'd likeSome ideas: A. Allow passing a tuple
B. Expose an
Option A does not look very readable. Option B is slightly better, although the nested dictionary is not great. Any other ideas? Some sort of context manager? Some Describe alternatives you've consideredThe API proposed in #3223 would look great if Additional contextNo response |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7099/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
2227413822 | PR_kwDOAMm_X85rz7ZX | 8911 | Refactor swap dims | benbovy 4160723 | open | 0 | 5 | 2024-04-05T08:45:49Z | 2024-04-17T16:46:34Z | MEMBER | 1 | pydata/xarray/pulls/8911 |
I've tried here re-implementing |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8911/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
2215059449 | PR_kwDOAMm_X85rJr7c | 8888 | to_base_variable: coerce multiindex data to numpy array | benbovy 4160723 | open | 0 | 3 | 2024-03-29T10:10:42Z | 2024-03-29T15:54:19Z | MEMBER | 0 | pydata/xarray/pulls/8888 |
@slevang this should also make work your test case added in #8809. I haven't added it here, instead I added a basic check that should be enough. I don't really understand why the serialization backends (zarr?) do not seem to work with the |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8888/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
1861543091 | I_kwDOAMm_X85u9OSz | 8097 | Documentation rendering issues (dark mode) | benbovy 4160723 | open | 0 | 2 | 2023-08-22T14:06:03Z | 2024-02-13T02:31:10Z | MEMBER | What is your issue?There is a couple of rendering issues in Xarray's documentation landing page, especially with the dark mode.
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8097/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1839199929 | PR_kwDOAMm_X85XUl4W | 8051 | Allow setting (or skipping) new indexes in open_dataset | benbovy 4160723 | open | 0 | 9 | 2023-08-07T10:53:46Z | 2024-02-03T19:12:48Z | MEMBER | 0 | pydata/xarray/pulls/8051 |
This PR introduces a new boolean parameter Currently works with the Zarr backend: ```python import numpy as np import xarray as xr example dataset (real dataset may be much larger)arr = np.random.random(size=1_000_000) xr.Dataset({"x": arr}).to_zarr("dataset.zarr") xr.open_dataset("dataset.zarr", set_indexes=False, engine="zarr") <xarray.Dataset>Dimensions: (x: 1000000)Coordinates:x (x) float64 ...Data variables:emptyxr.open_zarr("dataset.zarr", set_indexes=False) <xarray.Dataset>Dimensions: (x: 1000000)Coordinates:x (x) float64 ...Data variables:empty``` I'll add it to the other Xarray backends as well, but I'd like to get your thoughts about the API first.
Currently 1 and 2 are implemented in this PR, although as I write this comment I think that I would prefer 3. I guess this depends on whether we prefer |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8051/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
667864088 | MDU6SXNzdWU2Njc4NjQwODg= | 4285 | Awkward array backend? | benbovy 4160723 | open | 0 | 38 | 2020-07-29T13:53:45Z | 2023-12-30T18:47:48Z | MEMBER | Just curious if anyone here has thoughts on this. For more context: Awkward is like numpy but for arrays of very arbitrary (dynamic) structure. I don't know much yet about that library (I've just seen this SciPy 2020 presentation), but now I could imagine using xarray for dealing with labelled collections of geometrical / geospatial objects like polylines or polygons. At this stage, any integration between xarray and awkward arrays would be something highly experimental, but I think this might be an interesting case for flexible arrays (and possibly flexible indexes) mentioned in the roadmap. There is some discussion here: https://github.com/scikit-hep/awkward-1.0/issues/27. Does anyone see any other potential use case? cc @pydata/xarray |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/4285/reactions", "total_count": 6, "+1": 6, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1989356758 | I_kwDOAMm_X852kyzW | 8447 | Improve discoverability of backend engine options | benbovy 4160723 | open | 0 | 5 | 2023-11-12T11:14:56Z | 2023-12-12T20:30:28Z | MEMBER | Is your feature request related to a problem?Backend engine options are not easily discoverable and we need to know or figure out them before passing it as kwargs to Describe the solution you'd likeThe solution is similar to the one proposed in #8002 for setting a new index. The API could look like this: ```python import xarray as xr ds = xr.open_dataset( file_or_obj, engine=xr.backends.engine("myengine").with_options( option1=True, option2=100, ), ) ``` where We would need to extend the API for ```python class BackendEntrypoint: _open_dataset_options: dict[str, Any]
``` Such that ```python class MyEngineBackendEntryPoint(BackendEntrypoint): open_dataset_parameters = ("option1", "option2")
``` Pros:
Cons:
Describe alternatives you've consideredA Additional contextcc @jsignell https://github.com/stac-utils/pystac/issues/846#issuecomment-1405758442 |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8447/reactions", "total_count": 4, "+1": 4, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1148021907 | I_kwDOAMm_X85EbWyT | 6293 | Explicit indexes: next steps | benbovy 4160723 | open | 0 | 3 | 2022-02-23T12:19:38Z | 2023-12-01T09:34:28Z | MEMBER | 5692 is ~~not merged yet~~ now merged ~~but~~ and we can ~~already~~ start thinking about the next steps. I’m opening this issue to list and track the remaining tasks. @pydata/xarray, do not hesitate to add a comment below if you think about something that is missing here.Continue the refactoring of the internalsAlthough in #5692 everything seems to work with the current pandas index wrappers for dimension coordinates, not all of Xarray's internals have been refactored yet to fully support (or at least be compatible with) custom indexes. Here is a list of
I ended up following a common pattern in #5692 when adding explicit / flexible index support for various features (it is quite generic, though, the actual procedure may vary from one case to another and many steps may be skipped):
Relax all constraints related to “dimension (index) coordinates” in Xarray
Indexes repr
Public API for assigning and (re)setting indexesThere is no public API yet for creating and/or assigning existing indexes to Dataset and DataArray objects.
We still need to figure out how best we can (1) assign existing indexes (possibly with their coordinates) and (2) pass index build options. Other public API for index-based operationsTo fully leverage the power and flexibility of custom indexes, we might want to update some parts of Xarray’s public API in order to allow passing arbitrary options per index. For example:
Also:
Documentation
Index types and helper classes built in Xarray
3rd party indexes
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6293/reactions", "total_count": 12, "+1": 6, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 6, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1879109770 | PR_kwDOAMm_X85ZbILy | 8140 | Deprecate passing pd.MultiIndex implicitly | benbovy 4160723 | open | 0 | 23 | 2023-09-03T14:01:18Z | 2023-11-15T20:15:00Z | MEMBER | 0 | pydata/xarray/pulls/8140 |
This PR should normally raise a warning each time when indexed coordinates are created implicitly from a I updated the tests to create coordinates explicitly using I also refactored some parts where a
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8140/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
1865494976 | PR_kwDOAMm_X85Ytlq0 | 8111 | Alignment: allow flexible index coordinate order | benbovy 4160723 | open | 0 | 3 | 2023-08-24T16:18:49Z | 2023-09-28T15:58:38Z | MEMBER | 0 | pydata/xarray/pulls/8111 |
This PR relaxes some of the rules used in alignment for finding the indexes to compare or join together. Those indexes must still be of the same type and must relate to the same set of coordinates (and dimensions), but the order of coordinates is now ignored. It is up to the index to implement the equal / join logic if it needs to care about that order. Regarding ```python midx = pd.MultiIndex.from_product([["a", "b"], [0, 1]], names=("one", "two"))) midx2 = pd.MultiIndex.from_product([["a", "b"], [0, 1]], names=("two", "one")) midx.equals(midx2) # True ``` However, in Xarray the names of the multi-index levels (and their order) matter since each level has its own xarray coordinate. In this PR, |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8111/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
1869879398 | PR_kwDOAMm_X85Y8P4c | 8118 | Add Coordinates `set_xindex()` and `drop_indexes()` methods | benbovy 4160723 | open | 0 | 0 | 2023-08-28T14:28:24Z | 2023-09-19T01:53:18Z | MEMBER | 0 | pydata/xarray/pulls/8118 |
I don't think that we need to copy most API from Dataset / DataArray to ```python import dask.array as da import numpy as np import xarray as xr coords = ( xr.Coordinates( coords={"x": da.arange(100_000_000), "y": np.arange(100)}, indexes={}, ) .set_xindex("x", DaskIndex) .set_xindex("y", xr.indexes.PandasIndex) ) ds = xr.Dataset(coords=coords) <xarray.Dataset>Dimensions: (x: 100000000, y: 100)Coordinates:* x (x) int64 dask.array<chunksize=(16777216,), meta=np.ndarray>* y (y) int64 0 1 2 3 4 5 6 7 8 9 10 ... 90 91 92 93 94 95 96 97 98 99Data variables:emptyIndexes:x DaskIndex``` |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8118/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
1890893841 | I_kwDOAMm_X85wtMAR | 8171 | Fancy reprs | benbovy 4160723 | open | 0 | 10 | 2023-09-11T16:46:43Z | 2023-09-15T21:07:52Z | MEMBER | What is your issue?In Xarray we already have the plain-text and html reprs, which is great. Recently, I've tried anywidget and I think that it has potential to overcome some of the limitations of the current repr and possibly go well beyond it. The main advantages of anywidget:
I don't think we should replace the current html repr (it is still useful to have a basic, pure HTML/CSS version), but having a new widget could improve some aspects like not including the whole CSS each time an object repr is displayed, removing some HTML/CSS hacks... and actually has much more potential since we would have the whole javascript ecosystem at our fingertips (quick plots, etc.). Also bi-directional communication with Python is possible. I'm opening this issue to brainstorm about what would be nice to have in widget-based Xarray reprs:
cc @pydata/xarray |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8171/reactions", "total_count": 5, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 2, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1889195671 | I_kwDOAMm_X85wmtaX | 8166 | Dataset.from_dataframe: deprecate expanding the multi-index | benbovy 4160723 | open | 0 | 3 | 2023-09-10T15:54:31Z | 2023-09-11T06:20:50Z | MEMBER | What is your issue?Let's continue here the discussion about changing the behavior of Dataset.from_dataframe (see https://github.com/pydata/xarray/pull/8140#issuecomment-1712485626).
If we don't unstack anymore the multi-index in ```python ds = xr.Dataset( {"foo": (("x", "y"), [[1, 2], [3, 4]])}, coords={"x": ["a", "b"], "y": [1, 2]}, ) df = ds.to_dataframe() ds2 = xr.Dataset.from_dataframe(df, dim="z") ds2.identical(ds) # False ds2.unstack("z").identical(ds) # True ``` cc @max-sixty @dcherian |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8166/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1889751633 | PR_kwDOAMm_X85Z-5v1 | 8170 | Dataset.from_dataframe: optionally keep multi-index unexpanded | benbovy 4160723 | open | 0 | 0 | 2023-09-11T06:20:17Z | 2023-09-11T06:20:17Z | MEMBER | 1 | pydata/xarray/pulls/8170 |
I added both the
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8170/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
1880184915 | PR_kwDOAMm_X85ZespA | 8143 | Deprecate the multi-index dimension coordinate | benbovy 4160723 | open | 0 | 0 | 2023-09-04T12:32:36Z | 2023-09-04T12:32:48Z | MEMBER | 0 | pydata/xarray/pulls/8143 |
This PR adds a ```python import xarray as xr ds = xr.Dataset(coords={"x": ["a", "b"], "y": [1, 2]}) ds.stack(z=["x", "y"]) <xarray.Dataset>Dimensions: (z: 4)Coordinates:* z (z) object MultiIndex* x (z) <U1 'a' 'a' 'b' 'b'* y (z) int64 1 2 1 2Data variables:emptywith xr.set_options(future_no_mindex_dim_coord=True): ds.stack(z=["x", "y"]) <xarray.Dataset>Dimensions: (z: 4)Coordinates:* x (z) <U1 'a' 'a' 'b' 'b'* y (z) int64 1 2 1 2Dimensions without coordinates: zData variables:empty``` There are a few other things that we'll need to adapt or deprecate:
I started updating the tests, although this will be much easier once #8140 is merged. This is something that we could also easily split into multiple PRs. It is probably OK if some features are (temporarily) breaking badly when setting |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8143/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
1874412700 | PR_kwDOAMm_X85ZLe24 | 8124 | More flexible index variables | benbovy 4160723 | open | 0 | 0 | 2023-08-30T21:45:12Z | 2023-08-31T16:02:20Z | MEMBER | 1 | pydata/xarray/pulls/8124 |
The goal of this PR is to provide a more general solution to indexed coordinate variables, i.e., support arbitrary dimensions and/or duck arrays for those variables while at the same time prevent them from being updated in a way that would invalidate their index. This would solve problems like the one mentioned here: https://github.com/pydata/xarray/issues/1650#issuecomment-1697237429 @shoyer I've tried to implement what you have suggested in https://github.com/pydata/xarray/pull/4979#discussion_r589798510. It would be nice indeed if eventually we could get rid of So the approach implemented in this PR is to keep using The latter solution (wrapper) doesn't always work nicely, though. For example, several methods of More generally, which operations should we allow / forbid / skip for an indexed coordinate variable?
(Note: we could add cc @andersy005 (some changes made here may conflict with what you are refactoring in #8075). |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8124/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
1875631817 | PR_kwDOAMm_X85ZPnjq | 8128 | Add Index.load() and Index.chunk() methods | benbovy 4160723 | open | 0 | 0 | 2023-08-31T14:16:27Z | 2023-08-31T15:49:06Z | MEMBER | 1 | pydata/xarray/pulls/8128 |
As mentioned in #8124, it gives more control to custom Xarray indexes on what best to do when the Dataset / DataArray
For a DaskIndex, we might want to return a PandasIndex (or another non-lazy index) from |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8128/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
1412901282 | PR_kwDOAMm_X85A_96j | 7182 | add MultiPandasIndex helper class | benbovy 4160723 | open | 0 | 2 | 2022-10-18T09:42:58Z | 2023-08-23T16:30:28Z | MEMBER | 1 | pydata/xarray/pulls/7182 |
This PR adds a Early prototype in this notebook TODO / TO FIX:
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7182/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
1364388790 | I_kwDOAMm_X85RUuu2 | 7002 | Custom indexes and coordinate (re)ordering | benbovy 4160723 | open | 0 | 2 | 2022-09-07T09:44:12Z | 2023-08-23T14:35:32Z | MEMBER | What is your issue?(From https://github.com/pydata/xarray/issues/5647#issuecomment-946546464). The current alignment logic (as refactored in #5692) requires that two compatible indexes (i.e., of the same type) must relate to one or more coordinates with matching names but also in a matching order. For some multi-coordinate indexes like Possible options:
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7002/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1812008663 | I_kwDOAMm_X85sAQ7X | 8002 | Improve discoverability of index build options | benbovy 4160723 | open | 0 | 2 | 2023-07-19T13:54:09Z | 2023-07-19T17:48:51Z | MEMBER | Is your feature request related to a problem?Currently Describe the solution you'd likeWhat about something like this? ```python ds.set_xindex("x", MyCustomIndex.with_options(foo=1, bar=True)) ords.set_xindex("x", *MyCustomIndex.with_options(foo=1, bar=True)) ``` This would require adding a ```python xarray.core.indexesclass Index: @classmethod def with_options(cls) -> tuple[type[Self], dict[str, Any]]: return cls, {} ``` ```python third-party codefrom xarray.indexes import Index class MyCustomIndex(Index):
``` Thoughts? Describe alternatives you've consideredBuild options are also likely defined in the Index constructor, e.g., ```python third-party codefrom xarray.indexes import Index class MyCustomIndex(Index):
``` However, the Index constructor is not public API (only used internally and indirectly in Xarray when setting a new index from existing coordinates). Any other idea? Additional contextNo response |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8002/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1151751524 | I_kwDOAMm_X85EplVk | 6308 | xr.doctor(): diagnostics on a Dataset / DataArray ? | benbovy 4160723 | open | 0 | 4 | 2022-02-26T12:10:07Z | 2022-11-07T15:28:35Z | MEMBER | Is your feature request related to a problem?Recently I've been reading through various issue reports here and there (GH issues and discussions, forums, etc.) and I'm wondering if it wouldn't be useful to have some function in Xarray that inspects a Dataset or DataArray and reports a bunch of diagnostics, so that the community could better help troubleshooting performance or other issues faced by users. It's not always obvious where to look (e.g., number of chunks of a dask array, number of tasks of a dask graph, etc.) to diagnose issues, sometimes even for experienced users. Describe the solution you'd likeA
Describe alternatives you've consideredNone Additional contextNo response |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6308/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1364798843 | PR_kwDOAMm_X84-hLRI | 7004 | Rework PandasMultiIndex.sel internals | benbovy 4160723 | open | 0 | 2 | 2022-09-07T14:57:29Z | 2022-09-22T20:38:41Z | MEMBER | 0 | pydata/xarray/pulls/7004 |
This PR hopefully improves how are handled the labels that are provided for multi-index level coordinates in More specifically, slices are handled in a cleaner way and it is now allowed to provide array-like labels.
This yields a predictable behavior: as soon as one of the provided labels is a slice or array-like, the multi-index and all its level coordinates are kept in the result. Some cases illustrated below (I compare this PR with an older release due to the errors reported in #6838): ```python import xarray as xr import pandas as pd midx = pd.MultiIndex.from_product([list("abc"), range(4)], names=("one", "two")) ds = xr.Dataset(coords={"x": midx}) <xarray.Dataset>Dimensions: (x: 12)Coordinates:* x (x) object MultiIndex* one (x) object 'a' 'a' 'a' 'a' 'b' 'b' 'b' 'b' 'c' 'c' 'c' 'c'* two (x) int64 0 1 2 3 0 1 2 3 0 1 2 3Data variables:empty``` ```python ds.sel(one="a", two=0) this PR<xarray.Dataset>Dimensions: ()Coordinates:x object ('a', 0)one <U1 'a'two int64 0Data variables:emptyv2022.3.0<xarray.Dataset>Dimensions: ()Coordinates:x object ('a', 0)Data variables:empty``` ```python ds.sel(one="a") this PR:<xarray.Dataset>Dimensions: (two: 4)Coordinates:* two (two) int64 0 1 2 3one <U1 'a'Data variables:emptyv2022.3.0<xarray.Dataset>Dimensions: (two: 4)Coordinates:* two (two) int64 0 1 2 3Data variables:empty``` ```python ds.sel(one=slice("a", "b")) this PR<xarray.Dataset>Dimensions: (x: 8)Coordinates:* x (x) object MultiIndex* one (x) object 'a' 'a' 'a' 'a' 'b' 'b' 'b' 'b'* two (x) int64 0 1 2 3 0 1 2 3Data variables:emptyv2022.3.0<xarray.Dataset>Dimensions: (two: 8)Coordinates:* two (two) int64 0 1 2 3 0 1 2 3Data variables:empty``` ```python ds.sel(one="a", two=slice(1, 1)) this PR<xarray.Dataset>Dimensions: (x: 1)Coordinates:* x (x) object MultiIndex* one (x) object 'a'* two (x) int64 1Data variables:emptyv2022.3.0<xarray.Dataset>Dimensions: (x: 1)Coordinates:* x (x) MultiIndex- one (x) object 'a'- two (x) int64 1Data variables:empty``` ```python ds.sel(one=["b", "c"], two=[0, 2]) this PR<xarray.Dataset>Dimensions: (x: 4)Coordinates:* x (x) object MultiIndex* one (x) object 'b' 'b' 'c' 'c'* two (x) int64 0 2 0 2Data variables:emptyv2022.3.0ValueError: Vectorized selection is not available along coordinate 'one' (multi-index level)``` |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7004/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
1325016510 | I_kwDOAMm_X85O-iW- | 6860 | Align with join='override' may update index coordinate metadata | benbovy 4160723 | open | 0 | 0 | 2022-08-01T21:45:13Z | 2022-08-01T21:49:41Z | MEMBER | What happened?It seems that cf. @keewis' original https://github.com/pydata/xarray/pull/6857#discussion_r934425142. What did you expect to happen?Index coordinate metadata unaffected by alignment (i.e., metadata is passed through object -> aligned object for each object), like for align with other join methods. Minimal Complete Verifiable Example```Python import xarray as xr ds1 = xr.Dataset(coords={"x": ("x", [1, 2, 3], {"foo": 1})}) ds2 = xr.Dataset(coords={"x": ("x", [1, 2, 3], {"bar": 2})}) aligned1, aligned2 = xr.align(ds1, ds2, join="override") aligned1.x.attrs v2022.03.0 -> {'foo': 1}v2022.06.0 -> {'foo': 1, 'bar': 2}PR #6857 -> {'foo': 1}expected -> {'foo': 1}aligned2.x.attrs v2022.03.0 -> {}v2022.06.0 -> {'foo': 1, 'bar': 2}PR #6857 -> {'foo': 1, 'bar': 2}expected -> {'bar': 2}aligned11, aligned22 = xr.align(ds1, ds2, join="inner") aligned11.x.attrs {'foo': 1}aligned22.x.attrs {'bar': 2}``` MVCE confirmation
Relevant log outputNo response Anything else we need to know?No response Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.9.6 | packaged by conda-forge | (default, Jul 11 2021, 03:36:15)
[Clang 11.1.0 ]
python-bits: 64
OS: Darwin
OS-release: 20.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: (None, 'UTF-8')
libhdf5: 1.12.1
libnetcdf: 4.8.1
xarray: 0.21.2.dev137+g30023a484
pandas: 1.4.0
numpy: 1.22.2
scipy: 1.7.1
netCDF4: 1.5.8
pydap: installed
h5netcdf: 0.11.0
h5py: 3.4.0
Nio: None
zarr: 2.6.1
cftime: 1.5.2
nc_time_axis: 1.2.0
PseudoNetCDF: installed
rasterio: 1.2.10
cfgrib: 0.9.8.5
iris: 3.0.4
bottleneck: 1.3.2
dask: 2022.01.1
distributed: 2022.01.1
matplotlib: 3.4.3
cartopy: 0.20.1
seaborn: 0.11.1
numbagg: 0.2.1
fsspec: 0.8.5
cupy: None
pint: 0.16.1
sparse: 0.13.0
flox: None
numpy_groupies: None
setuptools: 57.4.0
pip: 20.2.4
conda: None
pytest: 6.2.5
IPython: 7.27.0
sphinx: 3.3.1
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6860/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1005623261 | I_kwDOAMm_X8478Jfd | 5812 | Check explicit indexes when comparing two xarray objects | benbovy 4160723 | open | 0 | 2 | 2021-09-23T16:19:32Z | 2021-09-24T15:59:02Z | MEMBER | Is your feature request related to a problem? Please describe.
With the explicit index refactor, two Dataset or DataArray objects Describe the solution you'd like
I'd suggest that One drawback is when we want to check either the attributes or the indexes but not both. Should we add options like suggested in #5733 then? |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5812/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1006335177 | I_kwDOAMm_X847-3TJ | 5814 | Confusing assertion message when comparing datasets with differing coordinates | benbovy 4160723 | open | 0 | 1 | 2021-09-24T10:50:11Z | 2021-09-24T15:17:00Z | MEMBER | What happened:
When two datasets What you expected to happen: An output assertion error message that shows only the differing coordinates. Minimal Complete Verifiable Example: ```python
Differing coordinates: L * x (x) int64 0 1 R * x (x) int64 2 3 Differing data variables: L var (x) float64 10.0 11.0 R var (x) float64 10.0 11.0 ``` I would rather expect: ```python
Differing coordinates: L * x (x) int64 0 1 R * x (x) int64 2 3 ``` Anything else we need to know?: Environment: Output of <tt>xr.show_versions()</tt>INSTALLED VERSIONS ------------------ commit: None python: 3.9.6 | packaged by conda-forge | (default, Jul 11 2021, 03:36:15) [Clang 11.1.0 ] python-bits: 64 OS: Darwin OS-release: 20.3.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: None LOCALE: (None, 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.19.1.dev72+ga8d84c703.d20210901 pandas: 1.3.2 numpy: 1.21.2 scipy: 1.7.1 netCDF4: 1.5.6 pydap: installed h5netcdf: 0.8.1 h5py: 3.3.0 Nio: None zarr: 2.6.1 cftime: 1.5.0 nc_time_axis: 1.2.0 PseudoNetCDF: installed rasterio: 1.2.1 cfgrib: 0.9.8.5 iris: 3.0.4 bottleneck: 1.3.2 dask: 2021.01.1 distributed: 2021.01.1 matplotlib: 3.4.3 cartopy: 0.18.0 seaborn: 0.11.1 numbagg: None fsspec: 0.8.5 cupy: None pint: 0.16.1 sparse: 0.11.2 setuptools: 57.4.0 pip: 20.2.4 conda: None pytest: 6.2.5 IPython: 7.27.0 sphinx: 3.3.1 |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5814/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
977149831 | MDU6SXNzdWU5NzcxNDk4MzE= | 5732 | Coordinates implicitly created when passing a DataArray as coord to Dataset constructor | benbovy 4160723 | open | 0 | 3 | 2021-08-23T15:20:37Z | 2021-08-24T14:18:09Z | MEMBER | I stumbled on this while working on #5692. Is this intended behavior or unwanted side effect? What happened: Create a new Dataset by passing a DataArray object as coordinate also add the DataArray coordinates to the dataset: ```python
What you expected to happen: The behavior above seems a bit counter-intuitive to me. I would rather expect no additional coordinates auto-magically added to the dataset, i.e. only one ```python
Environment: Output of <tt>xr.show_versions()</tt>INSTALLED VERSIONS ------------------ commit: None python: 3.8.6 | packaged by conda-forge | (default, Nov 27 2020, 19:17:44) [Clang 11.0.0 ] python-bits: 64 OS: Darwin OS-release: 20.3.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: None LOCALE: (None, 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.19.0 pandas: 1.1.5 numpy: 1.21.1 scipy: 1.7.0 netCDF4: 1.5.5.1 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.3.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2021.07.2 distributed: 2021.07.2 matplotlib: 3.3.3 cartopy: 0.19.0.post1 seaborn: None numbagg: None pint: None setuptools: 49.6.0.post20201009 pip: 20.3.1 conda: None pytest: 6.1.2 IPython: 7.25.0 sphinx: 3.3.1 |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5732/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
902009258 | MDU6SXNzdWU5MDIwMDkyNTg= | 5376 | Multi-scale datasets and custom indexes | benbovy 4160723 | open | 0 | 6 | 2021-05-26T08:38:00Z | 2021-06-02T08:07:38Z | MEMBER | I've been wondering if:
I'm thinking of an API that would look like this: ```python lazily load a big n-d image (full resolution) as a xarray.Datasetxyz_dataset = ... set a new index for the x/y/z coordinates(
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5376/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 1 } |
xarray 13221727 | issue | ||||||||
869721207 | MDU6SXNzdWU4Njk3MjEyMDc= | 5226 | Attributes encoding compatibility between backends | benbovy 4160723 | open | 0 | 1 | 2021-04-28T09:11:19Z | 2021-04-28T15:42:42Z | MEMBER | What happened: Let's create an Zarr dataset with some "less common" dtype and fill value, open it with Xarray and save the dataset as NetCDF: ```python import xarray as xr import zarr g = zarr.group() g.create('arr', shape=3, fill_value='z', dtype='<U1') g['arr'].attrs['_ARRAY_DIMENSIONS'] = ('dim_1') -- without masking fill valuesds = xr.open_zarr(g.store, mask_and_scale=False) ds.arr.attrs # returns {'_FillValue': 'z'} error: netCDF4 does not yet support setting a fill value for variable-length stringsds.to_netcdf('test.nc') -- with masking fill valuesds2 = xr.open_zarr(g.store, mask_and_scale=True) returns a dict that includes item _FillValue': 'z'ds2.arr.encoding same error than aboveds2.to_netcdf('out2.nc') ``` What you expected to happen: Seamless conversion (read/write) from one backend to another. Is there anything we could do to improve the case shown here above, and maybe other cases like the one described in #5223? Environment: Output of <tt>xr.show_versions()</tt>INSTALLED VERSIONS ------------------ commit: None libhdf5: None libnetcdf: None xarray: 0.17.0 pandas: 1.0.3 numpy: 1.18.1 scipy: 1.3.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.8.1 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.11.0 distributed: 2.14.0 matplotlib: 3.1.1 cartopy: None seaborn: None numbagg: None pint: None setuptools: 46.1.3.post20200325 pip: 19.2.3 conda: None pytest: 5.4.1 IPython: 7.13.0 sphinx: None |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5226/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
733077617 | MDU6SXNzdWU3MzMwNzc2MTc= | 4555 | Vectorized indexing (isel) of chunked data with 1D indices gives weird chunks | benbovy 4160723 | open | 0 | 1 | 2020-10-30T10:55:33Z | 2021-03-02T17:36:48Z | MEMBER | What happened: Applying What you expected to happen: More consistent chunk sizes. Minimal Complete Verifiable Example: Let's create a chunked DataArray ```python In [1]: import numpy as np In [2]: import xarray as xr In [3]: da = xr.DataArray(np.random.rand(100), dims='points').chunk(50) In [4]: da Out[4]: <xarray.DataArray (points: 100)> dask.array<xarray-\<this-array>, shape=(100,), dtype=float64, chunksize=(50,), chunktype=numpy.ndarray> Dimensions without coordinates: points ``` Select random indices results in a lot of small chunks ```python In [5]: indices = xr.Variable('nodes', np.random.choice(np.arange(100, dtype='int'), size=10)) In [6]: da_sel = da.isel(points=indices) In [7]: da_sel.chunks Out[7]: ((1, 1, 3, 1, 1, 3),) ``` What I would expect
This works fine with 2+ dimensional indexers, e.g., ```python In [9]: indices_2d = xr.Variable(('x', 'y'), np.random.choice(np.arange(100), size=(10, 10))) In [10]: da_sel_2d = da.isel(points=indices_2d) In [11]: da_sel_2d.chunks Out[11]: ((10,), (10,)) ``` Anything else we need to know?: I suspect the issue is here: In the example above I think we still want vectorized indexing (i.e., call Environment: Output of <tt>xr.show_versions()</tt>INSTALLED VERSIONS ------------------ commit: None python: 3.8.3 | packaged by conda-forge | (default, Jun 1 2020, 17:21:09) [Clang 9.0.1 ] python-bits: 64 OS: Darwin OS-release: 18.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.UTF-8 libhdf5: None libnetcdf: None xarray: 0.16.1 pandas: 1.1.3 numpy: 1.19.1 scipy: 1.5.2 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.19.0 distributed: 2.25.0 matplotlib: 3.3.1 cartopy: None seaborn: None numbagg: None pint: None setuptools: 47.3.1.post20200616 pip: 20.1.1 conda: None pytest: 5.4.3 IPython: 7.16.1 sphinx: 3.2.1 |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/4555/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
187873247 | MDU6SXNzdWUxODc4NzMyNDc= | 1094 | Supporting out-of-core computation/indexing for very large indexes | benbovy 4160723 | open | 0 | 5 | 2016-11-08T00:56:56Z | 2021-01-26T20:09:12Z | MEMBER | (Follow-up of discussion here https://github.com/pydata/xarray/pull/1024#issuecomment-258524115). xarray + dask.array successfully enable out-of-core computation for very large variables that doesn't fit in memory. One current limitation is that the indexes of a However, this may be problematic in some specific cases where we have to deal with very large indexes. As an example, big unstructured meshes often have coordinates (x, y, z) arranged as 1-d arrays of length that equals the number of nodes, which can be very large!! (See, e.g., ugrid conventions). It would be very nice if xarray could also help for these use cases. Therefore I'm wondering if (and how) out-of-core support can be extended to indexes and indexing. I've briefly looked at the documentation on My knowledge of dask is very limited, though. So I've no doubt that this suggestion is very simplistic and not very efficient, or that there are better approaches. I'm also certainly missing other issues not directly related to indexing. Any thoughts? cc @shoyer @mrocklin |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1094/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issues] ( [id] INTEGER PRIMARY KEY, [node_id] TEXT, [number] INTEGER, [title] TEXT, [user] INTEGER REFERENCES [users]([id]), [state] TEXT, [locked] INTEGER, [assignee] INTEGER REFERENCES [users]([id]), [milestone] INTEGER REFERENCES [milestones]([id]), [comments] INTEGER, [created_at] TEXT, [updated_at] TEXT, [closed_at] TEXT, [author_association] TEXT, [active_lock_reason] TEXT, [draft] INTEGER, [pull_request] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [state_reason] TEXT, [repo] INTEGER REFERENCES [repos]([id]), [type] TEXT ); CREATE INDEX [idx_issues_repo] ON [issues] ([repo]); CREATE INDEX [idx_issues_milestone] ON [issues] ([milestone]); CREATE INDEX [idx_issues_assignee] ON [issues] ([assignee]); CREATE INDEX [idx_issues_user] ON [issues] ([user]);