issues
104 rows where user = 4160723 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: comments, draft, created_at (date), updated_at (date), closed_at (date)
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at ▲ | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1389295853 | I_kwDOAMm_X85Szvjt | 7099 | Pass arbitrary options to sel() | benbovy 4160723 | open | 0 | 4 | 2022-09-28T12:44:52Z | 2024-04-30T00:44:18Z | MEMBER | Is your feature request related to a problem?Currently It would be also useful for custom indexes to expose their own selection options, e.g.,
From #3223, it would be nice if we could also pass distinct options values per index. What would be a good API for that? Describe the solution you'd likeSome ideas: A. Allow passing a tuple
B. Expose an
Option A does not look very readable. Option B is slightly better, although the nested dictionary is not great. Any other ideas? Some sort of context manager? Some Describe alternatives you've consideredThe API proposed in #3223 would look great if Additional contextNo response |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7099/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
2227413822 | PR_kwDOAMm_X85rz7ZX | 8911 | Refactor swap dims | benbovy 4160723 | open | 0 | 5 | 2024-04-05T08:45:49Z | 2024-04-17T16:46:34Z | MEMBER | 1 | pydata/xarray/pulls/8911 |
I've tried here re-implementing |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8911/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
2215059449 | PR_kwDOAMm_X85rJr7c | 8888 | to_base_variable: coerce multiindex data to numpy array | benbovy 4160723 | open | 0 | 3 | 2024-03-29T10:10:42Z | 2024-03-29T15:54:19Z | MEMBER | 0 | pydata/xarray/pulls/8888 |
@slevang this should also make work your test case added in #8809. I haven't added it here, instead I added a basic check that should be enough. I don't really understand why the serialization backends (zarr?) do not seem to work with the |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8888/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
2101987013 | PR_kwDOAMm_X85lJbZW | 8672 | Fix multiindex level serialization after reset_index | benbovy 4160723 | closed | 0 | 6 | 2024-01-26T10:40:42Z | 2024-02-23T01:22:17Z | 2024-01-31T17:42:29Z | MEMBER | 0 | pydata/xarray/pulls/8672 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8672/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
915057433 | MDU6SXNzdWU5MTUwNTc0MzM= | 5452 | [community] Flexible indexes meeting | benbovy 4160723 | closed | 0 | 7 | 2021-06-08T13:32:16Z | 2024-02-15T01:39:08Z | 2024-02-15T01:39:08Z | MEMBER | In addition to the bi-weekly community developers meeting, we plan to have 30min meetings on a weekly basis -- every Tue 8:30-9:00 PDT (17:30-18:00 CEST) -- to discuss the flexible indexes refactor. Anyone from @pydata/xarray feel free to join! The first meeting is in a couple of hours. Zoom link (subject to change). |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5452/reactions", "total_count": 5, "+1": 5, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
1861543091 | I_kwDOAMm_X85u9OSz | 8097 | Documentation rendering issues (dark mode) | benbovy 4160723 | open | 0 | 2 | 2023-08-22T14:06:03Z | 2024-02-13T02:31:10Z | MEMBER | What is your issue?There is a couple of rendering issues in Xarray's documentation landing page, especially with the dark mode.
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8097/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
213004586 | MDU6SXNzdWUyMTMwMDQ1ODY= | 1303 | `xarray.core.variable.as_variable()` part of the public API? | benbovy 4160723 | closed | 0 | 5 | 2017-03-09T11:07:52Z | 2024-02-06T17:57:21Z | 2017-06-02T17:55:12Z | MEMBER | Is it safe to use I have a specific use case where this would be very useful. I'm working on a package that heavily uses and extends xarray for landscape evolution modeling, and inside a custom class for model parameters I want to be able to create Although I know that ```python import xarray as xr class Parameter(object):
``` I don't think it is a viable option to copy A workaround using only public API would be something like: ```python class Parameter(object):
``` but it feels a bit hacky. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1303/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
1864056633 | PR_kwDOAMm_X85YovK- | 8107 | Better default behavior of the Coordinates constructor | benbovy 4160723 | closed | 0 | 2 | 2023-08-23T21:42:51Z | 2024-02-04T18:32:42Z | 2023-08-31T07:35:47Z | MEMBER | 0 | pydata/xarray/pulls/8107 |
After working more on
This PR introduces a breaking change since |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8107/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1839199929 | PR_kwDOAMm_X85XUl4W | 8051 | Allow setting (or skipping) new indexes in open_dataset | benbovy 4160723 | open | 0 | 9 | 2023-08-07T10:53:46Z | 2024-02-03T19:12:48Z | MEMBER | 0 | pydata/xarray/pulls/8051 |
This PR introduces a new boolean parameter Currently works with the Zarr backend: ```python import numpy as np import xarray as xr example dataset (real dataset may be much larger)arr = np.random.random(size=1_000_000) xr.Dataset({"x": arr}).to_zarr("dataset.zarr") xr.open_dataset("dataset.zarr", set_indexes=False, engine="zarr") <xarray.Dataset>Dimensions: (x: 1000000)Coordinates:x (x) float64 ...Data variables:emptyxr.open_zarr("dataset.zarr", set_indexes=False) <xarray.Dataset>Dimensions: (x: 1000000)Coordinates:x (x) float64 ...Data variables:empty``` I'll add it to the other Xarray backends as well, but I'd like to get your thoughts about the API first.
Currently 1 and 2 are implemented in this PR, although as I write this comment I think that I would prefer 3. I guess this depends on whether we prefer |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8051/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
667864088 | MDU6SXNzdWU2Njc4NjQwODg= | 4285 | Awkward array backend? | benbovy 4160723 | open | 0 | 38 | 2020-07-29T13:53:45Z | 2023-12-30T18:47:48Z | MEMBER | Just curious if anyone here has thoughts on this. For more context: Awkward is like numpy but for arrays of very arbitrary (dynamic) structure. I don't know much yet about that library (I've just seen this SciPy 2020 presentation), but now I could imagine using xarray for dealing with labelled collections of geometrical / geospatial objects like polylines or polygons. At this stage, any integration between xarray and awkward arrays would be something highly experimental, but I think this might be an interesting case for flexible arrays (and possibly flexible indexes) mentioned in the roadmap. There is some discussion here: https://github.com/scikit-hep/awkward-1.0/issues/27. Does anyone see any other potential use case? cc @pydata/xarray |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/4285/reactions", "total_count": 6, "+1": 6, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1989356758 | I_kwDOAMm_X852kyzW | 8447 | Improve discoverability of backend engine options | benbovy 4160723 | open | 0 | 5 | 2023-11-12T11:14:56Z | 2023-12-12T20:30:28Z | MEMBER | Is your feature request related to a problem?Backend engine options are not easily discoverable and we need to know or figure out them before passing it as kwargs to Describe the solution you'd likeThe solution is similar to the one proposed in #8002 for setting a new index. The API could look like this: ```python import xarray as xr ds = xr.open_dataset( file_or_obj, engine=xr.backends.engine("myengine").with_options( option1=True, option2=100, ), ) ``` where We would need to extend the API for ```python class BackendEntrypoint: _open_dataset_options: dict[str, Any]
``` Such that ```python class MyEngineBackendEntryPoint(BackendEntrypoint): open_dataset_parameters = ("option1", "option2")
``` Pros:
Cons:
Describe alternatives you've consideredA Additional contextcc @jsignell https://github.com/stac-utils/pystac/issues/846#issuecomment-1405758442 |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8447/reactions", "total_count": 4, "+1": 4, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1148021907 | I_kwDOAMm_X85EbWyT | 6293 | Explicit indexes: next steps | benbovy 4160723 | open | 0 | 3 | 2022-02-23T12:19:38Z | 2023-12-01T09:34:28Z | MEMBER | 5692 is ~~not merged yet~~ now merged ~~but~~ and we can ~~already~~ start thinking about the next steps. I’m opening this issue to list and track the remaining tasks. @pydata/xarray, do not hesitate to add a comment below if you think about something that is missing here.Continue the refactoring of the internalsAlthough in #5692 everything seems to work with the current pandas index wrappers for dimension coordinates, not all of Xarray's internals have been refactored yet to fully support (or at least be compatible with) custom indexes. Here is a list of
I ended up following a common pattern in #5692 when adding explicit / flexible index support for various features (it is quite generic, though, the actual procedure may vary from one case to another and many steps may be skipped):
Relax all constraints related to “dimension (index) coordinates” in Xarray
Indexes repr
Public API for assigning and (re)setting indexesThere is no public API yet for creating and/or assigning existing indexes to Dataset and DataArray objects.
We still need to figure out how best we can (1) assign existing indexes (possibly with their coordinates) and (2) pass index build options. Other public API for index-based operationsTo fully leverage the power and flexibility of custom indexes, we might want to update some parts of Xarray’s public API in order to allow passing arbitrary options per index. For example:
Also:
Documentation
Index types and helper classes built in Xarray
3rd party indexes
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6293/reactions", "total_count": 12, "+1": 6, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 6, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1879109770 | PR_kwDOAMm_X85ZbILy | 8140 | Deprecate passing pd.MultiIndex implicitly | benbovy 4160723 | open | 0 | 23 | 2023-09-03T14:01:18Z | 2023-11-15T20:15:00Z | MEMBER | 0 | pydata/xarray/pulls/8140 |
This PR should normally raise a warning each time when indexed coordinates are created implicitly from a I updated the tests to create coordinates explicitly using I also refactored some parts where a
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8140/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
1865494976 | PR_kwDOAMm_X85Ytlq0 | 8111 | Alignment: allow flexible index coordinate order | benbovy 4160723 | open | 0 | 3 | 2023-08-24T16:18:49Z | 2023-09-28T15:58:38Z | MEMBER | 0 | pydata/xarray/pulls/8111 |
This PR relaxes some of the rules used in alignment for finding the indexes to compare or join together. Those indexes must still be of the same type and must relate to the same set of coordinates (and dimensions), but the order of coordinates is now ignored. It is up to the index to implement the equal / join logic if it needs to care about that order. Regarding ```python midx = pd.MultiIndex.from_product([["a", "b"], [0, 1]], names=("one", "two"))) midx2 = pd.MultiIndex.from_product([["a", "b"], [0, 1]], names=("two", "one")) midx.equals(midx2) # True ``` However, in Xarray the names of the multi-index levels (and their order) matter since each level has its own xarray coordinate. In this PR, |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8111/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
1869879398 | PR_kwDOAMm_X85Y8P4c | 8118 | Add Coordinates `set_xindex()` and `drop_indexes()` methods | benbovy 4160723 | open | 0 | 0 | 2023-08-28T14:28:24Z | 2023-09-19T01:53:18Z | MEMBER | 0 | pydata/xarray/pulls/8118 |
I don't think that we need to copy most API from Dataset / DataArray to ```python import dask.array as da import numpy as np import xarray as xr coords = ( xr.Coordinates( coords={"x": da.arange(100_000_000), "y": np.arange(100)}, indexes={}, ) .set_xindex("x", DaskIndex) .set_xindex("y", xr.indexes.PandasIndex) ) ds = xr.Dataset(coords=coords) <xarray.Dataset>Dimensions: (x: 100000000, y: 100)Coordinates:* x (x) int64 dask.array<chunksize=(16777216,), meta=np.ndarray>* y (y) int64 0 1 2 3 4 5 6 7 8 9 10 ... 90 91 92 93 94 95 96 97 98 99Data variables:emptyIndexes:x DaskIndex``` |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8118/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
1890893841 | I_kwDOAMm_X85wtMAR | 8171 | Fancy reprs | benbovy 4160723 | open | 0 | 10 | 2023-09-11T16:46:43Z | 2023-09-15T21:07:52Z | MEMBER | What is your issue?In Xarray we already have the plain-text and html reprs, which is great. Recently, I've tried anywidget and I think that it has potential to overcome some of the limitations of the current repr and possibly go well beyond it. The main advantages of anywidget:
I don't think we should replace the current html repr (it is still useful to have a basic, pure HTML/CSS version), but having a new widget could improve some aspects like not including the whole CSS each time an object repr is displayed, removing some HTML/CSS hacks... and actually has much more potential since we would have the whole javascript ecosystem at our fingertips (quick plots, etc.). Also bi-directional communication with Python is possible. I'm opening this issue to brainstorm about what would be nice to have in widget-based Xarray reprs:
cc @pydata/xarray |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8171/reactions", "total_count": 5, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 2, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1889195671 | I_kwDOAMm_X85wmtaX | 8166 | Dataset.from_dataframe: deprecate expanding the multi-index | benbovy 4160723 | open | 0 | 3 | 2023-09-10T15:54:31Z | 2023-09-11T06:20:50Z | MEMBER | What is your issue?Let's continue here the discussion about changing the behavior of Dataset.from_dataframe (see https://github.com/pydata/xarray/pull/8140#issuecomment-1712485626).
If we don't unstack anymore the multi-index in ```python ds = xr.Dataset( {"foo": (("x", "y"), [[1, 2], [3, 4]])}, coords={"x": ["a", "b"], "y": [1, 2]}, ) df = ds.to_dataframe() ds2 = xr.Dataset.from_dataframe(df, dim="z") ds2.identical(ds) # False ds2.unstack("z").identical(ds) # True ``` cc @max-sixty @dcherian |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8166/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1889751633 | PR_kwDOAMm_X85Z-5v1 | 8170 | Dataset.from_dataframe: optionally keep multi-index unexpanded | benbovy 4160723 | open | 0 | 0 | 2023-09-11T06:20:17Z | 2023-09-11T06:20:17Z | MEMBER | 1 | pydata/xarray/pulls/8170 |
I added both the
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8170/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
1879864306 | PR_kwDOAMm_X85ZdmTF | 8142 | Dirty workaround for mypy 1.5 error | benbovy 4160723 | closed | 0 | 8 | 2023-09-04T09:21:18Z | 2023-09-07T16:04:55Z | 2023-09-07T08:21:12Z | MEMBER | 0 | pydata/xarray/pulls/8142 | I wanted to fix the following error with mypy 1.5:
Which looks similar to https://github.com/python/mypy/issues/9319. It is weird that here it worked with mypy versions < 1.5, though. I don't know if there is a better fix, but I thought that redefining |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8142/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1879652439 | PR_kwDOAMm_X85Zc4ub | 8141 | Fix doctests: pandas 2.1 MultiIndex repr with nan | benbovy 4160723 | closed | 0 | 0 | 2023-09-04T07:08:55Z | 2023-09-05T08:35:37Z | 2023-09-05T08:35:36Z | MEMBER | 0 | pydata/xarray/pulls/8141 | { "url": "https://api.github.com/repos/pydata/xarray/issues/8141/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
1880184915 | PR_kwDOAMm_X85ZespA | 8143 | Deprecate the multi-index dimension coordinate | benbovy 4160723 | open | 0 | 0 | 2023-09-04T12:32:36Z | 2023-09-04T12:32:48Z | MEMBER | 0 | pydata/xarray/pulls/8143 |
This PR adds a ```python import xarray as xr ds = xr.Dataset(coords={"x": ["a", "b"], "y": [1, 2]}) ds.stack(z=["x", "y"]) <xarray.Dataset>Dimensions: (z: 4)Coordinates:* z (z) object MultiIndex* x (z) <U1 'a' 'a' 'b' 'b'* y (z) int64 1 2 1 2Data variables:emptywith xr.set_options(future_no_mindex_dim_coord=True): ds.stack(z=["x", "y"]) <xarray.Dataset>Dimensions: (z: 4)Coordinates:* x (z) <U1 'a' 'a' 'b' 'b'* y (z) int64 1 2 1 2Dimensions without coordinates: zData variables:empty``` There are a few other things that we'll need to adapt or deprecate:
I started updating the tests, although this will be much easier once #8140 is merged. This is something that we could also easily split into multiple PRs. It is probably OK if some features are (temporarily) breaking badly when setting |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8143/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
1862912829 | PR_kwDOAMm_X85Yk15B | 8102 | Add `Coordinates.assign()` method | benbovy 4160723 | closed | 0 | 0 | 2023-08-23T09:15:51Z | 2023-09-01T13:28:16Z | 2023-09-01T13:28:16Z | MEMBER | 0 | pydata/xarray/pulls/8102 |
This is consistent with the Dataset and DataArray This allows writing: ```python midx = pd.MultiIndex.from_arrays([["a", "a", "b", "b"], [0, 1, 0, 1]]) midx_coords = xr.Coordinates.from_pandas_multiindex(midx, "x") ds = xr.Dataset(coords=midx_coords.assign(y=[1, 2])) ``` which is quite common (at least in the tests) and a bit nicer than
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8102/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1874412700 | PR_kwDOAMm_X85ZLe24 | 8124 | More flexible index variables | benbovy 4160723 | open | 0 | 0 | 2023-08-30T21:45:12Z | 2023-08-31T16:02:20Z | MEMBER | 1 | pydata/xarray/pulls/8124 |
The goal of this PR is to provide a more general solution to indexed coordinate variables, i.e., support arbitrary dimensions and/or duck arrays for those variables while at the same time prevent them from being updated in a way that would invalidate their index. This would solve problems like the one mentioned here: https://github.com/pydata/xarray/issues/1650#issuecomment-1697237429 @shoyer I've tried to implement what you have suggested in https://github.com/pydata/xarray/pull/4979#discussion_r589798510. It would be nice indeed if eventually we could get rid of So the approach implemented in this PR is to keep using The latter solution (wrapper) doesn't always work nicely, though. For example, several methods of More generally, which operations should we allow / forbid / skip for an indexed coordinate variable?
(Note: we could add cc @andersy005 (some changes made here may conflict with what you are refactoring in #8075). |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8124/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
1875631817 | PR_kwDOAMm_X85ZPnjq | 8128 | Add Index.load() and Index.chunk() methods | benbovy 4160723 | open | 0 | 0 | 2023-08-31T14:16:27Z | 2023-08-31T15:49:06Z | MEMBER | 1 | pydata/xarray/pulls/8128 |
As mentioned in #8124, it gives more control to custom Xarray indexes on what best to do when the Dataset / DataArray
For a DaskIndex, we might want to return a PandasIndex (or another non-lazy index) from |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8128/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
180638999 | MDExOlB1bGxSZXF1ZXN0ODc3MTUzMDM= | 1028 | Add `set_index`, `reset_index` and `reorder_levels` methods | benbovy 4160723 | closed | 0 | 8 | 2016-10-03T13:22:24Z | 2023-08-30T09:28:26Z | 2016-12-27T17:03:00Z | MEMBER | 0 | pydata/xarray/pulls/1028 | Another item in #719. I added tests and updated the docs, so this is ready for review. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1028/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1864650372 | PR_kwDOAMm_X85YqtUk | 8109 | Better error message when trying to set an index from a scalar coordinate | benbovy 4160723 | closed | 0 | 0 | 2023-08-24T08:18:13Z | 2023-08-30T09:27:27Z | 2023-08-30T07:13:15Z | MEMBER | 0 | pydata/xarray/pulls/8109 |
The message suggests using |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8109/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
966983801 | MDExOlB1bGxSZXF1ZXN0NzA5MTg3NDY2 | 5692 | Explicit indexes | benbovy 4160723 | closed | 0 | 46 | 2021-08-11T15:57:41Z | 2023-08-30T09:26:37Z | 2022-03-17T17:11:44Z | MEMBER | 0 | pydata/xarray/pulls/5692 |
Follow-up on #5636 (work in progress), supersedes #2195. This is likely to be going big, sorry in advance! It'll be safer to make a release before merging this PR. Current progress:
TODO:
In next PRs:
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5692/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
953235338 | MDExOlB1bGxSZXF1ZXN0Njk3MzA3NDc3 | 5636 | Refactor index vs. coordinate variable(s) | benbovy 4160723 | closed | 0 | 4 | 2021-07-26T19:54:25Z | 2023-08-30T09:21:55Z | 2021-08-09T07:56:56Z | MEMBER | 0 | pydata/xarray/pulls/5636 |
This implements option 3 (sort of) described in https://github.com/pydata/xarray/issues/5553#issue-933551030:
This is very much work in progress, I need to update (or revert) all related parts of Xarray's internals, update tests, etc. At this stage any comment on the approach described above is welcome. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5636/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1485037066 | PR_kwDOAMm_X85Ez9Gj | 7368 | Expose "Coordinates" as part of Xarray's public API | benbovy 4160723 | closed | 0 | 31 | 2022-12-08T16:59:29Z | 2023-08-30T09:11:57Z | 2023-07-21T20:40:03Z | MEMBER | 0 | pydata/xarray/pulls/7368 |
This is a rework of #7214. It follows the suggestions made in https://github.com/pydata/xarray/pull/7214#issuecomment-1295283938, https://github.com/pydata/xarray/pull/7214#issuecomment-1297046405 and https://github.com/pydata/xarray/pull/7214#issuecomment-1293774799:
EDIT: EDIT2: it ended up as a pretty big refactor with the promotion of Some examples of usage: ```python import pandas as pd import xarray as xr midx = pd.MultiIndex.from_product([["a", "b"], [1, 2]], names=("one", "two")) coords = xr.Coordinates.from_pandas_multiindex(midx, "x") Coordinates:* x (x) object MultiIndex* one (x) object 'a' 'a' 'b' 'b'* two (x) int64 1 2 1 2ds = xr.Dataset(coords=coords) <xarray.Dataset>Dimensions: (x: 4)Coordinates:* x (x) object MultiIndex* one (x) object 'a' 'a' 'b' 'b'* two (x) int64 1 2 1 2Data variables:emptyds_to_be_deprecated = xr.Dataset(coords={"x": midx}) ds_to_be_deprecated.identical(ds) Trueda = xr.DataArray([1, 2, 3, 4], dims="x", coords=ds.coords) <xarray.DataArray (x: 4)>array([1, 2, 3, 4])Coordinates:* x (x) object MultiIndex* one (x) object 'a' 'a' 'b' 'b'* two (x) int64 1 2 1 2``` TODO:
@shoyer, @dcherian, anyone -- what do you think about the approach proposed here? I'd like to check that with you before going further with tests, docs, etc. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7368/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1422543378 | PR_kwDOAMm_X85BgRaG | 7214 | Pass indexes directly to the DataArray and Dataset constructors | benbovy 4160723 | closed | 0 | 17 | 2022-10-25T14:16:44Z | 2023-08-30T09:11:56Z | 2023-07-18T11:52:11Z | MEMBER | 1 | pydata/xarray/pulls/7214 |
From https://github.com/pydata/xarray/issues/6392#issuecomment-1290454937: I'm thinking of only accepting one or more instances of Indexes as indexes argument in the Dataset and DataArray constructors. The only exception is when
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7214/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1863646946 | PR_kwDOAMm_X85YnWau | 8104 | Fix merge with compat=minimal (coord names) | benbovy 4160723 | closed | 0 | 0 | 2023-08-23T16:20:48Z | 2023-08-30T09:11:18Z | 2023-08-30T07:57:35Z | MEMBER | 0 | pydata/xarray/pulls/8104 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8104/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1358841264 | PR_kwDOAMm_X84-NgIX | 6975 | Add documentation on custom indexes | benbovy 4160723 | closed | 0 | 9 | 2022-09-01T13:20:00Z | 2023-08-30T09:10:34Z | 2023-07-17T23:23:22Z | MEMBER | 0 | pydata/xarray/pulls/6975 | This PR documents the API of the @pydata/xarray your feedback would be very much appreciated! I've been into this for quite some time, so there may be things that seem obvious to me but that you can still find very confusing or non-intuitive. It would then deserve some extra or better explanation. More specifically, I'm open to any suggestion on how to better illustrate this with clear and succinct examples. There are other parts of the documentation that still need to be updated regarding the indexes refactor (e.g., "dimension" coordinates, |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6975/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1859437888 | PR_kwDOAMm_X85YY-II | 8094 | Refactor update coordinates to better handle multi-coordinate indexes | benbovy 4160723 | closed | 0 | 4 | 2023-08-21T13:57:38Z | 2023-08-30T09:06:28Z | 2023-08-29T14:23:29Z | MEMBER | 0 | pydata/xarray/pulls/8094 |
This refactor should better handle multi-coordinate indexes when updating (or assigning) new coordinates. It also fixes, better isolates and better warns a bunch of deprecated pandas multi-index special cases (i.e., directly passing |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8094/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1498386428 | PR_kwDOAMm_X85FiyaY | 7382 | Some alignment optimizations | benbovy 4160723 | closed | 0 | 4 | 2022-12-15T12:54:56Z | 2023-08-30T09:05:24Z | 2023-01-05T21:25:55Z | MEMBER | 0 | pydata/xarray/pulls/7382 |
May fix some performance regressions, e.g., see https://github.com/pydata/xarray/issues/7376#issuecomment-1352989233. @ravwojdyla with this PR |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7382/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1362148668 | PR_kwDOAMm_X84-YVgW | 6992 | Review (re)set_index | benbovy 4160723 | closed | 0 | 1 | 2022-09-05T15:07:43Z | 2023-08-30T09:05:10Z | 2022-09-27T10:35:38Z | MEMBER | 0 | pydata/xarray/pulls/6992 |
Restore behavior prior to the explicit indexes refactor (i.e., refactored but without breaking changes). TODO:
For |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6992/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1412901282 | PR_kwDOAMm_X85A_96j | 7182 | add MultiPandasIndex helper class | benbovy 4160723 | open | 0 | 2 | 2022-10-18T09:42:58Z | 2023-08-23T16:30:28Z | MEMBER | 1 | pydata/xarray/pulls/7182 |
This PR adds a Early prototype in this notebook TODO / TO FIX:
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7182/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
1364388790 | I_kwDOAMm_X85RUuu2 | 7002 | Custom indexes and coordinate (re)ordering | benbovy 4160723 | open | 0 | 2 | 2022-09-07T09:44:12Z | 2023-08-23T14:35:32Z | MEMBER | What is your issue?(From https://github.com/pydata/xarray/issues/5647#issuecomment-946546464). The current alignment logic (as refactored in #5692) requires that two compatible indexes (i.e., of the same type) must relate to one or more coordinates with matching names but also in a matching order. For some multi-coordinate indexes like Possible options:
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7002/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
979316661 | MDU6SXNzdWU5NzkzMTY2NjE= | 5738 | Flexible indexes: how to handle possible dimension vs. coordinate name conflicts? | benbovy 4160723 | closed | 0 | 4 | 2021-08-25T15:31:39Z | 2023-08-23T13:28:41Z | 2023-08-23T13:28:40Z | MEMBER | Another thing that I've noticed while working on #5692. Currently it is not possible to have a Dataset with a same name used for both a dimension and a multi-index level. I guess the reason is to prevent some errors like unmatched dimension sizes when eventually the multi-index is dropped with renamed dimension(s) according to the level names (e.g., with I'm wondering how we should handle this in the context of flexible / custom indexes: A. Keep this current behavior as a special case for (pandas) multi-indexes. This would avoid breaking changes but how to support custom indexes that could eventually be used like pandas multi-indexes in B. Introduce some tag in C. Do not allow any dimension name matching the name of a coordinate attached to a multi-coordinate index. This seems silly? D. Eventually revert #2353 and let users taking care of potential conflicts. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5738/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
1175329407 | I_kwDOAMm_X85GDhp_ | 6392 | Pass indexes to the Dataset and DataArray constructors | benbovy 4160723 | closed | 0 | 6 | 2022-03-21T12:41:51Z | 2023-07-21T20:40:05Z | 2023-07-21T20:40:04Z | MEMBER | Is your feature request related to a problem?This is part of #6293 (explicit indexes next steps). Describe the solution you'd likeA pros:
cons:
An example with a pandas multi-indexCurrently a pandas multi-index may be passed directly as one (dimension) coordinate ; it is then "unpacked" into one dimension (tuple values) coordinate and one or more level coordinates. I would suggest depreciating this behavior in favor of a more explicit (although more verbose) way to pass an existing pandas multi-index: ```python import pandas as pd import xarray as xr pd_idx = pd.MultiIndex.from_product([["a", "b"], [1, 2]], names=("foo", "bar")) idx = xr.PandasMultiIndex(pd_idx, "x") indexes = {"x": idx, "foo": idx, "bar": idx} coords = idx.create_variables() ds = xr.Dataset(coords=coords, indexes=indexes) ``` The cases below should raise an error: ```python ds = xr.Dataset(indexes=indexes) ValueError: missing coordinate(s) for index(es): 'x', 'foo', 'bar'ds = xr.Dataset( coords=coords, indexes={"x": idx, "foo": idx}, ) ValueError: missing index(es) for coordinate(s): 'bar'ds = xr.Dataset( coords={"x": coords["x"], "foo": [0, 1, 2, 3], "bar": coords["bar"]}, indexes=indexes, ) ValueError: conflict between coordinate(s) and index(es): 'foo'ds = xr.Dataset( coords=coords, indexes={"x": idx, "foo": idx, "bar": xr.PandasIndex([0, 1, 2], "y")}, ) ValueError: conflict between coordinate(s) and index(es): 'bar'``` Should we raise an error or simply ignore the index in the case below? ```python ds = xr.Dataset(coords=coords) ValueError: missing index(es) for coordinate(s): 'x', 'foo', 'bar'orcreate unindexed coordinates 'foo' and 'bar' and a 'x' coordinate with a single pandas index``` Should we silently reorder the coordinates and/or indexes when the levels are not passed in the right order? It seems odd requiring mapping elements be passed in a given order. ```python ds = xr.Dataset(coords=coords, indexes={"bar": idx, "x": idx, "foo": idx}) list(ds.xindexes.keys()) ["x", "foo", "bar"]``` How to generalize to any (custom) index?With the case of multi-index, it is pretty easy to check whether the coordinates and indexes are consistent because we ensure consistent However, this may not be easy for other indexes. Some Xarray custom indexes (like a KD-Tree index) likely won't return anything from How could we solve this?
I think I prefer the second option. Describe alternatives you've consideredAlso allow passing index types (and build options) via
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6392/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
1812008663 | I_kwDOAMm_X85sAQ7X | 8002 | Improve discoverability of index build options | benbovy 4160723 | open | 0 | 2 | 2023-07-19T13:54:09Z | 2023-07-19T17:48:51Z | MEMBER | Is your feature request related to a problem?Currently Describe the solution you'd likeWhat about something like this? ```python ds.set_xindex("x", MyCustomIndex.with_options(foo=1, bar=True)) ords.set_xindex("x", *MyCustomIndex.with_options(foo=1, bar=True)) ``` This would require adding a ```python xarray.core.indexesclass Index: @classmethod def with_options(cls) -> tuple[type[Self], dict[str, Any]]: return cls, {} ``` ```python third-party codefrom xarray.indexes import Index class MyCustomIndex(Index):
``` Thoughts? Describe alternatives you've consideredBuild options are also likely defined in the Index constructor, e.g., ```python third-party codefrom xarray.indexes import Index class MyCustomIndex(Index):
``` However, the Index constructor is not public API (only used internally and indirectly in Xarray when setting a new index from existing coordinates). Any other idea? Additional contextNo response |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8002/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1307195361 | PR_kwDOAMm_X847hz6o | 6800 | (scipy 2022 branch) Add an "options" argument to Index.from_variables() | benbovy 4160723 | closed | 0 | 1 | 2022-07-17T20:01:00Z | 2022-12-08T09:38:50Z | 2022-09-02T13:54:46Z | MEMBER | 0 | pydata/xarray/pulls/6800 | It allows passing options to the constructor of a custom The An alternative way would be to pass options via coordinate metadata, like the This PR also adds type annotations to |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6800/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1357296406 | PR_kwDOAMm_X84-IR52 | 6971 | Add set_xindex and drop_indexes methods | benbovy 4160723 | closed | 0 | 7 | 2022-08-31T12:54:35Z | 2022-12-08T09:38:13Z | 2022-09-28T07:25:15Z | MEMBER | 0 | pydata/xarray/pulls/6971 |
This PR adds Dataset and DataArray Some comments and open questions:
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6971/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1363524666 | PR_kwDOAMm_X84-c82D | 6999 | Raise UserWarning when rename creates a new dimension coord | benbovy 4160723 | closed | 0 | 2 | 2022-09-06T16:16:17Z | 2022-12-08T09:38:13Z | 2022-09-27T09:33:40Z | MEMBER | 0 | pydata/xarray/pulls/6999 |
Current implemented "fix": raise a Alternatively, we could:
I don't have strong opinions on this, I'm happy to implement another alternative. The downside of reverting the breaking change now is that unfortunately it will introduce a breaking change in the next release., while workarounds are pretty straightforward. (*) from https://github.com/pydata/xarray/issues/6607#issuecomment-1126587818, doing |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6999/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1364493817 | PR_kwDOAMm_X84-gJCw | 7003 | Misc. fixes for Indexes with pd.Index objects | benbovy 4160723 | closed | 0 | 0 | 2022-09-07T11:05:02Z | 2022-12-08T09:36:51Z | 2022-09-23T07:30:38Z | MEMBER | 0 | pydata/xarray/pulls/7003 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7003/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1390999159 | PR_kwDOAMm_X84_3QjW | 7105 | Fix to_index(): return multiindex level as single index | benbovy 4160723 | closed | 0 | 4 | 2022-09-29T14:44:22Z | 2022-12-08T09:36:51Z | 2022-10-12T14:12:48Z | MEMBER | 0 | pydata/xarray/pulls/7105 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7105/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1193611401 | PR_kwDOAMm_X841rm9D | 6443 | Fix concat with scalar coordinate (wrong index type) | benbovy 4160723 | closed | 0 | 1 | 2022-04-05T19:16:30Z | 2022-12-08T09:36:50Z | 2022-04-06T01:19:48Z | MEMBER | 0 | pydata/xarray/pulls/6443 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6443/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1389632629 | PR_kwDOAMm_X84_ywy1 | 7101 | Fix Dataset.assign_coords overwriting multi-index | benbovy 4160723 | closed | 0 | 0 | 2022-09-28T16:21:48Z | 2022-12-08T09:36:50Z | 2022-09-28T18:02:16Z | MEMBER | 0 | pydata/xarray/pulls/7101 |
@dcherian the |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7101/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1324225268 | PR_kwDOAMm_X848a7mk | 6857 | Fix aligned index variable metadata side effect | benbovy 4160723 | closed | 0 | 0 | 2022-08-01T10:57:16Z | 2022-12-08T09:36:49Z | 2022-08-31T07:16:14Z | MEMBER | 0 | pydata/xarray/pulls/6857 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6857/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1472483025 | PR_kwDOAMm_X85EHyv7 | 7347 | Fix assign_coords resetting all dimension coords to default index | benbovy 4160723 | closed | 0 | 3 | 2022-12-02T08:19:01Z | 2022-12-08T09:36:49Z | 2022-12-02T16:32:40Z | MEMBER | 0 | pydata/xarray/pulls/7347 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7347/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1472470718 | I_kwDOAMm_X85XxB6- | 7346 | assign_coords reset all dimension coords to default (pandas) index | benbovy 4160723 | closed | 0 | 0 | 2022-12-02T08:07:55Z | 2022-12-02T16:32:41Z | 2022-12-02T16:32:41Z | MEMBER | What happened?See https://github.com/martinfleis/xvec/issues/13#issue-1472023524 What did you expect to happen?
Minimal Complete Verifiable ExampleSee https://github.com/martinfleis/xvec/issues/13#issue-1472023524 MVCE confirmation
Relevant log outputNo response Anything else we need to know?No response Environment
Xarray version 2022.11.0
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7346/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
1151751524 | I_kwDOAMm_X85EplVk | 6308 | xr.doctor(): diagnostics on a Dataset / DataArray ? | benbovy 4160723 | open | 0 | 4 | 2022-02-26T12:10:07Z | 2022-11-07T15:28:35Z | MEMBER | Is your feature request related to a problem?Recently I've been reading through various issue reports here and there (GH issues and discussions, forums, etc.) and I'm wondering if it wouldn't be useful to have some function in Xarray that inspects a Dataset or DataArray and reports a bunch of diagnostics, so that the community could better help troubleshooting performance or other issues faced by users. It's not always obvious where to look (e.g., number of chunks of a dask array, number of tasks of a dask graph, etc.) to diagnose issues, sometimes even for experienced users. Describe the solution you'd likeA
Describe alternatives you've consideredNone Additional contextNo response |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6308/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1322198907 | I_kwDOAMm_X85Ozyd7 | 6849 | Public API for setting new indexes: add a set_xindex method? | benbovy 4160723 | closed | 0 | 5 | 2022-07-29T12:38:34Z | 2022-09-28T07:25:16Z | 2022-09-28T07:25:16Z | MEMBER | What is your issue?xref https://github.com/pydata/xarray/pull/6795#discussion_r932665544 and #6293 (Public API section). The
Thoughts @pydata/xarray? |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6849/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
1361896826 | I_kwDOAMm_X85RLOV6 | 6989 | reset multi-index to single index (level): coordinate not renamed | benbovy 4160723 | closed | 0 | benbovy 4160723 | 0 | 2022-09-05T12:45:22Z | 2022-09-27T10:35:39Z | 2022-09-27T10:35:39Z | MEMBER | What happened?Resetting a multi-index to a single level (i.e., a single index) does not rename the remaining level coordinate to the dimension name. What did you expect to happen?While it is certainly more consistent not to rename the level coordinate here (since an index can be assigned to a non-dimension coordinate now), it breaks from the old behavior. I think it's better not introduce any breaking change. As discussed elsewhere, we might eventually want to deprecate Minimal Complete Verifiable Example```Python import pandas as pd import xarray as xr midx = pd.MultiIndex.from_product([["a", "b"], [1, 2]], names=("foo", "bar")) ds = xr.Dataset(coords={"x": midx}) <xarray.Dataset>Dimensions: (x: 4)Coordinates:* x (x) object MultiIndex* foo (x) object 'a' 'a' 'b' 'b'* bar (x) int64 1 2 1 2Data variables:emptyrds = ds.reset_index("foo") v2022.03.0<xarray.Dataset>Dimensions: (x: 4)Coordinates:* x (x) int64 1 2 1 2foo (x) object 'a' 'a' 'b' 'b'Data variables:emptyv2022.06.0<xarray.Dataset>Dimensions: (x: 4)Coordinates:foo (x) object 'a' 'a' 'b' 'b'* bar (x) int64 1 2 1 2Dimensions without coordinates: xData variables:empty``` MVCE confirmation
Relevant log outputNo response Anything else we need to know?No response Environment |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6989/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | |||||
1361626450 | I_kwDOAMm_X85RKMVS | 6987 | Indexes.get_unique() TypeError with pandas indexes | benbovy 4160723 | closed | 0 | benbovy 4160723 | 0 | 2022-09-05T09:02:50Z | 2022-09-23T07:30:39Z | 2022-09-23T07:30:39Z | MEMBER | @benbovy I also just tested the Taking the above dataset ```python
TypeError: unhashable type: 'MultiIndex' ``` However, for
[<xarray.core.indexes.PandasMultiIndex at 0x7f105bf1df20>] ``` Originally posted by @lukasbindreiter in https://github.com/pydata/xarray/issues/6752#issuecomment-1236717180 |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6987/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | |||||
1364798843 | PR_kwDOAMm_X84-hLRI | 7004 | Rework PandasMultiIndex.sel internals | benbovy 4160723 | open | 0 | 2 | 2022-09-07T14:57:29Z | 2022-09-22T20:38:41Z | MEMBER | 0 | pydata/xarray/pulls/7004 |
This PR hopefully improves how are handled the labels that are provided for multi-index level coordinates in More specifically, slices are handled in a cleaner way and it is now allowed to provide array-like labels.
This yields a predictable behavior: as soon as one of the provided labels is a slice or array-like, the multi-index and all its level coordinates are kept in the result. Some cases illustrated below (I compare this PR with an older release due to the errors reported in #6838): ```python import xarray as xr import pandas as pd midx = pd.MultiIndex.from_product([list("abc"), range(4)], names=("one", "two")) ds = xr.Dataset(coords={"x": midx}) <xarray.Dataset>Dimensions: (x: 12)Coordinates:* x (x) object MultiIndex* one (x) object 'a' 'a' 'a' 'a' 'b' 'b' 'b' 'b' 'c' 'c' 'c' 'c'* two (x) int64 0 1 2 3 0 1 2 3 0 1 2 3Data variables:empty``` ```python ds.sel(one="a", two=0) this PR<xarray.Dataset>Dimensions: ()Coordinates:x object ('a', 0)one <U1 'a'two int64 0Data variables:emptyv2022.3.0<xarray.Dataset>Dimensions: ()Coordinates:x object ('a', 0)Data variables:empty``` ```python ds.sel(one="a") this PR:<xarray.Dataset>Dimensions: (two: 4)Coordinates:* two (two) int64 0 1 2 3one <U1 'a'Data variables:emptyv2022.3.0<xarray.Dataset>Dimensions: (two: 4)Coordinates:* two (two) int64 0 1 2 3Data variables:empty``` ```python ds.sel(one=slice("a", "b")) this PR<xarray.Dataset>Dimensions: (x: 8)Coordinates:* x (x) object MultiIndex* one (x) object 'a' 'a' 'a' 'a' 'b' 'b' 'b' 'b'* two (x) int64 0 1 2 3 0 1 2 3Data variables:emptyv2022.3.0<xarray.Dataset>Dimensions: (two: 8)Coordinates:* two (two) int64 0 1 2 3 0 1 2 3Data variables:empty``` ```python ds.sel(one="a", two=slice(1, 1)) this PR<xarray.Dataset>Dimensions: (x: 1)Coordinates:* x (x) object MultiIndex* one (x) object 'a'* two (x) int64 1Data variables:emptyv2022.3.0<xarray.Dataset>Dimensions: (x: 1)Coordinates:* x (x) MultiIndex- one (x) object 'a'- two (x) int64 1Data variables:empty``` ```python ds.sel(one=["b", "c"], two=[0, 2]) this PR<xarray.Dataset>Dimensions: (x: 4)Coordinates:* x (x) object MultiIndex* one (x) object 'b' 'b' 'c' 'c'* two (x) int64 0 2 0 2Data variables:emptyv2022.3.0ValueError: Vectorized selection is not available along coordinate 'one' (multi-index level)``` |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7004/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
302077805 | MDU6SXNzdWUzMDIwNzc4MDU= | 1961 | Extend xarray with custom "coordinate wrappers" | benbovy 4160723 | closed | 0 | 10 | 2018-03-04T11:26:15Z | 2022-09-19T08:47:45Z | 2022-09-19T08:47:44Z | MEMBER | Recent and ongoing developments in xarray turn DataArray and Dataset more and more into data wrappers that are extensible at (almost) every level:
Regarding the latter, I’m thinking about the idea of extending xarray at an even more abstract level, i.e., the possibility of adding / registering "coordinate wrappers" to EDIT: "coordinate agents" may not be quite right here, I changed that to "coordinate wrappers") Indexes are a specific case of coordinate wrappers that serve the purpose of indexing. This is built in xarray. While indexing is enough in 80% of cases, I see a couple of use cases where other coordinate wrappers (built outside of xarray) would be nice to have:
In those examples we usually rely on coordinate attributes and/or classes that encapsulate xarray objects to implement the specific features that we need. While it works, it has limitations and I think it can be improved. Custom coordinate wrappers would be a way of extending xarray that is very consistent with other current (or considered) extension mechanisms. This is still a very vague idea and I’m sure that there are lots of details that can be discussed (serialization, etc.). But before going further, I’d like to know your thoughts @pydata/xarray. Do you think it is a silly idea? Do you have in mind other use cases where custom coordinate wrappers would be useful? |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1961/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
955936490 | MDU6SXNzdWU5NTU5MzY0OTA= | 5647 | Flexible indexes: review the implementation of alignment and merge | benbovy 4160723 | closed | 0 | 12 | 2021-07-29T15:03:23Z | 2022-09-07T09:47:13Z | 2022-09-07T09:47:13Z | MEMBER | The current implementation of the
This currently works well since a pd.Index can be directly treated as a 1-d array but this won’t be always the case anymore with custom indexes. I'm opening this issue to gather ideas on how best to handle alignment in a more flexible way (I haven't been thinking much at this problem yet). |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5647/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
1325016510 | I_kwDOAMm_X85O-iW- | 6860 | Align with join='override' may update index coordinate metadata | benbovy 4160723 | open | 0 | 0 | 2022-08-01T21:45:13Z | 2022-08-01T21:49:41Z | MEMBER | What happened?It seems that cf. @keewis' original https://github.com/pydata/xarray/pull/6857#discussion_r934425142. What did you expect to happen?Index coordinate metadata unaffected by alignment (i.e., metadata is passed through object -> aligned object for each object), like for align with other join methods. Minimal Complete Verifiable Example```Python import xarray as xr ds1 = xr.Dataset(coords={"x": ("x", [1, 2, 3], {"foo": 1})}) ds2 = xr.Dataset(coords={"x": ("x", [1, 2, 3], {"bar": 2})}) aligned1, aligned2 = xr.align(ds1, ds2, join="override") aligned1.x.attrs v2022.03.0 -> {'foo': 1}v2022.06.0 -> {'foo': 1, 'bar': 2}PR #6857 -> {'foo': 1}expected -> {'foo': 1}aligned2.x.attrs v2022.03.0 -> {}v2022.06.0 -> {'foo': 1, 'bar': 2}PR #6857 -> {'foo': 1, 'bar': 2}expected -> {'bar': 2}aligned11, aligned22 = xr.align(ds1, ds2, join="inner") aligned11.x.attrs {'foo': 1}aligned22.x.attrs {'bar': 2}``` MVCE confirmation
Relevant log outputNo response Anything else we need to know?No response Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.9.6 | packaged by conda-forge | (default, Jul 11 2021, 03:36:15)
[Clang 11.1.0 ]
python-bits: 64
OS: Darwin
OS-release: 20.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: (None, 'UTF-8')
libhdf5: 1.12.1
libnetcdf: 4.8.1
xarray: 0.21.2.dev137+g30023a484
pandas: 1.4.0
numpy: 1.22.2
scipy: 1.7.1
netCDF4: 1.5.8
pydap: installed
h5netcdf: 0.11.0
h5py: 3.4.0
Nio: None
zarr: 2.6.1
cftime: 1.5.2
nc_time_axis: 1.2.0
PseudoNetCDF: installed
rasterio: 1.2.10
cfgrib: 0.9.8.5
iris: 3.0.4
bottleneck: 1.3.2
dask: 2022.01.1
distributed: 2022.01.1
matplotlib: 3.4.3
cartopy: 0.20.1
seaborn: 0.11.1
numbagg: 0.2.1
fsspec: 0.8.5
cupy: None
pint: 0.16.1
sparse: 0.13.0
flox: None
numpy_groupies: None
setuptools: 57.4.0
pip: 20.2.4
conda: None
pytest: 6.2.5
IPython: 7.27.0
sphinx: 3.3.1
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6860/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1322190255 | I_kwDOAMm_X85OzwWv | 6848 | Update API | benbovy 4160723 | closed | 0 | 0 | 2022-07-29T12:30:08Z | 2022-07-29T12:30:23Z | 2022-07-29T12:30:23Z | MEMBER | { "url": "https://api.github.com/repos/pydata/xarray/issues/6848/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | |||||||
1176745736 | PR_kwDOAMm_X840z4zt | 6400 | Speed-up multi-index html repr + add display_values_threshold option | benbovy 4160723 | closed | 0 | 3 | 2022-03-22T12:57:37Z | 2022-03-29T07:10:22Z | 2022-03-29T07:05:32Z | MEMBER | 0 | pydata/xarray/pulls/6400 | This adds This optimized ```python import xarray as xr ds = xr.tutorial.load_dataset("air_temperature") da = ds["air"].stack(z=[...]) da.shape (3869000,)%timeit -n 1 -r 1 da.repr_html() 9.96 ms !```
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6400/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1174675456 | PR_kwDOAMm_X840tJ9A | 6388 | isel: convert IndexVariable to Variable if index is dropped | benbovy 4160723 | closed | 0 | 1 | 2022-03-20T20:29:58Z | 2022-03-29T07:10:08Z | 2022-03-21T04:47:48Z | MEMBER | 0 | pydata/xarray/pulls/6388 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6388/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
616432851 | MDExOlB1bGxSZXF1ZXN0NDE2NTQ0MzE4 | 4053 | Fix html repr in untrusted notebooks (plain text fallback) | benbovy 4160723 | closed | 0 | 5 | 2020-05-12T07:38:22Z | 2022-03-29T07:10:07Z | 2020-05-20T17:06:40Z | MEMBER | 0 | pydata/xarray/pulls/4053 |
This is not very elegant (actually plain text repr is already included in the notebook as I don't really know if this can be properly tested (I only added a basic test). Steps to test this fix:
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/4053/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
849315490 | MDExOlB1bGxSZXF1ZXN0NjA4MTEwNjI0 | 5102 | Flexible indexes: add Index base class and xindexes properties | benbovy 4160723 | closed | 0 | 10 | 2021-04-02T16:18:07Z | 2022-03-29T07:10:07Z | 2021-05-11T08:21:26Z | MEMBER | 0 | pydata/xarray/pulls/5102 | This PR clears up the path for flexible indexes:
~~The latter is a breaking change, although I'm not sure if the This is still work in progress, there are many broken tests that are not fixed yet. (EDIT: all tests should be fixed now). There's a lot of dirty fixes to avoid circular dependencies and in the many places where we still need direct access to the |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5102/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
893415955 | MDExOlB1bGxSZXF1ZXN0NjQ1OTMzODI3 | 5322 | Internal refactor of label-based data selection | benbovy 4160723 | closed | 0 | 1 | 2021-05-17T14:52:49Z | 2022-03-29T07:10:07Z | 2021-06-08T09:35:54Z | MEMBER | 0 | pydata/xarray/pulls/5322 | Xarray label-based data selection now relies on a newly added
For a simple Moving the label->positional indexer conversion logic into Working towards a more flexible/generic system, we still need to figure out how to:
This could be done in follow-up PRs.. Side note: I've initially tried to return from Happy to hear your thoughts @pydata/xarray. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5322/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
819062172 | MDExOlB1bGxSZXF1ZXN0NTgyMjI0MTQ4 | 4979 | Flexible indexes refactoring notes | benbovy 4160723 | closed | 0 | 22 | 2021-03-01T16:57:32Z | 2022-03-29T07:09:31Z | 2021-03-17T16:47:29Z | MEMBER | 0 | pydata/xarray/pulls/4979 | As a preliminary step before I take on the refactoring and implementation of flexible indexes in Xarray for the next few months, I reviewed the status of https://github.com/pydata/xarray/projects/1 and started compiling partially implemented or planned changes, thoughts, etc. into a single document that may serve as a basis for further discussion and implementation work. It's still very much work in progress (I will update it regularly in the forthcoming days) and it is very open to discussion (we can use this PR for that)! I'm not sure if Xarray's root folder is a good place for this document, though. We could move this into a new repository in I'm looking forward to getting started on this and to getting your thoughts/feedback! |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/4979/reactions", "total_count": 13, "+1": 3, "-1": 0, "laugh": 0, "hooray": 7, "confused": 0, "heart": 3, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
903899735 | MDExOlB1bGxSZXF1ZXN0NjU1MTA5NDg0 | 5385 | Cast PandasIndex to pd.(Multi)Index | benbovy 4160723 | closed | 0 | 0 | 2021-05-27T15:15:41Z | 2022-03-29T07:09:31Z | 2021-05-28T08:28:11Z | MEMBER | 0 | pydata/xarray/pulls/5385 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5385/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1174687047 | PR_kwDOAMm_X840tLrz | 6389 | Re-index: fix missing variable metadata | benbovy 4160723 | closed | 0 | 2 | 2022-03-20T21:11:38Z | 2022-03-29T07:09:31Z | 2022-03-21T07:53:05Z | MEMBER | 0 | pydata/xarray/pulls/6389 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6389/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1174610081 | PR_kwDOAMm_X840s_xU | 6385 | Fix concat with scalar coordinate | benbovy 4160723 | closed | 0 | 0 | 2022-03-20T16:46:48Z | 2022-03-29T07:09:30Z | 2022-03-21T04:49:23Z | MEMBER | 0 | pydata/xarray/pulls/6385 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6385/reactions", "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1174615799 | PR_kwDOAMm_X840tAtL | 6386 | Fix Dataset groupby returning a DataArray | benbovy 4160723 | closed | 0 | 0 | 2022-03-20T17:06:13Z | 2022-03-29T07:09:30Z | 2022-03-20T18:55:27Z | MEMBER | 0 | pydata/xarray/pulls/6386 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6386/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1175490214 | PR_kwDOAMm_X840vt1_ | 6394 | Fix DataArray groupby returning a Dataset | benbovy 4160723 | closed | 0 | 0 | 2022-03-21T14:43:21Z | 2022-03-29T07:09:30Z | 2022-03-21T15:26:20Z | MEMBER | 0 | pydata/xarray/pulls/6394 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6394/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1174622308 | PR_kwDOAMm_X840tBvD | 6387 | Fix concat with variable or dataarray as dim (propagate attrs) | benbovy 4160723 | closed | 0 | 1 | 2022-03-20T17:27:41Z | 2022-03-29T07:09:29Z | 2022-03-20T18:53:46Z | MEMBER | 0 | pydata/xarray/pulls/6387 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6387/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1183360119 | PR_kwDOAMm_X841JuRv | 6418 | Fix concat with scalar coordinate (dtype) | benbovy 4160723 | closed | 0 | 0 | 2022-03-28T12:22:50Z | 2022-03-29T07:06:46Z | 2022-03-28T16:05:01Z | MEMBER | 0 | pydata/xarray/pulls/6418 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6418/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
968796847 | MDU6SXNzdWU5Njg3OTY4NDc= | 5697 | Coerce the labels passed to Index.query to array-like objects | benbovy 4160723 | closed | 0 | 3 | 2021-08-12T13:09:40Z | 2022-03-17T17:11:43Z | 2022-03-17T17:11:43Z | MEMBER | When looking at #5691 I noticed that the labels are sometimes coerced to arrays (i.e., #3153) but not always. Later in Shouldn't we therefore make things easier and ensure that the labels given to |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5697/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
968990058 | MDU6SXNzdWU5Njg5OTAwNTg= | 5700 | Selection with multi-index and float32 values | benbovy 4160723 | closed | 0 | 0 | 2021-08-12T14:55:11Z | 2022-03-17T17:11:43Z | 2022-03-17T17:11:43Z | MEMBER | I guess it's rather an edge case, but a similar issue than the one fixed in #3153 may occur with multi-indexes: ```python
```python
```python
(xarray version: 0.18.2 as there's a regression introduced in 0.19.0 #5691) |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5700/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
955605233 | MDU6SXNzdWU5NTU2MDUyMzM= | 5645 | Flexible indexes: handle renaming coordinate variables | benbovy 4160723 | closed | 0 | 0 | 2021-07-29T08:42:00Z | 2022-03-17T17:11:42Z | 2022-03-17T17:11:42Z | MEMBER | We should have some API in This currently implemented here where the underlying This logic should be moved into Other, custom indexes might also have internal attributes to update, so we might need formal API for that. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5645/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
1005623261 | I_kwDOAMm_X8478Jfd | 5812 | Check explicit indexes when comparing two xarray objects | benbovy 4160723 | open | 0 | 2 | 2021-09-23T16:19:32Z | 2021-09-24T15:59:02Z | MEMBER | Is your feature request related to a problem? Please describe.
With the explicit index refactor, two Dataset or DataArray objects Describe the solution you'd like
I'd suggest that One drawback is when we want to check either the attributes or the indexes but not both. Should we add options like suggested in #5733 then? |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5812/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1006335177 | I_kwDOAMm_X847-3TJ | 5814 | Confusing assertion message when comparing datasets with differing coordinates | benbovy 4160723 | open | 0 | 1 | 2021-09-24T10:50:11Z | 2021-09-24T15:17:00Z | MEMBER | What happened:
When two datasets What you expected to happen: An output assertion error message that shows only the differing coordinates. Minimal Complete Verifiable Example: ```python
Differing coordinates: L * x (x) int64 0 1 R * x (x) int64 2 3 Differing data variables: L var (x) float64 10.0 11.0 R var (x) float64 10.0 11.0 ``` I would rather expect: ```python
Differing coordinates: L * x (x) int64 0 1 R * x (x) int64 2 3 ``` Anything else we need to know?: Environment: Output of <tt>xr.show_versions()</tt>INSTALLED VERSIONS ------------------ commit: None python: 3.9.6 | packaged by conda-forge | (default, Jul 11 2021, 03:36:15) [Clang 11.1.0 ] python-bits: 64 OS: Darwin OS-release: 20.3.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: None LOCALE: (None, 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.19.1.dev72+ga8d84c703.d20210901 pandas: 1.3.2 numpy: 1.21.2 scipy: 1.7.1 netCDF4: 1.5.6 pydap: installed h5netcdf: 0.8.1 h5py: 3.3.0 Nio: None zarr: 2.6.1 cftime: 1.5.0 nc_time_axis: 1.2.0 PseudoNetCDF: installed rasterio: 1.2.1 cfgrib: 0.9.8.5 iris: 3.0.4 bottleneck: 1.3.2 dask: 2021.01.1 distributed: 2021.01.1 matplotlib: 3.4.3 cartopy: 0.18.0 seaborn: 0.11.1 numbagg: None fsspec: 0.8.5 cupy: None pint: 0.16.1 sparse: 0.11.2 setuptools: 57.4.0 pip: 20.2.4 conda: None pytest: 6.2.5 IPython: 7.27.0 sphinx: 3.3.1 |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5814/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
985162305 | MDU6SXNzdWU5ODUxNjIzMDU= | 5755 | Mypy errors with the last version of _typed_ops.pyi | benbovy 4160723 | closed | 0 | 5 | 2021-09-01T13:34:52Z | 2021-09-13T10:53:16Z | 2021-09-13T00:04:54Z | MEMBER | What happened: Since #5569 I get a lot of mypy errors from
I also tried @max-sixty @Illviljan Any idea on what's happening? What you expected to happen: No mypy error in all cases. Anything else we need to know?:
Environment: mypy 0.910 python 3.9.6 (also tested with 3.8) |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5755/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
977149831 | MDU6SXNzdWU5NzcxNDk4MzE= | 5732 | Coordinates implicitly created when passing a DataArray as coord to Dataset constructor | benbovy 4160723 | open | 0 | 3 | 2021-08-23T15:20:37Z | 2021-08-24T14:18:09Z | MEMBER | I stumbled on this while working on #5692. Is this intended behavior or unwanted side effect? What happened: Create a new Dataset by passing a DataArray object as coordinate also add the DataArray coordinates to the dataset: ```python
What you expected to happen: The behavior above seems a bit counter-intuitive to me. I would rather expect no additional coordinates auto-magically added to the dataset, i.e. only one ```python
Environment: Output of <tt>xr.show_versions()</tt>INSTALLED VERSIONS ------------------ commit: None python: 3.8.6 | packaged by conda-forge | (default, Nov 27 2020, 19:17:44) [Clang 11.0.0 ] python-bits: 64 OS: Darwin OS-release: 20.3.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: None LOCALE: (None, 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.19.0 pandas: 1.1.5 numpy: 1.21.1 scipy: 1.7.0 netCDF4: 1.5.5.1 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.3.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2021.07.2 distributed: 2021.07.2 matplotlib: 3.3.3 cartopy: 0.19.0.post1 seaborn: None numbagg: None pint: None setuptools: 49.6.0.post20201009 pip: 20.3.1 conda: None pytest: 6.1.2 IPython: 7.25.0 sphinx: 3.3.1 |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5732/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
933551030 | MDU6SXNzdWU5MzM1NTEwMzA= | 5553 | Flexible indexes: how best to implement the new data model? | benbovy 4160723 | closed | 0 | 2 | 2021-06-30T10:38:13Z | 2021-08-09T07:56:56Z | 2021-08-09T07:56:56Z | MEMBER | Yesterday during the flexible indexes weekly meeting we have discussed with @shoyer and @jhamman on what would be the best approach to implement the new data model described here. In this issue I summarize the implementation of the current data model as well as some suggestions for the new data model along with their pros / cons (I might still be missing important ones!). I don't think there's an easy or ideal solution unfortunately, so @pydata/xarray any feedback would be very welcome! Current data model implementationCurrently any (pandas) index is wrapped into an Proposed alternativesOption 1: independent (coordinate) variables and indexesIndexes and coordinates are loosely coupled, i.e., a Pros:
Cons:
Option 2: indexes hold coordinate variablesThis is the opposite approach of the current one. Here, a Pros:
Cons:
Option 3: intermediate solutionWhen an index is set (or unset), it returns a new set of coordinate variables to replace the existing ones. Pros:
Cons:
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5553/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
187859705 | MDU6SXNzdWUxODc4NTk3MDU= | 1092 | Dataset groups | benbovy 4160723 | closed | 0 | 20 | 2016-11-07T23:28:36Z | 2021-07-02T19:56:50Z | 2021-07-02T19:56:49Z | MEMBER | EDIT: see https://github.com/pydata/xarray/issues/4118 for ongoing discussion Probably it has been already suggested, but similarly to netCDF4 groups it would be nice if we could access Currently xarray allows loading a specific netCDF4 group into a I think about an implementation of
Questions:
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1092/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
902009258 | MDU6SXNzdWU5MDIwMDkyNTg= | 5376 | Multi-scale datasets and custom indexes | benbovy 4160723 | open | 0 | 6 | 2021-05-26T08:38:00Z | 2021-06-02T08:07:38Z | MEMBER | I've been wondering if:
I'm thinking of an API that would look like this: ```python lazily load a big n-d image (full resolution) as a xarray.Datasetxyz_dataset = ... set a new index for the x/y/z coordinates(
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5376/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 1 } |
xarray 13221727 | issue | ||||||||
869721207 | MDU6SXNzdWU4Njk3MjEyMDc= | 5226 | Attributes encoding compatibility between backends | benbovy 4160723 | open | 0 | 1 | 2021-04-28T09:11:19Z | 2021-04-28T15:42:42Z | MEMBER | What happened: Let's create an Zarr dataset with some "less common" dtype and fill value, open it with Xarray and save the dataset as NetCDF: ```python import xarray as xr import zarr g = zarr.group() g.create('arr', shape=3, fill_value='z', dtype='<U1') g['arr'].attrs['_ARRAY_DIMENSIONS'] = ('dim_1') -- without masking fill valuesds = xr.open_zarr(g.store, mask_and_scale=False) ds.arr.attrs # returns {'_FillValue': 'z'} error: netCDF4 does not yet support setting a fill value for variable-length stringsds.to_netcdf('test.nc') -- with masking fill valuesds2 = xr.open_zarr(g.store, mask_and_scale=True) returns a dict that includes item _FillValue': 'z'ds2.arr.encoding same error than aboveds2.to_netcdf('out2.nc') ``` What you expected to happen: Seamless conversion (read/write) from one backend to another. Is there anything we could do to improve the case shown here above, and maybe other cases like the one described in #5223? Environment: Output of <tt>xr.show_versions()</tt>INSTALLED VERSIONS ------------------ commit: None libhdf5: None libnetcdf: None xarray: 0.17.0 pandas: 1.0.3 numpy: 1.18.1 scipy: 1.3.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.8.1 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.11.0 distributed: 2.14.0 matplotlib: 3.1.1 cartopy: None seaborn: None numbagg: None pint: None setuptools: 46.1.3.post20200325 pip: 19.2.3 conda: None pytest: 5.4.1 IPython: 7.13.0 sphinx: None |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5226/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
733077617 | MDU6SXNzdWU3MzMwNzc2MTc= | 4555 | Vectorized indexing (isel) of chunked data with 1D indices gives weird chunks | benbovy 4160723 | open | 0 | 1 | 2020-10-30T10:55:33Z | 2021-03-02T17:36:48Z | MEMBER | What happened: Applying What you expected to happen: More consistent chunk sizes. Minimal Complete Verifiable Example: Let's create a chunked DataArray ```python In [1]: import numpy as np In [2]: import xarray as xr In [3]: da = xr.DataArray(np.random.rand(100), dims='points').chunk(50) In [4]: da Out[4]: <xarray.DataArray (points: 100)> dask.array<xarray-\<this-array>, shape=(100,), dtype=float64, chunksize=(50,), chunktype=numpy.ndarray> Dimensions without coordinates: points ``` Select random indices results in a lot of small chunks ```python In [5]: indices = xr.Variable('nodes', np.random.choice(np.arange(100, dtype='int'), size=10)) In [6]: da_sel = da.isel(points=indices) In [7]: da_sel.chunks Out[7]: ((1, 1, 3, 1, 1, 3),) ``` What I would expect
This works fine with 2+ dimensional indexers, e.g., ```python In [9]: indices_2d = xr.Variable(('x', 'y'), np.random.choice(np.arange(100), size=(10, 10))) In [10]: da_sel_2d = da.isel(points=indices_2d) In [11]: da_sel_2d.chunks Out[11]: ((10,), (10,)) ``` Anything else we need to know?: I suspect the issue is here: In the example above I think we still want vectorized indexing (i.e., call Environment: Output of <tt>xr.show_versions()</tt>INSTALLED VERSIONS ------------------ commit: None python: 3.8.3 | packaged by conda-forge | (default, Jun 1 2020, 17:21:09) [Clang 9.0.1 ] python-bits: 64 OS: Darwin OS-release: 18.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.UTF-8 libhdf5: None libnetcdf: None xarray: 0.16.1 pandas: 1.1.3 numpy: 1.19.1 scipy: 1.5.2 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.19.0 distributed: 2.25.0 matplotlib: 3.3.1 cartopy: None seaborn: None numbagg: None pint: None setuptools: 47.3.1.post20200616 pip: 20.1.1 conda: None pytest: 5.4.3 IPython: 7.16.1 sphinx: 3.2.1 |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/4555/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
187873247 | MDU6SXNzdWUxODc4NzMyNDc= | 1094 | Supporting out-of-core computation/indexing for very large indexes | benbovy 4160723 | open | 0 | 5 | 2016-11-08T00:56:56Z | 2021-01-26T20:09:12Z | MEMBER | (Follow-up of discussion here https://github.com/pydata/xarray/pull/1024#issuecomment-258524115). xarray + dask.array successfully enable out-of-core computation for very large variables that doesn't fit in memory. One current limitation is that the indexes of a However, this may be problematic in some specific cases where we have to deal with very large indexes. As an example, big unstructured meshes often have coordinates (x, y, z) arranged as 1-d arrays of length that equals the number of nodes, which can be very large!! (See, e.g., ugrid conventions). It would be very nice if xarray could also help for these use cases. Therefore I'm wondering if (and how) out-of-core support can be extended to indexes and indexing. I've briefly looked at the documentation on My knowledge of dask is very limited, though. So I've no doubt that this suggestion is very simplistic and not very efficient, or that there are better approaches. I'm also certainly missing other issues not directly related to indexing. Any thoughts? cc @shoyer @mrocklin |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1094/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
512564243 | MDExOlB1bGxSZXF1ZXN0MzMyNTUyNTA3 | 3448 | Add license for the icons used in the html repr | benbovy 4160723 | closed | 0 | 1 | 2019-10-25T14:57:20Z | 2019-10-25T15:48:52Z | 2019-10-25T15:40:46Z | MEMBER | 0 | pydata/xarray/pulls/3448 | { "url": "https://api.github.com/repos/pydata/xarray/issues/3448/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
249584098 | MDExOlB1bGxSZXF1ZXN0MTM1Mjk4ODY3 | 1507 | Detailed report for testing.assert_equal and testing.assert_identical | benbovy 4160723 | closed | 0 | 18 | 2017-08-11T09:38:23Z | 2019-10-25T15:07:39Z | 2019-01-18T09:16:31Z | MEMBER | 0 | pydata/xarray/pulls/1507 |
~~In addition to ~~This may not be the most elegant solution, but it is helpful when datasets only differ by their attributes attached to coordinates or data variables (not shown in repr). I'm open to any suggestion.~~ The report shows the differences for dimensions, data values ( There is currently not much tests for Not sure if it's worth a what's new entry (EDIT: added one). |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1507/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
274619743 | MDExOlB1bGxSZXF1ZXN0MTUzMTE4MjQ3 | 1723 | Fix unexpected behavior of .set_index() since pandas 0.21.0 | benbovy 4160723 | closed | 0 | 0 | 2017-11-16T18:37:20Z | 2019-10-25T15:07:18Z | 2017-11-17T00:54:51Z | MEMBER | 0 | pydata/xarray/pulls/1723 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1723/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
287844110 | MDExOlB1bGxSZXF1ZXN0MTYyNDI2NzU2 | 1820 | WIP: html repr | benbovy 4160723 | closed | 0 | 40 | 2018-01-11T16:33:07Z | 2019-10-25T15:06:58Z | 2019-10-24T16:48:46Z | MEMBER | 0 | pydata/xarray/pulls/1820 |
This is work in progress, although the basic functionality is there. You can see a preview here: http://nbviewer.jupyter.org/gist/benbovy/3009f342fb283bd0288125a1f7883ef2 TODO:
Nice to have (keep this for later):
Other thoughts (old)A big challenge here is to provide both robust and flexible styling (CSS): - I have tested the current styling in jupyterlab (0.30.6, light theme), notebook (5.2.2) and nbviewer: despite some slight differences it looks quite good! - However, the current CSS code is a bit fragile (I had to add a lot of `!important`). Probably this could be a bit cleaned and optimized (unfortunately my CSS skills are limited). - Also, with the jupyterlab's dark theme it looks ugly. We probably need to use jupyterlab CSS variables so that our CSS scheme is compatible with the theme machinery, but at the same time we need to support other front-ends. So we probably need to maintain different stylings (i.e., multiple CSS files, one of them picked-up depending on the front-end), though I don't know if it's easy to automatically detect the front-end (choosing a default style is difficult too). - The notebook rendering on Github seems to disable style tags (no style is applied to the output, see https://gist.github.com/benbovy/3009f342fb283bd0288125a1f7883ef2). Output is not readable at all in this case, so it might be useful to allow turning off rich output as an option. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1820/reactions", "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
264747372 | MDU6SXNzdWUyNjQ3NDczNzI= | 1627 | html repr of xarray object (for the notebook) | benbovy 4160723 | closed | 0 | 39 | 2017-10-11T21:49:20Z | 2019-10-24T16:56:15Z | 2019-10-24T16:48:47Z | MEMBER | Edit: preview for
I started to think a bit more deeply about how could look like a more rich, html-based representation of xarray objects that we would see, e.g., in jupyter notebooks. Here are some ideas for Some notes:
- The html repr looks pretty similar than the plain-text repr. I think it's better if they don't differ too much from each other.
- For the sake of consistency, I've stolen some style from It is still, of course, some preliminary thoughts. Any feedback/suggestion is welcome, even opinions about whether an html repr is really needed or not! |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1627/reactions", "total_count": 11, "+1": 7, "-1": 0, "laugh": 0, "hooray": 4, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
234658224 | MDU6SXNzdWUyMzQ2NTgyMjQ= | 1447 | Package naming "conventions" for xarray extensions | benbovy 4160723 | closed | 0 | 5 | 2017-06-08T21:14:24Z | 2019-06-28T22:58:33Z | 2019-06-28T21:58:33Z | MEMBER | I'm wondering what would be a good name for a package that primarily aims at providing an xarray extension (in the form of a I'm currently thinking about using a prefix like the For example, for a xarray extension for signal processing we would have: package full name: ```python
The main advantage is that we directly have an idea on what the package is about. It may be also good for the overall visibility of both xarray and its 3rd-party extensions. The downside is that there is three name variations: one for getting and installing the package, another one for importing the package and again another one for using the accessor. This may be annoying especially for new users who are not accustomed to this kind of naming convention. Conversely, choosing a different, unrelated name like salem or pangaea has the advantage of using the same name everywhere and perhaps providing multiple accessors in the same package, but given that the number of xarray extensions is likely to grow in a next future (see, e.g., the pangeo-data project) it would become difficult to have a clear view of the whole xarray package ecosystem. Any thoughts? |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1447/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
180676935 | MDU6SXNzdWUxODA2NzY5MzU= | 1030 | Concatenate multiple variables into one variable with a multi-index (categories) | benbovy 4160723 | closed | 0 | 3 | 2016-10-03T15:54:23Z | 2019-02-25T07:25:40Z | 2019-02-25T07:25:40Z | MEMBER | I often have to deal with datasets in this form (multiple variables of different sizes, each representing different categories, on the same physical dimension but using different names as they have different labels),
where it would be more convenient to have the data re-arranged into the following form (concatenate the variables into a single variable with a multi-index with the labels of both the categories and the physical coordinate):
The latter would allow using xarray's nice features like Currently, the best way that I've found to transform the data is something like: ``` python data = np.concatenate([ds.data_band1, ds.data_band2, ds.data_band3]) wn = np.concatenate([ds.wn_band1, ds.wn_band2, ds.wn_band3]) band = np.concatenate([np.repeat(1, 4), np.repeat(2, 6), np.repeat(3, 8)]) midx = pd.MultiIndex.from_arrays([band, wn], names=('band', 'wn')) ds2 = xr.Dataset({'data': ('spectrum', data)}, coords={'spectrum': midx}) ``` Maybe I miss a better way to do this? If I don't, it would be nice to have a convenience method for this, unless this use case is too rare to be worth it. Also not sure at all on what would be a good API such a method. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1030/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
349078381 | MDExOlB1bGxSZXF1ZXN0MjA3Mjc3NDg2 | 2357 | DOC: move xarray related projects to top-level TOC section | benbovy 4160723 | closed | 0 | 1 | 2018-08-09T10:57:47Z | 2018-08-11T13:41:24Z | 2018-08-10T20:13:08Z | MEMBER | 0 | pydata/xarray/pulls/2357 | Make xarray-related projects more discoverable, as it has been suggested in xarray mailing-list. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/2357/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
300588788 | MDExOlB1bGxSZXF1ZXN0MTcxNjMxNTQ1 | 1946 | DOC: add main sections to toc | benbovy 4160723 | closed | 0 | 0 | 2018-02-27T11:13:17Z | 2018-02-27T21:16:18Z | 2018-02-27T19:04:24Z | MEMBER | 0 | pydata/xarray/pulls/1946 | Not a big change, but adds a little more clarity IMO. I'm open to any suggestion for better section names and/or organization. Also I let "What's new" at the top, but not sure if "Getting started" is the right section. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1946/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
275033174 | MDU6SXNzdWUyNzUwMzMxNzQ= | 1727 | IPython auto-completion triggers data loading | benbovy 4160723 | closed | 0 | 11 | 2017-11-18T00:14:00Z | 2017-11-18T07:09:41Z | 2017-11-18T07:09:40Z | MEMBER | I create a big netcdf file like this: ```python In [1]: import xarray as xr In [2]: import numpy as np In [3]: ds = xr.Dataset({'myvar': np.arange(100000000, dtype='float64')}) In [4]: ds.to_netcdf('test.nc') ``` Then when I open the file in a IPython console and I use auto-completion, it triggers loading the data. ```python In [1]: import xarray as xr In [2]: ds = xr.open_dataset('test.nc') In [3]: ds.my # <TAB> autocompletion with any character -> triggers loading ``` I don't have that issue using the python console. Auto-completion for dictionary access in IPython (#1632) works fine too. Output of
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1727/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
274591962 | MDU6SXNzdWUyNzQ1OTE5NjI= | 1722 | Change in behavior of .set_index() from pandas 0.20.3 to 0.21.0 | benbovy 4160723 | closed | 0 | 1 | 2017-11-16T17:05:20Z | 2017-11-17T00:54:51Z | 2017-11-17T00:54:51Z | MEMBER | I use xarray 0.9.6 for both examples below. With pandas 0.20.3, ```python In [1]: import xarray as xr In [2]: import pandas as pd In [3]: pd.version Out[3]: '0.20.3' In [4]: ds = xr.Dataset({'grid__x': ('x', [1, 2, 3])}) In [5]: ds.set_index(x='grid__x') Out[5]: <xarray.Dataset> Dimensions: (x: 3) Coordinates: * x (x) int64 1 2 3 Data variables: empty ``` With pandas 0.21.0, it creates a ```python In [1]: import xarray as xr In [2]: import pandas as pd In [3]: pd.version Out[3]: '0.21.0' In [4]: ds = xr.Dataset({'grid__x': ('x', [1, 2, 3])}) In [5]: ds.set_index(x='grid__x') Out[5]: <xarray.Dataset> Dimensions: (x: 3) Coordinates: * x (x) MultiIndex - grid__x (x) int64 1 2 3 Data variables: empty ``` |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1722/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
230631480 | MDExOlB1bGxSZXF1ZXN0MTIxOTQyNjMx | 1422 | xarray.core.variable.as_variable part of the public API | benbovy 4160723 | closed | 0 | 6 | 2017-05-23T08:44:08Z | 2017-06-10T18:33:34Z | 2017-06-02T17:55:12Z | MEMBER | 0 | pydata/xarray/pulls/1422 |
Make I changed the docstrings to follow the numpydoc format more closely. I also removed the |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1422/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
134359597 | MDU6SXNzdWUxMzQzNTk1OTc= | 767 | MultiIndex and data selection | benbovy 4160723 | closed | 0 | 9 | 2016-02-17T18:24:00Z | 2016-09-14T14:28:29Z | 2016-09-14T14:28:29Z | MEMBER | [Edited for more clarity] First of all, I find the MultiIndex very useful and I'm looking forward to see the TODOs in #719 implemented in the next releases, especially the three first ones in the list! Apart from these issues, I think that some other aspects may be improved, notably regarding data selection. Or maybe I've not correctly understood how to deal with multi-index and data selection... To illustrate this, I use some fake spectral data with two discontinuous bands of different length / resolution: ``` In [1]: import pandas as pd In [2]: import xarray as xr In [3]: band = np.array(['foo', 'foo', 'bar', 'bar', 'bar']) In [4]: wavenumber = np.array([4050.2, 4050.3, 4100.1, 4100.3, 4100.5]) In [5]: spectrum = np.array([1.7e-4, 1.4e-4, 1.2e-4, 1.0e-4, 8.5e-5]) In [6]: s = pd.Series(spectrum, index=[band, wavenumber]) In [7]: s.index.names = ('band', 'wavenumber') In [8]: da = xr.DataArray(s, dims='band_wavenumber') In [9]: da Out[9]: <xarray.DataArray (band_wavenumber: 5)> array([ 1.70000000e-04, 1.40000000e-04, 1.20000000e-04, 1.00000000e-04, 8.50000000e-05]) Coordinates: * band_wavenumber (band_wavenumber) object ('foo', 4050.2) ... ``` I extract the band 'bar' using ``` In [10]: da_bar = da.sel(band_wavenumber='bar') In [11]: da_bar Out[11]: <xarray.DataArray (band_wavenumber: 3)> array([ 1.20000000e-04, 1.00000000e-04, 8.50000000e-05]) Coordinates: * band_wavenumber (band_wavenumber) object ('bar', 4100.1) ... ``` It selects the data the way I want, although using the dimension name is confusing in this case. It would be nice if we can also use the Futhermore, Extracting the band 'bar' from the pandas ``` In [12]: s_bar = s.loc['bar'] In [13]: s_bar Out[13]: wavenumber 4100.1 0.000120 4100.3 0.000100 4100.5 0.000085 dtype: float64 ``` The problem is also that the unstacked ``` In [13]: da.unstack('band_wavenumber') Out[13]: <xarray.DataArray (band: 2, wavenumber: 5)> array([[ nan, nan, 1.20000000e-04, 1.00000000e-04, 8.50000000e-05], [ 1.70000000e-04, 1.40000000e-04, nan, nan, nan]]) Coordinates: * band (band) object 'bar' 'foo' * wavenumber (wavenumber) float64 4.05e+03 4.05e+03 4.1e+03 4.1e+03 4.1e+03 In [14]: da_bar.unstack('band_wavenumber') Out[14]: <xarray.DataArray (band: 2, wavenumber: 5)> array([[ nan, nan, 1.20000000e-04, 1.00000000e-04, 8.50000000e-05], [ nan, nan, nan, nan, nan]]) Coordinates: * band (band) object 'bar' 'foo' * wavenumber (wavenumber) float64 4.05e+03 4.05e+03 4.1e+03 4.1e+03 4.1e+03 ``` |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/767/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
169588316 | MDExOlB1bGxSZXF1ZXN0ODAyMjk0OTM= | 947 | Multi-index levels as coordinates | benbovy 4160723 | closed | 0 | 17 | 2016-08-05T11:34:49Z | 2016-09-14T03:35:04Z | 2016-09-14T03:34:51Z | MEMBER | 0 | pydata/xarray/pulls/947 | Implements 2, 4 and 5 in #719. Demo: ``` In [1]: import numpy as np In [2]: import pandas as pd In [3]: import xarray as xr In [4]: index = pd.MultiIndex.from_product((list('ab'), range(2)), ...: names= ('level_1', 'level_2')) In [5]: da = xr.DataArray(np.random.rand(4, 4), coords={'x': index}, ...: dims=('x', 'y'), name='test') In [6]: da Out[6]: <xarray.DataArray 'test' (x: 4, y: 4)> array([[ 0.15036153, 0.68974802, 0.40082234, 0.94451318], [ 0.26732938, 0.49598123, 0.8679231 , 0.6149102 ], [ 0.3313594 , 0.93857424, 0.73023367, 0.44069622], [ 0.81304837, 0.81244159, 0.37274953, 0.86405196]]) Coordinates: * level_1 (x) object 'a' 'a' 'b' 'b' * level_2 (x) int64 0 1 0 1 * y (y) int64 0 1 2 3 In [7]: da['level_1'] Out[7]: <xarray.DataArray 'level_1' (x: 4)> array(['a', 'a', 'b', 'b'], dtype=object) Coordinates: * level_1 (x) object 'a' 'a' 'b' 'b' * level_2 (x) int64 0 1 0 1 In [8]: da.sel(x='a', level_2=1) Out[8]: <xarray.DataArray 'test' (y: 4)> array([ 0.26732938, 0.49598123, 0.8679231 , 0.6149102 ]) Coordinates: x object ('a', 1) * y (y) int64 0 1 2 3 In [9]: da.sel(level_2=1) Out[9]: <xarray.DataArray 'test' (level_1: 2, y: 4)> array([[ 0.26732938, 0.49598123, 0.8679231 , 0.6149102 ], [ 0.81304837, 0.81244159, 0.37274953, 0.86405196]]) Coordinates: * level_1 (level_1) object 'a' 'b' * y (y) int64 0 1 2 3 ``` Some notes about the implementation:
- I slightly modified Remaining issues:
- ``` In [6]: [name for name in da.coords] Out[6]: ['x', 'y'] In [7]: da.coords.keys()
Out[7]:
KeysView(Coordinates:
* level_1 (x) object 'a' 'a' 'b' 'b'
* level_2 (x) int64 0 1 0 1
* y (y) int64 0 1 2 3)
Of course still needs proper tests and docs... |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/947/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
159768214 | MDExOlB1bGxSZXF1ZXN0NzM0NjU0MTA= | 879 | Multi-index repr | benbovy 4160723 | closed | 0 | 2 | 2016-06-11T10:58:13Z | 2016-08-31T21:40:59Z | 2016-08-31T21:40:59Z | MEMBER | 0 | pydata/xarray/pulls/879 | Another item of #719. An example: ``` python
To be consistent with the displayed coordinates and/or data variables, it displays the actual used level values. Using the It still needs testing. Maybe it would be nice to align the level values. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/879/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issues] ( [id] INTEGER PRIMARY KEY, [node_id] TEXT, [number] INTEGER, [title] TEXT, [user] INTEGER REFERENCES [users]([id]), [state] TEXT, [locked] INTEGER, [assignee] INTEGER REFERENCES [users]([id]), [milestone] INTEGER REFERENCES [milestones]([id]), [comments] INTEGER, [created_at] TEXT, [updated_at] TEXT, [closed_at] TEXT, [author_association] TEXT, [active_lock_reason] TEXT, [draft] INTEGER, [pull_request] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [state_reason] TEXT, [repo] INTEGER REFERENCES [repos]([id]), [type] TEXT ); CREATE INDEX [idx_issues_repo] ON [issues] ([repo]); CREATE INDEX [idx_issues_milestone] ON [issues] ([milestone]); CREATE INDEX [idx_issues_assignee] ON [issues] ([assignee]); CREATE INDEX [idx_issues_user] ON [issues] ([user]);