github: issues: 3 rows where "created_at" is on date 2023-07-19 and user = 2448579 sorted by updated

3 rows where "created_at" is on date 2023-07-19 and user = 2448579 sorted by updated_at descending

Search:

descending

id	node_id	number	title	user	state	comments	created_at	updated_at ▲	closed_at	author_association	draft	pull_request	body	reactions	state_reason	repo	type
1812301185	I_kwDOAMm_X85sBYWB	8005	Design for IntervalIndex	dcherian 2448579	open	5	2023-07-19T16:30:50Z	2023-09-09T06:30:20Z		MEMBER			Is your feature request related to a problem? We should add a wrapper for `pandas.IntervalIndex` this would solve a long standing problem around propagating "bounds" variables (CF conventions, https://github.com/pydata/xarray/issues/1475) The CF design CF "encoding" for intervals is to use bounds variables. There is an attribute `"bounds"` on the dimension coordinate, that refers to a second variable (at least 2D). Example: `x` has an attribute `bounds` that refers to `x_bounds`. ```python import numpy as np left = np.arange(0.5, 3.6, 1) right = np.arange(1.5, 4.6, 1) bounds = np.stack([left, right]) ds = xr.Dataset( {"data": ("x", [1, 2, 3, 4])}, coords={"x": ("x", [1, 2, 3, 4], {"bounds": "x_bounds"}), "x_bounds": (("bnds", "x"), bounds)}, ) ds ``` A fundamental problem with our current data model is that we lose `x_bounds` when we extract `ds.data` because there is a dimension `bnds` that is not shared with `ds.data`. Very important metadata is now lost! We would also like to use the "bounds" to enable interval based indexing. `ds.sel(x=1.1)` should give you the value from the appropriate interval. Pandas IntervalIndex All the indexing is easy to implement by wrapping pandas.IntervalIndex, but there is one limitation. `pd.IntervalIndex` saves two pieces of information for each interval (left bound, right bound). CF saves three : left bound, right bound (see `x_bounds`) and a "central" value (see `x`). This should be OK to work around in our wrapper. Fundamental Question To me, a core question is whether `x_bounds` needs to be preserved after creating an `IntervalIndex`. 1. If so, we need a better rule around coordinate variable propagation. In this case, the IntervalIndex would be associated with `x` and `x_bounds`. So the rule could be > "propagate all variables necessary to propagate an index associated with any of the dimensions on the extracted variable." So when extracting `ds.data` we propagate all variables necessary to propagate indexes associated with `ds.data.dims` that is `x` which would say "propagate `x`, `x_bounds`, and the IntervalIndex. Alternatively, we could choose to drop `x_bounds` entirely. I interpret this approach as "decoding" the bounds variable to an interval index object. When saving to disk, we would encode the interval index in two variables. (See below) Describe the solution you'd like I've prototyped (2) [approach 1 in this notebook) following @benbovy's suggestion ```python from xarray import Variable from xarray.indexes import PandasIndex class XarrayIntervalIndex(PandasIndex): def __init__(self, index, dim, coord_dtype): assert isinstance(index, pd.IntervalIndex) # for PandasIndex self.index = index self.dim = dim self.coord_dtype = coord_dtype @classmethod def from_variables(cls, variables, options): assert len(variables) == 1 (dim,) = tuple(variables) bounds = options["bounds"] assert isinstance(bounds, (xr.DataArray, xr.Variable)) (axis,) = bounds.get_axis_num(set(bounds.dims) - {dim}) left, right = np.split(bounds.data, 2, axis=axis) index = pd.IntervalIndex.from_arrays(left.squeeze(), right.squeeze()) coord_dtype = bounds.dtype return cls(index, dim, coord_dtype) def create_variables(self, variables): from xarray.core.indexing import PandasIndexingAdapter newvars = {self.dim: xr.Variable(self.dim, PandasIndexingAdapter(self.index))} return newvars def __repr__(self): string = f"Xarray{self.index!r}" return string def to_pandas_index(self): return self.index @property def mid(self): return PandasIndex(self.index.right, self.dim, self.coord_dtype) @property def left(self): return PandasIndex(self.index.right, self.dim, self.coord_dtype) @property def right(self): return PandasIndex(self.index.right, self.dim, self.coord_dtype) ``` `python ds1 = ( ds.drop_indexes("x") .set_xindex("x", XarrayIntervalIndex, bounds=ds.x_bounds) .drop_vars("x_bounds") ) ds1` `python ds1.sel(x=1.1)` Describe alternatives you've considered I've tried some approaches in this notebook	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8005/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1812504689	I_kwDOAMm_X85sCKBx	8006	Fix documentation about datetime_unit of xarray.DataArray.differentiate	dcherian 2448579	closed	0	2023-07-19T18:31:10Z	2023-09-01T09:37:15Z	2023-09-01T09:37:15Z	MEMBER			Should say that `Y` and `M` cannot be supported with `datetime64` Discussed in https://github.com/pydata/xarray/discussions/8000 <sup>Originally posted by jesieleo July 19, 2023</sup> I have a piece of data that looks like this ``` <xarray.Dataset> Dimensions: (time: 612, LEV: 15, latitude: 20, longitude: 357) Coordinates: * time (time) datetime64[ns] 1960-01-15 1960-02-15 ... 2010-12-15 * LEV (LEV) float64 5.01 15.07 25.28 35.76 ... 149.0 171.4 197.8 229.5 * latitude (latitude) float64 -4.75 -4.25 -3.75 -3.25 ... 3.75 4.25 4.75 * longitude (longitude) float64 114.2 114.8 115.2 115.8 ... 291.2 291.8 292.2 Data variables: u (time, LEV, latitude, longitude) float32 ... Attributes: (12/30) cdm_data_type: Grid Conventions: COARDS, CF-1.6, ACDD-1.3 creator_email: chepurin@umd.edu creator_name: APDRC creator_type: institution creator_url: https://www.atmos.umd.edu/~ocean/ ... ... standard_name_vocabulary: CF Standard Name Table v29 summary: Simple Ocean Data Assimilation (SODA) soda po... time_coverage_end: 2010-12-15T00:00:00Z time_coverage_start: 1983-01-15T00:00:00Z title: SODA soda pop2.2.4 [TIME][LEV][LAT][LON] Westernmost_Easting: 118.25 ``` when i try to use xarray.DataArray.differentiate `data.u.differentiate('time',datetime_unit='M')` will appear ``` Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\Anaconda3\lib\site-packages\xarray\core\dataarray.py", line 3609, in differentiate ds = self._to_temp_dataset().differentiate(coord, edge_order, datetime_unit) File "D:\Anaconda3\lib\site-packages\xarray\core\dataset.py", line 6372, in differentiate coord_var = coord_var._to_numeric(datetime_unit=datetime_unit) File "D:\Anaconda3\lib\site-packages\xarray\core\variable.py", line 2428, in _to_numeric numeric_array = duck_array_ops.datetime_to_numeric( File "D:\Anaconda3\lib\site-packages\xarray\core\duck_array_ops.py", line 466, in datetime_to_numeric array = array / np.timedelta64(1, datetime_unit) TypeError: Cannot get a common metadata divisor for Numpy datatime metadata [ns] and [M] because they have incompatible nonlinear base time units. ``` Would you please told me is this a BUG?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8006/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1812646094	PR_kwDOAMm_X85V7g7q	8007	Update copyright year in README	dcherian 2448579	closed	0	2023-07-19T20:00:50Z	2023-07-20T21:13:27Z	2023-07-20T21:13:26Z	MEMBER	0	pydata/xarray/pulls/8007		{ "url": "https://api.github.com/repos/pydata/xarray/issues/8007/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);

issues

3 rows where "created_at" is on date 2023-07-19 and user = 2448579 sorted by updated_at descending

Is your feature request related to a problem?

The CF design

Pandas IntervalIndex

Fundamental Question

Describe the solution you'd like

Describe alternatives you've considered

Discussed in https://github.com/pydata/xarray/discussions/8000

Advanced export