github: pull_requests: 9 rows where user = 20629530

9 rows where user = 20629530

Search:

descending

id ▼	node_id	number	state	title	user	body	created_at	updated_at	closed_at	merged_at	merge_commit_sha	head	base	author_association	repo	url
369184294	MDExOlB1bGxSZXF1ZXN0MzY5MTg0Mjk0	3733	closed	Implementation of polyfit and polyval	aulemahal 20629530	- [x] Closes #3349 - [x] Tests added - [x] Passes `isort -rc . && black . && mypy . && flake8` - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API Following discussions in #3349, I suggest here an implementation of `polyfit` and `polyval` for xarray. However, this is still work in progress, a lot of testing is missing, all docstrings are missing. But, mainly, I have questions on how to properly conduct this. My implementation mostly duplicates the code of `np.polyfit`, but making use of `dask.array.linalg.lstsq` and `dask.array.apply_along_axis` for dask arrays. The same method as in `xscale.signal.fitting.polyfit`, but I add NaN-awareness in a 1-D manner. The version with numpy is also slightly different of `np.polyfit` because of the NaN skipping, but I wanted the function to replicate its behaviour. It returns a variable number of DataArrays, depending on the keyword arguments (coefficients, [ residuals, matrix rank, singular values ] / [covariance matrix]). Thus giving a medium-length function that has a lot of duplicated code from `numpy.polyfit`. I thought of simply using a `xr.apply_ufunc`, but that makes chunking along the fitted dimension forbidden and difficult to return the ancillary results (residuals, rank, covariance matrix...). Questions: 1 ) Are the functions where they should go? 2 ) Should xarray's implementation really replicate the behaviour of numpy's? A lot of extra code could be removed if we'd say we only want to compute and return the residuals and the coefficients. All the other variables are a few lines of code away for the user that really wants them, and they don't need the power of xarray and dask anyway.	2020-01-30T16:58:51Z	2020-03-26T00:22:17Z	2020-03-25T17:17:45Z	2020-03-25T17:17:45Z	ec215daecec642db94102dc24156448f8440f52d	7eeba59ff487d5bc51809da4ae824e7283b5b2aa	009aa66620b3437cf0de675013fa7d1ff231963c	CONTRIBUTOR	xarray 13221727	https://github.com/pydata/xarray/pull/3733
413713886	MDExOlB1bGxSZXF1ZXN0NDEzNzEzODg2	4033	closed	xr.infer_freq	aulemahal 20629530	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Tests added - [x] Passes `isort -rc . && black . && mypy . && flake8` - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API This PR adds a `xr.infer_freq` method to copy pandas `infer_freq` but on `CFTimeIndex` objects. I tried to subclass pandas `_FrequencyInferer` and to only override as little as possible. Two things are problematic right now and I would like to get feedback on how to implement them if this PR gets the dev's approval. 1) `pd.DatetimeIndex.asi8` returns integers representing _nanoseconds_ since 1970-1-1, while `xr.CFTimeIndex.asi8` returns _microseconds_. In order not to break the API, I patched the `_CFTimeFrequencyInferer` to store 1000x the values. Not sure if this is the best, but it works. 2) As of now, `xr.infer_freq` will fail on weekly indexes. This is because pandas is using `datetime.weekday()` at some point but cftime objects do not implement that (they use `dayofwk` instead). I'm not sure what to do? Cftime could implement it to completly mirror python's datetime or pandas could use `dayofwk` since it's available on the `TimeStamp` objects. Another option, cleaner but longer, would be to reimplement `_FrequencyInferer` from scratch. I may have time for this, cause I really think a `xr.infer_freq` method would be useful.	2020-05-05T19:39:05Z	2020-05-30T18:11:36Z	2020-05-30T18:08:27Z	2020-05-30T18:08:27Z	fd9e620a84389170138cc014ee5a0213718beb78	9a553edae8b2b4f52e5044d89b0f0354d51b003c	d1f7cb8fd95d588d3f7a7e90916c25747b90ad5a	CONTRIBUTOR	xarray 13221727	https://github.com/pydata/xarray/pull/4033
424048387	MDExOlB1bGxSZXF1ZXN0NDI0MDQ4Mzg3	4099	closed	Allow non-unique and non-monotonic coordinates in get_clean_interp_index and polyfit	aulemahal 20629530	<!-- Feel free to remove check-list items aren't relevant to your change --> - [ ] Closes #xxxx - [x] Tests added - [x] Passes `isort -rc . && black . && mypy . && flake8` - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API Pull #3733 added `da.polyfit` and `xr.polyval` and is using `xr.core.missing.get_clean_interp_index` in order to get the fitting coordinate. However, this method is stricter than what polyfit needs: as in `numpy.polyfit`, non-unique and non-monotonic indexes are acceptable. This PR adds a `strict` keyword argument to `get_clean_interp_index` so we can skip the uniqueness and monotony tests. `ds.polyfit` and `xr.polyval` were modified to use that keyword. I only added tests for `get_clean_interp_index`, could add more for `polyfit` if requested.	2020-05-27T18:48:58Z	2020-06-05T15:46:00Z	2020-06-05T15:46:00Z	2020-06-05T15:46:00Z	09df5ca4036d84620373fa4bccd11d1f1d4bec28	fedfbf5ccdf52cac82ac0c072ae8882d630a2f51	e5cc19cd8f8a69e0743f230f5bf51b7778a0ff96	CONTRIBUTOR	xarray 13221727	https://github.com/pydata/xarray/pull/4099
431889644	MDExOlB1bGxSZXF1ZXN0NDMxODg5NjQ0	4135	closed	Correct dask handling for 1D idxmax/min on ND data	aulemahal 20629530	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #4123 - [x] Tests added - [x] Passes `isort -rc . && black . && mypy . && flake8` - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API Based on comments on dask/dask#3096, I fixed the dask indexing error that occurred when `idxmax/idxmin` were called on ND data (where N > 2). Added tests are very simplistic, I believe the 1D and 2D tests already cover most cases, I just wanted to test that is was indeed working on ND data, assuming that non-dask data was already treated properly. I believe this doesn't conflict with #3936.	2020-06-09T15:36:09Z	2020-06-25T16:09:59Z	2020-06-25T03:59:52Z	2020-06-25T03:59:51Z	f4638afe009fde5f53de1a1b80cc71f62593c463	76e82e90948aae14f170c595dc2ee61fdf1770cf	fb5fe79a2881055065cc2c0ed3f49f5448afdf32	CONTRIBUTOR	xarray 13221727	https://github.com/pydata/xarray/pull/4135
443610926	MDExOlB1bGxSZXF1ZXN0NDQzNjEwOTI2	4193	closed	Fix polyfit fail on deficient rank	aulemahal 20629530	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #4190 - [x] Tests added - [x] Passes `isort -rc . && black . && mypy . && flake8` - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed in `api.rst` Fixes #4190. In cases where the input matrix had a deficient rank (matrix rank != order) because of the number of NaN values, polyfit would fail, simply because numpy's lstsq returned an empty array for the residuals (instead of a size 1 array). This fixes the problem by catching the case and returning `np.nan` instead. The other point in the issue was that `RankWarning` is also not raised in that case. That was due to the fact that `da.polyfit` was computing the rank from the coordinate (Vandermonde) matrix, instead of the masked data. Thus, is a given line has too many NaN values, its deficient rank was not detected. I added a test and warning at all places where a rank is computed (5 different lines). Also, to match np.polyfit behaviour of no warning when `full=True`, I changed the warning filters using a context manager, ignoring the `RankWarning` in that case. Overall, it feels a bi ugly because of the duplicated code and it will print the warning for every line of an array that has a deficient rank, which can be a lot...	2020-07-02T16:00:21Z	2020-08-20T14:20:43Z	2020-08-20T08:34:45Z	2020-08-20T08:34:45Z	efabe74b1ce8f0666b93658ebb48104aa37b26ac	04be2e0fa1f96762798761f08aca7c37d7d8c67d	26547d19d477cc77461c09b3aadd55f7eb8b4dbf	CONTRIBUTOR	xarray 13221727	https://github.com/pydata/xarray/pull/4193
625530046	MDExOlB1bGxSZXF1ZXN0NjI1NTMwMDQ2	5233	closed	Calendar utilities	aulemahal 20629530	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #5155 - [x] Tests added - [x] Passes `pre-commit run --all-files` - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [x] New functions/methods are listed in `api.rst` So: - Added `coding.cftime_offsets.date_range` and `coding.cftime_offsets.date_range_like` The first simply swtiches between `pd.date_range` and `xarray.cftime_range` according to the arguments. The second infers start, end and freq from an existing datetime array and returns a similar range in another calendar. - Added `coding/calendar_ops.py` with `convert_calendar` and `interp_calendar` Didn't know where to put them, so there they are. - Added `DataArray.dt.calendar`. When the datetime objects are backed by numpy, it always return `"proleptic_gregorian"`. I'm not sure where to expose the function. Should the range-generators be accessible directly like `xr.date_range`? The `convert_calendar` and `interp_calendar` could be implemented as methods of `DataArray` and `Dataset`, should I do that?	2021-04-28T20:01:33Z	2021-12-30T22:54:49Z	2021-12-30T22:54:11Z	2021-12-30T22:54:11Z	b14e2d8400da5c036f1ebb5486939f7f587b9f27	5aa747079ce32c51645ca245b1423cbacaf0cb1b	2694046c748a51125de6d460073635f1d789958e	CONTRIBUTOR	xarray 13221727	https://github.com/pydata/xarray/pull/5233
657205536	MDExOlB1bGxSZXF1ZXN0NjU3MjA1NTM2	5402	open	`dt.to_pytimedelta` to allow arithmetic with cftime objects	aulemahal 20629530	<!-- Feel free to remove check-list items aren't relevant to your change --> - [ ] Closes #xxxx - [x] Tests added - [x] Passes `pre-commit run --all-files` - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed in `api.rst` When playing with cftime objects a problem I encountered many times is that I can sub two arrays and them add it back to another. Subtracting to cftime datetime arrays result in an array of `np.timedelta64`. And when trying to add it back to another cftime array, we get a `UFuncTypeError` because the two arrays have incompatible dtypes : '<m8[ns]' and 'O'. Example: ```python import xarray as xr da = xr.DataArray(xr.cftime_range('1900-01-01', freq='D', periods=10), dims=('time',)) # An array of timedelta64[ns] dt = da - da[0] da[-1] + dt # Fails ``` However, if the two arrays were of 'O' dtype, then the subtraction would be made by `cftime` which supports `datetime.timedelta` objects. This solution here adds a `to_pytimedelta` to the `TimedeltaAccessor`, mirroring the name of the similar function on `pd.Series.dt`. It uses a monkeypatching workaround to prevent xarray to case the array back into numpy objects. The user still has to check if the data is in cftime or numpy to adapt the operation (calling `dt.to_pytimedelta` or not), but custom workaround were always overly complicated for such a simple problem, this helps. Also, this doesn't work with dask arrays because loading a dask array triggers the variable constructor and thus recasts the array of `datetime.timedelta` to `numpy.timedelta[64]`. I realize I maybe should have opened an issue before, but I had this idea and it all rushed along.	2021-05-28T22:48:50Z	2022-06-09T14:50:16Z			0060277e4ecf1b05a198aeff9051d86f814b0096	71d567789573b47e059dbaebabcbda9c3493d0c5	d1e4164f3961d7bbb3eb79037e96cae14f7182f8	CONTRIBUTOR	xarray 13221727	https://github.com/pydata/xarray/pull/5402
729993114	MDExOlB1bGxSZXF1ZXN0NzI5OTkzMTE0	5781	open	Add encodings to save_mfdataset	aulemahal 20629530	<!-- Feel free to remove check-list items aren't relevant to your change --> - [ ] Closes #xxxx - [x] Tests added - [x] Passes `pre-commit run --all-files` - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed in `api.rst` Simply add a `encodings` argument to `save_mfdataset`. As for the other args, it expects a list of dictionaries, with encoding information to be passed to `to_netcdf` for each dataset. Added a minimal test, simply to see if the argument was taken into account.	2021-09-08T21:24:13Z	2022-10-06T21:44:18Z			d86b32087d7108dc866e34569653033973160827	23acbb84683f3dab9f593ee63a0323433b2b3638	d1e4164f3961d7bbb3eb79037e96cae14f7182f8	CONTRIBUTOR	xarray 13221727	https://github.com/pydata/xarray/pull/5781
1673012286	PR_kwDOAMm_X85juCQ-	8603	closed	Convert 360_day calendars by choosing random dates to drop or add	aulemahal 20629530	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` Small PR to add a new "method" to convert to and from 360_day calendars. The current two methods (chosen with the `align_on` keyword) will always remove or add the same day-of-year for all years of the same length. This new option will randomly chose the days, one for each fifth of the year (72-days period). It emulates the method of the LOCA datasets (see [web page](https://loca.ucsd.edu/loca-calendar/) and [article](https://journals.ametsoc.org/view/journals/hydr/15/6/jhm-d-14-0082_1.xml) ). February 29th is always removed/added when the source/target is a leap year. I copied the implementation from xclim (which I wrote), [see code here](https://github.com/Ouranosinc/xclim/blob/fb29b8a8e400c7d8aaf4e1233a6b37a300126257/xclim/core/calendar.py#L112-L134) .	2024-01-10T19:13:31Z	2024-04-16T14:53:42Z	2024-04-16T14:53:42Z	2024-04-16T14:53:42Z	239309f881ba0d7e02280147bc443e6e286e6a63	b581e1f700382207c3bd0fd03860f44f33b29b79	b004af5174a4b0e32519df792a4f625d5548a9f0	CONTRIBUTOR	xarray 13221727	https://github.com/pydata/xarray/pull/8603

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [pull_requests] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [state] TEXT,
   [locked] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [body] TEXT,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [merged_at] TEXT,
   [merge_commit_sha] TEXT,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [draft] INTEGER,
   [head] TEXT,
   [base] TEXT,
   [author_association] TEXT,
   [auto_merge] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [url] TEXT,
   [merged_by] INTEGER REFERENCES [users]([id])
);
CREATE INDEX [idx_pull_requests_merged_by]
    ON [pull_requests] ([merged_by]);
CREATE INDEX [idx_pull_requests_repo]
    ON [pull_requests] ([repo]);
CREATE INDEX [idx_pull_requests_milestone]
    ON [pull_requests] ([milestone]);
CREATE INDEX [idx_pull_requests_assignee]
    ON [pull_requests] ([assignee]);
CREATE INDEX [idx_pull_requests_user]
    ON [pull_requests] ([user]);