home / github

Menu
  • GraphQL API
  • Search all tables

pull_requests

Table actions
  • GraphQL API for pull_requests

9 rows where user = 20629530

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: state, base, created_at (date), updated_at (date), closed_at (date), merged_at (date)

id ▼ node_id number state locked title user body created_at updated_at closed_at merged_at merge_commit_sha assignee milestone draft head base author_association auto_merge repo url merged_by
369184294 MDExOlB1bGxSZXF1ZXN0MzY5MTg0Mjk0 3733 closed 0 Implementation of polyfit and polyval aulemahal 20629530 - [x] Closes #3349 - [x] Tests added - [x] Passes `isort -rc . && black . && mypy . && flake8` - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API Following discussions in #3349, I suggest here an implementation of `polyfit` and `polyval` for xarray. However, this is still work in progress, a lot of testing is missing, all docstrings are missing. But, mainly, I have questions on how to properly conduct this. My implementation mostly duplicates the code of `np.polyfit`, but making use of `dask.array.linalg.lstsq` and `dask.array.apply_along_axis` for dask arrays. The same method as in `xscale.signal.fitting.polyfit`, but I add NaN-awareness in a 1-D manner. The version with numpy is also slightly different of `np.polyfit` because of the NaN skipping, but I wanted the function to replicate its behaviour. It returns a variable number of DataArrays, depending on the keyword arguments (coefficients, [ residuals, matrix rank, singular values ] / [covariance matrix]). Thus giving a medium-length function that has a lot of duplicated code from `numpy.polyfit`. I thought of simply using a `xr.apply_ufunc`, but that makes chunking along the fitted dimension forbidden and difficult to return the ancillary results (residuals, rank, covariance matrix...). Questions: 1 ) Are the functions where they should go? 2 ) Should xarray's implementation really replicate the behaviour of numpy's? A lot of extra code could be removed if we'd say we only want to compute and return the residuals and the coefficients. All the other variables are a few lines of code away for the user that really wants them, and they don't need the power of xarray and dask anyway. 2020-01-30T16:58:51Z 2020-03-26T00:22:17Z 2020-03-25T17:17:45Z 2020-03-25T17:17:45Z ec215daecec642db94102dc24156448f8440f52d     0 7eeba59ff487d5bc51809da4ae824e7283b5b2aa 009aa66620b3437cf0de675013fa7d1ff231963c CONTRIBUTOR   xarray 13221727 https://github.com/pydata/xarray/pull/3733  
413713886 MDExOlB1bGxSZXF1ZXN0NDEzNzEzODg2 4033 closed 0 xr.infer_freq aulemahal 20629530 <!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Tests added - [x] Passes `isort -rc . && black . && mypy . && flake8` - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API This PR adds a `xr.infer_freq` method to copy pandas `infer_freq` but on `CFTimeIndex` objects. I tried to subclass pandas `_FrequencyInferer` and to only override as little as possible. Two things are problematic right now and I would like to get feedback on how to implement them if this PR gets the dev's approval. 1) `pd.DatetimeIndex.asi8` returns integers representing _nanoseconds_ since 1970-1-1, while `xr.CFTimeIndex.asi8` returns _microseconds_. In order not to break the API, I patched the `_CFTimeFrequencyInferer` to store 1000x the values. Not sure if this is the best, but it works. 2) As of now, `xr.infer_freq` will fail on weekly indexes. This is because pandas is using `datetime.weekday()` at some point but cftime objects do not implement that (they use `dayofwk` instead). I'm not sure what to do? Cftime could implement it to completly mirror python's datetime or pandas could use `dayofwk` since it's available on the `TimeStamp` objects. Another option, cleaner but longer, would be to reimplement `_FrequencyInferer` from scratch. I may have time for this, cause I really think a `xr.infer_freq` method would be useful. 2020-05-05T19:39:05Z 2020-05-30T18:11:36Z 2020-05-30T18:08:27Z 2020-05-30T18:08:27Z fd9e620a84389170138cc014ee5a0213718beb78     0 9a553edae8b2b4f52e5044d89b0f0354d51b003c d1f7cb8fd95d588d3f7a7e90916c25747b90ad5a CONTRIBUTOR   xarray 13221727 https://github.com/pydata/xarray/pull/4033  
424048387 MDExOlB1bGxSZXF1ZXN0NDI0MDQ4Mzg3 4099 closed 0 Allow non-unique and non-monotonic coordinates in get_clean_interp_index and polyfit aulemahal 20629530 <!-- Feel free to remove check-list items aren't relevant to your change --> - [ ] Closes #xxxx - [x] Tests added - [x] Passes `isort -rc . && black . && mypy . && flake8` - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API Pull #3733 added `da.polyfit` and `xr.polyval` and is using `xr.core.missing.get_clean_interp_index` in order to get the fitting coordinate. However, this method is stricter than what polyfit needs: as in `numpy.polyfit`, non-unique and non-monotonic indexes are acceptable. This PR adds a `strict` keyword argument to `get_clean_interp_index` so we can skip the uniqueness and monotony tests. `ds.polyfit` and `xr.polyval` were modified to use that keyword. I only added tests for `get_clean_interp_index`, could add more for `polyfit` if requested. 2020-05-27T18:48:58Z 2020-06-05T15:46:00Z 2020-06-05T15:46:00Z 2020-06-05T15:46:00Z 09df5ca4036d84620373fa4bccd11d1f1d4bec28     0 fedfbf5ccdf52cac82ac0c072ae8882d630a2f51 e5cc19cd8f8a69e0743f230f5bf51b7778a0ff96 CONTRIBUTOR   xarray 13221727 https://github.com/pydata/xarray/pull/4099  
431889644 MDExOlB1bGxSZXF1ZXN0NDMxODg5NjQ0 4135 closed 0 Correct dask handling for 1D idxmax/min on ND data aulemahal 20629530 <!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #4123 - [x] Tests added - [x] Passes `isort -rc . && black . && mypy . && flake8` - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API Based on comments on dask/dask#3096, I fixed the dask indexing error that occurred when `idxmax/idxmin` were called on ND data (where N > 2). Added tests are very simplistic, I believe the 1D and 2D tests already cover most cases, I just wanted to test that is was indeed working on ND data, assuming that non-dask data was already treated properly. I believe this doesn't conflict with #3936. 2020-06-09T15:36:09Z 2020-06-25T16:09:59Z 2020-06-25T03:59:52Z 2020-06-25T03:59:51Z f4638afe009fde5f53de1a1b80cc71f62593c463     0 76e82e90948aae14f170c595dc2ee61fdf1770cf fb5fe79a2881055065cc2c0ed3f49f5448afdf32 CONTRIBUTOR   xarray 13221727 https://github.com/pydata/xarray/pull/4135  
443610926 MDExOlB1bGxSZXF1ZXN0NDQzNjEwOTI2 4193 closed 0 Fix polyfit fail on deficient rank aulemahal 20629530 <!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #4190 - [x] Tests added - [x] Passes `isort -rc . && black . && mypy . && flake8` - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed in `api.rst` Fixes #4190. In cases where the input matrix had a deficient rank (matrix rank != order) because of the number of NaN values, polyfit would fail, simply because numpy's lstsq returned an empty array for the residuals (instead of a size 1 array). This fixes the problem by catching the case and returning `np.nan` instead. The other point in the issue was that `RankWarning` is also not raised in that case. That was due to the fact that `da.polyfit` was computing the rank from the coordinate (Vandermonde) matrix, instead of the masked data. Thus, is a given line has too many NaN values, its deficient rank was not detected. I added a test and warning at all places where a rank is computed (5 different lines). Also, to match np.polyfit behaviour of no warning when `full=True`, I changed the warning filters using a context manager, ignoring the `RankWarning` in that case. Overall, it feels a bi ugly because of the duplicated code and it will print the warning for every line of an array that has a deficient rank, which can be a lot... 2020-07-02T16:00:21Z 2020-08-20T14:20:43Z 2020-08-20T08:34:45Z 2020-08-20T08:34:45Z efabe74b1ce8f0666b93658ebb48104aa37b26ac     0 04be2e0fa1f96762798761f08aca7c37d7d8c67d 26547d19d477cc77461c09b3aadd55f7eb8b4dbf CONTRIBUTOR   xarray 13221727 https://github.com/pydata/xarray/pull/4193  
625530046 MDExOlB1bGxSZXF1ZXN0NjI1NTMwMDQ2 5233 closed 0 Calendar utilities aulemahal 20629530 <!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #5155 - [x] Tests added - [x] Passes `pre-commit run --all-files` - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [x] New functions/methods are listed in `api.rst` So: - Added `coding.cftime_offsets.date_range` and `coding.cftime_offsets.date_range_like` The first simply swtiches between `pd.date_range` and `xarray.cftime_range` according to the arguments. The second infers start, end and freq from an existing datetime array and returns a similar range in another calendar. - Added `coding/calendar_ops.py` with `convert_calendar` and `interp_calendar` Didn't know where to put them, so there they are. - Added `DataArray.dt.calendar`. When the datetime objects are backed by numpy, it always return `"proleptic_gregorian"`. I'm not sure where to expose the function. Should the range-generators be accessible directly like `xr.date_range`? The `convert_calendar` and `interp_calendar` could be implemented as methods of `DataArray` and `Dataset`, should I do that? 2021-04-28T20:01:33Z 2021-12-30T22:54:49Z 2021-12-30T22:54:11Z 2021-12-30T22:54:11Z b14e2d8400da5c036f1ebb5486939f7f587b9f27     0 5aa747079ce32c51645ca245b1423cbacaf0cb1b 2694046c748a51125de6d460073635f1d789958e CONTRIBUTOR   xarray 13221727 https://github.com/pydata/xarray/pull/5233  
657205536 MDExOlB1bGxSZXF1ZXN0NjU3MjA1NTM2 5402 open 0 `dt.to_pytimedelta` to allow arithmetic with cftime objects aulemahal 20629530 <!-- Feel free to remove check-list items aren't relevant to your change --> - [ ] Closes #xxxx - [x] Tests added - [x] Passes `pre-commit run --all-files` - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed in `api.rst` When playing with cftime objects a problem I encountered many times is that I can sub two arrays and them add it back to another. Subtracting to cftime datetime arrays result in an array of `np.timedelta64`. And when trying to add it back to another cftime array, we get a `UFuncTypeError` because the two arrays have incompatible dtypes : '<m8[ns]' and 'O'. Example: ```python import xarray as xr da = xr.DataArray(xr.cftime_range('1900-01-01', freq='D', periods=10), dims=('time',)) # An array of timedelta64[ns] dt = da - da[0] da[-1] + dt # Fails ``` However, if the two arrays were of 'O' dtype, then the subtraction would be made by `cftime` which supports `datetime.timedelta` objects. This solution here adds a `to_pytimedelta` to the `TimedeltaAccessor`, mirroring the name of the similar function on `pd.Series.dt`. It uses a monkeypatching workaround to prevent xarray to case the array back into numpy objects. The user still has to check if the data is in cftime or numpy to adapt the operation (calling `dt.to_pytimedelta` or not), but custom workaround were always overly complicated for such a simple problem, this helps. Also, this doesn't work with dask arrays because loading a dask array triggers the variable constructor and thus recasts the array of `datetime.timedelta` to `numpy.timedelta[64]`. I realize I maybe should have opened an issue before, but I had this idea and it all rushed along. 2021-05-28T22:48:50Z 2022-06-09T14:50:16Z     0060277e4ecf1b05a198aeff9051d86f814b0096     0 71d567789573b47e059dbaebabcbda9c3493d0c5 d1e4164f3961d7bbb3eb79037e96cae14f7182f8 CONTRIBUTOR   xarray 13221727 https://github.com/pydata/xarray/pull/5402  
729993114 MDExOlB1bGxSZXF1ZXN0NzI5OTkzMTE0 5781 open 0 Add encodings to save_mfdataset aulemahal 20629530 <!-- Feel free to remove check-list items aren't relevant to your change --> - [ ] Closes #xxxx - [x] Tests added - [x] Passes `pre-commit run --all-files` - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed in `api.rst` Simply add a `encodings` argument to `save_mfdataset`. As for the other args, it expects a list of dictionaries, with encoding information to be passed to `to_netcdf` for each dataset. Added a minimal test, simply to see if the argument was taken into account. 2021-09-08T21:24:13Z 2022-10-06T21:44:18Z     d86b32087d7108dc866e34569653033973160827     0 23acbb84683f3dab9f593ee63a0323433b2b3638 d1e4164f3961d7bbb3eb79037e96cae14f7182f8 CONTRIBUTOR   xarray 13221727 https://github.com/pydata/xarray/pull/5781  
1673012286 PR_kwDOAMm_X85juCQ- 8603 closed 0 Convert 360_day calendars by choosing random dates to drop or add aulemahal 20629530 <!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` Small PR to add a new "method" to convert to and from 360_day calendars. The current two methods (chosen with the `align_on` keyword) will always remove or add the same day-of-year for all years of the same length. This new option will randomly chose the days, one for each fifth of the year (72-days period). It emulates the method of the LOCA datasets (see [web page](https://loca.ucsd.edu/loca-calendar/) and [article](https://journals.ametsoc.org/view/journals/hydr/15/6/jhm-d-14-0082_1.xml) ). February 29th is always removed/added when the source/target is a leap year. I copied the implementation from xclim (which I wrote), [see code here](https://github.com/Ouranosinc/xclim/blob/fb29b8a8e400c7d8aaf4e1233a6b37a300126257/xclim/core/calendar.py#L112-L134) . 2024-01-10T19:13:31Z 2024-04-16T14:53:42Z 2024-04-16T14:53:42Z 2024-04-16T14:53:42Z 239309f881ba0d7e02280147bc443e6e286e6a63     0 b581e1f700382207c3bd0fd03860f44f33b29b79 b004af5174a4b0e32519df792a4f625d5548a9f0 CONTRIBUTOR   xarray 13221727 https://github.com/pydata/xarray/pull/8603  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [pull_requests] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [state] TEXT,
   [locked] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [body] TEXT,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [merged_at] TEXT,
   [merge_commit_sha] TEXT,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [draft] INTEGER,
   [head] TEXT,
   [base] TEXT,
   [author_association] TEXT,
   [auto_merge] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [url] TEXT,
   [merged_by] INTEGER REFERENCES [users]([id])
);
CREATE INDEX [idx_pull_requests_merged_by]
    ON [pull_requests] ([merged_by]);
CREATE INDEX [idx_pull_requests_repo]
    ON [pull_requests] ([repo]);
CREATE INDEX [idx_pull_requests_milestone]
    ON [pull_requests] ([milestone]);
CREATE INDEX [idx_pull_requests_assignee]
    ON [pull_requests] ([assignee]);
CREATE INDEX [idx_pull_requests_user]
    ON [pull_requests] ([user]);
Powered by Datasette · Queries took 247.829ms · About: xarray-datasette