github: issue_comments: 15 rows where user = 8881170 sorted by updated

15 rows where user = 8881170 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
1194615618	https://github.com/pydata/xarray/pull/6825#issuecomment-1194615618	https://api.github.com/repos/pydata/xarray/issues/6825	IC_kwDOAMm_X85HNGNC	bradyrx 8881170	2022-07-25T20:52:55Z	2022-07-25T20:52:55Z	CONTRIBUTOR	Thanks @dcherian!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Add docstring example for xr.open_mfdataset 1317320059
1194581162	https://github.com/pydata/xarray/pull/6825#issuecomment-1194581162	https://api.github.com/repos/pydata/xarray/issues/6825	IC_kwDOAMm_X85HM9yq	bradyrx 8881170	2022-07-25T20:22:28Z	2022-07-25T20:22:28Z	CONTRIBUTOR	Is there some `#noqa` equivalent to avoid testing the docstring example here? Or should I be pointing to a test dataset to open?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Add docstring example for xr.open_mfdataset 1317320059
791465015	https://github.com/pydata/xarray/issues/4922#issuecomment-791465015	https://api.github.com/repos/pydata/xarray/issues/4922	MDEyOklzc3VlQ29tbWVudDc5MTQ2NTAxNQ==	bradyrx 8881170	2021-03-05T14:47:46Z	2021-03-05T14:47:46Z	CONTRIBUTOR	I feel like this should not work i.e. rolling window length (6) < size along axis (3). So the bottleneck error seems right. This is normally the case, but with `min_periods=1` it should just return the given value so long as there's at least one observation (as in case #2, where the boundaries return as normal and the middle number is smoothed). Thanks for the pointer on #4977!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Bottleneck and dask objects ignore `min_periods` on `rolling` 811321550
790986252	https://github.com/pydata/xarray/issues/4922#issuecomment-790986252	https://api.github.com/repos/pydata/xarray/issues/4922	MDEyOklzc3VlQ29tbWVudDc5MDk4NjI1Mg==	bradyrx 8881170	2021-03-04T22:21:37Z	2021-03-04T22:32:01Z	CONTRIBUTOR	@dcherian, to add to the complexity here, it's even weirder than originally reported. See my test cases below. This might alter how this bug is approached. ```python import xarray as xr def _rolling(ds): return ds.rolling(time=6, center=False, min_periods=1).mean() Length 3 array to test that min_periods is called in, despite asking for 6 time-steps of smoothing ds = xr.DataArray([1, 2, 3], dims='time') ds['time'] = xr.cftime_range(start='2021-01-01', freq='D', periods=3) ``` 1. With `bottleneck` installed, `min_periods` is ignored as a kwarg with in-memory arrays. (`bottleneck` installed) ```python Just apply rolling to the base array. ds.rolling(time=6, center=False, min_periods=1).mean() ValueError: Moving window (=6) must between 1 and 3, inclusive Group into single day climatology groups and apply ds.groupby('time.dayofyear').map(_rolling) ValueError: Moving window (=6) must between 1 and 1, inclusive ``` 2. With `bottleneck` uninstalled, `min_periods` works with in-memory arrays. (`bottleneck` uninstalled) ```python Just apply rolling to the base array. ds.rolling(time=6, center=False, min_periods=1).mean() <xarray.DataArray (time: 3)> array([1. , 1.5, 2. ]) Coordinates: * time (time) object 2021-01-01 00:00:00 ... 2021-01-03 00:00:00 Group into single day climatology groups and apply ds.groupby('time.dayofyear').map(_rolling) <xarray.DataArray (time: 3)> array([1., 2., 3.]) Coordinates: * time (time) object 2021-01-01 00:00:00 ... 2021-01-03 00:00:00 ``` 3. Regardless of `bottleneck`, `dask` objects ignore `min_period` when a `groupby` object. This specifically seems like an issue with `.map()` (independent of `bottleneck` installation) ```python Just apply rolling to the base array. ds.chunk().rolling(time=6, center=False, min_periods=1).mean().compute() <xarray.DataArray (time: 3)> array([1. , 1.5, 2. ]) Coordinates: * time (time) object 2021-01-01 00:00:00 ... 2021-01-03 00:00:00 Group into single day climatology groups and apply ds.chunk().groupby('time.dayofyear').map(_rolling) ValueError: For window size 6, every chunk should be larger than 3, but the smallest chunk size is 1. Rechunk your array with a larger chunk size or a chunk size that more evenly divides the shape of your array. ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Bottleneck and dask objects ignore `min_periods` on `rolling` 811321550
655142333	https://github.com/pydata/xarray/issues/3813#issuecomment-655142333	https://api.github.com/repos/pydata/xarray/issues/3813	MDEyOklzc3VlQ29tbWVudDY1NTE0MjMzMw==	bradyrx 8881170	2020-07-07T21:22:30Z	2020-07-07T21:22:30Z	CONTRIBUTOR	FYI, this is also seen on `xr.apply_ufunc`, but only when `vectorize=True`. It seems like ndarrays write switch are turned off when `vectorize=True`. This is also solved by `.copy()`, which is good anways to avoid mutating the original ndarrays. Perhaps also a `copy=bool` could be added to `apply_ufunc` to create copies of the ndarrays? I'd be happy to lead that PR if it makes sense. Example: ``python def match_nans(a, b): """Pairwise matching of nans between two time series.""" # Try with and without.copy` commands. # a = a.copy() # b = b.copy() if np.isnan(a).any() or np.isnan(b).any(): idx = np.logical_or(np.isnan(a), np.isnan(b)) a[idx], b[idx] = np.nan, np.nan return a, b A = xr.DataArray(np.random.rand(10, 5), dims=['time', 'space']) B = xr.DataArray(np.random.rand(10, 5), dims=['time', 'space']) A[0, 1] = np.nan B[5, 0] = np.nan xr.apply_ufunc(match_nans, A, B, input_core_dims=[['time'], ['time']], output_core_dims=[['time'], ['time']], # Try with and without vectorize. vectorize=True,) ```	{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Xarray operations produce read-only array 573031381
628135082	https://github.com/pydata/xarray/issues/1815#issuecomment-628135082	https://api.github.com/repos/pydata/xarray/issues/1815	MDEyOklzc3VlQ29tbWVudDYyODEzNTA4Mg==	bradyrx 8881170	2020-05-13T17:27:06Z	2020-05-13T17:27:06Z	CONTRIBUTOR	So would you be re-doing the same computation by running .compute() separately on these objects? Yes. but you can do `dask.compute(xarray_obj1, xarray_obj2,...)` or combine those objects appropriately into a Dataset and then call compute on that. Good call. I figured there was a workaround.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	apply_ufunc(dask='parallelized') with multiple outputs 287223508
628070696	https://github.com/pydata/xarray/issues/1815#issuecomment-628070696	https://api.github.com/repos/pydata/xarray/issues/1815	MDEyOklzc3VlQ29tbWVudDYyODA3MDY5Ng==	bradyrx 8881170	2020-05-13T15:33:56Z	2020-05-13T15:33:56Z	CONTRIBUTOR	One issue I see is that this would return multiple dask objects, correct? So to get the results from them, you'd have to run `.compute()` on each separately. I think it's a valid assumption to expect that the multiple output objects would share a lot of the same computational pipeline. So would you be re-doing the same computation by running `.compute()` separately on these objects? The earlier mentioned code snippets provide a nice path forward, since you can just run compute on one object, and then split its `result` (or however you name it) dimension into multiple individual objects. Thoughts?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	apply_ufunc(dask='parallelized') with multiple outputs 287223508
624158963	https://github.com/pydata/xarray/pull/3816#issuecomment-624158963	https://api.github.com/repos/pydata/xarray/issues/3816	MDEyOklzc3VlQ29tbWVudDYyNDE1ODk2Mw==	bradyrx 8881170	2020-05-05T16:28:26Z	2020-05-05T16:28:26Z	CONTRIBUTOR	I missed this originally @dcherian, but thanks for the great work here. The docs changes are a great help.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Add template xarray object kwarg to map_blocks 573768194
614244205	https://github.com/pydata/xarray/issues/1815#issuecomment-614244205	https://api.github.com/repos/pydata/xarray/issues/1815	MDEyOklzc3VlQ29tbWVudDYxNDI0NDIwNQ==	bradyrx 8881170	2020-04-15T19:45:50Z	2020-04-15T19:45:50Z	CONTRIBUTOR	I think ideally it would be nice to return multiple DataArrays or a Dataset of variables. But I'm really happy with this solution. I'm using it on a 600GB dataset of particle trajectories and was able to write a ufunc to go through and return each particle's x, y, z location when it met a certain condition. I think having something simple like the stackoverflow snippet I posted would be great for the docs as an `apply_ufunc` example. I'd be happy to lead this if folks think it's a good idea.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	apply_ufunc(dask='parallelized') with multiple outputs 287223508
614216243	https://github.com/pydata/xarray/issues/1815#issuecomment-614216243	https://api.github.com/repos/pydata/xarray/issues/1815	MDEyOklzc3VlQ29tbWVudDYxNDIxNjI0Mw==	bradyrx 8881170	2020-04-15T18:49:51Z	2020-04-15T18:49:51Z	CONTRIBUTOR	This looks essentially the same to @stefraynaud's answer, but I came across this stackoverflow response here: https://stackoverflow.com/questions/52094320/with-xarray-how-to-parallelize-1d-operations-on-a-multidimensional-dataset. @andersy005, I imagine you're far past this now. And this might have been related to discussions with Genevieve and I anyways. ```python def new_linregress(x, y): # Wrapper around scipy linregress to use in apply_ufunc slope, intercept, r_value, p_value, std_err = stats.linregress(x, y) return np.array([slope, intercept, r_value, p_value, std_err]) return a new DataArray stats = xr.apply_ufunc(new_linregress, ds[x], ds[y], input_core_dims=[['year'], ['year']], output_core_dims=[["parameter"]], vectorize=True, dask="parallelized", output_dtypes=['float64'], output_sizes={"parameter": 5}, ) ```	{ "total_count": 3, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 3, "rocket": 0, "eyes": 0 }	apply_ufunc(dask='parallelized') with multiple outputs 287223508
573107748	https://github.com/pydata/xarray/pull/3667#issuecomment-573107748	https://api.github.com/repos/pydata/xarray/issues/3667	MDEyOklzc3VlQ29tbWVudDU3MzEwNzc0OA==	bradyrx 8881170	2020-01-10T16:32:47Z	2020-01-10T16:32:47Z	CONTRIBUTOR	Thanks @dcherian -- done in https://github.com/pydata/xarray/pull/3682.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Add map_blocks example to docs 546451185
572688941	https://github.com/pydata/xarray/pull/3667#issuecomment-572688941	https://api.github.com/repos/pydata/xarray/issues/3667	MDEyOklzc3VlQ29tbWVudDU3MjY4ODk0MQ==	bradyrx 8881170	2020-01-09T18:23:14Z	2020-01-09T18:23:14Z	CONTRIBUTOR	Oops, forgot to add to `whats-new`, but this is a pretty minor addition.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Add map_blocks example to docs 546451185
572137657	https://github.com/pydata/xarray/pull/3667#issuecomment-572137657	https://api.github.com/repos/pydata/xarray/issues/3667	MDEyOklzc3VlQ29tbWVudDU3MjEzNzY1Nw==	bradyrx 8881170	2020-01-08T16:04:54Z	2020-01-08T16:04:54Z	CONTRIBUTOR	What's going on here? I use travis on my repos so I'm not familiar with the Azure setup. I only modified a docstring so I'm not sure why it would break the testing suite? Unless it's testing my code snippet in the docs?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Add map_blocks example to docs 546451185
561261583	https://github.com/pydata/xarray/issues/3580#issuecomment-561261583	https://api.github.com/repos/pydata/xarray/issues/3580	MDEyOklzc3VlQ29tbWVudDU2MTI2MTU4Mw==	bradyrx 8881170	2019-12-03T17:02:39Z	2019-12-03T17:02:39Z	CONTRIBUTOR	I can't seem to replicate this issue for some reason. I have the same versions of `xarray`, `numpy`, and `netCDF4` installed. `python-traceback IndexError: The indexing operation you are attempting to perform is not valid on netCDF4.Variable object. Try loading your data into memory first by calling .load().` This implies that it's having issues slicing numpy-style with a dask array. I bet if you load it into memory and slice that way it'll work. But at ~22GB you might not be able to do that. The preferred way to slice in `xarray` is to use `.sel()` and `.isel()` to leverage the label-aware nature of `xarray`. So you should have no problem doing this operation explicitly with the following: `fullda['sst'].isel(M=0, S=0, X=0, Y=0)`. You of course don't need to slice the `L` dimension since you are taking the full thing, but the equivalent notation there is :`fullda['sst'].isel(L=slice(0, None))`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xr.DataArray.values fails with latest versions of netcdf4 529644880
494059784	https://github.com/pydata/xarray/issues/2969#issuecomment-494059784	https://api.github.com/repos/pydata/xarray/issues/2969	MDEyOklzc3VlQ29tbWVudDQ5NDA1OTc4NA==	bradyrx 8881170	2019-05-20T16:30:02Z	2019-05-20T16:30:02Z	CONTRIBUTOR	Thanks for the feedback and link to the other issue. I wasn't sure what to search to find other issues on this. The coordinate transformation seems like the most straightforward approach.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	`where` function mis-broadcasts and alters data type on dataset 445175953

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

15 rows where user = 8881170 sorted by updated_at descending

Length 3 array to test that min_periods is called in, despite asking

for 6 time-steps of smoothing

1. With `bottleneck` installed, `min_periods` is ignored as a kwarg with in-memory arrays.

Just apply rolling to the base array.

Group into single day climatology groups and apply

2. With `bottleneck` uninstalled, `min_periods` works with in-memory arrays.

Just apply rolling to the base array.

Group into single day climatology groups and apply

3. Regardless of `bottleneck`, `dask` objects ignore `min_period` when a `groupby` object.

Just apply rolling to the base array.

Group into single day climatology groups and apply

return a new DataArray

Advanced export

issue_comments

15 rows where user = 8881170 sorted by updated_at descending

Length 3 array to test that min_periods is called in, despite asking

for 6 time-steps of smoothing

1. With bottleneck installed, min_periods is ignored as a kwarg with in-memory arrays.

Just apply rolling to the base array.

Group into single day climatology groups and apply

2. With bottleneck uninstalled, min_periods works with in-memory arrays.

Just apply rolling to the base array.

Group into single day climatology groups and apply

3. Regardless of bottleneck, dask objects ignore min_period when a groupby object.

Just apply rolling to the base array.

Group into single day climatology groups and apply

return a new DataArray

Advanced export

1. With `bottleneck` installed, `min_periods` is ignored as a kwarg with in-memory arrays.

2. With `bottleneck` uninstalled, `min_periods` works with in-memory arrays.

3. Regardless of `bottleneck`, `dask` objects ignore `min_period` when a `groupby` object.