github: issue_comments: 8 rows where issue = 702646191 sorted by updated

8 rows where issue = 702646191 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
712066302	https://github.com/pydata/xarray/issues/4428#issuecomment-712066302	https://api.github.com/repos/pydata/xarray/issues/4428	MDEyOklzc3VlQ29tbWVudDcxMjA2NjMwMg==	TomAugspurger 1312546	2020-10-19T11:08:13Z	2020-10-19T11:43:46Z	MEMBER	Sorry, my comment in https://github.com/pydata/xarray/issues/4428#issuecomment-711034128 was incorrect in a couple ways We still do the splitting, even when slicing with an out-of-order indexer. Checking on if that's appropriate. I'm checking in on a logic bug when computing the number of chunks. I don't think we properly handle non-uniform chunking on the other axes.	{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Behaviour change in xarray.Dataset.sortby/sel between dask==2.25.0 and dask==2.26.0 702646191
711034128	https://github.com/pydata/xarray/issues/4428#issuecomment-711034128	https://api.github.com/repos/pydata/xarray/issues/4428	MDEyOklzc3VlQ29tbWVudDcxMTAzNDEyOA==	TomAugspurger 1312546	2020-10-17T15:54:48Z	2020-10-17T15:54:48Z	MEMBER	I assume that the indices `[np.argsort(da.x.data)]` are not going to be monotonically increasing. That induces a different slicing pattern. The docs in https://docs.dask.org/en/latest/array-slicing.html#efficiency describe the case where the indices are sorted, but doesn't discuss the non-sorted case (yet).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Behaviour change in xarray.Dataset.sortby/sel between dask==2.25.0 and dask==2.26.0 702646191
710683863	https://github.com/pydata/xarray/issues/4428#issuecomment-710683863	https://api.github.com/repos/pydata/xarray/issues/4428	MDEyOklzc3VlQ29tbWVudDcxMDY4Mzg2Mw==	dcherian 2448579	2020-10-16T22:40:50Z	2020-10-16T22:40:50Z	MEMBER	@TomAugspurger @jbusecke is seeing some funny behaviour in https://github.com/jbusecke/cmip6_preprocessing/issues/58 Here's a reproducer ``` python import dask import numpy as np import xarray as xr dask.config.set( **{ "array.slicing.split_large_chunks": True, "array.chunk-size": "24 MiB", } ) da = xr.DataArray( dask.array.random.random((10, 1000, 2000), chunks=(-1, -1, 200)), dims=["x", "y", "time"], coords={"x": [3, 4, 5, 6, 7, 9, 8, 0, 2, 1]}, ) da ``` Which is basically `python da.data[np.argsort(da.x.data), ...]` I don't understand why its rechunking when we are indexing with a list along a dimension with a single chunk...	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Behaviour change in xarray.Dataset.sortby/sel between dask==2.25.0 and dask==2.26.0 702646191
709539887	https://github.com/pydata/xarray/issues/4428#issuecomment-709539887	https://api.github.com/repos/pydata/xarray/issues/4428	MDEyOklzc3VlQ29tbWVudDcwOTUzOTg4Nw==	TomAugspurger 1312546	2020-10-15T19:20:53Z	2020-10-15T19:20:53Z	MEMBER	Closing the loop here, with https://github.com/dask/dask/pull/6665 the behavior of Dask=2.25.0 should be restored (possibly with a warning about creating large chunks). So this can probably be closed, though there may be parts of xarray that should be updated to avoid creating large chunks, or we could rely on the user to do that through the dask config system.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Behaviour change in xarray.Dataset.sortby/sel between dask==2.25.0 and dask==2.26.0 702646191
696475388	https://github.com/pydata/xarray/issues/4428#issuecomment-696475388	https://api.github.com/repos/pydata/xarray/issues/4428	MDEyOklzc3VlQ29tbWVudDY5NjQ3NTM4OA==	ccarouge 8587080	2020-09-22T02:19:03Z	2020-09-22T02:19:03Z	NONE	Hi. This change of behaviour broke an interpolation for me. The interpolation function does a sortby along the interpolated dimension. But then you can't interpolate along a chunked dimension. I would argue the interpolation function needs to rechunk after the sortby to the original values or stop people from interpolating without assume_sorted=True with a dask array.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Behaviour change in xarray.Dataset.sortby/sel between dask==2.25.0 and dask==2.26.0 702646191
693552440	https://github.com/pydata/xarray/issues/4428#issuecomment-693552440	https://api.github.com/repos/pydata/xarray/issues/4428	MDEyOklzc3VlQ29tbWVudDY5MzU1MjQ0MA==	JSKenyon 6582745	2020-09-16T17:31:54Z	2020-09-16T17:31:54Z	NONE	Thanks! I will definitely give that a go when I am back at my work PC. My personal take is that this level of automated rechunking is dangerous. I have constructed the chunking in my code with great care and for a reason. Having it changed "invisibly" by operations which didn't have this behaviour previously seems problematic to me.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Behaviour change in xarray.Dataset.sortby/sel between dask==2.25.0 and dask==2.26.0 702646191
693475844	https://github.com/pydata/xarray/issues/4428#issuecomment-693475844	https://api.github.com/repos/pydata/xarray/issues/4428	MDEyOklzc3VlQ29tbWVudDY5MzQ3NTg0NA==	dcherian 2448579	2020-09-16T15:17:44Z	2020-09-16T15:17:44Z	MEMBER	This looks like a consequence of https://github.com/dask/dask/pull/6514 . That change helps with cases like https://github.com/pydata/xarray/issues/4112 `sortby` is basically an `isel` indexing operation; so dask is automatically rechunking to make chunks with size < the default. You could fix this by setting an appropriate value in `array.chunk-size` either temporarily or permanently `python with dask.config.set({"array.chunk-size": "256MiB"}): # or appropriate value ...`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Behaviour change in xarray.Dataset.sortby/sel between dask==2.25.0 and dask==2.26.0 702646191
693385409	https://github.com/pydata/xarray/issues/4428#issuecomment-693385409	https://api.github.com/repos/pydata/xarray/issues/4428	MDEyOklzc3VlQ29tbWVudDY5MzM4NTQwOQ==	JSKenyon 6582745	2020-09-16T12:54:39Z	2020-09-16T12:54:39Z	NONE	Finally managed to reproduce. Here it is: ```python import xarray import dask.array as da import numpy as np if name == "main": `data = da.random.random([10000, 16, 4], chunks=(10000, 16, 4)) dtype = np.float32 xds = xarray.Dataset( data_vars={"DATA1": (("x", "y", "z"), data.astype(dtype))}) upsample_factor = 1024//xds.dims["y"] # Create a selection which will upsample the y axis. selection = np.repeat(np.arange(xds.dims["y"]), upsample_factor) print("xarray.Dataset prior to resampling:\n", xds) xds = xds.sel({"y": selection}) print("xarray.Dataset post resampling:\n", xds)` ``` With `dask==2.25.0` this gives: `xarray.Dataset prior to resampling: <xarray.Dataset> Dimensions: (x: 10000, y: 16, z: 4) Dimensions without coordinates: x, y, z Data variables: DATA1 (x, y, z) float32 dask.array<chunksize=(10000, 16, 4), meta=np.ndarray> xarray.Dataset post resampling: <xarray.Dataset> Dimensions: (x: 10000, y: 1024, z: 4) Dimensions without coordinates: x, y, z Data variables: DATA1 (x, y, z) float32 dask.array<chunksize=(10000, 1024, 4), meta=np.ndarray>` With `dask==2.26.0` this gives: `xarray.Dataset prior to resampling: <xarray.Dataset> Dimensions: (x: 10000, y: 16, z: 4) Dimensions without coordinates: x, y, z Data variables: DATA1 (x, y, z) float32 dask.array<chunksize=(10000, 16, 4), meta=np.ndarray> xarray.Dataset post resampling: <xarray.Dataset> Dimensions: (x: 10000, y: 1024, z: 4) Dimensions without coordinates: x, y, z Data variables: DATA1 (x, y, z) float32 dask.array<chunksize=(10000, 512, 4), meta=np.ndarray>` And finally, the most distressing part - changing the dtype changes the chunking! With `dtype = np.complex64`, `dask==2.26.0` gives: `xarray.Dataset prior to resampling: <xarray.Dataset> Dimensions: (x: 10000, y: 16, z: 4) Dimensions without coordinates: x, y, z Data variables: DATA1 (x, y, z) complex64 dask.array<chunksize=(10000, 16, 4), meta=np.ndarray> xarray.Dataset post resampling: <xarray.Dataset> Dimensions: (x: 10000, y: 1024, z: 4) Dimensions without coordinates: x, y, z Data variables: DATA1 (x, y, z) complex64 dask.array<chunksize=(10000, 342, 4), meta=np.ndarray>`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Behaviour change in xarray.Dataset.sortby/sel between dask==2.25.0 and dask==2.26.0 702646191

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);