html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/4428#issuecomment-712066302,https://api.github.com/repos/pydata/xarray/issues/4428,712066302,MDEyOklzc3VlQ29tbWVudDcxMjA2NjMwMg==,1312546,2020-10-19T11:08:13Z,2020-10-19T11:43:46Z,MEMBER,"Sorry, my comment in https://github.com/pydata/xarray/issues/4428#issuecomment-711034128 was incorrect in a couple ways

1. We still do the splitting, even when slicing with an out-of-order indexer. Checking on if that's appropriate.
2. I'm checking in on a logic bug when computing the number of chunks. I don't think we properly handle non-uniform chunking on the other axes.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,702646191
https://github.com/pydata/xarray/issues/4428#issuecomment-711034128,https://api.github.com/repos/pydata/xarray/issues/4428,711034128,MDEyOklzc3VlQ29tbWVudDcxMTAzNDEyOA==,1312546,2020-10-17T15:54:48Z,2020-10-17T15:54:48Z,MEMBER,"I assume that the indices `[np.argsort(da.x.data)]` are not going to be monotonically increasing. That induces a different slicing pattern. The docs in https://docs.dask.org/en/latest/array-slicing.html#efficiency describe the case where the indices are sorted, but doesn't discuss the non-sorted case (yet).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,702646191
https://github.com/pydata/xarray/issues/4428#issuecomment-710683863,https://api.github.com/repos/pydata/xarray/issues/4428,710683863,MDEyOklzc3VlQ29tbWVudDcxMDY4Mzg2Mw==,2448579,2020-10-16T22:40:50Z,2020-10-16T22:40:50Z,MEMBER,"@TomAugspurger @jbusecke is seeing some funny behaviour in https://github.com/jbusecke/cmip6_preprocessing/issues/58

Here's a reproducer
``` python
import dask
import numpy as np
import xarray as xr

dask.config.set(
    **{
        ""array.slicing.split_large_chunks"": True,
        ""array.chunk-size"": ""24 MiB"",
    }
)

da = xr.DataArray(
    dask.array.random.random((10, 1000, 2000), chunks=(-1, -1, 200)),
    dims=[""x"", ""y"", ""time""],
    coords={""x"": [3, 4, 5, 6, 7, 9, 8, 0, 2, 1]},
)
da
```

![image](https://user-images.githubusercontent.com/2448579/96319766-d15a4b00-0fcd-11eb-9f9d-0f7116933367.png)

![image](https://user-images.githubusercontent.com/2448579/96319786-e0d99400-0fcd-11eb-9eaf-074e92ffc941.png)

Which is basically

``` python
da.data[np.argsort(da.x.data), ...]
```

![image](https://user-images.githubusercontent.com/2448579/96319876-141c2300-0fce-11eb-92ec-935645c6dffc.png)

I don't understand why its rechunking when we are indexing with a list along a dimension with a single chunk...","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,702646191
https://github.com/pydata/xarray/issues/4428#issuecomment-709539887,https://api.github.com/repos/pydata/xarray/issues/4428,709539887,MDEyOklzc3VlQ29tbWVudDcwOTUzOTg4Nw==,1312546,2020-10-15T19:20:53Z,2020-10-15T19:20:53Z,MEMBER,"Closing the loop here, with https://github.com/dask/dask/pull/6665 the behavior of Dask=2.25.0 should be restored (possibly with a warning about creating large chunks).

So this can probably be closed, though there *may* be parts of xarray that should be updated to avoid creating large chunks, or we could rely on the user to do that through the dask config system.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,702646191
https://github.com/pydata/xarray/issues/4428#issuecomment-696475388,https://api.github.com/repos/pydata/xarray/issues/4428,696475388,MDEyOklzc3VlQ29tbWVudDY5NjQ3NTM4OA==,8587080,2020-09-22T02:19:03Z,2020-09-22T02:19:03Z,NONE,Hi. This change of behaviour broke an interpolation for me. The interpolation function does a sortby along the interpolated dimension. But then you can't interpolate along a chunked dimension. I would argue the interpolation function needs to rechunk after the sortby to the original values or stop people from interpolating without assume_sorted=True with a dask array.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,702646191
https://github.com/pydata/xarray/issues/4428#issuecomment-693552440,https://api.github.com/repos/pydata/xarray/issues/4428,693552440,MDEyOklzc3VlQ29tbWVudDY5MzU1MjQ0MA==,6582745,2020-09-16T17:31:54Z,2020-09-16T17:31:54Z,NONE,"Thanks! I will definitely give that a go when I am back at my work PC. My personal take is that this level of automated rechunking is dangerous. I have constructed the chunking in my code with great care and for a reason. Having it changed ""invisibly"" by operations which didn't have this behaviour previously seems problematic to me.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,702646191
https://github.com/pydata/xarray/issues/4428#issuecomment-693475844,https://api.github.com/repos/pydata/xarray/issues/4428,693475844,MDEyOklzc3VlQ29tbWVudDY5MzQ3NTg0NA==,2448579,2020-09-16T15:17:44Z,2020-09-16T15:17:44Z,MEMBER,"This looks like a consequence of https://github.com/dask/dask/pull/6514 . That change helps with cases like https://github.com/pydata/xarray/issues/4112

`sortby` is basically an `isel` indexing operation; so dask is automatically rechunking to make chunks with size < the default. You could fix this by setting an appropriate value in `array.chunk-size` either temporarily or permanently

``` python
with dask.config.set({""array.chunk-size"": ""256MiB""}):  # or appropriate value
    ...
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,702646191
https://github.com/pydata/xarray/issues/4428#issuecomment-693385409,https://api.github.com/repos/pydata/xarray/issues/4428,693385409,MDEyOklzc3VlQ29tbWVudDY5MzM4NTQwOQ==,6582745,2020-09-16T12:54:39Z,2020-09-16T12:54:39Z,NONE,"Finally managed to reproduce. Here it is:
```python
import xarray
import dask.array as da
import numpy as np


if __name__ == ""__main__"":

    data = da.random.random([10000, 16, 4], chunks=(10000, 16, 4))

    dtype = np.float32

    xds = xarray.Dataset(
        data_vars={""DATA1"": ((""x"", ""y"", ""z""), data.astype(dtype))})

    upsample_factor = 1024//xds.dims[""y""]

    # Create a selection which will upsample the y axis.
    selection = np.repeat(np.arange(xds.dims[""y""]), upsample_factor)

    print(""xarray.Dataset prior to resampling:\n"", xds)

    xds = xds.sel({""y"": selection})

    print(""xarray.Dataset post resampling:\n"", xds)
```

With `dask==2.25.0` this gives:
```
xarray.Dataset prior to resampling:
 <xarray.Dataset>
Dimensions:  (x: 10000, y: 16, z: 4)
Dimensions without coordinates: x, y, z
Data variables:
    DATA1    (x, y, z) float32 dask.array<chunksize=(10000, 16, 4), meta=np.ndarray>
xarray.Dataset post resampling:
 <xarray.Dataset>
Dimensions:  (x: 10000, y: 1024, z: 4)
Dimensions without coordinates: x, y, z
Data variables:
    DATA1    (x, y, z) float32 dask.array<chunksize=(10000, 1024, 4), meta=np.ndarray>
```


With `dask==2.26.0` this gives:
```
xarray.Dataset prior to resampling:
 <xarray.Dataset>
Dimensions:  (x: 10000, y: 16, z: 4)
Dimensions without coordinates: x, y, z
Data variables:
    DATA1    (x, y, z) float32 dask.array<chunksize=(10000, 16, 4), meta=np.ndarray>
xarray.Dataset post resampling:
 <xarray.Dataset>
Dimensions:  (x: 10000, y: 1024, z: 4)
Dimensions without coordinates: x, y, z
Data variables:
    DATA1    (x, y, z) float32 dask.array<chunksize=(10000, 512, 4), meta=np.ndarray>
```

And finally, the most distressing part - changing the dtype changes the chunking! With `dtype = np.complex64`, `dask==2.26.0` gives:
```
xarray.Dataset prior to resampling:
 <xarray.Dataset>
Dimensions:  (x: 10000, y: 16, z: 4)
Dimensions without coordinates: x, y, z
Data variables:
    DATA1    (x, y, z) complex64 dask.array<chunksize=(10000, 16, 4), meta=np.ndarray>
xarray.Dataset post resampling:
 <xarray.Dataset>
Dimensions:  (x: 10000, y: 1024, z: 4)
Dimensions without coordinates: x, y, z
Data variables:
    DATA1    (x, y, z) complex64 dask.array<chunksize=(10000, 342, 4), meta=np.ndarray>
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,702646191