id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
887711474,MDU6SXNzdWU4ODc3MTE0NzQ=,5290,Inconclusive error messages using to_zarr with regions,5802846,closed,0,,,4,2021-05-11T15:54:39Z,2023-11-05T06:28:39Z,2023-11-05T06:28:39Z,CONTRIBUTOR,,,,"
**What happened**:
The idea is to use a xarray dataset (stored as dummy zarr file), which is subsequently filled with the `region` argument, as explained in the documentation. Ideally, almost nothing is stored to disk upfront.
It seems the current implementation is only designed to either store coordinates for the whole dataset and write them to disk or to write without coordinates. I failed to understand this from the documentation and tried to create a dataset without coordinates and fill it with a dataset subset with coordinates. It gave some inconclusive errors depending on the actual code example (see below).
`ValueError: parameter 'value': expected array with shape (0,), got (10,)` or `ValueError: conflicting sizes for dimension 'x': length 10 on 'x' and length 30 on 'foo'`
It might also be a bug and it should in fact be possible to add a dataset with coordinates to a dummy dataset without coordinates. Then there seems to be an issue regarding the handling of the variables during storing the region.
... or I might just have done it wrong... and I'm looking forward to suggestions.
**What you expected to happen**:
Either an error message telling me that that i should use coordinates during creation of the dummy dataset. Alternatively, if this is a bug and should be possible then it should just work.
**Minimal Complete Verifiable Example**:
```python
import dask.array
import xarray as xr
import numpy as np
error = 1 # choose between 0 (no error), 1, 2, 3
dummies = dask.array.zeros(30, chunks=10)
# chunks in coords are not taken into account while saving!?
coord_x = dask.array.zeros(30, chunks=10) # or coord_x = np.zeros((30,))
if error == 0:
ds = xr.Dataset({""foo"": (""x"", dummies)}, coords={""x"":coord_x})
else:
ds = xr.Dataset({""foo"": (""x"", dummies)})
print(ds)
path = ""./tmp/test.zarr""
ds.to_zarr(path, mode='w', compute=False, consolidated=True)
# create a new dataset to be input into a region
ds = xr.Dataset({""foo"": ('x', np.arange(10))},coords={""x"":np.arange(10)})
if error == 1:
ds.to_zarr(path, region={""x"": slice(10, 20)})
# ValueError: parameter 'value': expected array with shape (0,), got (10,)
elif error == 2:
ds.to_zarr(path, region={""x"": slice(0, 10)})
ds.to_zarr(path, region={""x"": slice(10, 20)})
# ValueError: conflicting sizes for dimension 'x': length 10 on 'x' and length 30 on 'foo'
elif error == 3:
ds.to_zarr(path, region={""x"": slice(0, 10)})
ds = xr.Dataset({""foo"": ('x', np.arange(10))},coords={""x"":np.arange(10)})
ds.to_zarr(path, region={""x"": slice(10, 20)})
# ValueError: parameter 'value': expected array with shape (0,), got (10,)
else:
ds.to_zarr(path, region={""x"": slice(10, 20)})
ds = xr.open_zarr(path)
print('reopen',ds['x'])
```
**Anything else we need to know?**:
**Environment**:
Output of xr.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.8.6 | packaged by conda-forge | (default, Oct 7 2020, 19:08:05)
[GCC 7.5.0]
python-bits: 64
OS: Linux
OS-release: 4.19.0-16-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: en_US.UTF-8
libhdf5: None
libnetcdf: None
xarray: 0.18.0
pandas: 1.2.3
numpy: 1.19.2
scipy: 1.6.2
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: 2.8.1
cftime: 1.4.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2021.04.0
distributed: None
matplotlib: 3.4.1
cartopy: None
seaborn: None
numbagg: None
pint: None
setuptools: 49.6.0.post20210108
pip: 21.0.1
conda: None
pytest: None
IPython: None
sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5290/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
535686852,MDExOlB1bGxSZXF1ZXN0MzUxMzU0OTI4,3607,Strided rolling,5802846,open,0,,,3,2019-12-10T12:05:54Z,2023-03-09T20:37:56Z,,CONTRIBUTOR,,0,pydata/xarray/pulls/3607,"This PR adds a stride parameter to the rolling function of DataArray and Dataset . It basically extends the stride functionality being available for `core.rolling.DataArrayRolling.construct` and `core.rolling.DatasetRolling.construct` to the other methods of `DataArrayRolling` and `DatasetRolling`.
Note: it makes the arguments of `DataArrayRolling` and `DatasetRolling` inconsistent with the respective `rolling` arguments of pandas Series and DataFrame (They do not support stride).
Moreover, it does not solve the issue addressed in [this pandas issue](https://github.com/pandas-dev/pandas/issues/15354) (Efficient stride computation).
- [x] Tests added
- [x] Passes `black . && mypy . && flake8`
- [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3607/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
976207971,MDU6SXNzdWU5NzYyMDc5NzE=,5727,Setting item with loc and boolean mask fails,5802846,closed,0,,,3,2021-08-21T19:41:56Z,2022-03-17T17:11:43Z,2022-03-17T17:11:43Z,CONTRIBUTOR,,,,"**What happened**:
When setting a DataArray with `loc`, [xarray converts bool masks to the coord data type](https://github.com/pydata/xarray/blob/48a9dbe7d8dc2361bc985dd9fb1193a26135b310/xarray/core/indexing.py#L78). Therefore `loc` does not use a boolean mask but tries to match the indexes of the coord.
**Minimal Complete Verifiable Example**:
```python
import numpy as np
import xarray as xr
x = np.arange(10).astype(np.float64)
fx = np.arange(10).astype(np.float64)
da = xr.DataArray(fx,dims=['x'],coords={'x':x})
mask = np.zeros((10,))
mask[1::2] = 1
mask = mask.astype(bool)
da.loc[{'x':~mask}] = np.arange(5)+10
```
```
Traceback (most recent call last):
File """", line 1, in
File ""/home/matthmey/miniconda3/envs/analytics/lib/python3.7/site-packages/xarray/core/dataarray.py"", line 214, in __setitem__
self.data_array[pos_indexers] = value
File ""/home/matthmey/miniconda3/envs/analytics/lib/python3.7/site-packages/xarray/core/dataarray.py"", line 767, in __setitem__
self.variable[key] = value
File ""/home/matthmey/miniconda3/envs/analytics/lib/python3.7/site-packages/xarray/core/variable.py"", line 854, in __setitem__
indexable[index_tuple] = value
File ""/home/matthmey/miniconda3/envs/analytics/lib/python3.7/site-packages/xarray/core/indexing.py"", line 1164, in __setitem__
array[key] = value
ValueError: shape mismatch: value array of shape (5,) could not be broadcast to indexing result of shape (10,)
```
**Anything else we need to know?**:
Could be fixed by replacing [the line](https://github.com/pydata/xarray/blob/48a9dbe7d8dc2361bc985dd9fb1193a26135b310/xarray/core/indexing.py#L78), but maybe this is not the cleanest solution. I tried fixing it with the following, which works for the above code but fails for other cases.
```
if label.dtype != np.bool:
label = maybe_cast_to_coords_dtype(label, coord.dtype)
```
**Environment**:
Output of xr.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.7.0 (default, Oct 9 2018, 10:31:47)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 5.4.0-80-lowlatency
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None
xarray: 0.19.0
pandas: 1.0.1
numpy: 1.18.1
scipy: 1.4.1
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: 2.5.0
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.14.0
distributed: None
matplotlib: 3.1.3
cartopy: None
seaborn: None
numbagg: None
pint: None
setuptools: 46.1.3.post20200330
pip: 20.0.2
conda: None
pytest: None
IPython: 7.13.0
sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5727/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
535703663,MDU6SXNzdWU1MzU3MDM2NjM=,3608,Feature Request: Efficient rolling with strides,5802846,open,0,,,8,2019-12-10T12:38:59Z,2021-07-28T11:58:28Z,,CONTRIBUTOR,,,,"Xarray is facing the same issues in its current `rolling` implementation (`DataArrayRolling` and `DatasetRolling`) as described in [this pandas issue](https://github.com/pandas-dev/pandas/issues/15354). Namely, the `construct` methods stride parameter is applied after the rolling is computed. Technically, we are computing more than we would need to because we partially throwing it away due to striding.
In PR #3607 the issue is solved for the `...Rolling`'s `__iter__` function but not for the `construct`, `reduce` and `_bottleneck_reduce` methods.
Since the way Xarray's rolling is implemented relies on numpy, we could introduce a sliding window function as described [here](https://github.com/numpy/numpy/issues/7753#).
Any opinions?
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3608/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
536214141,MDExOlB1bGxSZXF1ZXN0MzUxNzg0Nzk5,3610,Fix zarr append with groups,5802846,closed,0,,,12,2019-12-11T08:24:44Z,2020-03-02T12:19:17Z,2020-03-02T12:19:17Z,CONTRIBUTOR,,0,pydata/xarray/pulls/3610,"Fixes the issue that `xarray.core.dataset.Dataset.to_zarr` produced errors when using `append_dim` and `group` simultaneously. The issue was that during appending, the zarr store was opened without the group information.
- [x] Closes #3170
- [x] Tests added
- [x] Passes `black . && mypy . && flake8`
- [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3610/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull