id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 490037439,MDU6SXNzdWU0OTAwMzc0Mzk=,3282,Silent value assignment failure in open_zarr Dataset due to hidden mode='r',6475152,open,0,,,0,2019-09-05T22:23:46Z,2020-03-29T10:33:20Z,,NONE,,,,"Hello Xarray devs, Thanks for your work on this fantastic package. I'm a new user, and the subtleties of different data stores are unfamiliar to me. I got tripped up by the fact that Zarr stores are (silently) read-only, and I think it would be helpful if this were more prominent in the docstring or [zarr section](https://xarray.pydata.org/en/stable/io.html#zarr) of the docs. When I try to assign values to parts of a local Zarr-backed Dataset, I get a silent failure: ```python In [142]: ds = xr.open_zarr('tmp.zarr', chunks=None) In [143]: selector = dict(time='2014-06-06T01:00:00', azimuth=0, frequency=0.0) In [144]: ds['counts'].loc[selector].values Out[144]: array(4294967295, dtype=uint32) # try to assign a value here, like the example in the docs: # In [55]: ds['empty'].loc[dict(lon=260, lat=30)] = 100 In [145]: ds['counts'].loc[selector].values = 0 # just get the same value back In [146]: ds['counts'].loc[selector].values Out[146]: array(4294967295, dtype=uint32) ``` The answer seems to be buried in the `open_zarr` source code: ```python ... # Zarr supports a wide range of access modes, but for now xarray either # reads or writes from a store, never both. For open_zarr, we only read mode = 'r' zarr_store = ZarrStore.open_group(store, mode=mode, synchronizer=synchronizer, group=group, consolidated=consolidated) ... ``` #### Expected Output Assignment that follows the examples [in the documentation](https://xarray.pydata.org/en/stable/indexing.html#assigning-values-with-indexing). 1. I think that mentioning `mode='r'` in the `open_zarr` docstring would be the most helpful. 2. A description of this and the reasoning behind why zarr datasets are read-only would be helpful in the [zarr section](https://xarray.pydata.org/en/stable/io.html#zarr) of the docs. 3. Optionally, a note in the [indexing and assignment](https://xarray.pydata.org/en/stable/indexing.html#assigning-values-with-indexing) that not all store backends support assignment would also be helpful. I'm happy to make a PR on 1 & 3, but I'm not familiar with the reasoning behind why stores are never mixed-mode. Thanks again! #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 | packaged by conda-forge | (default, Mar 27 2019, 15:43:19) [Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 16.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: None libnetcdf: None xarray: 0.12.3 pandas: 0.24.2 numpy: 1.16.3 scipy: 1.3.0 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.3.1 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 1.2.2 distributed: 1.28.1 matplotlib: 3.1.0 cartopy: None seaborn: None numbagg: None setuptools: 41.0.1 pip: 19.1 conda: None pytest: None IPython: 7.5.0 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3282/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 496460488,MDU6SXNzdWU0OTY0NjA0ODg=,3326,quantile with Dask arrays,6475152,closed,0,,,0,2019-09-20T17:14:59Z,2019-11-25T15:57:49Z,2019-11-25T15:57:49Z,NONE,,,,"Currently the `quantile` method [raises an exception](https://github.com/pydata/xarray/blob/master/xarray/core/variable.py#L1637) when it encounters a Dask array. ```python if isinstance(self.data, dask_array_type): raise TypeError( ""quantile does not work for arrays stored as dask "" ""arrays. Load the data via .compute() or .load() "" ""prior to calling this method."" ) ``` I think it's because taking a quantile needs to see all the data in the dimension it's quantile-ing, or blocked/approximate methods weren't on hand when the feature was added. Dask arrays where the dimension being quantile-ed was exactly one chunk in extent seem like a special case where no blocked algorithm is needed. The problem with following the suggestion of the exception (loading the array into memory) is that ""wide and shallow"" arrays are too big to load into memory, yet each chunk is statistically independent if the quantile dimension is the ""shallow"" dimension. I'm not necessarily proposing delegating to Dask's quantile (unless it's super easy), but wanted to explore this special case described above. Related links: * https://github.com/pydata/xarray/issues/2999 * https://stackoverflow.com/a/47103407/745557 Thank you! EDIT: added stackoverflow link","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3326/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue