id,node_id,number,state,locked,title,user,body,created_at,updated_at,closed_at,merged_at,merge_commit_sha,assignee,milestone,draft,head,base,author_association,auto_merge,repo,url,merged_by
39752514,MDExOlB1bGxSZXF1ZXN0Mzk3NTI1MTQ=,468,closed,0,Option for closing files with scipy backend,1197350,"This addresses issue #463, in which open_mfdataset failed when trying to open a list of files longer than my system's ulimit. I tried to find a solution in which the underlying netcdf file objects are kept closed by default and only reopened ""when needed"".

I ended up subclassing scipy.io.netcdf_file and overwriting the variable attribute with a property which first checks whether the file is open or closed and opens it if needed. That was the easy part. The hard part was figuring out when to close them. The problem is that a couple of different parts of the code (e.g. each individual variable and also the datastore object itself) keep references to the netcdf_file object. In the end I used the debugger to find out when during initialization the variables were actually being read and added some calls to close() in various different places. It is relatively easy to close the files up at the end of the initialization, but it was much harder to make sure that the whole array of files is never open at the same time. I also had to disable mmap when this option is active.

This solution is messy and, moreover, extremely slow. There is a factor of ~100 performance penalty during initialization for reopening and closing the files all the time (but only a factor of 10 for the actual calculation). I am sure this could be reduced if someone who understands the code better found some judicious points at which to call close() on the netcdf_file. The loss of mmap also sucks.

This option can be accessed with the close_files key word, which I added to api.

Timing for loading and doing a calculation with close_files=True:

``` python
count_open_files()
%time mfds =  xray.open_mfdataset(ddir + '/dt_global_allsat_msla_uv_2014101*.nc', engine='scipy', close_files=True)
count_open_files()
%time print float(mfds.variables['u'].mean())
count_open_files()
```

output:

```
3 open files
CPU times: user 11.1 s, sys: 17.5 s, total: 28.5 s
Wall time: 27.7 s
2 open files
0.0055650632367
CPU times: user 649 ms, sys: 974 ms, total: 1.62 s
Wall time: 633 ms
2 open files
```

Timing for loading and doing a calculation with close_files=False (default, should revert to old behavior):

``` python
count_open_files()
%time mfds =  xray.open_mfdataset(ddir + '/dt_global_allsat_msla_uv_2014101*.nc', engine='scipy', close_files=False)
count_open_files()
%time print float(mfds.variables['u'].mean())
count_open_files()
```

```
3 open files
CPU times: user 264 ms, sys: 85.3 ms, total: 349 ms
Wall time: 291 ms
22 open files
0.0055650632367
CPU times: user 174 ms, sys: 141 ms, total: 315 ms
Wall time: 56 ms
22 open files
```

This is not a very serious pull request, but I spent all day on it, so I thought I would share. Maybe you can see some obvious way to improve it...
",2015-07-11T21:24:24Z,2015-08-10T12:50:45Z,2015-08-09T00:04:12Z,,fe363c15d6c4f23d664d8729a54c9c2ce5a4e918,,,0,200aeb006781528cf6d4ca2f118d7f9257bd191b,200aeb006781528cf6d4ca2f118d7f9257bd191b,MEMBER,,13221727,https://github.com/pydata/xarray/pull/468,
41962443,MDExOlB1bGxSZXF1ZXN0NDE5NjI0NDM=,522,closed,0,Fix datetime decoding when time units are 'days since 0000-01-01 00:00:00',1197350,"This fixes #521 using the workaround described in Unidata/netcdf4-python#442.
",2015-08-08T23:26:07Z,2015-08-09T00:10:18Z,2015-08-09T00:06:49Z,,cd4a3c221516dbafbff9ccf7586913a2e1aeaefd,,,0,54f63df0b25a2f0df10885e390d7b8f05320f33e,200aeb006781528cf6d4ca2f118d7f9257bd191b,MEMBER,,13221727,https://github.com/pydata/xarray/pull/522,
41962908,MDExOlB1bGxSZXF1ZXN0NDE5NjI5MDg=,523,closed,0,Fix datetime decoding when time units are 'days since 0000-01-01 00:00:00',1197350,"This fixes #521 using the workaround described in Unidata/netcdf4-python#442.
",2015-08-09T00:12:00Z,2015-08-14T17:22:02Z,2015-08-14T17:22:02Z,,368c623812a58d1e09ec121d531e3e076391fcbd,,,0,653b8641787aa008c5c901f2967392b7894b207d,200aeb006781528cf6d4ca2f118d7f9257bd191b,MEMBER,,13221727,https://github.com/pydata/xarray/pull/523,
42016208,MDExOlB1bGxSZXF1ZXN0NDIwMTYyMDg=,524,closed,0,Option for closing files with scipy backend,1197350,"This is the same as #468, which was accidentally closed. I just copied and pasted my comment below

This addresses issue #463, in which open_mfdataset failed when trying to open a list of files longer than my system's ulimit. I tried to find a solution in which the underlying netcdf file objects are kept closed by default and only reopened ""when needed"".

I ended up subclassing scipy.io.netcdf_file and overwriting the variable attribute with a property which first checks whether the file is open or closed and opens it if needed. That was the easy part. The hard part was figuring out when to close them. The problem is that a couple of different parts of the code (e.g. each individual variable and also the datastore object itself) keep references to the netcdf_file object. In the end I used the debugger to find out when during initialization the variables were actually being read and added some calls to close() in various different places. It is relatively easy to close the files up at the end of the initialization, but it was much harder to make sure that the whole array of files is never open at the same time. I also had to disable mmap when this option is active.

This solution is messy and, moreover, extremely slow. There is a factor of ~100 performance penalty during initialization for reopening and closing the files all the time (but only a factor of 10 for the actual calculation). I am sure this could be reduced if someone who understands the code better found some judicious points at which to call close() on the netcdf_file. The loss of mmap also sucks.

This option can be accessed with the close_files key word, which I added to api.

Timing for loading and doing a calculation with close_files=True:

``` python
count_open_files()
%time mfds =  xray.open_mfdataset(ddir + '/dt_global_allsat_msla_uv_2014101*.nc', engine='scipy', close_files=True)
count_open_files()
%time print float(mfds.variables['u'].mean())
count_open_files()
```

output:

```
3 open files
CPU times: user 11.1 s, sys: 17.5 s, total: 28.5 s
Wall time: 27.7 s
2 open files
0.0055650632367
CPU times: user 649 ms, sys: 974 ms, total: 1.62 s
Wall time: 633 ms
2 open files
```

Timing for loading and doing a calculation with close_files=False (default, should revert to old behavior):

``` python
count_open_files()
%time mfds =  xray.open_mfdataset(ddir + '/dt_global_allsat_msla_uv_2014101*.nc', engine='scipy', close_files=False)
count_open_files()
%time print float(mfds.variables['u'].mean())
count_open_files()
```

```
3 open files
CPU times: user 264 ms, sys: 85.3 ms, total: 349 ms
Wall time: 291 ms
22 open files
0.0055650632367
CPU times: user 174 ms, sys: 141 ms, total: 315 ms
Wall time: 56 ms
22 open files
```

This is not a very serious pull request, but I spent all day on it, so I thought I would share. Maybe you can see some obvious way to improve it...
",2015-08-10T12:49:23Z,2016-06-24T17:45:07Z,2016-06-24T17:45:07Z,,,,,0,0145d62c12808c636d1a578fc82b9098f8b78d29,200aeb006781528cf6d4ca2f118d7f9257bd191b,MEMBER,,13221727,https://github.com/pydata/xarray/pull/524,
42731548,MDExOlB1bGxSZXF1ZXN0NDI3MzE1NDg=,538,closed,0,Fix contour color,1197350,"This fixes #537 by adding a check for the presence of the colors kwarg.
",2015-08-18T18:24:36Z,2015-09-01T17:48:12Z,2015-09-01T17:20:56Z,2015-09-01T17:20:55Z,ddf9177fad8cf94bc2dfe908892fa59589479056,,,0,ee4fab78e2e54042ea1adb1c7bc7c849f9ec0c76,d2e9bccf65241e0d962902605e91db96bc2c768d,MEMBER,,13221727,https://github.com/pydata/xarray/pull/538,
65407870,MDExOlB1bGxSZXF1ZXN0NjU0MDc4NzA=,818,closed,0,Multidimensional groupby,1197350,"Many datasets have a two dimensional coordinate variable (e.g. longitude) which is different from the logical grid coordinates (e.g. nx, ny). (See #605.) For plotting purposes, this is solved by #608. However, we still might want to split / apply / combine over such coordinates. That has not been possible, because groupby only supports creating groups on one-dimensional arrays.

This PR overcomes that issue by using `stack` to collapse multiple dimensions in the group variable. A minimal example of the new functionality is

``` python
>>> da = xr.DataArray([[0,1],[2,3]], 
                coords={'lon': (['ny','nx'], [[30,40],[40,50]] ),
                        'lat': (['ny','nx'], [[10,10],[20,20]] )},
                dims=['ny','nx'])
>>> da.groupby('lon').sum()
<xarray.DataArray (lon: 3)>
array([0, 3, 3])
Coordinates:
  * lon      (lon) int64 30 40 50
```

This feature could have broad applicability for many realistic datasets (particularly model output on irregular grids): for example, averaging non-rectangular grids zonally (i.e. in latitude), binning in temperature, etc.

If you think this is worth pursuing, I would love some feedback.

The PR is not complete. Some items to address are
- [x] Create a specialized grouper to allow coarser bins. By default, if no `grouper` is specified, the `GroupBy` object uses all unique values to define the groups. With a high resolution dataset, this could balloon to a huge number of groups. With the latitude example, we would like to be able to specify e.g. 1-degree bins. Usage would be `da.groupby('lon', bins=range(-90,90))`.
- [ ] Allow specification of which dims to stack. For example, stack in space but keep time dimension intact. (Currently it just stacks all the dimensions of the group variable.) 
- [x] A nice example for the docs.
",2016-04-06T04:14:37Z,2016-07-31T23:02:59Z,2016-07-08T01:50:38Z,2016-07-08T01:50:38Z,a0a3860a87815f1f580aa56b972c7e8d9359b6ce,,,0,dc50064728cceade436c65b958f1b06a60e2eec7,0d0ae9d3e766c3af9dd98383ab3b33dfea9494dc,MEMBER,,13221727,https://github.com/pydata/xarray/pull/818,
75682773,MDExOlB1bGxSZXF1ZXN0NzU2ODI3NzM=,892,closed,0,fix printing of unicode attributes,1197350,"fixes #834

I would welcome a suggestion of how to test this in a way that works with both python 2 and 3. This is somewhat outside my expertise.
",2016-06-29T16:47:27Z,2016-07-24T02:57:13Z,2016-07-24T02:57:13Z,,9ea1fbacfa935f884598775e40c2287e00c92ef7,,,0,be0acd453f6d9b8082e7b2ebe0a957ac7dbc5e3e,0d0ae9d3e766c3af9dd98383ab3b33dfea9494dc,MEMBER,,13221727,https://github.com/pydata/xarray/pull/892,
87647409,MDExOlB1bGxSZXF1ZXN0ODc2NDc0MDk=,1027,closed,0,Groupby bins empty groups,1197350,"This PR fixes a bug in `groupby_bins` in which empty bins were dropped from the grouped results. Now `groupby_bins` restores any empty bins automatically. To recover the old behavior, one could apply `dropna` after a groupby operation.

Fixes #1019  
",2016-10-02T21:31:32Z,2016-10-03T15:28:01Z,2016-10-03T15:22:15Z,2016-10-03T15:22:15Z,0e044ce807fa0ee15703c8b4088bf41ae8e99116,,,0,06517c3eb6cf4c4967c05e009803ad63a7103392,525e086097171aa8a904d54ea2ee8dc76f2a69ef,MEMBER,,13221727,https://github.com/pydata/xarray/pull/1027,
93171281,MDExOlB1bGxSZXF1ZXN0OTMxNzEyODE=,1104,closed,0,add optimization tips,1197350,This adds some dask optimization tips from the mailing list (closes #1103).,2016-11-10T15:26:25Z,2016-11-10T16:49:13Z,2016-11-10T16:49:06Z,2016-11-10T16:49:06Z,2cfd3882374831e5edfe5d040f7775c5bb5ecc7a,,,0,e489be806cfb7a8d2fc03493c410a1db2b80fe24,92095f759a4b61691bf494d46d8d3008d812c6f8,MEMBER,,13221727,https://github.com/pydata/xarray/pull/1104,
113554698,MDExOlB1bGxSZXF1ZXN0MTEzNTU0Njk4,1345,closed,0,new dask prefix,1197350," - [x] closes #1343
 - [ ] tests added / passed
 - [ ] passes ``git diff upstream/master | flake8 --diff``
 - [ ] whatsnew entry
",2017-03-31T00:56:24Z,2017-05-21T09:45:39Z,2017-05-16T19:11:13Z,,f16e61b31e8d30048d220c8247dc188c079080ce,,,0,9529f54a01f4c06e642c54cb2088bcc9867ffe38,f2a50158beba112e521548bea63fdf758b327235,MEMBER,,13221727,https://github.com/pydata/xarray/pull/1345,
118408475,MDExOlB1bGxSZXF1ZXN0MTE4NDA4NDc1,1390,closed,0,Fix groupby bins tests,1197350," - [x] closes #1386
 - [x] tests added / passed
 - [x] passes ``git diff upstream/master | flake8 --diff``
 - [x] whatsnew entry
",2017-05-01T17:46:41Z,2017-05-01T21:52:14Z,2017-05-01T21:52:14Z,2017-05-01T21:52:14Z,a9a12b0aca862d5ab19180594f616b8efab13308,,,0,a91626819ea0cf333b5aad768863a487fd3a3de7,8f6a68e3f821689203bce2bce52b412e9fe70b5c,MEMBER,,13221727,https://github.com/pydata/xarray/pull/1390,
120903269,MDExOlB1bGxSZXF1ZXN0MTIwOTAzMjY5,1411,closed,0,fixed dask prefix naming,1197350," - [x] Closes #1343
 - [x] Tests added / passed
 - [x] Passes ``git diff upstream/master | flake8 --diff``
 - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API

I am starting a new PR for this since the original one (#1345) was not branched of my own fork.

As the discussion there stood, @shoyer suggested that `dataset.chunk` should also be updated to match the latest conventions in dask naming. The relevant code is here

```python
        def maybe_chunk(name, var, chunks):
            chunks = selkeys(chunks, var.dims)
            if not chunks:
                chunks = None
            if var.ndim > 0:
                token2 = tokenize(name, token if token else var._data)
                name2 = '%s%s-%s' % (name_prefix, name, token2)
                return var.chunk(chunks, name=name2, lock=lock)
            else:
                return var

        variables = OrderedDict([(k, maybe_chunk(k, v, chunks))
                                 for k, v in self.variables.items()])
```

Currently, `chunk` has an optional keyword argument `name_prefix='xarray-'`. Do we want to keep this optional? 

IMO, the current naming logic in `chunk` is not a problem for dask and will not cause problems for the distributed bokeh dashboard (as `open_dataset` did).",2017-05-16T19:10:30Z,2017-05-22T20:39:01Z,2017-05-22T20:38:56Z,2017-05-22T20:38:56Z,d80248476ebe4a3845211c5d58e0af1effc73ea3,,,0,630726ad12b9e83094ddd14bd02e6d4d2a18d706,028454d9d8c6d7d2f8afd7d0133941f961dbe231,MEMBER,,13221727,https://github.com/pydata/xarray/pull/1411,
121142890,MDExOlB1bGxSZXF1ZXN0MTIxMTQyODkw,1413,closed,0,concat prealigned objects,1197350," - [x] Closes #1385
 - [ ] Tests added / passed
 - [ ] Passes ``git diff upstream/master | flake8 --diff``
 - [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API

This is an initial PR to bypass index alignment and coordinate checking when concatenating datasets.",2017-05-17T20:16:00Z,2017-07-17T21:53:53Z,2017-07-17T21:53:40Z,,cfaf0dd40692dcfddf9acb1dc9af1a292f965ece,,,0,a0314bfac8d3308c3c14c674386b9da4cb7b3c8d,d5c7e0612e8243c0a716460da0b74315f719f2df,MEMBER,,13221727,https://github.com/pydata/xarray/pull/1413,
137819104,MDExOlB1bGxSZXF1ZXN0MTM3ODE5MTA0,1528,closed,0,WIP: Zarr backend,1197350," - [x] Closes #1223 
 - [x] Tests added / passed
 - [x] Passes ``git diff upstream/master | flake8 --diff``
 - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API

I think that a zarr backend could be the ideal storage format for xarray datasets, overcoming many of the frustrations associated with netcdf and enabling optimal performance on cloud platforms.

This is a very basic start to implementing a zarr backend (as proposed in #1223); however, I am taking a somewhat different approach. I store the whole dataset in a single zarr group. I encode the extra metadata needed by xarray (so far just dimension information) as attributes within the zarr group and child arrays. I hide these special attributes from the user by wrapping the attribute dictionaries in a ""`HiddenKeyDict`"", so that they can't be viewed or modified.

I have no tests yet (:flushed:), but the following code works.
```python
from xarray.backends.zarr import ZarrStore
import xarray as xr
import numpy as np

ds = xr.Dataset(
    {'foo': (('y', 'x'), np.ones((100, 200)), {'myattr1': 1, 'myattr2': 2}),
     'bar': (('x',), np.zeros(200))},
    {'y': (('y',), np.arange(100)),
     'x': (('x',), np.arange(200))},
    {'some_attr': 'copana'}
).chunk({'y': 50, 'x': 40})

zs = ZarrStore(store='zarr_test')
ds.dump_to_store(zs)
ds2 = xr.Dataset.load_store(zs)
assert ds2.equals(ds)
```


There is a very long way to go here, but I thought I would just get a PR started. Some questions that would help me move forward.

1. What is ""encoding"" at the variable level? (I have never understood this part of xarray.) How should encoding be handled with zarr?
1. Should we encode / decode CF for zarr stores?
1. Do we want to always automatically align dask chunks with the underlying zarr chunks?
1. What sort of public API should the zarr backend have? Should you be able to load zarr stores via `open_dataset`? Or do we need a new method? I think `.to_zarr()` would be quite useful.
1. zarr arrays are extensible along all axes. What does this imply for unlimited dimensions?
1. Is any autoclose logic needed? As far as I can tell, zarr objects don't need to be closed.

",2017-08-27T02:38:01Z,2018-02-13T21:35:03Z,2017-12-14T02:11:36Z,2017-12-14T02:11:36Z,8fe7eb0fbcb7aaa90d894bcf32dc1408735e5d9d,,,0,f5633cabd19189675b607379badc2c19b86c0b8e,89a1a9883c0c8409dad8dbcccf1ab73a3ea2cafc,MEMBER,,13221727,https://github.com/pydata/xarray/pull/1528,
162224186,MDExOlB1bGxSZXF1ZXN0MTYyMjI0MTg2,1817,closed,0,fix rasterio chunking with s3 datasets,1197350," - [x] Closes #1816 (remove if there is no corresponding issue, which should only be the case for minor changes)
 - [x] Tests added (for all bug fixes or enhancements)
 - [x] Tests passed (for all non-documentation changes)
 - [x] Passes ``git diff upstream/master **/*py | flake8 --diff`` (remove if you did not edit any Python files)
 - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later)

This is a simple fix for token generation of non-filename targets for rasterio.

The problem is that I have no idea how to test it without actually hitting s3 (which requires boto and aws credentials).
",2018-01-10T20:37:45Z,2018-01-24T09:33:07Z,2018-01-23T16:33:28Z,2018-01-23T16:33:28Z,3cd2337d8035a324cb38d6793eaf33818066f25c,,,0,350b929dcb4e87dc365b14f215925b944e91922a,e31cf43e8d183c63474b2898a0776fda72abc82c,MEMBER,,13221727,https://github.com/pydata/xarray/pull/1817,
180650279,MDExOlB1bGxSZXF1ZXN0MTgwNjUwMjc5,2047,closed,0,Fix decode cf with dask,1197350," - [x] Closes #1372
 - [x] Tests added
 - [x] Tests passed
 - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API

This was a very simple fix for an issue that has vexed me for quite a while. Am I missing something obvious here?
",2018-04-10T15:56:20Z,2018-04-12T23:38:02Z,2018-04-12T23:38:02Z,2018-04-12T23:38:02Z,a9d1f3a36229636f0d519eb36a8d4a7c91f6e1cd,,,0,c8843003b98a3a26636ec9f88393590c633eb382,6402391cf206fd04c12d44773fecd9b42ea0c246,MEMBER,,13221727,https://github.com/pydata/xarray/pull/2047,
213736501,MDExOlB1bGxSZXF1ZXN0MjEzNzM2NTAx,2405,closed,0,WIP: don't create indexes on multidimensional dimensions,1197350," - [x] Closes #2368, Closes #2233
 - [ ] Tests added (for all bug fixes or enhancements)
 - [ ] Tests passed (for all non-documentation changes)
 - [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later)

This is just a start to the solution proposed in #2368. A surprisingly small number of tests broke in my local environment.",2018-09-06T20:13:11Z,2023-07-19T18:33:17Z,2023-07-19T18:33:17Z,,6129432d60690690916cbac40a9e91099ad1f114,,,0,40c8d36844fdee6f8c06ec5babfacb25f177e954,d1e4164f3961d7bbb3eb79037e96cae14f7182f8,MEMBER,,13221727,https://github.com/pydata/xarray/pull/2405,
217463158,MDExOlB1bGxSZXF1ZXN0MjE3NDYzMTU4,2430,closed,0,WIP: revise top-level package description,1197350,"I have often complained that xarray's top-level package description assumes that the user knows all about pandas. I think this alienates many new users.

This is a first draft at revising that top-level description. Feedback from the community very needed here.",2018-09-22T15:35:47Z,2019-01-07T01:04:19Z,2019-01-06T00:31:57Z,2019-01-06T00:31:57Z,a0bbea89d5ce1399a24ca6c27b446283588ca2b4,,,0,085a5ddce0022639db9a2d23b5a486bb2cff38b3,bb87a9441d22b390e069d0fde58f297a054fd98a,MEMBER,,13221727,https://github.com/pydata/xarray/pull/2430,
232190289,MDExOlB1bGxSZXF1ZXN0MjMyMTkwMjg5,2559,closed,0,Zarr consolidated,1197350,"This PR adds support for reading and writing of [consolidated metadata](https://zarr.readthedocs.io/en/latest/tutorial.html#consolidating-metadata) in zarr stores.

 - [x] Closes #2558 (remove if there is no corresponding issue, which should only be the case for minor changes)
 - [x] Tests added (for all bug fixes or enhancements)
 - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later)",2018-11-20T04:39:41Z,2018-12-05T14:58:58Z,2018-12-04T23:51:00Z,2018-12-04T23:51:00Z,3ae93ac31ce122fc10b089f3b92b8c20e8b218c9,,,0,fe4af34732f104f1e5f2b18e25dec1c3b92d6809,483b8a0a89ea4be862488e51af8a1b3bc7f40356,MEMBER,,13221727,https://github.com/pydata/xarray/pull/2559,
242668810,MDExOlB1bGxSZXF1ZXN0MjQyNjY4ODEw,2659,closed,0,to_dict without data,1197350,"This PR provides the ability to export Datasets and DataArrays to dictionary _without_ the actual data. This could be useful for generating indices of dataset contents to expose to search indices or other automated data discovery tools

In the process of doing this, I refactored the core dictionary export function to live in the Variable class, since the same code was duplicated in several places.

 - [x] Closes #2656
 - [x] Tests added
 - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API
",2019-01-07T14:09:25Z,2019-02-12T21:21:13Z,2019-01-21T23:25:56Z,2019-01-21T23:25:56Z,a7d55b9bcd0cc19330b5784842d51af5309d07ee,,,0,4cf7bc8efe9fe6aae4c2487685c883b70aefa9dd,ede3e0101bae2f45c3f4634a1e1ecb8e2ccd0258,MEMBER,,13221727,https://github.com/pydata/xarray/pull/2659,
261202056,MDExOlB1bGxSZXF1ZXN0MjYxMjAyMDU2,2813,open,0,[WIP] added protect_dataset_variables_inplace to open_zarr,1197350,"This adds the same call to `_protect_dataset_variables_inplace` to `open_zarr` which we find in `open_dataset`. It wraps the arrays with `indexing.MemoryCachedArray`.

As far as I can tell, it *does not work*, in the sense that nothing is cached.

 - [ ] One possible way to close #2812 
 - [ ] Tests added
 - [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API
",2019-03-14T14:50:15Z,2024-03-25T14:05:24Z,,,a8195022a5cc6c4c573bffa7df1588e6aa0a12b2,,,0,5ab07f8fa8f2a1e656b276e64f698f91aa07330d,d1e4164f3961d7bbb3eb79037e96cae14f7182f8,MEMBER,,13221727,https://github.com/pydata/xarray/pull/2813,
261207163,MDExOlB1bGxSZXF1ZXN0MjYxMjA3MTYz,2814,open,0,[WIP] Use zarr internal LRU caching,1197350,"Alternative way to close #2812. This uses zarr's own caching.

In contrast to #2813, this *does work*.

 - [ ] Closes #2812 
 - [ ] Tests added
 - [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API
",2019-03-14T15:01:06Z,2024-03-25T14:00:50Z,,,5ac4430d03ce962eabf140771da75c0667e24e65,,,0,e92f9c1b55fb685c2fc80f13dd16de852b0550b6,d1e4164f3961d7bbb3eb79037e96cae14f7182f8,MEMBER,,13221727,https://github.com/pydata/xarray/pull/2814,
268927254,MDExOlB1bGxSZXF1ZXN0MjY4OTI3MjU0,2881,closed,0,decreased pytest verbosity,1197350,"This removes the `--verbose` flag from py.test in .travis.yml.

 - [x] Closes #2880 
",2019-04-09T21:12:50Z,2019-04-09T23:36:01Z,2019-04-09T23:34:22Z,2019-04-09T23:34:22Z,2c10d1443bea09e5ef53e5a7e35195a195e193a7,,,0,b085e311a822e5e77a9a2b6f3e132281bbd285ea,3435b03de218f54a55eb72dff597bb47b0f407cb,MEMBER,,13221727,https://github.com/pydata/xarray/pull/2881,
297270660,MDExOlB1bGxSZXF1ZXN0Mjk3MjcwNjYw,3105,closed,0,Switch doc examples to use nbsphinx,1197350,"This is the beginning of the docs refactor we have in mind for the sprint tomorrow.

We will merge things first to the scipy19-docs branch so we can make sure things build on RTD.

http://xarray.pydata.org/en/scipy19-docs",2019-07-13T02:28:34Z,2019-07-13T04:53:09Z,2019-07-13T04:52:52Z,2019-07-13T04:52:52Z,903495e5f3b7439e9ba9d63178129f43cab3082a,,,0,163a0a694e187ec3b66c757572e4fc50be7aa8e3,6586c26af6e55279efe646188b39ee1caf86db23,MEMBER,,13221727,https://github.com/pydata/xarray/pull/3105,
297282705,MDExOlB1bGxSZXF1ZXN0Mjk3MjgyNzA1,3106,closed,0,Replace sphinx_gallery with notebook,1197350,"Today @jhamman and I discussed how to refactor our somewhat fragmented ""examples"". We decided to basically copy the approach of the [dask-examples](https://github.com/dask/dask-examples) repo, but have it live here in the main xarray repo. Basically this approach is:
- all examples are notebooks
- examples are rendered during doc build by nbsphinx
- we will eventually have a binder that works with all of the same examples

This PR removes the dependency on sphinx_gallery and replaces the existing gallery with a standalone notebook called `visualization_gallery.ipynb`. However, not all of the links that worked in the gallery work here, since we are now using nbsphinx to render the notebooks (see https://github.com/spatialaudio/nbsphinx/issues/308).

Really important to get @dcherian's feedback on this, as he was the one who originally introduced the gallery. My view is that having everything as notebooks makes examples easier to maintain. But I'm curious to hear other views.",2019-07-13T05:35:34Z,2019-07-13T14:03:20Z,2019-07-13T14:03:19Z,2019-07-13T14:03:19Z,92cd3c4ef9fc61c62fe8c8d257dd223c73a80ac2,,,0,cd769c44954e2d03ea1aae607e56ad7e142542ea,903495e5f3b7439e9ba9d63178129f43cab3082a,MEMBER,,13221727,https://github.com/pydata/xarray/pull/3106,
297354511,MDExOlB1bGxSZXF1ZXN0Mjk3MzU0NTEx,3121,closed,0,Allow other tutorial filename extensions,1197350,"<!-- Feel free to remove check-list items aren't relevant to your change -->

 - [x] Closes #3118
 - [ ] Tests added
 - [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API

Together with https://github.com/pydata/xarray-data/pull/15, this allows us to generalize out tutorial datasets to non netCDF files. But it is backwards compatible--if there is no file suffix, it will append `.nc`.",2019-07-13T23:27:44Z,2019-07-14T01:07:55Z,2019-07-14T01:07:51Z,2019-07-14T01:07:51Z,5df8a428d36a3909395777bc9bc36e2d56b7422c,,,0,49c11ade9d36e9c6295993e52deb660d01cfb846,92cd3c4ef9fc61c62fe8c8d257dd223c73a80ac2,MEMBER,,13221727,https://github.com/pydata/xarray/pull/3121,
297445477,MDExOlB1bGxSZXF1ZXN0Mjk3NDQ1NDc3,3131,open,0,WIP: tutorial on merging datasets,1197350,"<!-- Feel free to remove check-list items aren't relevant to your change -->

 - [x] Closes #1391 
 - [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API

This is a start on a tutorial about merging / combining datasets. ",2019-07-15T01:28:25Z,2022-06-09T14:50:17Z,,,0d29796ce96aaa6e8f2fe7b81548b8fbe659c666,35968931,,0,211a2b30ba063dcb3cf5eb0965376ac5a348e7ac,d1e4164f3961d7bbb3eb79037e96cae14f7182f8,MEMBER,,13221727,https://github.com/pydata/xarray/pull/3131,
415292337,MDExOlB1bGxSZXF1ZXN0NDE1MjkyMzM3,4047,closed,0,Document Xarray zarr encoding conventions,1197350,"When we implemented the Zarr backend, we made some _ad hoc_ choices about how to encode NetCDF data in Zarr. At this stage, it would be useful to explicitly document this encoding. I decided to put it on the ""Xarray Internals"" page, but I'm open to moving if folks feel it fits better elsewhere.

cc @jeffdlb, @WardF, @DennisHeimbigner",2020-05-08T15:29:14Z,2020-05-22T21:59:09Z,2020-05-20T17:04:02Z,2020-05-20T17:04:02Z,261df2e56b2d554927887b8943f84514fc60369b,,,0,eb700172e72c2177feb8c837c24bc62110451227,69548df9826cde9df6cbdae9c033c9fb1e62d493,MEMBER,,13221727,https://github.com/pydata/xarray/pull/4047,
597608584,MDExOlB1bGxSZXF1ZXN0NTk3NjA4NTg0,5065,closed,0,Zarr chunking fixes,1197350,"<!-- Feel free to remove check-list items aren't relevant to your change -->

- [x] Closes #2300, closes #5056
- [x] Tests added
- [x] Passes `pre-commit run --all-files`
- [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst`

This PR contains two small, related updates to how Zarr chunks are handled.

1. We now delete the `encoding` attribute at the Variable level whenever `chunk` is called. The persistence of `chunk` encoding has been the source of lots of confusion (see #2300,  #4046, #4380, https://github.com/dcs4cop/xcube/issues/347)
2. Added a new option called `safe_chunks` in `to_zarr` which allows for bypassing the requirement of the many-to-one relationship between Zarr chunks and Dask chunks (see #5056).

Both these touch the internal logic for how chunks are handled, so I thought it was easiest to tackle them with a single PR.",2021-03-22T01:35:22Z,2021-04-26T16:37:43Z,2021-04-26T16:37:43Z,2021-04-26T16:37:42Z,dd7f742fd79126d8665740c5a461c265fdfdc0da,,,0,023920b8d11d1a77bedfa880fcaf68f95167729c,69950a46f9402a7c5ae6d86d766d7933738dc62b,MEMBER,,13221727,https://github.com/pydata/xarray/pull/5065,
1592876533,PR_kwDOAMm_X85e8V31,8428,closed,0,Add mode='a-': Do not overwrite coordinates when appending to Zarr with `append_dim`,1197350,"This implements the 1b option described in #8427.

- [x] Closes #8427
- [x] Tests added
- [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst`
",2023-11-08T15:41:58Z,2023-12-01T04:21:57Z,2023-12-01T03:58:54Z,2023-12-01T03:58:54Z,c93b31a9175eed9e506eb1950bf843d7de715bb9,,,0,9dbb50acec94790dd24c1d84a5bb5ad4325b5ac9,1715ed3422c04853fda1827de7e3580c07de85cf,MEMBER,"{""enabled_by"": {""login"": ""andersy005"", ""id"": 13301940, ""node_id"": ""MDQ6VXNlcjEzMzAxOTQw"", ""avatar_url"": ""https://avatars.githubusercontent.com/u/13301940?v=4"", ""gravatar_id"": """", ""url"": ""https://api.github.com/users/andersy005"", ""html_url"": ""https://github.com/andersy005"", ""followers_url"": ""https://api.github.com/users/andersy005/followers"", ""following_url"": ""https://api.github.com/users/andersy005/following{/other_user}"", ""gists_url"": ""https://api.github.com/users/andersy005/gists{/gist_id}"", ""starred_url"": ""https://api.github.com/users/andersy005/starred{/owner}{/repo}"", ""subscriptions_url"": ""https://api.github.com/users/andersy005/subscriptions"", ""organizations_url"": ""https://api.github.com/users/andersy005/orgs"", ""repos_url"": ""https://api.github.com/users/andersy005/repos"", ""events_url"": ""https://api.github.com/users/andersy005/events{/privacy}"", ""received_events_url"": ""https://api.github.com/users/andersy005/received_events"", ""type"": ""User"", ""site_admin"": false}, ""merge_method"": ""squash"", ""commit_title"": ""Add mode='a-': Do not overwrite coordinates when appending to Zarr with `append_dim` (#8428)"", ""commit_message"": ""Co-authored-by: Deepak Cherian <deepak@cherian.net>\r\nCo-authored-by: Anderson Banihirwe <13301940+andersy005@users.noreply.github.com>\r\n""}",13221727,https://github.com/pydata/xarray/pull/8428,