html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/pull/1528#issuecomment-350375750,https://api.github.com/repos/pydata/xarray/issues/1528,350375750,MDEyOklzc3VlQ29tbWVudDM1MDM3NTc1MA==,703554,2017-12-08T21:24:45Z,2017-12-08T22:27:47Z,CONTRIBUTOR,"Just to confirm, if writes are aligned with chunk boundaries in the
destination array then no locking is required.

Also if you're going to be moving large datasets into cloud storage and
doing distributed computing then it may be worth investigating compressors
and compressor options as good compression ratio may make a big difference
where network bandwidth may be the limiting factor. I would suggest using
the Blosc compressor with cname='zstd'. I would also suggest using shuffle,
the Blosc codec in latest numcodecs has an AUTOSHUFFLE option so byte
shuffle is used for arrays with >1 byte item size and bit shuffle is used
for arrays with 1 byte item size
. I would also experiment with compression level (clevel) to see how speed
balances against compression ratio. E.g., Blosc(cname='zstd', clevel=5,
shuffle=Blosc.AUTOSHUFFLE) may be a good starting point. The default
compressor is Blosc(cname='lz4', ...) is more optimised for fast local
storage, so speed is very good but compression ratio is moderate, this may
not be best for distributed computing.
","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694
https://github.com/pydata/xarray/pull/1528#issuecomment-350379064,https://api.github.com/repos/pydata/xarray/issues/1528,350379064,MDEyOklzc3VlQ29tbWVudDM1MDM3OTA2NA==,703554,2017-12-08T21:40:40Z,2017-12-08T22:27:35Z,CONTRIBUTOR,"Some examples of compressor benchmarking here may be useful
http://alimanfoo.github.io/2016/09/21/genotype-compression-benchmark.html

The specific conclusions probably won't apply to your data but some of the
code and ideas may be useful. Since writing that article I added Zstd and
LZ4 compressors in numcodecs so those may also be worth trying in addition
to Blosc with various configurations. (Blosc breaks up each chunk into
blocks which enables multithreaded compression/decompression but can also
reduce compression ratio over the same compressor library used without
Blosc. I.e., Blosc(cname='zstd', clevel=1) will behave differently from
Zstd(level=1) even though the same underlying compression library
(Zstandard) is being used.)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694
https://github.com/pydata/xarray/pull/1528#issuecomment-348839453,https://api.github.com/repos/pydata/xarray/issues/1528,348839453,MDEyOklzc3VlQ29tbWVudDM0ODgzOTQ1Mw==,703554,2017-12-04T01:40:57Z,2017-12-04T01:40:57Z,CONTRIBUTOR,"I know you're not including string support in this PR, but for interest, there are a couple of changes coming into zarr via https://github.com/alimanfoo/zarr/pull/212 that may be relevant in future. 

It should now be impossible to generate a segfault via a badly configured object array. It is also now much harder to badly configure an object array. When creating an object array, an object codec should be provided via the ``object_codec`` parameter. There are now three codecs in numcodecs that can be used for variable length text strings: MsgPack, Pickle and JSON (new). [Examples notebook here](https://github.com/alimanfoo/zarr/blob/14ac8d9bf19633232f6522dfcd925f300722b82b/notebooks/object_arrays.ipynb). In that notebook I also ran some simple benchmarks and MsgPack comes out well, but JSON isn't too shabby either.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694
https://github.com/pydata/xarray/pull/1528#issuecomment-347385269,https://api.github.com/repos/pydata/xarray/issues/1528,347385269,MDEyOklzc3VlQ29tbWVudDM0NzM4NTI2OQ==,703554,2017-11-28T01:36:29Z,2017-11-28T01:49:24Z,CONTRIBUTOR,"FWIW I think the best option at the moment is to make sure you add either Pickle or MsgPack filter for any zarr array with an object dtype. 

BTW I was thinking that zarr should automatically add one of these filters any time someone creates an array with an object dtype, to avoid them hitting the pointer issue. If you have any thoughts on best solution drop them here: https://github.com/alimanfoo/zarr/issues/208
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694
https://github.com/pydata/xarray/pull/1528#issuecomment-347381734,https://api.github.com/repos/pydata/xarray/issues/1528,347381734,MDEyOklzc3VlQ29tbWVudDM0NzM4MTczNA==,703554,2017-11-28T01:16:07Z,2017-11-28T01:16:07Z,CONTRIBUTOR,"When still in the original interpreter session, all the objects still exist
in memory, so all the pointers stored in the array are still valid. Restart
the session and the objects are gone and the pointers are invalid.

On Tue, Nov 28, 2017 at 1:14 AM, Alistair Miles <alimanfoo@googlemail.com>
wrote:

> Try exiting and restarting the interpreter, then running:
>
> zgs = zarr.open_group(store='zarr_directory')
> zgs.x[:]
>
>
> On Tue, Nov 28, 2017 at 1:10 AM, Ryan Abernathey <notifications@github.com
> > wrote:
>
>> zarr needs a filter that can encode and pack the strings into a single
>> buffer, except in the special case where the data are being stored in-memory
>>
>> @alimanfoo <https://github.com/alimanfoo>: the following also seems to
>> works with directory store
>>
>> values = np.array([b'ab', b'cdef', np.nan], dtype=object)
>> zgs = zarr.open_group(store='zarr_directory')
>> zgs.create('x', shape=values.shape, dtype=values.dtype)
>> zgs.x[:] = values
>>
>> This seems to contradict your statement above. What am I missing?
>>
>> —
>> You are receiving this because you were mentioned.
>> Reply to this email directly, view it on GitHub
>> <https://github.com/pydata/xarray/pull/1528#issuecomment-347380750>, or mute
>> the thread
>> <https://github.com/notifications/unsubscribe-auth/AAq8QnNQ7bI5GRyHsUUSQAgusymx8eJnks5s611rgaJpZM4PDrlp>
>> .
>>
>
>
>
> --
> Alistair Miles
> Head of Epidemiological Informatics
> Centre for Genomics and Global Health <http://cggh.org>
> Big Data Institute Building
> Old Road Campus
> Roosevelt Drive
> Oxford
> OX3 7LF
> United Kingdom
> Phone: +44 (0)1865 743596 <+44%201865%20743596>
> Email: alimanfoo@googlemail.com
> Web: http://a <http://purl.org/net/aliman>limanfoo.github.io/
> Twitter: https://twitter.com/alimanfoo
>


-- 
Alistair Miles
Head of Epidemiological Informatics
Centre for Genomics and Global Health <http://cggh.org>
Big Data Institute Building
Old Road Campus
Roosevelt Drive
Oxford
OX3 7LF
United Kingdom
Phone: +44 (0)1865 743596
Email: alimanfoo@googlemail.com
Web: http://a <http://purl.org/net/aliman>limanfoo.github.io/
Twitter: https://twitter.com/alimanfoo
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694
https://github.com/pydata/xarray/pull/1528#issuecomment-347381500,https://api.github.com/repos/pydata/xarray/issues/1528,347381500,MDEyOklzc3VlQ29tbWVudDM0NzM4MTUwMA==,703554,2017-11-28T01:14:42Z,2017-11-28T01:14:42Z,CONTRIBUTOR,"Try exiting and restarting the interpreter, then running:

zgs = zarr.open_group(store='zarr_directory')
zgs.x[:]


On Tue, Nov 28, 2017 at 1:10 AM, Ryan Abernathey <notifications@github.com>
wrote:

> zarr needs a filter that can encode and pack the strings into a single
> buffer, except in the special case where the data are being stored in-memory
>
> @alimanfoo <https://github.com/alimanfoo>: the following also seems to
> works with directory store
>
> values = np.array([b'ab', b'cdef', np.nan], dtype=object)
> zgs = zarr.open_group(store='zarr_directory')
> zgs.create('x', shape=values.shape, dtype=values.dtype)
> zgs.x[:] = values
>
> This seems to contradict your statement above. What am I missing?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <https://github.com/pydata/xarray/pull/1528#issuecomment-347380750>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/AAq8QnNQ7bI5GRyHsUUSQAgusymx8eJnks5s611rgaJpZM4PDrlp>
> .
>


-- 
Alistair Miles
Head of Epidemiological Informatics
Centre for Genomics and Global Health <http://cggh.org>
Big Data Institute Building
Old Road Campus
Roosevelt Drive
Oxford
OX3 7LF
United Kingdom
Phone: +44 (0)1865 743596
Email: alimanfoo@googlemail.com
Web: http://a <http://purl.org/net/aliman>limanfoo.github.io/
Twitter: https://twitter.com/alimanfoo
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694
https://github.com/pydata/xarray/pull/1528#issuecomment-347363503,https://api.github.com/repos/pydata/xarray/issues/1528,347363503,MDEyOklzc3VlQ29tbWVudDM0NzM2MzUwMw==,703554,2017-11-27T23:27:41Z,2017-11-27T23:27:41Z,CONTRIBUTOR,"For variable length strings (or any array with an object dtype) zarr needs
a filter that can encode and pack the strings into a single buffer, except
in the special case where the data are being stored in-memory (as in your
first example). The filter has to be specified manually, some examples
here: http://zarr.readthedocs.io/en/master/tutorial.html#string-arrays.
There are two codecs currently in numcodecs that can do this, one is
Pickle, the other is MsgPack. I haven't done any benchmarking of data size
or encoding speed, but MsgPack may be preferable because it's more portable.

There was some discussion a while back about creating a codec that handles
variable-length strings by encoding via UTF8 then concatenating encoded
bytes and lengths or offsets, IIRC similar to Arrow, and maybe even
creating a special ""text"" dtype that inserts this filter automatically so
you don't have to add it manually. But there hasn't been a strong
motivation so far.

On Mon, Nov 27, 2017 at 10:32 PM, Stephan Hoyer <notifications@github.com>
wrote:

> Overall, I find the conventions module to be a bit unwieldy. There is a
> lot of stuff in there, not all of which is related to CF conventions. It
> would be useful to separate the actual conventions from the encoding /
> decoding needed for different backends.
>
> Agreed!
>
> I wonder why zarr doesn't have a UTF-8 variable length string type (
> alimanfoo/zarr#206 <https://github.com/alimanfoo/zarr/issues/206>) --
> that would feel like the obvious first choice for encoding this data.
>
> That said, xarary *should* be able to use first-length bytes just fine,
> doing UTF-8 encoding/decoding on the fly.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <https://github.com/pydata/xarray/pull/1528#issuecomment-347351224>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/AAq8QkLTQUuspLhiXYR2_WMW8Hg9LFziks5s6ziTgaJpZM4PDrlp>
> .
>


-- 
Alistair Miles
Head of Epidemiological Informatics
Centre for Genomics and Global Health <http://cggh.org>
Big Data Institute Building
Old Road Campus
Roosevelt Drive
Oxford
OX3 7LF
United Kingdom
Phone: +44 (0)1865 743596
Email: alimanfoo@googlemail.com
Web: http://a <http://purl.org/net/aliman>limanfoo.github.io/
Twitter: https://twitter.com/alimanfoo
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694
https://github.com/pydata/xarray/pull/1528#issuecomment-345619509,https://api.github.com/repos/pydata/xarray/issues/1528,345619509,MDEyOklzc3VlQ29tbWVudDM0NTYxOTUwOQ==,703554,2017-11-20T08:07:44Z,2017-11-20T08:07:44Z,CONTRIBUTOR,"Fantastic!

On Monday, November 20, 2017, Matthew Rocklin <notifications@github.com>
wrote:

> That is, indeed, quite exciting. Also exciting is that I was able to look
> at and compute on your data easily.
>
> In [1]: import zarr
>
> In [2]: import gcsfs
>
> In [3]: fs = gcsfs.GCSFileSystem(project='pangeo-181919')
>
> In [4]: gcsmap = gcsfs.mapping.GCSMap('zarr_store_test', gcs=fs, check=True, create=False)
>
> In [5]: import xarray as xr
>
> In [6]: ds_gcs = xr.open_zarr(gcsmap, mode='r')
>
> In [7]: ds_gcs
> Out[7]: <xarray.Dataset>
> Dimensions:  (x: 200, y: 100)
> Coordinates:
>   * x        (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ...
>   * y        (y) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ...
> Data variables:
>     bar      (x) float64 dask.array<shape=(200,), chunksize=(40,)>
>     foo      (y, x) float32 dask.array<shape=(100, 200), chunksize=(50, 40)>
> Attributes:
>     array_atr:  [1, 2]
>     some_attr:  copana
>
> In [8]: ds_gcs.sum()
> Out[8]: <xarray.Dataset>
> Dimensions:  ()
> Data variables:
>     bar      float64 dask.array<shape=(), chunksize=()>
>     foo      float32 dask.array<shape=(), chunksize=()>
>
> In [9]: ds_gcs.sum().compute()
> Out[9]: <xarray.Dataset>
> Dimensions:  ()
> Data variables:
>     bar      float64 0.0
>     foo      float32 20000.0
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <https://github.com/pydata/xarray/pull/1528#issuecomment-345575240>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/AAq8Quu1UYM4BO3i_KzMkXGnN-g-TFczks5s4OO5gaJpZM4PDrlp>
> .
>


-- 
Alistair Miles
Head of Epidemiological Informatics
Centre for Genomics and Global Health <http://cggh.org>
Big Data Institute Building
Old Road Campus
Roosevelt Drive
Oxford
OX3 7LF
United Kingdom
Phone: +44 (0)1865 743596
Email: alimanfoo@googlemail.com
Web: http://a <http://purl.org/net/aliman>limanfoo.github.io/
Twitter: https://twitter.com/alimanfoo
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694
https://github.com/pydata/xarray/pull/1528#issuecomment-345080945,https://api.github.com/repos/pydata/xarray/issues/1528,345080945,MDEyOklzc3VlQ29tbWVudDM0NTA4MDk0NQ==,703554,2017-11-16T22:18:04Z,2017-11-16T22:18:04Z,CONTRIBUTOR,"Re different zarr storage backends, main options are plain dict, DirectoryStore, ZipStore, and there's a [new DBMStore class just merged](https://github.com/alimanfoo/zarr/pull/186) which enables storage in any DBM-style database (e.g., Berkeley DB). ZipStore has some constraints because of how zip files work, you can't really replace an entry in a zip file which means anything that writes the same array chunk more than once will generate warnings. Dask's S3Map should also work, I haven't tried it and obviously not ideal for unit tests but I'd be interested if you get any experience with it.

Re different combinations of zarr and dask chunks, it can be thread safe even if chunks are not aligned, just need to pass a synchronizer when instantiating the array or group. Zarr has a ThreadSynchronizer class which can be used for thread-based parallelism. If a synchronizer is provided, it is used to lock each chunk individually during write operations. [More info here](http://zarr.readthedocs.io/en/latest/tutorial.html#parallel-computing-and-synchronization). 

Re fill values, zarr has a native concept of fill value for each array, with the fill value stored as part of the array metadata. Array metadata are stored as JSON and I recently [merged a fix](https://github.com/alimanfoo/zarr/pull/176) so that a bytes fill values could be used (via base64 encoding). I believe the netcdf way is to store fill value separately as value of ""_FillValue"" attribute? You could do this with zarr but user attributes are also JSON and so you would need to do your own encoding/decoding. But if possible I'd suggest using the native zarr fill_value support as it handles bytes fill value encoding and also checks to ensure fill values are valid wrt the array dtype.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694
https://github.com/pydata/xarray/pull/1528#issuecomment-339897936,https://api.github.com/repos/pydata/xarray/issues/1528,339897936,MDEyOklzc3VlQ29tbWVudDMzOTg5NzkzNg==,703554,2017-10-27T07:42:34Z,2017-10-27T07:42:34Z,CONTRIBUTOR,"Suggest testing against GitHub master, there are a few other issues I'd
like to work through before next release.

On Thu, 26 Oct 2017 at 23:07, Ryan Abernathey <notifications@github.com>
wrote:

> Fantastic! Are you planning a release any time soon? If not we can set up
> to test against the github master.
>
> Sent from my iPhone
>
> > On Oct 26, 2017, at 5:04 PM, Alistair Miles <notifications@github.com>
> wrote:
> >
> > Just to say, support for 0d arrays, and for arrays with one or more
> zero-length dimensions, is in zarr master.
> >
> > —
> > You are receiving this because you were mentioned.
> > Reply to this email directly, view it on GitHub, or mute the thread.
> >
>
> —
> You are receiving this because you were mentioned.
>
> Reply to this email directly, view it on GitHub
> <https://github.com/pydata/xarray/pull/1528#issuecomment-339815147>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/AAq8QtP5kta-H9Y90Puv9BHig7krEI0Wks5swQKQgaJpZM4PDrlp>
> .
>
-- 
Alistair Miles
Head of Epidemiological Informatics
Centre for Genomics and Global Health <http://cggh.org>
Big Data Institute Building
Old Road Campus
Roosevelt Drive
Oxford
OX3 7LF
United Kingdom
Phone: +44 (0)1865 743596
Email: alimanfoo@googlemail.com
Web: http://a <http://purl.org/net/aliman>limanfoo.github.io/
Twitter: https://twitter.com/alimanfoo
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694
https://github.com/pydata/xarray/pull/1528#issuecomment-339800443,https://api.github.com/repos/pydata/xarray/issues/1528,339800443,MDEyOklzc3VlQ29tbWVudDMzOTgwMDQ0Mw==,703554,2017-10-26T21:04:17Z,2017-10-26T21:04:17Z,CONTRIBUTOR,"Just to say, support for 0d arrays, and for arrays with one or more zero-length dimensions, is in zarr master.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694
https://github.com/pydata/xarray/pull/1528#issuecomment-335186616,https://api.github.com/repos/pydata/xarray/issues/1528,335186616,MDEyOklzc3VlQ29tbWVudDMzNTE4NjYxNg==,703554,2017-10-09T15:07:29Z,2017-10-09T17:23:21Z,CONTRIBUTOR,"I'm on paternity leave for the next 2 weeks, then will be catching up for a
couple of weeks I expect. May be able to merge straightforward PRs but will
have limited bandwidth.","{""total_count"": 3, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 3, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694
https://github.com/pydata/xarray/pull/1528#issuecomment-335030993,https://api.github.com/repos/pydata/xarray/issues/1528,335030993,MDEyOklzc3VlQ29tbWVudDMzNTAzMDk5Mw==,703554,2017-10-08T19:17:27Z,2017-10-08T23:37:47Z,CONTRIBUTOR,"FWIW I think some JSON encoders for attributes would ultimately be a useful
addition to zarr, but I won't be able to put any effort into zarr in the
next month, so workarounds in xarray sounds like a good idea for now.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694
https://github.com/pydata/xarray/pull/1528#issuecomment-325813339,https://api.github.com/repos/pydata/xarray/issues/1528,325813339,MDEyOklzc3VlQ29tbWVudDMyNTgxMzMzOQ==,703554,2017-08-29T21:43:48Z,2017-08-29T21:43:48Z,CONTRIBUTOR,"On Tuesday, August 29, 2017, Ryan Abernathey <notifications@github.com>
wrote:
>
> @alimanfoo <https://github.com/alimanfoo>: when do you anticipate the 2.2
> zarr release to happen? Will the API change significantly? If so, I will
> wait for that to move forward here.
>
Zarr 2.2 will hopefully happen some time in the next 2 months, but it will
be fully backwards-compatible, no breaking API changes.


-- 
Alistair Miles
Head of Epidemiological Informatics
Centre for Genomics and Global Health <http://cggh.org>
Big Data Institute Building
Old Road Campus
Roosevelt Drive
Oxford
OX3 7LF
United Kingdom
Phone: +44 (0)1865 743596
Email: alimanfoo@googlemail.com
Web: http://a <http://purl.org/net/aliman>limanfoo.github.io/
Twitter: https://twitter.com/alimanfoo
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694
https://github.com/pydata/xarray/pull/1528#issuecomment-325729013,https://api.github.com/repos/pydata/xarray/issues/1528,325729013,MDEyOklzc3VlQ29tbWVudDMyNTcyOTAxMw==,703554,2017-08-29T17:02:41Z,2017-08-29T17:02:41Z,CONTRIBUTOR,"FWIW all filter (codec) classes have been migrated from zarr to a separate packaged called numcodecs and will be imported from there in the next (2.2) zarr release. Here is [FixedScaleOffset](https://github.com/alimanfoo/numcodecs/blob/master/numcodecs/fixedscaleoffset.py). Implementation is basic numpy, probably some room for optimization. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694
https://github.com/pydata/xarray/pull/1528#issuecomment-325727280,https://api.github.com/repos/pydata/xarray/issues/1528,325727280,MDEyOklzc3VlQ29tbWVudDMyNTcyNzI4MA==,703554,2017-08-29T16:56:55Z,2017-08-29T16:56:55Z,CONTRIBUTOR,"Following this with interest.

Regarding autoclose, just to confirm that zarr doesn't really have any notion of whether something is open or closed. When using the DirectoryStore storage class (most common use case I imagine), all files are automatically closed, nothing is kept open. There are some storage classes (e.g., ZipStore) that do require an explicit close call to finalise the file on disk if you have been writing data, but I think you can ignore this in xarray and leave it up to the user to manage this themselves.

Out of interest, @shoyer do you still think there would be value in writing a wrapper for zarr analogous to h5netcdf? Or does this PR provide all the necessary functionality?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694