html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/pull/1528#issuecomment-364801395,https://api.github.com/repos/pydata/xarray/issues/1528,364801395,MDEyOklzc3VlQ29tbWVudDM2NDgwMTM5NQ==,306380,2018-02-11T23:40:18Z,2018-02-11T23:40:18Z,MEMBER,"Does the to_zarr method suffice:
http://xarray.pydata.org/en/latest/generated/xarray.Dataset.to_zarr.html#xarray.Dataset.to_zarr
?
On Sun, Feb 11, 2018 at 6:35 PM, Martin Durant
wrote:
> Question: how would one *build* a zarr-xarray dataset?
>
> With zarr you can open an array that contains no data, and use set-slice
> notation to fill in the values (which is what dask's store essentially
> does).
>
> If I have some pre-known coordinates and bigger-than-memory data arrays,
> how would I go about getting the values into the zarr structure? If this
> can't be done directly with the xarray interface, is there a way to call
> zarr's open/create/zeros such that the corresponding array will appear as a
> variable when the same dataset is opened with xarray?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or mute
> the thread
>
> .
>
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694
https://github.com/pydata/xarray/pull/1528#issuecomment-350343117,https://api.github.com/repos/pydata/xarray/issues/1528,350343117,MDEyOklzc3VlQ29tbWVudDM1MDM0MzExNw==,306380,2017-12-08T18:55:35Z,2017-12-08T18:55:35Z,MEMBER,"Not as far as I know.
On Fri, Dec 8, 2017 at 1:53 PM, Ryan Abernathey
wrote:
> *@rabernat* commented on this pull request.
> ------------------------------
>
> In xarray/backends/common.py
> :
>
> > @@ -184,7 +185,7 @@ def sync(self):
> import dask.array as da
> import dask
> if LooseVersion(dask.__version__) > LooseVersion('0.8.1'):
> - da.store(self.sources, self.targets, lock=GLOBAL_LOCK)
> + da.store(self.sources, self.targets, lock=self.lock)
>
> There is no reason that a task run on the distributed system will not show
> up on the dashboard. My first guess is that somehow you're using a local
> scheduler.
>
> I was not using a local scheduler. After digging further, I can see the
> tasks on the distributed dashboard using a regular zarr.DirectoryStore,
> but not when I pass a gcsfs.mapping.GCSMap to to_zarr. Is there any
> reasons these two should behave differently?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or mute
> the thread
>
> .
>
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694
https://github.com/pydata/xarray/pull/1528#issuecomment-349488598,https://api.github.com/repos/pydata/xarray/issues/1528,349488598,MDEyOklzc3VlQ29tbWVudDM0OTQ4ODU5OA==,306380,2017-12-06T00:30:21Z,2017-12-06T00:30:21Z,MEMBER,We tried this out on a cloud-deployed cluster on GCE and things worked pleasantly. Some conversation here: https://github.com/pangeo-data/pangeo/issues/19,"{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 1, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694
https://github.com/pydata/xarray/pull/1528#issuecomment-347983854,https://api.github.com/repos/pydata/xarray/issues/1528,347983854,MDEyOklzc3VlQ29tbWVudDM0Nzk4Mzg1NA==,306380,2017-11-29T20:19:37Z,2017-11-29T20:19:37Z,MEMBER,"> FWIW I think the best option at the moment is to make sure you add either Pickle or MsgPack filter for any zarr array with an object dtype.
Is it possible to add one of these filters to XArray's default use of Zarr?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694
https://github.com/pydata/xarray/pull/1528#issuecomment-347981682,https://api.github.com/repos/pydata/xarray/issues/1528,347981682,MDEyOklzc3VlQ29tbWVudDM0Nzk4MTY4Mg==,306380,2017-11-29T20:11:25Z,2017-11-29T20:11:25Z,MEMBER,FWIW my vote is for msgpack over pickle for both performance and cross-language reasons,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694
https://github.com/pydata/xarray/pull/1528#issuecomment-345778844,https://api.github.com/repos/pydata/xarray/issues/1528,345778844,MDEyOklzc3VlQ29tbWVudDM0NTc3ODg0NA==,306380,2017-11-20T18:05:25Z,2017-11-20T18:05:25Z,MEMBER,"> This is, of course, by design :)
It's so nice when well-designed things come together and just work as planned :)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694
https://github.com/pydata/xarray/pull/1528#issuecomment-345575240,https://api.github.com/repos/pydata/xarray/issues/1528,345575240,MDEyOklzc3VlQ29tbWVudDM0NTU3NTI0MA==,306380,2017-11-20T02:28:07Z,2017-11-20T02:28:07Z,MEMBER,"That is, indeed, quite exciting. Also exciting is that I was able to look at and compute on your data easily.
```python
In [1]: import zarr
In [2]: import gcsfs
In [3]: fs = gcsfs.GCSFileSystem(project='pangeo-181919')
In [4]: gcsmap = gcsfs.mapping.GCSMap('zarr_store_test', gcs=fs, check=True, create=False)
In [5]: import xarray as xr
In [6]: ds_gcs = xr.open_zarr(gcsmap, mode='r')
In [7]: ds_gcs
Out[7]:
Dimensions: (x: 200, y: 100)
Coordinates:
* x (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ...
* y (y) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ...
Data variables:
bar (x) float64 dask.array
foo (y, x) float32 dask.array
Attributes:
array_atr: [1, 2]
some_attr: copana
In [8]: ds_gcs.sum()
Out[8]:
Dimensions: ()
Data variables:
bar float64 dask.array
foo float32 dask.array
In [9]: ds_gcs.sum().compute()
Out[9]:
Dimensions: ()
Data variables:
bar float64 0.0
foo float32 20000.0
```","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 1, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694
https://github.com/pydata/xarray/pull/1528#issuecomment-345104713,https://api.github.com/repos/pydata/xarray/issues/1528,345104713,MDEyOklzc3VlQ29tbWVudDM0NTEwNDcxMw==,306380,2017-11-17T00:12:01Z,2017-11-17T00:12:01Z,MEMBER,Hooray for standard interfaces!,"{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 1, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694
https://github.com/pydata/xarray/pull/1528#issuecomment-345101150,https://api.github.com/repos/pydata/xarray/issues/1528,345101150,MDEyOklzc3VlQ29tbWVudDM0NTEwMTE1MA==,306380,2017-11-16T23:52:48Z,2017-11-16T23:52:48Z,MEMBER,"The gcsfs library also provides a MutableMapping for Google Cloud Storage.
The dask.distributed library now also provides a distributed lock for synchronization, if necessary though in practice we should just rechunk the dask.array before writing.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694