html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/pull/1528#issuecomment-364801395,https://api.github.com/repos/pydata/xarray/issues/1528,364801395,MDEyOklzc3VlQ29tbWVudDM2NDgwMTM5NQ==,306380,2018-02-11T23:40:18Z,2018-02-11T23:40:18Z,MEMBER,"Does the to_zarr method suffice: http://xarray.pydata.org/en/latest/generated/xarray.Dataset.to_zarr.html#xarray.Dataset.to_zarr ? On Sun, Feb 11, 2018 at 6:35 PM, Martin Durant wrote: > Question: how would one *build* a zarr-xarray dataset? > > With zarr you can open an array that contains no data, and use set-slice > notation to fill in the values (which is what dask's store essentially > does). > > If I have some pre-known coordinates and bigger-than-memory data arrays, > how would I go about getting the values into the zarr structure? If this > can't be done directly with the xarray interface, is there a way to call > zarr's open/create/zeros such that the corresponding array will appear as a > variable when the same dataset is opened with xarray? > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > , or mute > the thread > > . > ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694 https://github.com/pydata/xarray/pull/1528#issuecomment-350343117,https://api.github.com/repos/pydata/xarray/issues/1528,350343117,MDEyOklzc3VlQ29tbWVudDM1MDM0MzExNw==,306380,2017-12-08T18:55:35Z,2017-12-08T18:55:35Z,MEMBER,"Not as far as I know. On Fri, Dec 8, 2017 at 1:53 PM, Ryan Abernathey wrote: > *@rabernat* commented on this pull request. > ------------------------------ > > In xarray/backends/common.py > : > > > @@ -184,7 +185,7 @@ def sync(self): > import dask.array as da > import dask > if LooseVersion(dask.__version__) > LooseVersion('0.8.1'): > - da.store(self.sources, self.targets, lock=GLOBAL_LOCK) > + da.store(self.sources, self.targets, lock=self.lock) > > There is no reason that a task run on the distributed system will not show > up on the dashboard. My first guess is that somehow you're using a local > scheduler. > > I was not using a local scheduler. After digging further, I can see the > tasks on the distributed dashboard using a regular zarr.DirectoryStore, > but not when I pass a gcsfs.mapping.GCSMap to to_zarr. Is there any > reasons these two should behave differently? > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > , or mute > the thread > > . > ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694 https://github.com/pydata/xarray/pull/1528#issuecomment-349488598,https://api.github.com/repos/pydata/xarray/issues/1528,349488598,MDEyOklzc3VlQ29tbWVudDM0OTQ4ODU5OA==,306380,2017-12-06T00:30:21Z,2017-12-06T00:30:21Z,MEMBER,We tried this out on a cloud-deployed cluster on GCE and things worked pleasantly. Some conversation here: https://github.com/pangeo-data/pangeo/issues/19,"{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 1, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694 https://github.com/pydata/xarray/pull/1528#issuecomment-347983854,https://api.github.com/repos/pydata/xarray/issues/1528,347983854,MDEyOklzc3VlQ29tbWVudDM0Nzk4Mzg1NA==,306380,2017-11-29T20:19:37Z,2017-11-29T20:19:37Z,MEMBER,"> FWIW I think the best option at the moment is to make sure you add either Pickle or MsgPack filter for any zarr array with an object dtype. Is it possible to add one of these filters to XArray's default use of Zarr?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694 https://github.com/pydata/xarray/pull/1528#issuecomment-347981682,https://api.github.com/repos/pydata/xarray/issues/1528,347981682,MDEyOklzc3VlQ29tbWVudDM0Nzk4MTY4Mg==,306380,2017-11-29T20:11:25Z,2017-11-29T20:11:25Z,MEMBER,FWIW my vote is for msgpack over pickle for both performance and cross-language reasons,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694 https://github.com/pydata/xarray/pull/1528#issuecomment-345778844,https://api.github.com/repos/pydata/xarray/issues/1528,345778844,MDEyOklzc3VlQ29tbWVudDM0NTc3ODg0NA==,306380,2017-11-20T18:05:25Z,2017-11-20T18:05:25Z,MEMBER,"> This is, of course, by design :) It's so nice when well-designed things come together and just work as planned :)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694 https://github.com/pydata/xarray/pull/1528#issuecomment-345575240,https://api.github.com/repos/pydata/xarray/issues/1528,345575240,MDEyOklzc3VlQ29tbWVudDM0NTU3NTI0MA==,306380,2017-11-20T02:28:07Z,2017-11-20T02:28:07Z,MEMBER,"That is, indeed, quite exciting. Also exciting is that I was able to look at and compute on your data easily. ```python In [1]: import zarr In [2]: import gcsfs In [3]: fs = gcsfs.GCSFileSystem(project='pangeo-181919') In [4]: gcsmap = gcsfs.mapping.GCSMap('zarr_store_test', gcs=fs, check=True, create=False) In [5]: import xarray as xr In [6]: ds_gcs = xr.open_zarr(gcsmap, mode='r') In [7]: ds_gcs Out[7]: Dimensions: (x: 200, y: 100) Coordinates: * x (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ... * y (y) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ... Data variables: bar (x) float64 dask.array foo (y, x) float32 dask.array Attributes: array_atr: [1, 2] some_attr: copana In [8]: ds_gcs.sum() Out[8]: Dimensions: () Data variables: bar float64 dask.array foo float32 dask.array In [9]: ds_gcs.sum().compute() Out[9]: Dimensions: () Data variables: bar float64 0.0 foo float32 20000.0 ```","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 1, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694 https://github.com/pydata/xarray/pull/1528#issuecomment-345104713,https://api.github.com/repos/pydata/xarray/issues/1528,345104713,MDEyOklzc3VlQ29tbWVudDM0NTEwNDcxMw==,306380,2017-11-17T00:12:01Z,2017-11-17T00:12:01Z,MEMBER,Hooray for standard interfaces!,"{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 1, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694 https://github.com/pydata/xarray/pull/1528#issuecomment-345101150,https://api.github.com/repos/pydata/xarray/issues/1528,345101150,MDEyOklzc3VlQ29tbWVudDM0NTEwMTE1MA==,306380,2017-11-16T23:52:48Z,2017-11-16T23:52:48Z,MEMBER,"The gcsfs library also provides a MutableMapping for Google Cloud Storage. The dask.distributed library now also provides a distributed lock for synchronization, if necessary though in practice we should just rechunk the dask.array before writing.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253136694