github: issue_comments: 7 rows where author_association = "MEMBER" and issue = 546562676 sorted by updated

7 rows where author_association = "MEMBER" and issue = 546562676 sorted by updated_at descending

Search:

✖

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
573910792	https://github.com/pydata/xarray/issues/3668#issuecomment-573910792	https://api.github.com/repos/pydata/xarray/issues/3668	MDEyOklzc3VlQ29tbWVudDU3MzkxMDc5Mg==	rabernat 1197350	2020-01-13T22:50:41Z	2020-01-13T22:50:48Z	MEMBER	It would be wonderful if we could translate this complex xarray issue into a minimally simple zarr issue. Then the zarr devs can decide whether this use case is compatible with the zarr spec or not.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset: support for multiple zarr datasets 546562676
573509747	https://github.com/pydata/xarray/issues/3668#issuecomment-573509747	https://api.github.com/repos/pydata/xarray/issues/3668	MDEyOklzc3VlQ29tbWVudDU3MzUwOTc0Nw==	jhamman 2443309	2020-01-13T05:06:45Z	2020-01-13T05:06:45Z	MEMBER	@dmedv and @rabernat - after thinking about this a bit more and reviewing the links in the last post, I'm pretty sure we're bumping into a bug in zarray's directory store pickle support. It would be nice to confirm this with some zarr-only tests but I don't see why the store needs to reference the zgroup files when the object is unpickled.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset: support for multiple zarr datasets 546562676
573197896	https://github.com/pydata/xarray/issues/3668#issuecomment-573197896	https://api.github.com/repos/pydata/xarray/issues/3668	MDEyOklzc3VlQ29tbWVudDU3MzE5Nzg5Ng==	jhamman 2443309	2020-01-10T20:43:30Z	2020-01-10T20:43:30Z	MEMBER	Also, @dmedv, can you add the output of `xr.show_versions()` to your original post?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset: support for multiple zarr datasets 546562676
573196874	https://github.com/pydata/xarray/issues/3668#issuecomment-573196874	https://api.github.com/repos/pydata/xarray/issues/3668	MDEyOklzc3VlQ29tbWVudDU3MzE5Njg3NA==	jhamman 2443309	2020-01-10T20:40:14Z	2020-01-10T20:40:14Z	MEMBER	The scenario you are describing--trying to open a file that is not accessible at all from the client--is certainly not something we ever considered when designing this. It is a miracle to me that it does work with netCDF. True. I think its fair to say that the behavior you are enjoying (accessing data that the client cannot see) is the exception, not the rule. I expect there are many places in our backends that will not support this functionality at present. The motivation for implementing the `parallel` feature was simply to shard the fileIO time when opening large collections (>10k) of netcdf files. Ironically, this dask issue also popped up and has some significant overlap here: https://github.com/dask/dask/issues/5769 In both of these cases, the desire is for the worker to open the file (or zarr dataset), construct the underlying dask arrays, and return the meta object. This requires the object to be fully pickle-able and for any references to be maintained. It is possible, as indicated by your traceback, that the zarr backend is trying to reference the `zgroup` file and its not there. The logical place to start would be to look into why we can't pickle xarray datasets that come from zarr stores.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset: support for multiple zarr datasets 546562676
572369966	https://github.com/pydata/xarray/issues/3668#issuecomment-572369966	https://api.github.com/repos/pydata/xarray/issues/3668	MDEyOklzc3VlQ29tbWVudDU3MjM2OTk2Ng==	rabernat 1197350	2020-01-09T03:42:23Z	2020-01-09T03:42:23Z	MEMBER	Thanks for these detailed reports! The scenario you are describing--trying to open a file that is not accessible at all from the client--is certainly not something we ever considered when designing this. It is a miracle to me that it does work with netCDF. I think you are on track with the serialization diagnostics. I believe that @jhamman has the best understanding of this topic. He implemented the parallel mode in `open_mfdataset`. Perhaps he can give some suggestions. In the meantime, it seems worth asking the obvious question...how hard would it be to mount the NFS volume on the client? That would avoid having to go down this route.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset: support for multiple zarr datasets 546562676
572205386	https://github.com/pydata/xarray/issues/3668#issuecomment-572205386	https://api.github.com/repos/pydata/xarray/issues/3668	MDEyOklzc3VlQ29tbWVudDU3MjIwNTM4Ng==	rabernat 1197350	2020-01-08T18:51:06Z	2020-01-08T18:51:06Z	MEMBER	Hi @dmedv -- thanks a lot for raising this issue here! One clarification question: is there just a single zarr store you are trying to read? Or are you trying to combine multiple stores, like `open_mfdataset` does with multiple netcdf files? Some of the data is only available on the workers, not on the client. Can you provide more detail about how the zarr data is distributed across the different workers and client.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset: support for multiple zarr datasets 546562676
572196698	https://github.com/pydata/xarray/issues/3668#issuecomment-572196698	https://api.github.com/repos/pydata/xarray/issues/3668	MDEyOklzc3VlQ29tbWVudDU3MjE5NjY5OA==	dcherian 2448579	2020-01-08T18:28:57Z	2020-01-08T18:28:57Z	MEMBER	You can use the pseudocode here: https://xarray.pydata.org/en/stable/io.html#reading-multi-file-datasets and change `open_dataset` to `open_zarr` and then things should work (if I understand things correctly)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset: support for multiple zarr datasets 546562676

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);