home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

12 rows where issue = 421029352 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 3

  • tasansal 5
  • rabernat 4
  • dcherian 3

author_association 2

  • MEMBER 7
  • NONE 5

issue 1

  • expose zarr caching from xarray · 12 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1246120311 https://github.com/pydata/xarray/issues/2812#issuecomment-1246120311 https://api.github.com/repos/pydata/xarray/issues/2812 IC_kwDOAMm_X85KRkl3 dcherian 2448579 2022-09-14T01:33:03Z 2022-09-14T01:33:03Z MEMBER

docs.xarray.dev/en/stable/user-guide/io.html seems great to me.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  expose zarr caching from xarray 421029352
1246009657 https://github.com/pydata/xarray/issues/2812#issuecomment-1246009657 https://api.github.com/repos/pydata/xarray/issues/2812 IC_kwDOAMm_X85KRJk5 tasansal 13684161 2022-09-13T22:24:59Z 2022-09-13T22:24:59Z NONE

@dcherian, I will start a PR. Where do you think this belongs in the docs? Some places I can think of:

  • Examples section https://docs.xarray.dev/en/stable/generated/xarray.open_dataset.html
  • https://docs.xarray.dev/en/stable/user-guide/io.html
  • FAQ? https://docs.xarray.dev/en/stable/getting-started-guide/faq.html
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  expose zarr caching from xarray 421029352
1246007312 https://github.com/pydata/xarray/issues/2812#issuecomment-1246007312 https://api.github.com/repos/pydata/xarray/issues/2812 IC_kwDOAMm_X85KRJAQ tasansal 13684161 2022-09-13T22:20:57Z 2022-09-13T22:20:57Z NONE

I couldn't get open_zarr to open without Daskifying arrays. open_dataset(..., engine="zarr") does open without Daskifying when you haven't passed chunks.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  expose zarr caching from xarray 421029352
1246005938 https://github.com/pydata/xarray/issues/2812#issuecomment-1246005938 https://api.github.com/repos/pydata/xarray/issues/2812 IC_kwDOAMm_X85KRIqy rabernat 1197350 2022-09-13T22:18:31Z 2022-09-13T22:18:31Z MEMBER

Glad you got it working! So you're saying it does not work with open_zarr and does work with open_dataset(...engine='zarr')? Weird. We should deprecate open_zarr.

However, the behavior in Dask is strange. I think it is making each worker have its own cache and blowing up memory if I ask for a large cache.

Yes, I think I experienced that as well. I think the entire cache is serialized and passed around between workers.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  expose zarr caching from xarray 421029352
1246004791 https://github.com/pydata/xarray/issues/2812#issuecomment-1246004791 https://api.github.com/repos/pydata/xarray/issues/2812 IC_kwDOAMm_X85KRIY3 dcherian 2448579 2022-09-13T22:16:33Z 2022-09-13T22:16:33Z MEMBER

@tasansal a PR would be very welcome!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  expose zarr caching from xarray 421029352
1245989599 https://github.com/pydata/xarray/issues/2812#issuecomment-1245989599 https://api.github.com/repos/pydata/xarray/issues/2812 IC_kwDOAMm_X85KRErf tasansal 13684161 2022-09-13T21:52:45Z 2022-09-13T21:52:45Z NONE

@rabernat

Following up on the previous, yes it does work with the Zarr backend! I agree with @dcherian, we should add this to the docs.

However, the behavior in Dask is strange. I think it is making each worker have its own cache and blowing up memory if I ask for a large cache.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  expose zarr caching from xarray 421029352
1245417352 https://github.com/pydata/xarray/issues/2812#issuecomment-1245417352 https://api.github.com/repos/pydata/xarray/issues/2812 IC_kwDOAMm_X85KO4-I tasansal 13684161 2022-09-13T13:30:08Z 2022-09-13T13:58:55Z NONE

@rabernat, yes, I have tried that like this:

```python from zarr.storage import FSStore, LRUStoreCache import xarray as xr

path = "gs://prefix/object.zarr"

store_nocache = FSStore(path) store_cached = LRUStoreCache(store_nocache, max_size=2**30)

ds = xr.open_zarr(store_cached) ```

When I read the same data twice, it still downloads. Am I doing something wrong?

While I wait for a response, I will try it again and update if it works, but the last time I checked, it didn't.

Note to self: I also need to check it with Zarr backend and Dask backend.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  expose zarr caching from xarray 421029352
1243935545 https://github.com/pydata/xarray/issues/2812#issuecomment-1243935545 https://api.github.com/repos/pydata/xarray/issues/2812 IC_kwDOAMm_X85KJPM5 dcherian 2448579 2022-09-12T15:46:57Z 2022-09-12T15:46:57Z MEMBER

You just have to initialize the Store object outside of Xarray and then pass it to open_zarr or open_dataset(store, engine="zarr").

This would be good to document!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  expose zarr caching from xarray 421029352
1243823078 https://github.com/pydata/xarray/issues/2812#issuecomment-1243823078 https://api.github.com/repos/pydata/xarray/issues/2812 IC_kwDOAMm_X85KIzvm rabernat 1197350 2022-09-12T14:25:39Z 2022-09-12T14:25:39Z MEMBER

I have successfully used the Zarr LRU cache with Xarray. You just have to initialize the Store object outside of Xarray and then pass it to open_zarr or open_dataset(store, engine="zarr").

Have you tried that?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  expose zarr caching from xarray 421029352
1243814673 https://github.com/pydata/xarray/issues/2812#issuecomment-1243814673 https://api.github.com/repos/pydata/xarray/issues/2812 IC_kwDOAMm_X85KIxsR tasansal 13684161 2022-09-12T14:20:01Z 2022-09-12T14:20:01Z NONE

Hi @rabernat, I looked at your PRs, and they seem to haven't gotten much attention.

I tried using a store with LRUCache in open_zarr, but it appears to ignore the cache.

For our use cases in https://github.com/TGSAI/mdio-python, we usually want to use any form of LRUCache (it doesn't have to be Zarr's necessarily).

  • Do you know of a hack to make this work?
  • What can we do to help and start working on this?
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  expose zarr caching from xarray 421029352
472905515 https://github.com/pydata/xarray/issues/2812#issuecomment-472905515 https://api.github.com/repos/pydata/xarray/issues/2812 MDEyOklzc3VlQ29tbWVudDQ3MjkwNTUxNQ== rabernat 1197350 2019-03-14T15:02:22Z 2019-03-14T15:02:22Z MEMBER

I have created two PRs which attempt to provide zarr caching in different ways. I would welcome some advice on which one is a better approach.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  expose zarr caching from xarray 421029352
472871184 https://github.com/pydata/xarray/issues/2812#issuecomment-472871184 https://api.github.com/repos/pydata/xarray/issues/2812 MDEyOklzc3VlQ29tbWVudDQ3Mjg3MTE4NA== rabernat 1197350 2019-03-14T14:07:03Z 2019-03-14T14:07:03Z MEMBER

Or should we use xarray's own caching mechanism?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  expose zarr caching from xarray 421029352

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 390.74ms · About: xarray-datasette