home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where issue = 1588516592 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 3

  • rabernat 2
  • dcherian 1
  • JMorado 1

author_association 2

  • MEMBER 3
  • NONE 1

issue 1

  • added 'storage_transformers' to valid_encodings · 4 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1460504638 https://github.com/pydata/xarray/pull/7540#issuecomment-1460504638 https://api.github.com/repos/pydata/xarray/issues/7540 IC_kwDOAMm_X85XDYg- dcherian 2448579 2023-03-08T17:00:14Z 2023-03-08T17:00:14Z MEMBER

Does it makes sense to create a new backend in a new project to enable experimentation?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  added 'storage_transformers' to valid_encodings 1588516592
1460182260 https://github.com/pydata/xarray/pull/7540#issuecomment-1460182260 https://api.github.com/repos/pydata/xarray/issues/7540 IC_kwDOAMm_X85XCJz0 rabernat 1197350 2023-03-08T13:48:51Z 2023-03-08T13:49:21Z MEMBER

Regarding locks, I think we need to think hard about the best way to deal with this across the stack. There are a couple of different options: - Current status: just use a global lock on the entire array--super inefficient - A bit better: use per-variable locks - Even better: have locks at the shard level. This would allow concurrent writing of shards - Alternative which accomplishes the same thing: expose different virtual chunks when reading vs. writing. When writing, the writer library (e.g. Xarray or Dask) would see the shards as the chunks (with a lower layer of the stack handling breaking the shard down into chunks). When reading, the individual, smaller chunks would be accessible.

Note that there are still some deep inefficiencies in the way zarr-python writes shards (see https://github.com/zarr-developers/zarr-python/discussions/1338). I think we should be optimizing things at the Zarr level first, before implementing workarounds in Xarray.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  added 'storage_transformers' to valid_encodings 1588516592
1460175664 https://github.com/pydata/xarray/pull/7540#issuecomment-1460175664 https://api.github.com/repos/pydata/xarray/issues/7540 IC_kwDOAMm_X85XCIMw rabernat 1197350 2023-03-08T13:44:02Z 2023-03-08T13:44:02Z MEMBER

It's great to see this PR get started in Xarray! Thanks @JMorado!

From the perspective of a Zarr developer, the sharding feature is still highly experimental. The API may change significantly. While the sharding code is released in the sense that it is available deep in Zarr, it is not really considered part of the public API yet.

So perhaps it's a bit too early to be doing this?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  added 'storage_transformers' to valid_encodings 1588516592
1454850970 https://github.com/pydata/xarray/pull/7540#issuecomment-1454850970 https://api.github.com/repos/pydata/xarray/issues/7540 IC_kwDOAMm_X85Wt0Oa JMorado 7460993 2023-03-04T19:32:19Z 2023-03-04T19:32:19Z NONE

Hi @jhamman,

I have added a test and corrected a problem that I had previously missed. It turns out that a lock must be used to ensure the correct writing of a sharded zarr store. Everything seems to be working as expected now. Here are some comments:

  • Currently, a lock is passed to ArrayWriter even if only one variable is being sharded. While I believe that locks would not be needed to write non-sharded variables, I don't think it is possible to restrict their usage to specific variables, since the zarr store is written as a whole. Any thoughts on this?
  • Related to the previous point, I think the usage of locks could even be made shard-specific, relaxing the writing of different shard files. I also believe this is not currently possible, but let me know if there is a solution.
  • For the test, importing ShardingStorageTransformer is currently a bit awkward, but it can be neated in the future when ShardingStorageTransformer becomes available from the top-level zarr namespace.

Let me know what you think.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  added 'storage_transformers' to valid_encodings 1588516592

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 281.299ms · About: xarray-datasette