home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

11 rows where issue = 617476316 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 3

  • AndrewILWilliams 5
  • shoyer 3
  • dcherian 3

author_association 2

  • MEMBER 6
  • CONTRIBUTOR 5

issue 1

  • Automatic chunking of arrays ? · 11 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
628816425 https://github.com/pydata/xarray/issues/4055#issuecomment-628816425 https://api.github.com/repos/pydata/xarray/issues/4055 MDEyOklzc3VlQ29tbWVudDYyODgxNjQyNQ== shoyer 1217238 2020-05-14T18:37:40Z 2020-05-14T18:37:40Z MEMBER

If we think can improve an error message by adding additional context, the right solution is to use raise Exception(...) from original_error: https://stackoverflow.com/a/16414892/809705

On the other hand, if xarray doesn't have anything more to add on top of the original error message, it is best not to add any wrapper at all. Users will just see the original error from dask.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Automatic chunking of arrays ? 617476316
628797255 https://github.com/pydata/xarray/issues/4055#issuecomment-628797255 https://api.github.com/repos/pydata/xarray/issues/4055 MDEyOklzc3VlQ29tbWVudDYyODc5NzI1NQ== AndrewILWilliams 56925856 2020-05-14T18:01:45Z 2020-05-14T18:01:45Z CONTRIBUTOR

I also thought that, after the dask error message it's pretty easy to then look at the dataset and check what the problem dimension is.

In general though, is that the type of layout you'd suggest for catching and re-raising errors? Using raise Exception() ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Automatic chunking of arrays ? 617476316
628616379 https://github.com/pydata/xarray/issues/4055#issuecomment-628616379 https://api.github.com/repos/pydata/xarray/issues/4055 MDEyOklzc3VlQ29tbWVudDYyODYxNjM3OQ== AndrewILWilliams 56925856 2020-05-14T12:57:21Z 2020-05-14T17:50:31Z CONTRIBUTOR

Nice, that's neater! Would this work, in the maybe_chunk() call? Sorry about the basic questions!

python def maybe_chunk(name, var, chunks): chunks = selkeys(chunks, var.dims) if not chunks: chunks = None if var.ndim > 0: # when rechunking by different amounts, make sure dask names change # by provinding chunks as an input to tokenize. # subtle bugs result otherwise. see GH3350 token2 = tokenize(name, token if token else var._data, chunks) name2 = f"{name_prefix}{name}-{token2}" try: return var.chunk(chunks, name=name2, lock=lock) except NotImplementedError as err: raise Exception("Automatic chunking fails for object arrays." + "These include cftime DataArrays.") else: return var

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Automatic chunking of arrays ? 617476316
628747933 https://github.com/pydata/xarray/issues/4055#issuecomment-628747933 https://api.github.com/repos/pydata/xarray/issues/4055 MDEyOklzc3VlQ29tbWVudDYyODc0NzkzMw== shoyer 1217238 2020-05-14T16:31:39Z 2020-05-14T16:31:39Z MEMBER

The error message from dask is already pretty descriptive: NotImplementedError: Can not use auto rechunking with object dtype. We are unable to estimate the size in bytes of object data

I don't think we have much to add on top of that?

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Automatic chunking of arrays ? 617476316
628582552 https://github.com/pydata/xarray/issues/4055#issuecomment-628582552 https://api.github.com/repos/pydata/xarray/issues/4055 MDEyOklzc3VlQ29tbWVudDYyODU4MjU1Mg== dcherian 2448579 2020-05-14T11:51:21Z 2020-05-14T11:51:21Z MEMBER

is_scalar(chunks) might be the appropriate condition. is_scalar is already imported from .utils in dataset.py

This seems to work fine in a lot of cases, except automatic chunking isn't implemented for object dtypes at the moment, so it fails if you pass a cftime coordinate, for example.

Can we catch this error and re-raise specifying "automatic chunking fails for object arrays. These include cftime DataArrays" or something like that?

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Automatic chunking of arrays ? 617476316
628513777 https://github.com/pydata/xarray/issues/4055#issuecomment-628513777 https://api.github.com/repos/pydata/xarray/issues/4055 MDEyOklzc3VlQ29tbWVudDYyODUxMzc3Nw== AndrewILWilliams 56925856 2020-05-14T09:26:24Z 2020-05-14T09:26:24Z CONTRIBUTOR

Also, the contributing docs have been super clear so far! Thanks! :)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Automatic chunking of arrays ? 617476316
628513443 https://github.com/pydata/xarray/issues/4055#issuecomment-628513443 https://api.github.com/repos/pydata/xarray/issues/4055 MDEyOklzc3VlQ29tbWVudDYyODUxMzQ0Mw== AndrewILWilliams 56925856 2020-05-14T09:25:48Z 2020-05-14T09:25:48Z CONTRIBUTOR

Cheers! Just had a look, is it as simple as just changing this line to the following, @dcherian ?

python if isinstance(chunks, Number) or chunks=='auto': chunks = dict.fromkeys(self.dims, chunks)

This seems to work fine in a lot of cases, except automatic chunking isn't implemented for object dtypes at the moment, so it fails if you pass a cftime coordinate, for example.

One option is to automatically use self=xr.decode_cf(self) if the input dataset is cftime? Or could just throw an error.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Automatic chunking of arrays ? 617476316
628319690 https://github.com/pydata/xarray/issues/4055#issuecomment-628319690 https://api.github.com/repos/pydata/xarray/issues/4055 MDEyOklzc3VlQ29tbWVudDYyODMxOTY5MA== shoyer 1217238 2020-05-14T00:43:22Z 2020-05-14T00:43:22Z MEMBER

Agreed, this would be very welcome!

chunks='auto' isn't supported only because xarray support for dask predates it :)

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Automatic chunking of arrays ? 617476316
628231949 https://github.com/pydata/xarray/issues/4055#issuecomment-628231949 https://api.github.com/repos/pydata/xarray/issues/4055 MDEyOklzc3VlQ29tbWVudDYyODIzMTk0OQ== dcherian 2448579 2020-05-13T20:35:49Z 2020-05-13T20:35:49Z MEMBER

Awesome! Please see https://xarray.pydata.org/en/stable/contributing.html for docs on contributing

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Automatic chunking of arrays ? 617476316
628212516 https://github.com/pydata/xarray/issues/4055#issuecomment-628212516 https://api.github.com/repos/pydata/xarray/issues/4055 MDEyOklzc3VlQ29tbWVudDYyODIxMjUxNg== AndrewILWilliams 56925856 2020-05-13T19:56:34Z 2020-05-13T19:56:34Z CONTRIBUTOR

Oh ok I didn't know about this, I'll take a look and read the contribution docs tomorrow ! It'll be my first PR so may need a bit of hand-holding when it comes to tests. Willing to try though!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Automatic chunking of arrays ? 617476316
628065564 https://github.com/pydata/xarray/issues/4055#issuecomment-628065564 https://api.github.com/repos/pydata/xarray/issues/4055 MDEyOklzc3VlQ29tbWVudDYyODA2NTU2NA== dcherian 2448579 2020-05-13T15:26:10Z 2020-05-13T15:26:49Z MEMBER

so da.chunk({dim_name: "auto"}) works but da.chunk("auto") does not. The latter is a relatively easy fix. We just need to update the condition here: https://github.com/pydata/xarray/blob/bd84186acbd84bd386134a5b60111596cee2d8ec/xarray/core/dataset.py#L1736-L1737

A PR would be very welcome if you have the time, @AndrewWilliams3142

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Automatic chunking of arrays ? 617476316

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.761ms · About: xarray-datasette