home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

7 rows where user = 460756 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, reactions, created_at (date), updated_at (date)

issue 4

  • Support creating DataSet from streaming object 2
  • Representing & checking Dataset schemas 2
  • Time dtype encoding defaulting to `int64` when writing netcdf or zarr 2
  • `ds.to_zarr(mode="a", append_dim="time")` not capturing any time steps under Hours 1

user 1

  • JackKelly · 7 ✖

author_association 1

  • NONE 7
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
965595392 https://github.com/pydata/xarray/issues/3942#issuecomment-965595392 https://api.github.com/repos/pydata/xarray/issues/3942 IC_kwDOAMm_X845jdEA JackKelly 460756 2021-11-10T17:56:41Z 2021-11-10T17:57:17Z NONE

Cool, I agree that an error and a documentation change is likely to be sufficient :slightly_smiling_face: (and I'd be keen to write a PR to help out!)

But, before we commit to that path: Please may I ask: Why not default to xarray encoding time as 'units': 'nanoseconds since 1970-01-01' to be consistent with np.datetime64[ns]? Sorry if I've missed something obvious!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Time dtype encoding defaulting to `int64` when writing netcdf or zarr 595492608
965562434 https://github.com/pydata/xarray/issues/3942#issuecomment-965562434 https://api.github.com/repos/pydata/xarray/issues/3942 IC_kwDOAMm_X845jVBC JackKelly 460756 2021-11-10T17:17:29Z 2021-11-10T17:49:22Z NONE

I think I've bumped into a symptom of this issue (my issue is described in #5969). And I think #3379 may be another symptom of this issue.

Perhaps I'm biased (because I work with timeseries which only span a few years) but I wonder if xarray should default to encoding time as 'units': 'nanoseconds since 1970-01-01' (to be consistent with np.datetime64[ns]) unless the timeseries includes dates before the year 1677, or after the year 2262 :slightly_smiling_face:? Would that work?

If that's no good, then let's definitely add a note to the documentation to say that it might be a good idea for users to manually specify the encoding for datetimes if they wish to append to Zarrs.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Time dtype encoding defaulting to `int64` when writing netcdf or zarr 595492608
965543913 https://github.com/pydata/xarray/issues/3379#issuecomment-965543913 https://api.github.com/repos/pydata/xarray/issues/3379 IC_kwDOAMm_X845jQfp JackKelly 460756 2021-11-10T16:56:58Z 2021-11-10T16:56:58Z NONE

I think the underlying issue might be #3942.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `ds.to_zarr(mode="a", append_dim="time")` not capturing any time steps under Hours 503583044
938691112 https://github.com/pydata/xarray/issues/1900#issuecomment-938691112 https://api.github.com/repos/pydata/xarray/issues/1900 IC_kwDOAMm_X84380oo JackKelly 460756 2021-10-08T14:32:44Z 2021-10-08T14:35:46Z NONE

OK, I think pandera isn't the way forwards because it appears very tighly coupled to Pandas (so, for example, I don't think it's possible to use pandera with n-dimensional arrays).

But Pydantic looks promising. Here's a very quick coding experiment showing one way to use pydantic with xarray... it validates a few things; but it's not super-useful as a human-readable specification for what's going on inside a DataArray or Dataset.

{
    "total_count": 2,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 1
}
  Representing & checking Dataset schemas  295959111
938397801 https://github.com/pydata/xarray/issues/1900#issuecomment-938397801 https://api.github.com/repos/pydata/xarray/issues/1900 IC_kwDOAMm_X8437tBp JackKelly 460756 2021-10-08T07:04:51Z 2021-10-08T07:04:51Z NONE

I'm really interested in a machine-readable schema for xarray!

Pandera provides machine-readable schemas for Pandas and, as of version 0.7, panderas has decoupled pandera and pandas types to make pandera more useful for things like xarray. I haven't tried pandera yet but I plan to do some experiments soon.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Representing & checking Dataset schemas  295959111
636641496 https://github.com/pydata/xarray/issues/1075#issuecomment-636641496 https://api.github.com/repos/pydata/xarray/issues/1075 MDEyOklzc3VlQ29tbWVudDYzNjY0MTQ5Ng== JackKelly 460756 2020-06-01T06:37:08Z 2020-06-01T06:37:08Z NONE

FWIW, I've also tested @delgadom's technique, using netCDF4 and it also works well (and is useful in situations where we don't want to install h5netcdf). Thanks!

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support creating DataSet from streaming object 186895655
635415386 https://github.com/pydata/xarray/issues/1075#issuecomment-635415386 https://api.github.com/repos/pydata/xarray/issues/1075 MDEyOklzc3VlQ29tbWVudDYzNTQxNTM4Ng== JackKelly 460756 2020-05-28T15:18:34Z 2020-05-28T15:19:06Z NONE

Is this now implemented (and hence can this issue be closed?) It appears that this works well:

python boto_s3 = boto3.client('s3') s3_object = boto_s3.get_object(Bucket=bucket, Key=key) netcdf_bytes = s3_object['Body'].read() netcdf_bytes_io = io.BytesIO(netcdf_bytes) ds = xr.open_dataset(netcdf_bytes_io)

Is that the right approach to opening a NetCDF file on S3, using the latest xarray code?

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support creating DataSet from streaming object 186895655

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.276ms · About: xarray-datasette