home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where issue = 653442225 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 3

  • shoyer 2
  • dcherian 1
  • andersy005 1

issue 1

  • `xr.save_mfdataset()` doesn't honor `compute=False` argument · 4 ✖

author_association 1

  • MEMBER 4
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
656403217 https://github.com/pydata/xarray/issues/4209#issuecomment-656403217 https://api.github.com/repos/pydata/xarray/issues/4209 MDEyOklzc3VlQ29tbWVudDY1NjQwMzIxNw== dcherian 2448579 2020-07-09T23:43:17Z 2020-07-09T23:43:17Z MEMBER

Here's an alternative map_blocks solution:

``` python def write_block(ds, t0): if len(ds.time) > 0: fname = (ds.time[0] - t0).values.astype("timedelta64[h]").astype(int) ds.to_netcdf(f"temp/file-{fname:06d}.nc")

# dummy return
return ds.time

ds = xr.tutorial.open_dataset("air_temperature", chunks={"time": 100}) ds.map_blocks(write_block, kwargs=dict(t0=ds.time[0])).compute(scheduler="processes") ```

There are two workarounds here though. 1. The user function always has to return something. 2. We can't provide template=ds.time because it has no chunk information and ds.time.chunk({"time": 100}) silently does nothing because it is an IndexVariable. So the user function still needs the len(ds.time) > 0 workaround.

I think a cleaner API may be to have dask.compute([write_block(block) for block in ds.to_delayed()]) where ds.to_delayed() yields a bunch of tasks; each of which gives a Dataset wrapping one block of the underlying array.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `xr.save_mfdataset()` doesn't honor `compute=False` argument 653442225
656399210 https://github.com/pydata/xarray/issues/4209#issuecomment-656399210 https://api.github.com/repos/pydata/xarray/issues/4209 MDEyOklzc3VlQ29tbWVudDY1NjM5OTIxMA== shoyer 1217238 2020-07-09T23:28:30Z 2020-07-09T23:28:30Z MEMBER

The way compute=False currently works may be a little confusing. It doesn't actually delay creating files, it just delays writing the array data.

Interesting... I always assumed that all operations (including file creation) were delayed. So, this is a feature and not a bug then?

Well, there is certainly a case for file creation also being lazy -- it definitely is more intuitive! This was more of an oversight than an intentional omission. Metadata generally needs to be written from a single process, anyways, so we never got around to doing it with Dask.

That said, there are also some legitimate use cases where it is nice to be able to eagerly write metadata only without any array data. This is what we were proposing to do with compute=False in to_zarr: https://github.com/pydata/xarray/pull/4035

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `xr.save_mfdataset()` doesn't honor `compute=False` argument 653442225
656395121 https://github.com/pydata/xarray/issues/4209#issuecomment-656395121 https://api.github.com/repos/pydata/xarray/issues/4209 MDEyOklzc3VlQ29tbWVudDY1NjM5NTEyMQ== andersy005 13301940 2020-07-09T23:14:15Z 2020-07-09T23:14:15Z MEMBER

The way compute=False currently works may be a little confusing. It doesn't actually delay creating files, it just delays writing the array data.

Interesting... I always assumed that all operations (including file creation) were delayed. So, this is a feature and not a bug then?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `xr.save_mfdataset()` doesn't honor `compute=False` argument 653442225
656388313 https://github.com/pydata/xarray/issues/4209#issuecomment-656388313 https://api.github.com/repos/pydata/xarray/issues/4209 MDEyOklzc3VlQ29tbWVudDY1NjM4ODMxMw== shoyer 1217238 2020-07-09T22:50:16Z 2020-07-09T22:50:16Z MEMBER

The way compute=False currently works may be a little confusing. It doesn't actually delay creating files, it just delays writing the array data.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `xr.save_mfdataset()` doesn't honor `compute=False` argument 653442225

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 3920.544ms · About: xarray-datasette