home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

7 rows where issue = 309227775 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 4

  • rabernat 2
  • shoyer 2
  • leroygr 2
  • NickMortimer 1

author_association 2

  • MEMBER 4
  • NONE 3

issue 1

  • Enable Append/concat to existing zarr datastore · 7 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
442055522 https://github.com/pydata/xarray/issues/2022#issuecomment-442055522 https://api.github.com/repos/pydata/xarray/issues/2022 MDEyOklzc3VlQ29tbWVudDQ0MjA1NTUyMg== leroygr 1411854 2018-11-27T13:20:33Z 2018-11-27T13:20:33Z NONE

Would be indeed really nice to get this built-in into xarray, but that is just a matter of patience I guess :)

Patience...or action. Anyone is welcome and encouraged to submit a pull request on this topic. Xarray is a volunteer effort.

Obviously. I'm just new to Zarr so a bit early to contribute to Xarray on that topic.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Enable Append/concat to existing zarr datastore 309227775
442052347 https://github.com/pydata/xarray/issues/2022#issuecomment-442052347 https://api.github.com/repos/pydata/xarray/issues/2022 MDEyOklzc3VlQ29tbWVudDQ0MjA1MjM0Nw== rabernat 1197350 2018-11-27T13:09:43Z 2018-11-27T13:09:43Z MEMBER

Would be indeed really nice to get this built-in into xarray, but that is just a matter of patience I guess :)

Patience...or action. Anyone is welcome and encouraged to submit a pull request on this topic. Xarray is a volunteer effort.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Enable Append/concat to existing zarr datastore 309227775
441997049 https://github.com/pydata/xarray/issues/2022#issuecomment-441997049 https://api.github.com/repos/pydata/xarray/issues/2022 MDEyOklzc3VlQ29tbWVudDQ0MTk5NzA0OQ== leroygr 1411854 2018-11-27T09:55:10Z 2018-11-27T09:55:10Z NONE

My use case for this is appending Argo float data to an existing zarr store. At the moment I have 800+ netcdf files that need transforming before they can be added or read by xarray in *.nc type read. At the moment I read the first transform it and add to a zarr sort using .to_zarr. Then I proceed to read the next files and append each variable to zarr using zarr append function.

This is probably not a good way to go but all that I could figure at the moment.

@NickMortimer would you have snipped for appending xarray objects to existing zarr dataset?

Would be indeed really nice to get this built-in into xarray, but that is just a matter of patience I guess :)

Thanks! Greg

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Enable Append/concat to existing zarr datastore 309227775
430792980 https://github.com/pydata/xarray/issues/2022#issuecomment-430792980 https://api.github.com/repos/pydata/xarray/issues/2022 MDEyOklzc3VlQ29tbWVudDQzMDc5Mjk4MA== shoyer 1217238 2018-10-17T21:17:11Z 2018-10-17T21:17:11Z MEMBER

We are just adding or completely overwriting variables. This works currently (from the docs: "If mode=’a’, existing variables will be overwritten"). But I'm not sure what happens if there is a conflict between coordinates among the new and old variables.

I'm pretty sure the coordinates will just get overwritten, too, at least as long as the coordinate arrays have the same shape. If they have different shapes, you probably will get an error. We certainly don't do any checks for alignment currently.

ds1 has some of the same variables as ds2, possibly with overlapping coordinates. In this case, we want to do some kind of append. If there is no overlap between coordinates, then it's straightforward: put the extra values from ds1 into file2.nc.

This is only case I would try to solve to the initial implementation. It's probably 20% of the work (to add a keyword argument like extend='time') and covers 80% of the use-cases.

If we need alignment, I'm sure we could make that work in a follow-up. Certainly it would be less error prone to use.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Enable Append/concat to existing zarr datastore 309227775
430780110 https://github.com/pydata/xarray/issues/2022#issuecomment-430780110 https://api.github.com/repos/pydata/xarray/issues/2022 MDEyOklzc3VlQ29tbWVudDQzMDc4MDExMA== rabernat 1197350 2018-10-17T20:37:28Z 2018-10-17T20:37:28Z MEMBER

We may have people interested in working on this soon.

I think we have some details to sort out regarding the api for appending. The most generic case looks something like this

```python ds1 = xr.open_dataset('file1.nc')

file2.nc already exists

ds1.to_netcdf('file2.nc', mode='a+') ```

We need to figure out what should happen under different circumstances. Some cases are: - We are just adding or completely overwriting variables. This works currently (from the docs: "If mode=’a’, existing variables will be overwritten"). But I'm not sure what happens if there is a conflict between coordinates among the new and old variables. - ds1 has some of the same variables as ds2, possibly with overlapping coordinates. In this case, we want to do some kind of append. If there is no overlap between coordinates, then it's straightforward: put the extra values from ds1 into file2.nc. If there is overlap, then there are two options: - overwrite all of the overlapping portion with ds1, or - keep the existing values from ds2. - With netCDF, there is an additional limitation that the underlying library will only let you extend along one dimension (the UNLIMITED one). Other backends like zarr will let you extend along many dimensions.

It seems like much of the logic for overlapping dimension should be able to be handled via align. The hard part will be figuring out how to tell the store to write to the appropriate regions of its arrays.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Enable Append/concat to existing zarr datastore 309227775
402408267 https://github.com/pydata/xarray/issues/2022#issuecomment-402408267 https://api.github.com/repos/pydata/xarray/issues/2022 MDEyOklzc3VlQ29tbWVudDQwMjQwODI2Nw== NickMortimer 4338975 2018-07-04T08:41:47Z 2018-07-04T08:41:47Z NONE

My use case for this is appending Argo float data to an existing zarr store. At the moment I have 800+ netcdf files that need transforming before they can be added or read by xarray in *.nc type read. At the moment I read the first transform it and add to a zarr sort using .to_zarr. Then I proceed to read the next files and append each variable to zarr using zarr append function.

This is probably not a good way to go but all that I could figure at the moment.

@shoyer I think it would be useful to have a straight append mode: to_zarr(....,mode='a+')

{
    "total_count": 5,
    "+1": 5,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Enable Append/concat to existing zarr datastore 309227775
377088567 https://github.com/pydata/xarray/issues/2022#issuecomment-377088567 https://api.github.com/repos/pydata/xarray/issues/2022 MDEyOklzc3VlQ29tbWVudDM3NzA4ODU2Nw== shoyer 1217238 2018-03-29T01:10:16Z 2018-03-29T01:10:16Z MEMBER

This would probably make sense to think about along-side support for appending along an existing dimension in a netCDF file (https://github.com/pydata/xarray/issues/1672).

I can see a few potential ways to write the syntax. Probably supplying a range of indices along a dimension to write to would make the most sense, e.g., to_zarr(..., destination={'time': slice(1000, 2000)}) to indicate writing to positions 1000-2000 along the time dimension.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Enable Append/concat to existing zarr datastore 309227775

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 3123.85ms · About: xarray-datasette