home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

14 rows where author_association = "MEMBER" and issue = 1519552711 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 6

  • TomNicholas 5
  • rabernat 3
  • benbovy 2
  • Illviljan 2
  • shoyer 1
  • jhamman 1

issue 1

  • Import datatree in xarray? · 14 ✖

author_association 1

  • MEMBER · 14 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1411536785 https://github.com/pydata/xarray/pull/7418#issuecomment-1411536785 https://api.github.com/repos/pydata/xarray/issues/7418 IC_kwDOAMm_X85UIleR Illviljan 14371165 2023-02-01T06:34:08Z 2023-02-01T06:34:08Z MEMBER

Hmm, I don't understand. Adding py.typed should be all that's needed, did that in flox and it worked great there: https://github.com/xarray-contrib/flox/pull/92

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Import datatree in xarray? 1519552711
1411454442 https://github.com/pydata/xarray/pull/7418#issuecomment-1411454442 https://api.github.com/repos/pydata/xarray/issues/7418 IC_kwDOAMm_X85UIRXq TomNicholas 35968931 2023-02-01T04:40:30Z 2023-02-01T04:40:30Z MEMBER

I think this PR is ready now, it just fails mypy (@Illviljan I added a py.typed file to datatree but the xarray mypy CI is still not happy with it).

Once this is merged we can push on with implementing open_datatree (https://github.com/pydata/xarray/pull/7437 @jthielen) and I can take a crack at fixing https://github.com/xarray-contrib/datatree/issues/146 in xarray in a follow-up PR.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Import datatree in xarray? 1519552711
1378079073 https://github.com/pydata/xarray/pull/7418#issuecomment-1378079073 https://api.github.com/repos/pydata/xarray/issues/7418 IC_kwDOAMm_X85SI9Fh rabernat 1197350 2023-01-11T00:34:03Z 2023-01-11T00:34:03Z MEMBER

we should carefully evaluate the datatree API to make sure we won't want to change it soon

I agree with this. We could use the PR process for this review.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Import datatree in xarray? 1519552711
1378074559 https://github.com/pydata/xarray/pull/7418#issuecomment-1378074559 https://api.github.com/repos/pydata/xarray/issues/7418 IC_kwDOAMm_X85SI7-_ shoyer 1217238 2023-01-11T00:27:47Z 2023-01-11T00:27:47Z MEMBER

I agree, datatree is an important data structure for Xarray. My preferred way to do this would be follow @rabernat's suggestion and to fork the code the existing repo into the Xarray main codebase.

My main concern is that we should carefully evaluate the datatree API to make sure we won't want to change it soon. Once we bring it into Xarray, there will be a higher expectation that the interface will remain stable.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Import datatree in xarray? 1519552711
1377905161 https://github.com/pydata/xarray/pull/7418#issuecomment-1377905161 https://api.github.com/repos/pydata/xarray/issues/7418 IC_kwDOAMm_X85SISoJ TomNicholas 35968931 2023-01-10T21:33:27Z 2023-01-10T21:33:27Z MEMBER

Integrating upstream into xarray might also help with people trying to open their nested data formats as a datatree objects, because then we can immediately begin integrating with xarray's backend engines.

See for example this datatree issue asking about opening grib files as a datatree. It would be nice to be able to do

open_datatree("data.grib", engine="cfgrib")

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Import datatree in xarray? 1519552711
1373101988 https://github.com/pydata/xarray/pull/7418#issuecomment-1373101988 https://api.github.com/repos/pydata/xarray/issues/7418 IC_kwDOAMm_X85R19-k TomNicholas 35968931 2023-01-06T03:35:36Z 2023-01-06T03:35:36Z MEMBER

Would it mean that if someone wants to later add any feature "x" or "y" into Xarray, they just need implementing the feature for Dataset (and possibly DataArray) and it will be guaranteed to work with Datatree?

Basically yes, it would immediately work with Datatree. Datatree currently implements most dataset methods by literally copying them and their docstrings, and they work by mapping the method over every node in the tree. We could integrate Datatree in such a way that the additional developer effort to get a method on dataset working on Datatree would be negligible (think adding a single element with the method name to an internal list, or copy-pasting a docstring).

I don't think it is ideal to have such non-synchronized state within Xarray itself.

This is an argument for waiting before integrating.

I'm just speaking generally from my experience of having struggled while doing some heavy refactoring in Xarray recently :)

I appreciate the input @benbovy! I think the main difference between this effort and your (heroic) indexes effort is that Datatree doesn't touch any existing API.

I guess my main concern is that integrating prematurely into Xarray might give a false sense of stability - I don't want to later realize I should redesign Datatree, and have people be annoyed because they thought it was as stable as the rest of xarray.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Import datatree in xarray? 1519552711
1372908509 https://github.com/pydata/xarray/pull/7418#issuecomment-1372908509 https://api.github.com/repos/pydata/xarray/issues/7418 IC_kwDOAMm_X85R1Ovd benbovy 4160723 2023-01-05T23:08:15Z 2023-01-05T23:08:15Z MEMBER

Again, there is likely more good reasons merging the Datatree code with Xarray than not doing it, but IMHO such decision should be made very carefully. You certainly do know better than me what positive vs. negative impacts it would have here! I'm just speaking generally from my experience of having struggled while doing some heavy refactoring in Xarray recently :)

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Import datatree in xarray? 1519552711
1372888139 https://github.com/pydata/xarray/pull/7418#issuecomment-1372888139 https://api.github.com/repos/pydata/xarray/issues/7418 IC_kwDOAMm_X85R1JxL benbovy 4160723 2023-01-05T22:46:05Z 2023-01-05T22:46:05Z MEMBER

I don't have strong opinions for or against including datatree in Xarray. It indeed makes sense if it is using many Xarray internals and if there are many existing or potential applications for it. Additional load (CI) is fine if datatree doesn't bring any extra dependency and won't do so in the near future (which seems to be the case).

Datatree should become a first-class Xarray object

Since Datatree sits above DataArray and Dataset, it should not interfere with any of our existing API.

Would it mean that if someone wants to later add any feature "x" or "y" into Xarray, they just need implementing the feature for Dataset (and possibly DataArray) and it will be guaranteed to work with Datatree? (I guess so but I'm not familiar enough with Datatree to know it for sure).

Otherwise, if there is any extra implementation effort required to make feature "x" or "y" work with Datatree, then I'm concerned about the additional burden or obstacle for future contributors and maintainers. Or we could say that this is OK to leave datatree support and wait for someone to take care of it later, but I don't think it is ideal to have such non-synchronized state within Xarray itself.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Import datatree in xarray? 1519552711
1372887496 https://github.com/pydata/xarray/pull/7418#issuecomment-1372887496 https://api.github.com/repos/pydata/xarray/issues/7418 IC_kwDOAMm_X85R1JnI jhamman 2443309 2023-01-05T22:45:17Z 2023-01-05T22:45:17Z MEMBER

I personally favor just copying the code into Xarray and archiving the old repo.

I also lean in this direction. At this point, I see little downside to making this change at this point. My suggestion to import xarray-datatree into xarray was meant low-lift compromise.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Import datatree in xarray? 1519552711
1372822656 https://github.com/pydata/xarray/pull/7418#issuecomment-1372822656 https://api.github.com/repos/pydata/xarray/issues/7418 IC_kwDOAMm_X85R05yA rabernat 1197350 2023-01-05T21:50:53Z 2023-01-05T21:50:53Z MEMBER

I personally favor just copying the code into Xarray and archiving the old repo.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Import datatree in xarray? 1519552711
1372806772 https://github.com/pydata/xarray/pull/7418#issuecomment-1372806772 https://api.github.com/repos/pydata/xarray/issues/7418 IC_kwDOAMm_X85R0150 TomNicholas 35968931 2023-01-05T21:36:25Z 2023-01-05T21:36:25Z MEMBER

Why? Are its dependencies different from Xarray?

No, datatree has no additional dependencies. I was just asking because if we went for the "import from second repository" plan we may want to test that the import works as part of our CI. Not a major issue though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Import datatree in xarray? 1519552711
1372802153 https://github.com/pydata/xarray/pull/7418#issuecomment-1372802153 https://api.github.com/repos/pydata/xarray/issues/7418 IC_kwDOAMm_X85R00xp rabernat 1197350 2023-01-05T21:31:33Z 2023-01-05T21:31:33Z MEMBER
  • At what stage is atatree "ready" to moved in here? At what stage should it become encouraged public API?

My opinion is that Datatree should move into Xarray now, ideally in a way that does not disrupt any existing user code, and that Datatree should become a first-class Xarray object (together with DataArray, and Dataset). Since it's a new feature, we don't necessarily have to be super conservative here. I think it is more than good enough / stable enough in its current state.

  • What's a good way to slowly roll the feature out?

Since Datatree sits above DataArray and Dataset, it should not interfere with any of our existing API. As long as test coverage is good, documentation is solid, and the code style matches the rest of Xarray, I think we can just bring it in.

  • How do I decrease the bus factor on datatree's code? Can I get some code reviews during the merging process? 🙏

I think that it is inevitable that you Tom will be the main owner of the Datatree code at the beginning (as @shoyer was of all of Xarray when he first released it). Over time, if people use it, some fraction of users will become maintainers, starting with the existing dev team.

  • Should I make a new CI environment just for testing datatree stuff?

Why? Are its dependencies different from Xarray?

{
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Import datatree in xarray? 1519552711
1372360850 https://github.com/pydata/xarray/pull/7418#issuecomment-1372360850 https://api.github.com/repos/pydata/xarray/issues/7418 IC_kwDOAMm_X85RzJCS TomNicholas 35968931 2023-01-05T15:24:58Z 2023-01-05T15:24:58Z MEMBER

There is also at least one bug in datatree that cannot be fixed without a (small) change to xarray, and having datatree as an optional import means I could fix it here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Import datatree in xarray? 1519552711
1371430929 https://github.com/pydata/xarray/pull/7418#issuecomment-1371430929 https://api.github.com/repos/pydata/xarray/issues/7418 IC_kwDOAMm_X85RvmAR Illviljan 14371165 2023-01-04T21:18:05Z 2023-01-04T21:18:05Z MEMBER

Add a py.typed file in datatree to fix the mypy error: https://github.com/pydata/xarray/blob/main/xarray/py.typed

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
  Import datatree in xarray? 1519552711

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 15.584ms · About: xarray-datasette