home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 1210175870

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/4285#issuecomment-1210175870 https://api.github.com/repos/pydata/xarray/issues/4285 1210175870 IC_kwDOAMm_X85IIdF- 35968931 2022-08-10T05:25:17Z 2022-08-10T05:32:13Z MEMBER

Since RaggedArray can't be used everywhere that an ak.Array can be used, it shouldn't be a subclass.

I see, makes sense.

I hadn't been thinking that RaggedArray is something we'd put in the general Awkward Array library.

Oh I was just thinking if we're building a new class that is tightly coupled to awkward.Array then it should live in awkward. (I also would like someone else to maintain it ideally! :sweat_smile: )

I was thinking of it only as a way to define "the subset of Awkward Arrays that xarray uses," which would live in xarray.

I don't think it's within scope of xarray to offer a numpy-like array class in our main library - we don't do this for any other case!

Or it could be a third package, as awkward-pandas is to awkward and pandas.

However we could definitely have a separate awkward-xarray package that lives in xarray-contrib and provides a RaggedArray class. (see pint-xarray for something sort of similar.) That seems fine, all it takes is some keen bean to take our prototypes here and turn them into something usable...

(Imagine reading the docs and it says, "You can apply this function to ak.Array, but not to ak.RaggedArray." Or "this is an ak.Array that happens to be ragged, but not a ak.RaggedArray.")

Yeah that wouldn't be ideal.

(Digression: From my perspective part of the problem is that merely generalising numpy arrays to be ragged would have been useful for lots of people, but awkward.Array goes a lot further. It also generalises the type system, adds things like Records, and possibly adds xarray-like features. That puts awkward.Array in a somewhat ill-defined place within the wider scientific python ecosystem: it's kind of a numpy-like duck array, but can't be treated as one, it's also a more general type system, and it might even get features of higher-level data structures like xarray.)

  • some people are going to want the shape to specify the maximum of "var" dimensions (what you asked for): "virtually padding",
  • some people are going to want the shape to specify the minimum of "var" dimensions because that tells you what upper bounds are legal to slice: "virtually truncating",
  • and some people are going to want the string "var" or maybe None or maybe np.nan in place of "var" dimensions because no integer is correct. Then they would have to deal with the fact that this shape is not a tuple of integers.

That's very interesting. I'm not immediately sure which of those would be best for xarray wrapping - I think it's plausible that we could eventually support any of those options... ((3) through the issues Deepak linked to (#5168, #2801).)

I fixed the code that I wrote in the comments above for posterity.

Thanks for fixing that, and for all the explanations!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  667864088
Powered by Datasette · Queries took 0.559ms · About: xarray-datasette