home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 1209967070

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/4285#issuecomment-1209967070 https://api.github.com/repos/pydata/xarray/issues/4285 1209967070 IC_kwDOAMm_X85IHqHe 35968931 2022-08-09T22:47:24Z 2022-08-10T05:50:40Z MEMBER

Thanks for the huge response there @jpivarski !

Ragged array is not a specialized subset of types within Awkward Array. There are ak.* functions that would take you out of this subset. However (thinking it through...) I don't think slices, ufuncs, or reducers would take you out of this subset.

This is an important point which I meant to ask about earlier. We need a RaggedArray class which always returns other RaggedArray instances (i.e. the set of ragged arrays is closed under the set of numpy-like methods / functions that xarray might call upon it).

To answer your question about monkey-patching, I think it would be best to make a wrapper. You don't want to give all ak.Array instances properties named shape and dtype, since those properties won't make sense for general types. This is exactly the reason we had to back off on making ak.Array inherit from pandas.api.extensions.ExtensionArray: Pandas wanted it to have methods with names and behaviors that would have been misleading for Awkward Arrays.

If you want a RaggedArray class that is more specific (i.e. defines more attributes) than awkward.Array, then surely the "correct" thing to do would be be to subclass though? I mean for eventual integration of RaggedArray within awkward's codebase.

Thus, it can act as a gatekeeper of what kinds of operations are allowed: ak.* won't recognize RaggedArray, which is good because some ak.* functions would take you out of this "ragged array" subset of types. You can add some non-ufunc NumPy functions with __array_function__, but only the ones that make sense for this subset of types.

That makes sense. And if you subclassed then I guess you would also need to change those ak.* functions to not accept RaggedArray, so maybe wrapping is better...

Thanks for the wrapping example! I think there is a bug with your .shape method though - if I put your code snippets in a file then they return the wrong results:

```python In [1]: from ragged import RaggedArray

In [2]: ra = RaggedArray([[1, 2, 3], [4, 5]])

In [3]: ra.ndim Out[3]: 1

In [4]: ra.shape Out[4]: [3] `` (I expected2and(2, 3)respectively). I think perhapscontext["shape"]` is being overwritten as it recurses through the data structure, when it should be being appended?

I would really like to try testing the RaggedArray class with our WIP public framework for testing duck array compatiblity (#6894). If we can get a very basic wrapper then I could make a PR to add RaggedArray to awkward, and import xarray's new tests to test it with.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  667864088
Powered by Datasette · Queries took 2.04ms · About: xarray-datasette