home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

7 rows where issue = 482543307 and user = 19956442 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 1

  • Duane321 · 7 ✖

issue 1

  • Use pytorch as backend for xarrays · 7 ✖

author_association 1

  • NONE 7
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
773489462 https://github.com/pydata/xarray/issues/3232#issuecomment-773489462 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc3MzQ4OTQ2Mg== Duane321 19956442 2021-02-04T17:46:15Z 2021-02-04T17:46:15Z NONE

Thank again @keewis , that was indeed the case. It was due to my older PyTorch version (1.6.0)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
770128996 https://github.com/pydata/xarray/issues/3232#issuecomment-770128996 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc3MDEyODk5Ng== Duane321 19956442 2021-01-30T01:14:03Z 2021-01-30T01:14:03Z NONE

Thank you very much @keewis - your code did what I was trying to do. big help!

One thing I noticed with the missing features is the following :

This seems like a bit of a problem. Index-based selection is a primary reason to use xarray's. If that changes .data to a numpy array, then autodiff-ing through selection seems not possible. Is there another approach I'm not seeing?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
768529007 https://github.com/pydata/xarray/issues/3232#issuecomment-768529007 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc2ODUyOTAwNw== Duane321 19956442 2021-01-27T19:39:32Z 2021-01-29T22:37:28Z NONE

I've made some mild progress, but it raises a few questions. I've defined this simple Tensor subclass which meets the duck array criteria:

``` class XArrayTensor(torch.Tensor): def new(cls, data=None, requires_grad=False): if data is None: data = torch.Tensor() return torch.Tensor._make_subclass(cls, data, requires_grad)

def __init__(self, data=None, dims: Tuple[str] = None):
    self.dims = dims

def __array_function__(self, func, types, args, kwargs):
    if func not in IMPLEMENTED_FUNCTIONS or not (not all(issubclass(t, torch.Tensor) for t in types)):
        return NotImplemented
    return IMPLEMENTED_FUNCTIONS[func](*args, **kwargs)

def __array_ufunc__(self, func, types, args, kwargs):
    if func not in IMPLEMENTED_FUNCTIONS or not (not all(issubclass(t, torch.Tensor) for t in types)):
        return NotImplementedError
    return IMPLEMENTED_FUNCTIONS[func](*args, **kwargs)

```

where IMPLEMENTED_FUNCTIONS holds a mapping from numpy functions to API compatible tensor operators (similar in style to this)

I added a torch_array_type to pycompat.py, which allows DataArray's .data attribute to persist as an XArrayTensor:

``` xr_tsr = XArrayTensor(torch.rand(3, 2))

data_array = xr.DataArray( xr_tsr, coords=dict(a=["a1", "a2", "a3"], b=["b1", "b1"]), dims=["a", "b"], name="dummy", attrs={"grad": xr_tsr.grad}, ) print(type(data_array.data)) --> yields 'xarray_tensor.XArrayTensor' ```

The issue I'm running into is when I run an operation like np.mean(data_array). The operation gets dispatched to functions within duck_array_ops.py, which are the things I'd like to override.

Also, I'd like to confirm something. If the API matching were complete, would the following be possible?

some_sum = data_array.sum() some_sum.backward() data_array.grad --> provides the gradient

I'm starting to suspect not because that would involve data_array being both DataArray and a Torch.Tensor object. It seems what I'm in fact enabling is that DataArray.data is a Torch.Tensor.

{
    "total_count": 2,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 2,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
766466486 https://github.com/pydata/xarray/issues/3232#issuecomment-766466486 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc2NjQ2NjQ4Ng== Duane321 19956442 2021-01-25T00:13:53Z 2021-01-25T00:14:11Z NONE

Note that your the main work in adding array_function is not the dispatch mechanism, but mapping to 100% compatible APIs. That job should have gotten a lot easier now compared to 9 months ago. PyTorch now has a completely matching fft module, and a ~70% complete linalg module in master. And functions in the main namespace have gained dtype keywords, integer-to-float promotion, and other NumPy compat changes. So it should be feasible to write your custom subclass.

Glad to hear there's progress I can lean on. I'll come back with a minimum version that does the API matching for maybe 1-2 methods, just to get feedback on theoverall structure. If it works, I can brute through a lot of the rest 🤞

Looks like you need to patch that internally just a bit, probably adding pytorch to NON_NUMPY_SUPPORTED_ARRAY_TYPES.

Thank you, I hesitate to change xarray code but not anymore.

Note that I do not expect anymore that we'll be adding array_function to torch.Tensor, and certainly not any time soon. My current expectation is that the "get the correct namespace from an array/tensor object directly" from https://numpy.org/neps/nep-0037-array-module.html#how-to-use-get-array-module and https://data-apis.github.io/array-api/latest/ will turn out to be a much better design long-term.

Does this mean I shouldn't fill out __array_function__ in my subclass? Or is this just a forward looking expectation?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
766464095 https://github.com/pydata/xarray/issues/3232#issuecomment-766464095 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc2NjQ2NDA5NQ== Duane321 19956442 2021-01-25T00:00:46Z 2021-01-25T00:00:46Z NONE

While it would be fantastic to have gpu-enabled auto-diff-able xarrays / DataArrays, an interesting development worth looking into are the named tensor in https://pytorch.org/docs/stable/named_tensor.html. This appears to be an attempt to bridge the gap from the that they are making pytorch tensors increasingly dataarray like. I would not be surprised if within the next few iterations they add indexes to the tensors closing the gap even further.

I really hope so. I explored named_tensors at first, but the lack an index for each dimension was a non-starter. So, I'll keep an eye out.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
765738462 https://github.com/pydata/xarray/issues/3232#issuecomment-765738462 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc2NTczODQ2Mg== Duane321 19956442 2021-01-22T23:16:49Z 2021-01-22T23:16:49Z NONE

No one is working on array_function at the moment. Implementing it has some backwards compat concerns as well, because people may be relying on np.somefunc(some_torch_tensor) to be coerced to ndarray. It's not a small project, but implementing a prototype with a few function in the torch namespace that are not exactly matching the NumPy API would be a useful way to start pushing this forward.

@rgommers Do you expect this solution to work with a PyTorch Tensor custom subclass? Or is monkey patching necessary?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
765710268 https://github.com/pydata/xarray/issues/3232#issuecomment-765710268 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc2NTcxMDI2OA== Duane321 19956442 2021-01-22T22:04:20Z 2021-01-22T22:14:50Z NONE

I'd like to cast my vote in favor of getting this functionality in. It would be nice to autodiff through xarray operations.

From reading this and related threads, I'm trying to determine a gameplan to make this happen. I'm not familiar with xarray code, so any guidance would be much appreciated. This is what I'm thinking :

1) Create a custom subclass of PyTorch's Tensors which meets the duck array required methods and attributes. Since this isn't officially supported, looks like I could run into issues getting this subclass to persist through tensor operations. 2) Implement the __array_function__ protocol for PyTorch similar to how is demo-ed here. 3) Pass this custom class into data array constructors and hope the .grad attribute works.

My first attempts at this haven't been successful. Whatever custom class I make and past to the DataArray constructor gets converted to something xarray can handle with this line :

https://github.com/pydata/xarray/blob/bc35548d96caaec225be9a26afbbaa94069c9494/xarray/core/dataarray.py#L408

Any suggestions would be appreciated. I'm hoping to figure out the shortest path to a working prototype.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 1437.846ms · About: xarray-datasette