home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

49 rows where issue = 482543307 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 14

  • rgommers 7
  • Duane321 7
  • fjanoos 6
  • keewis 6
  • tomwhite 4
  • shoyer 4
  • jakirkham 3
  • hsharrison 3
  • jacobtomlinson 2
  • dcherian 2
  • hjalmarlucius 2
  • zaxtax 1
  • jhamman 1
  • andersy005 1

author_association 3

  • NONE 26
  • MEMBER 14
  • CONTRIBUTOR 9

issue 1

  • Use pytorch as backend for xarrays · 49 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1190589331 https://github.com/pydata/xarray/issues/3232#issuecomment-1190589331 https://api.github.com/repos/pydata/xarray/issues/3232 IC_kwDOAMm_X85G9vOT jakirkham 3019665 2022-07-20T18:01:56Z 2022-07-20T18:01:56Z NONE

While it is true to use PyTorch Tensors directly, one would need the Array API implemented in PyTorch. One could use them indirectly by converting them zero-copy to CuPy arrays, which do have Array API support

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
1190382681 https://github.com/pydata/xarray/issues/3232#issuecomment-1190382681 https://api.github.com/repos/pydata/xarray/issues/3232 IC_kwDOAMm_X85G88xZ hsharrison 4441865 2022-07-20T14:48:15Z 2022-07-20T14:48:15Z CONTRIBUTOR

Makes sense, then I'll wait for https://github.com/pytorch/pytorch/issues/58743 to try it.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
1190162973 https://github.com/pydata/xarray/issues/3232#issuecomment-1190162973 https://api.github.com/repos/pydata/xarray/issues/3232 IC_kwDOAMm_X85G8HId tomwhite 85085 2022-07-20T11:35:03Z 2022-07-20T11:35:03Z CONTRIBUTOR

I think it can't be tested with pytorch until they compete pytorch/pytorch#58743, right?

It needs __array_namespace__ to be defined to activate the new code path.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
1190068100 https://github.com/pydata/xarray/issues/3232#issuecomment-1190068100 https://api.github.com/repos/pydata/xarray/issues/3232 IC_kwDOAMm_X85G7v-E hsharrison 4441865 2022-07-20T09:50:59Z 2022-07-20T09:50:59Z CONTRIBUTOR

Nice that it's so simple. I think it can't be tested with pytorch until they compete https://github.com/pytorch/pytorch/issues/58743, right?

Or we should just try passing torch.tensor into xarray directly?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
1189941650 https://github.com/pydata/xarray/issues/3232#issuecomment-1189941650 https://api.github.com/repos/pydata/xarray/issues/3232 IC_kwDOAMm_X85G7RGS tomwhite 85085 2022-07-20T07:45:39Z 2022-07-20T07:45:39Z CONTRIBUTOR

Hi @hsharrison - thanks for offering to do some testing. Here's a little demo script that you could try, by switching numpy.array_api to pytorch: https://github.com/tomwhite/xarray/commit/929812a12818ffaa1187eb860c9b61e3fc03973c

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
1189938517 https://github.com/pydata/xarray/issues/3232#issuecomment-1189938517 https://api.github.com/repos/pydata/xarray/issues/3232 IC_kwDOAMm_X85G7QVV hsharrison 4441865 2022-07-20T07:42:05Z 2022-07-20T07:42:05Z CONTRIBUTOR

Glad to see progress on this!! 👏

Just curious though, seeing this comment in the PR:

Note: I haven't actually tested this with pytorch (which is the motivating example for https://github.com/pydata/xarray/issues/3232).

Are we sure this closes the issue? And, how can we try it out? Even lacking docs, a comment explaining how to set it up would be great, and I can do some testing on my end. I understand that it's an experimental feature.

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 1,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
1187007032 https://github.com/pydata/xarray/issues/3232#issuecomment-1187007032 https://api.github.com/repos/pydata/xarray/issues/3232 IC_kwDOAMm_X85GwEo4 tomwhite 85085 2022-07-18T10:04:29Z 2022-07-18T10:04:29Z CONTRIBUTOR

Opened #6804

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
1183301651 https://github.com/pydata/xarray/issues/3232#issuecomment-1183301651 https://api.github.com/repos/pydata/xarray/issues/3232 IC_kwDOAMm_X85Gh8AT dcherian 2448579 2022-07-13T14:31:55Z 2022-07-13T14:32:01Z MEMBER

I'd be happy to turn this into a PR with some tests.

Absolutely!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
1182978725 https://github.com/pydata/xarray/issues/3232#issuecomment-1182978725 https://api.github.com/repos/pydata/xarray/issues/3232 IC_kwDOAMm_X85GgtKl tomwhite 85085 2022-07-13T09:18:51Z 2022-07-13T09:18:51Z CONTRIBUTOR

I started having a look at making xarray work with the array API here: https://github.com/tomwhite/xarray/commit/c72a1c4a4c52152bdab83f60f35615de28e8be7f. Some basic operations work (preserving the underlying array): https://github.com/tomwhite/xarray/commit/929812a12818ffaa1187eb860c9b61e3fc03973c. If there's interest, I'd be happy to turn this into a PR with some tests.

{
    "total_count": 4,
    "+1": 4,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
1013174167 https://github.com/pydata/xarray/issues/3232#issuecomment-1013174167 https://api.github.com/repos/pydata/xarray/issues/3232 IC_kwDOAMm_X848Y8-X zaxtax 8529 2022-01-14T14:32:49Z 2022-01-14T14:32:49Z NONE

@keewis @shoyer now that numpy is merged in https://github.com/numpy/numpy/pull/18585 __array_namespace__ support and pytorch is in the process of add __array_namespace__ support https://github.com/pytorch/pytorch/issues/58743 is it worth exploring adding support through the __array_namespace__ API?

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
851581057 https://github.com/pydata/xarray/issues/3232#issuecomment-851581057 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDg1MTU4MTA1Nw== keewis 14808389 2021-05-31T16:12:35Z 2021-06-01T20:01:07Z MEMBER

changing the xarray internals is not too much work: we need to get xarray.core.utils.is_duck_array to return true if the object has either __array_namespace__ or __array_ufunc__ and __array_function__ (or all three) defined, and we'd need a short test demonstrating that objects that implement only __array_namespace__ survive unchanged when wrapped by a xarray object (i.e. something like isinstance(xr.DataArray(pytorch_object).mean().data, pytorch.Tensor)).

We might still be a bit too early with this, though: the PR which adds __array_namespace__ to numpy has not been merged into numpy:main yet.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
851494928 https://github.com/pydata/xarray/issues/3232#issuecomment-851494928 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDg1MTQ5NDkyOA== hjalmarlucius 35001974 2021-05-31T13:32:29Z 2021-05-31T13:32:29Z NONE

Thanks for the prompt response. Would love to contribute but I have to climb the learning curve first.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
851426576 https://github.com/pydata/xarray/issues/3232#issuecomment-851426576 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDg1MTQyNjU3Ng== keewis 14808389 2021-05-31T11:32:05Z 2021-05-31T11:32:05Z MEMBER

I don't, unfortunately (there's the partial example in https://github.com/pydata/xarray/issues/3232#issuecomment-769789746, though).

This is nothing usable right now, but the pytorch maintainers are currently looking into providing support for __array_namespace__ (NEP47). Once there has been sufficient progress in both numpy and pytorch we don't have to change much in xarray (i.e. allowing __array_namespace__ instead of __array_ufunc__ / _array_function__ for duck arrays) to make this work without any wrapper code.

You (or anyone interested) might still want to maintain a "pytorch-xarray" convenience library to allow something like arr.torch.grad(dim="x").

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
851118675 https://github.com/pydata/xarray/issues/3232#issuecomment-851118675 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDg1MTExODY3NQ== hjalmarlucius 35001974 2021-05-31T02:09:07Z 2021-05-31T02:09:07Z NONE

@Duane321 or @keewis do you have the full code example for making this work? I'm a novice on numpy ufuncs and am trying to use get gradients while keeping my xarray coords.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
786599239 https://github.com/pydata/xarray/issues/3232#issuecomment-786599239 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc4NjU5OTIzOQ== keewis 14808389 2021-02-26T11:47:55Z 2021-02-26T11:48:09Z MEMBER

@Duane321: with xarray>=0.17.0 you should be able to remove the __getattributes__ trick.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
773489462 https://github.com/pydata/xarray/issues/3232#issuecomment-773489462 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc3MzQ4OTQ2Mg== Duane321 19956442 2021-02-04T17:46:15Z 2021-02-04T17:46:15Z NONE

Thank again @keewis , that was indeed the case. It was due to my older PyTorch version (1.6.0)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
771066618 https://github.com/pydata/xarray/issues/3232#issuecomment-771066618 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc3MTA2NjYxOA== keewis 14808389 2021-02-01T18:34:00Z 2021-02-01T23:39:51Z MEMBER

I can't reproduce that: python In [4]: da.loc["a1"] Out[4]: <xarray.DataArray (b: 2)> tensor([0.4793, 0.7493], dtype=torch.float32) Coordinates: a <U2 'a1' * b (b) <U2 'b1' 'b2' with numpy: 1.19.5 xarray: 0.16.2 pytorch: 1.7.1.post2 pandas: 1.2.1 maybe this is a environment issue?

Edit: the missing feature list includes loc (and sel) because it is currently not possible to have a duck array in a dimension coordinate, so this: python xr.DataArray( [0, 1, 2], coords={"x": XArrayTensor(torch.Tensor([10, 12, 14]))}, dims="x", ).loc[{"x": XArrayTensor(torch.Tensor([10, 14]))}] does not work, but python xr.DataArray( XArrayTensor(torch.Tensor([0, 1, 2])), coords={"x": [10, 12, 14]}, dims="x", ).loc[{"x": [10, 14]}] should work just fine.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
770128996 https://github.com/pydata/xarray/issues/3232#issuecomment-770128996 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc3MDEyODk5Ng== Duane321 19956442 2021-01-30T01:14:03Z 2021-01-30T01:14:03Z NONE

Thank you very much @keewis - your code did what I was trying to do. big help!

One thing I noticed with the missing features is the following :

This seems like a bit of a problem. Index-based selection is a primary reason to use xarray's. If that changes .data to a numpy array, then autodiff-ing through selection seems not possible. Is there another approach I'm not seeing?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
768529007 https://github.com/pydata/xarray/issues/3232#issuecomment-768529007 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc2ODUyOTAwNw== Duane321 19956442 2021-01-27T19:39:32Z 2021-01-29T22:37:28Z NONE

I've made some mild progress, but it raises a few questions. I've defined this simple Tensor subclass which meets the duck array criteria:

``` class XArrayTensor(torch.Tensor): def new(cls, data=None, requires_grad=False): if data is None: data = torch.Tensor() return torch.Tensor._make_subclass(cls, data, requires_grad)

def __init__(self, data=None, dims: Tuple[str] = None):
    self.dims = dims

def __array_function__(self, func, types, args, kwargs):
    if func not in IMPLEMENTED_FUNCTIONS or not (not all(issubclass(t, torch.Tensor) for t in types)):
        return NotImplemented
    return IMPLEMENTED_FUNCTIONS[func](*args, **kwargs)

def __array_ufunc__(self, func, types, args, kwargs):
    if func not in IMPLEMENTED_FUNCTIONS or not (not all(issubclass(t, torch.Tensor) for t in types)):
        return NotImplementedError
    return IMPLEMENTED_FUNCTIONS[func](*args, **kwargs)

```

where IMPLEMENTED_FUNCTIONS holds a mapping from numpy functions to API compatible tensor operators (similar in style to this)

I added a torch_array_type to pycompat.py, which allows DataArray's .data attribute to persist as an XArrayTensor:

``` xr_tsr = XArrayTensor(torch.rand(3, 2))

data_array = xr.DataArray( xr_tsr, coords=dict(a=["a1", "a2", "a3"], b=["b1", "b1"]), dims=["a", "b"], name="dummy", attrs={"grad": xr_tsr.grad}, ) print(type(data_array.data)) --> yields 'xarray_tensor.XArrayTensor' ```

The issue I'm running into is when I run an operation like np.mean(data_array). The operation gets dispatched to functions within duck_array_ops.py, which are the things I'd like to override.

Also, I'd like to confirm something. If the API matching were complete, would the following be possible?

some_sum = data_array.sum() some_sum.backward() data_array.grad --> provides the gradient

I'm starting to suspect not because that would involve data_array being both DataArray and a Torch.Tensor object. It seems what I'm in fact enabling is that DataArray.data is a Torch.Tensor.

{
    "total_count": 2,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 2,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
769789746 https://github.com/pydata/xarray/issues/3232#issuecomment-769789746 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc2OTc4OTc0Ng== keewis 14808389 2021-01-29T12:57:37Z 2021-01-29T15:22:01Z MEMBER

I added a torch_array_type to pycompat.py

torch.Tensor defines values, so the issue is this: https://github.com/pydata/xarray/blob/8cc34cb412ba89ebca12fc84f76a9e452628f1bc/xarray/core/variable.py#L221 @shoyer, any ideas?

For now, I guess we can remove it using __getattribute__. With that you will have to cast the data first if you want to access torch.Tensor.values: python torch.Tensor(tensor).values()

Not sure if that's the best way, but that would look like this:

<tt>pytorch</tt> wrapper class ```python In [13]: import numpy as np ...: import torch ...: from typing import Tuple ...: import xarray as xr ...: import functools ...: ...: def wrap_torch(f): ...: @functools.wraps(f) ...: def wrapper(*args, **kwargs): ...: # TODO: use a dict comprehension if there are functions that rely on the order of the parameters ...: if "axis" in kwargs: ...: kwargs["dim"] = kwargs.pop("axis") # torch calls that parameter 'dim' instead of 'axis' ...: ...: return f(*args, **kwargs) ...: ...: return wrapper ...: ...: class DTypeWrapper: ...: def __init__(self, dtype): ...: self.dtype = dtype ...: if dtype.is_complex: ...: self.kind = "c" ...: elif dtype.is_floating_point: ...: self.kind = "f" ...: else: ...: # I don't know pytorch at all, so falling back to "i" might not be the best choice ...: self.kind = "i" ...: ...: def __getattr__(self, name): ...: return getattr(self.dtype, name) ...: ...: def __repr__(self): ...: return repr(self.dtype) ...: ...: IMPLEMENTED_FUNCTIONS = { ...: np.mean: wrap_torch(torch.mean), ...: np.nanmean: wrap_torch(torch.mean), # not sure if pytorch has a separate nanmean function ...: } ...: ...: class XArrayTensor(torch.Tensor): ...: def __new__(cls, data=None, requires_grad=False): ...: if data is None: ...: data = torch.Tensor() ...: return torch.Tensor._make_subclass(cls, data, requires_grad) ...: ...: def __init__(self, data=None, dims: Tuple[str] = None): ...: self.dims = dims ...: ...: def __array_function__(self, func, types, args, kwargs): ...: if func not in IMPLEMENTED_FUNCTIONS or any(not issubclass(t, torch.Tensor) for t in types): ...: return NotImplemented ...: return IMPLEMENTED_FUNCTIONS[func](*args, **kwargs) ...: ...: def __array_ufunc__(self, func, types, args, kwargs): ...: if func not in IMPLEMENTED_FUNCTIONS or any(not issubclass(t, torch.Tensor) for t in types): ...: return NotImplementedError ...: return IMPLEMENTED_FUNCTIONS[func](*args, **kwargs) ...: ...: def __getattribute__(self, name): ...: if name == "values": ...: raise AttributeError( ...: "'values' has been removed for compatibility with xarray." ...: " To access it, use `torch.Tensor(tensor).values()`." ...: ) ...: return object.__getattribute__(self, name) ...: ...: @property ...: def shape(self): ...: return tuple(super().shape) ...: ...: @property ...: def dtype(self): ...: return DTypeWrapper(super().dtype) ...: ...: tensor = XArrayTensor(torch.rand(3, 2)) ...: display(tensor) ...: display(tensor.shape) ...: display(tensor.dtype) ...: display(tensor.ndim) ...: ...: da = xr.DataArray(tensor, coords={"a": ["a1", "a2", "a3"], "b": ["b1", "b2"]}, dims=["a", "b"]) ...: display(da) ...: display(da.data) ...: display(da.mean(dim="a")) ```

with that, I can execute mean and get back a torch.Tensor wrapped by a DataArray without modifying the xarray code. For a list of features where duck arrays are not supported, yet, see Working with numpy-like arrays (that list should be pretty complete, but if you think there's something missing please open a new issue).

For np.mean(da): be aware that DataArray does not define __array_function__, yet (see #3917), and that with it you have to fall back to np.mean(da, axis=0) instead of da.mean(dim="a").

If the API matching were complete, would the following be possible?

no, it won't be because this is fragile: any new method of DataArray could shadow the methods of the wrapped object. Also, without tight integration xarray does not know what to do with the result, so you would always get the underlying data instead of a new DataArray.

Instead, we recommend extension packages (extending xarray), so with a hypothetical xarray-pytorch library you would write some_sum.torch.backward() instead of some_sum.backward(). That is a bit more work, but it also gives you a lot more control. For an example, see pint-xarray.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
769656592 https://github.com/pydata/xarray/issues/3232#issuecomment-769656592 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc2OTY1NjU5Mg== rgommers 98330 2021-01-29T08:26:23Z 2021-01-29T08:26:23Z NONE

I'm starting to suspect not because that would involve data_array being both DataArray and a Torch.Tensor object. It seems what I'm in fact enabling is that DataArray.data is a Torch.Tensor.

some_sum is still a DataArray, which doesn't have a backward method. You could use data_array = xr.DataArray( xr_tsr, coords=dict(a=["a1", "a2", "a3"], b=["b1", "b1"]), dims=["a", "b"], name="dummy", attrs={"grad": xr_tsr.grad, "backward": xr_tsr.backward}, ) and your example should work (I assume you meant .grad not .grid).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
766669784 https://github.com/pydata/xarray/issues/3232#issuecomment-766669784 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc2NjY2OTc4NA== rgommers 98330 2021-01-25T09:12:51Z 2021-01-25T09:12:51Z NONE

Does this mean I shouldn't fill out __array_function__ in my subclass? Or is this just a forward looking expectation?

No, adding it should be perfectly fine. The dispatch mechanism itself isn't going anywhere, it's part of numpy and it works. Whether or not torch.Tensor itself has an __array_function__ method isn't too relevant for your subclass.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
766470557 https://github.com/pydata/xarray/issues/3232#issuecomment-766470557 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc2NjQ3MDU1Nw== keewis 14808389 2021-01-25T00:33:35Z 2021-01-25T00:33:35Z MEMBER

Looks like you need to patch that internally just a bit, probably adding pytorch to NON_NUMPY_SUPPORTED_ARRAY_TYPES.

defining __array_function__ (and the other properties listed in the docs) should be enough: https://github.com/pydata/xarray/blob/a0c71c1508f34345ad7eef244cdbbe224e031c1b/xarray/core/variable.py#L232-L235

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
766466486 https://github.com/pydata/xarray/issues/3232#issuecomment-766466486 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc2NjQ2NjQ4Ng== Duane321 19956442 2021-01-25T00:13:53Z 2021-01-25T00:14:11Z NONE

Note that your the main work in adding array_function is not the dispatch mechanism, but mapping to 100% compatible APIs. That job should have gotten a lot easier now compared to 9 months ago. PyTorch now has a completely matching fft module, and a ~70% complete linalg module in master. And functions in the main namespace have gained dtype keywords, integer-to-float promotion, and other NumPy compat changes. So it should be feasible to write your custom subclass.

Glad to hear there's progress I can lean on. I'll come back with a minimum version that does the API matching for maybe 1-2 methods, just to get feedback on theoverall structure. If it works, I can brute through a lot of the rest 🤞

Looks like you need to patch that internally just a bit, probably adding pytorch to NON_NUMPY_SUPPORTED_ARRAY_TYPES.

Thank you, I hesitate to change xarray code but not anymore.

Note that I do not expect anymore that we'll be adding array_function to torch.Tensor, and certainly not any time soon. My current expectation is that the "get the correct namespace from an array/tensor object directly" from https://numpy.org/neps/nep-0037-array-module.html#how-to-use-get-array-module and https://data-apis.github.io/array-api/latest/ will turn out to be a much better design long-term.

Does this mean I shouldn't fill out __array_function__ in my subclass? Or is this just a forward looking expectation?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
766464095 https://github.com/pydata/xarray/issues/3232#issuecomment-766464095 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc2NjQ2NDA5NQ== Duane321 19956442 2021-01-25T00:00:46Z 2021-01-25T00:00:46Z NONE

While it would be fantastic to have gpu-enabled auto-diff-able xarrays / DataArrays, an interesting development worth looking into are the named tensor in https://pytorch.org/docs/stable/named_tensor.html. This appears to be an attempt to bridge the gap from the that they are making pytorch tensors increasingly dataarray like. I would not be surprised if within the next few iterations they add indexes to the tensors closing the gap even further.

I really hope so. I explored named_tensors at first, but the lack an index for each dimension was a non-starter. So, I'll keep an eye out.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
766090834 https://github.com/pydata/xarray/issues/3232#issuecomment-766090834 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc2NjA5MDgzNA== fjanoos 923438 2021-01-23T14:50:04Z 2021-01-23T14:50:04Z NONE

@Duane321 While it would be fantastic to have gpu-enabled auto-diff-able xarrays / DataArrays, an interesting development worth looking into are the named tensor in https://pytorch.org/docs/stable/named_tensor.html. This appears to be an attempt to bridge the gap from the that they are making pytorch tensors increasingly dataarray like. I would not be surprised if within the next few iterations they add indexes to the tensors closing the gap even further.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
765906982 https://github.com/pydata/xarray/issues/3232#issuecomment-765906982 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc2NTkwNjk4Mg== rgommers 98330 2021-01-23T11:12:59Z 2021-01-23T11:12:59Z NONE

Note that your the main work in adding __array_function__ is not the dispatch mechanism, but mapping to 100% compatible APIs. That job should have gotten a lot easier now compared to 9 months ago. PyTorch now has a completely matching fft module, and a ~70% complete linalg module in master. And functions in the main namespace have gained dtype keywords, integer-to-float promotion, and other NumPy compat changes. So it should be feasible to write your custom subclass.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
765905229 https://github.com/pydata/xarray/issues/3232#issuecomment-765905229 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc2NTkwNTIyOQ== rgommers 98330 2021-01-23T10:57:48Z 2021-01-23T11:09:52Z NONE

Create a custom subclass of PyTorch's Tensors which meets the duck array required methods and attributes. Since this isn't officially supported, looks like I could run into issues getting this subclass to persist through tensor operations.

If you use PyTorch 1.7.1 or later, then Tensor subclasses are much better preserved through pytorch functions and operations like slicing. So a custom subclass, adding the attributes and methods Xarray requires for a duck array should be feasible.

data = as_compatible_data(data)

Looks like you need to patch that internally just a bit, probably adding pytorch to NON_NUMPY_SUPPORTED_ARRAY_TYPES.

Note that I do not expect anymore that we'll be adding __array_function__ to torch.Tensor, and certainly not any time soon. My current expectation is that the "get the correct namespace from an array/tensor object directly" from https://numpy.org/neps/nep-0037-array-module.html#how-to-use-get-array-module and https://data-apis.github.io/array-api/latest/ will turn out to be a much better design long-term.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
765738462 https://github.com/pydata/xarray/issues/3232#issuecomment-765738462 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc2NTczODQ2Mg== Duane321 19956442 2021-01-22T23:16:49Z 2021-01-22T23:16:49Z NONE

No one is working on array_function at the moment. Implementing it has some backwards compat concerns as well, because people may be relying on np.somefunc(some_torch_tensor) to be coerced to ndarray. It's not a small project, but implementing a prototype with a few function in the torch namespace that are not exactly matching the NumPy API would be a useful way to start pushing this forward.

@rgommers Do you expect this solution to work with a PyTorch Tensor custom subclass? Or is monkey patching necessary?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
765710268 https://github.com/pydata/xarray/issues/3232#issuecomment-765710268 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc2NTcxMDI2OA== Duane321 19956442 2021-01-22T22:04:20Z 2021-01-22T22:14:50Z NONE

I'd like to cast my vote in favor of getting this functionality in. It would be nice to autodiff through xarray operations.

From reading this and related threads, I'm trying to determine a gameplan to make this happen. I'm not familiar with xarray code, so any guidance would be much appreciated. This is what I'm thinking :

1) Create a custom subclass of PyTorch's Tensors which meets the duck array required methods and attributes. Since this isn't officially supported, looks like I could run into issues getting this subclass to persist through tensor operations. 2) Implement the __array_function__ protocol for PyTorch similar to how is demo-ed here. 3) Pass this custom class into data array constructors and hope the .grad attribute works.

My first attempts at this haven't been successful. Whatever custom class I make and past to the DataArray constructor gets converted to something xarray can handle with this line :

https://github.com/pydata/xarray/blob/bc35548d96caaec225be9a26afbbaa94069c9494/xarray/core/dataarray.py#L408

Any suggestions would be appreciated. I'm hoping to figure out the shortest path to a working prototype.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
656627686 https://github.com/pydata/xarray/issues/3232#issuecomment-656627686 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDY1NjYyNzY4Ng== jacobtomlinson 1610850 2020-07-10T11:30:36Z 2020-07-10T11:30:36Z CONTRIBUTOR

@fjanoos I'm afraid I don't. In RAPIDS we support cupy as our GPU array implementation. So this request has come from the desire to make xarray compatible with the RAPIDS suite of tools.

We commonly see folks using cupy to switch straight over to a tool like pytorch using DLPack. https://docs-cupy.chainer.org/en/stable/reference/interoperability.html#dlpack

But I don't really see #4212 as an effort to make cupy the GPU backend for xarray. I see it as adding support for another backend to xarray. The more the merrier!

{
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
656372249 https://github.com/pydata/xarray/issues/3232#issuecomment-656372249 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDY1NjM3MjI0OQ== fjanoos 923438 2020-07-09T22:01:25Z 2020-07-09T22:02:30Z NONE

@andersy005 I'm about to start working actively on cupy support in xarray. Would be great to get some of your input.

Cupy requests that instead of calling __array__ you instead call their .get method for explicit conversion to numpy. So we need to add a little compatibility code for this.

Do you have a sense of the overhead / effort of making jax vs cupy as the gpu backend for xarrays ? One advantage of jax would be built in auto-diff functionality that would enable xarray to be plugged directly into deep learning pipelines. Downside is that it is not as numpy compatible as cupy. How much of a non-starter would this be ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
656178897 https://github.com/pydata/xarray/issues/3232#issuecomment-656178897 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDY1NjE3ODg5Nw== jacobtomlinson 1610850 2020-07-09T14:58:40Z 2020-07-09T14:58:40Z CONTRIBUTOR

@andersy005 I'm about to start working actively on cupy support in xarray. Would be great to get some of your input.

Cupy requests that instead of calling __array__ you instead call their .get method for explicit conversion to numpy. So we need to add a little compatibility code for this.

{
    "total_count": 4,
    "+1": 4,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
655751621 https://github.com/pydata/xarray/issues/3232#issuecomment-655751621 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDY1NTc1MTYyMQ== andersy005 13301940 2020-07-08T20:54:15Z 2020-07-08T20:54:15Z MEMBER

@jacobtomlinson gave CuPy a go a few months back. I seem to remember that he ran into a few problems but it would be good to get those documented here.

I've been test driving xarray objects backed by CuPy arrays, and one issue I keep running into is that operations (such as plotting) that expect numpy arrays fail due to xarray's implicit converstion to Numpy arrays via np.asarray(). CuPy decided not to allow implicit conversion to NumPy arrays (see https://github.com/cupy/cupy/pull/3421).

I am wondering whether there is a plan for dealing with this issue?

Here's a small, reproducible example:

```python

  <CUDA Device 0>

[24]: ds.isel(time=0, lev=0).tmin.plot() # Fails ```

Traceback ```python --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-21-69a72de2b9fd> in <module> ----> 1 ds.isel(time=0, lev=0).tmin.plot() /glade/work/abanihi/softwares/miniconda3/envs/rapids/lib/python3.7/site-packages/xarray/plot/plot.py in __call__(self, **kwargs) 444 445 def __call__(self, **kwargs): --> 446 return plot(self._da, **kwargs) 447 448 @functools.wraps(hist) /glade/work/abanihi/softwares/miniconda3/envs/rapids/lib/python3.7/site-packages/xarray/plot/plot.py in plot(darray, row, col, col_wrap, ax, hue, rtol, subplot_kws, **kwargs) 198 kwargs["ax"] = ax 199 --> 200 return plotfunc(darray, **kwargs) 201 202 /glade/work/abanihi/softwares/miniconda3/envs/rapids/lib/python3.7/site-packages/xarray/plot/plot.py in newplotfunc(darray, x, y, figsize, size, aspect, ax, row, col, col_wrap, xincrease, yincrease, add_colorbar, add_labels, vmin, vmax, cmap, center, robust, extend, levels, infer_intervals, colors, subplot_kws, cbar_ax, cbar_kwargs, xscale, yscale, xticks, yticks, xlim, ylim, norm, **kwargs) 684 685 # Pass the data as a masked ndarray too --> 686 zval = darray.to_masked_array(copy=False) 687 688 # Replace pd.Intervals if contained in xval or yval. /glade/work/abanihi/softwares/miniconda3/envs/rapids/lib/python3.7/site-packages/xarray/core/dataarray.py in to_masked_array(self, copy) 2325 Masked where invalid values (nan or inf) occur. 2326 """ -> 2327 values = self.values # only compute lazy arrays once 2328 isnull = pd.isnull(values) 2329 return np.ma.MaskedArray(data=values, mask=isnull, copy=copy) /glade/work/abanihi/softwares/miniconda3/envs/rapids/lib/python3.7/site-packages/xarray/core/dataarray.py in values(self) 556 def values(self) -> np.ndarray: 557 """The array's data as a numpy.ndarray""" --> 558 return self.variable.values 559 560 @values.setter /glade/work/abanihi/softwares/miniconda3/envs/rapids/lib/python3.7/site-packages/xarray/core/variable.py in values(self) 444 def values(self): 445 """The variable's data as a numpy.ndarray""" --> 446 return _as_array_or_item(self._data) 447 448 @values.setter /glade/work/abanihi/softwares/miniconda3/envs/rapids/lib/python3.7/site-packages/xarray/core/variable.py in _as_array_or_item(data) 247 TODO: remove this (replace with np.asarray) once these issues are fixed 248 """ --> 249 data = np.asarray(data) 250 if data.ndim == 0: 251 if data.dtype.kind == "M": /glade/work/abanihi/softwares/miniconda3/envs/rapids/lib/python3.7/site-packages/numpy/core/_asarray.py in asarray(a, dtype, order) 83 84 """ ---> 85 return array(a, dtype, copy=False, order=order) 86 87 ValueError: object __array__ method not producing an array ```
{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
606354369 https://github.com/pydata/xarray/issues/3232#issuecomment-606354369 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDYwNjM1NDM2OQ== jakirkham 3019665 2020-03-31T02:07:47Z 2020-03-31T02:07:47Z NONE

Well here's a blogpost on using Dask + CuPy. Maybe start there and build up to using Xarray.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
606322579 https://github.com/pydata/xarray/issues/3232#issuecomment-606322579 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDYwNjMyMjU3OQ== fjanoos 923438 2020-03-31T00:24:06Z 2020-03-31T00:24:06Z NONE

If you have any pointers on how to go about this - I can give it a try.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
606262540 https://github.com/pydata/xarray/issues/3232#issuecomment-606262540 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDYwNjI2MjU0MA== jakirkham 3019665 2020-03-30T21:31:18Z 2020-03-30T21:31:18Z NONE

Yeah Jacob and I played with this a few months back. There were some issues, but my recollection is pretty hazy. If someone gives this another try, it would be interesting to hear how things go.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
606230158 https://github.com/pydata/xarray/issues/3232#issuecomment-606230158 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDYwNjIzMDE1OA== jhamman 2443309 2020-03-30T20:27:32Z 2020-03-30T20:27:32Z MEMBER

@jacobtomlinson gave CuPy a go a few months back. I seem to remember that he ran into a few problems but it would be good to get those documented here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
606228143 https://github.com/pydata/xarray/issues/3232#issuecomment-606228143 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDYwNjIyODE0Mw== dcherian 2448579 2020-03-30T20:24:08Z 2020-03-30T20:24:08Z MEMBER

Just chiming in quickly. I think there's definitely interest in doing this through NEP-18.

It looks like CUDA has implemented __array_function__ (https://docs-cupy.chainer.org/en/stable/reference/interoperability.html) so many things may "just work". There was some work earlier on plugging in pydata/sparse, and there is some ongoing work to plug in pint. With both these efforts, a lot of xarray's code should be "backend-agnostic" but its not perfect.

Have you tried creating DataArrays with cupy arrays yet? I would just try things and see what works vs what doesn't.

Practically, our approach so far has been to add a number of xfailed tests (test_sparse.py and test_units.py) and slowly start fixing them. So that's one way to proceed if you're up for it.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
606216839 https://github.com/pydata/xarray/issues/3232#issuecomment-606216839 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDYwNjIxNjgzOQ== fjanoos 923438 2020-03-30T20:05:24Z 2020-03-30T20:05:24Z NONE

This might be a good time to revive this thread and see if there is wider interest (and bandwidth) in having xarray use CuPy (https://cupy.chainer.org/ ) as a backend (along with numpy). It appears to be a plug-and-play replacement for numpy - so it might not have all the issues that were brought up regarding pytorch/jax ?

Any thoughts ? cc @mrocklin

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
524420000 https://github.com/pydata/xarray/issues/3232#issuecomment-524420000 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDUyNDQyMDAwMA== shoyer 1217238 2019-08-23T18:38:19Z 2019-08-23T18:38:19Z MEMBER

I have not thought too much about these yet. But I agree that they will probably require backend specific logic to do efficiently.

On Fri, Aug 23, 2019 at 12:13 PM firdaus janoos notifications@github.com wrote:

While it is pretty straightforward to implement a lot of standard xarray operations with a pytorch / Jax backend (since they just fallback on native functions) - it will be interesting to think about how to implement rolling operations / expanding / exponential window in a way that is both efficient and maintains differentiability.

Expanding and exponential window operations would be easy to do leveraging RNN semantics - but doing rolling using convolutions is going to be very inefficient.

Do you have any thoughts on this?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/3232?email_source=notifications&email_token=AAJJFVWRVLTFNT3DYOZIJB3QGASFBA5CNFSM4ING6FH2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5A6IWY#issuecomment-524411995, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJJFVQ7JBUNO3CAIFGVJ63QGASFBANCNFSM4ING6FHQ .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
524411995 https://github.com/pydata/xarray/issues/3232#issuecomment-524411995 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDUyNDQxMTk5NQ== fjanoos 923438 2019-08-23T18:13:35Z 2019-08-23T18:13:35Z NONE

While it is pretty straightforward to implement a lot of standard xarray operations with a pytorch / Jax backend (since they just fallback on native functions) - it will be interesting to think about how to implement rolling operations / expanding / exponential window in a way that is both efficient and maintains differentiability.

Expanding and exponential window operations would be easy to do leveraging RNN semantics - but doing rolling using convolutions is going to be very inefficient.

Do you have any thoughts on this?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
524403160 https://github.com/pydata/xarray/issues/3232#issuecomment-524403160 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDUyNDQwMzE2MA== shoyer 1217238 2019-08-23T17:45:54Z 2019-08-23T17:45:54Z MEMBER

Within a jit compiled function, JAX's execution speed should be quite competitive on GPUs. It uses the XLA compiler, which was recently enabled by default in TensorFlow.

For data loading and deep learning algorithms, take a look at the examples in the notebooks directory in the JAX repo. The APIs for deep learning in JAX are still undergoing rapid development, so APIs are not quite as stable/usable as pytorch or keras yet, but they are quite capable. See jax.experimental.stax and tensor2tensor.trax for examples.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
524348393 https://github.com/pydata/xarray/issues/3232#issuecomment-524348393 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDUyNDM0ODM5Mw== fjanoos 923438 2019-08-23T15:00:02Z 2019-08-23T15:00:02Z NONE

I haven't used JAX - but was just browsing through its documentation and it looks super cool. Any ideas on how it compares with Pytorch in terms of:

a) Cxecution speed, esp. on GPU b) Memory management on GPUs. Pytorch has the 'Dataloader/Dataset' paradigm which uses background multithreading to shuttle batches of data back and forth - along with a lot of tips and tricks on efficient memory usage. c) support for deep-learning optimization algorithms ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
523101805 https://github.com/pydata/xarray/issues/3232#issuecomment-523101805 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDUyMzEwMTgwNQ== rgommers 98330 2019-08-20T16:53:40Z 2019-08-20T16:53:40Z NONE

This is a definite downside of reusing NumPy's existing namespace.

We didn't discuss an alternative very explicitly I think, but at least we'll have wide adoption fast. Hopefully the pain is limited ....

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
522884516 https://github.com/pydata/xarray/issues/3232#issuecomment-522884516 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDUyMjg4NDUxNg== shoyer 1217238 2019-08-20T07:07:18Z 2019-08-20T07:07:18Z MEMBER

Implementing it has some backwards compat concerns as well, because people may be relying on np.somefunc(some_torch_tensor) to be coerced to ndarray.

Yes, this is a concern for JAX as well. This is a definite downside of reusing NumPy's existing namespace.

It turns out even xarray was relying on this behavior with dask in at least one edge case: https://github.com/pydata/xarray/issues/3215

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
522824647 https://github.com/pydata/xarray/issues/3232#issuecomment-522824647 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDUyMjgyNDY0Nw== rgommers 98330 2019-08-20T02:18:59Z 2019-08-20T02:18:59Z NONE

Personally, I think the most viable way to achieve seamless integration with deep learning libraries would be to support integration with JAX, which already implements NumPy's API almost exactly.

Less familiar with that, but pytorch does have experimental XLA support, so that's a start.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
522824210 https://github.com/pydata/xarray/issues/3232#issuecomment-522824210 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDUyMjgyNDIxMA== rgommers 98330 2019-08-20T02:16:32Z 2019-08-20T02:16:32Z NONE

I think there has been some discussion about this, but I don't know the current status (CC @rgommers).

The PyTorch team is definitely receptive to the idea of adding __array_function__ and __array_ufunc__, as well as expanding the API for better NumPy compatibility.

Also, they want a Tensor.__torch_function__ styled after __array_function__ so they can make their own API overridable.

The tracking issue for all of this is https://github.com/pytorch/pytorch/issues/22402

The biggest challenge for pytorch would be defining the translation layer that implements NumPy's API.

Agreed. No one is working on __array_function__ at the moment. Implementing it has some backwards compat concerns as well, because people may be relying on np.somefunc(some_torch_tensor) to be coerced to ndarray. It's not a small project, but implementing a prototype with a few function in the torch namespace that are not exactly matching the NumPy API would be a useful way to start pushing this forward.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
522820303 https://github.com/pydata/xarray/issues/3232#issuecomment-522820303 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDUyMjgyMDMwMw== shoyer 1217238 2019-08-20T01:55:46Z 2019-08-20T01:55:46Z MEMBER

If pytorch implements overrides of NumPy's API via the __array_function__ protocol, then this could work with minimal effort. We are already using this to support sparse arrays (this isn't an official release yet, but functionality is working in the development version).

I think there has been some discussion about this, but I don't know the current status (CC @rgommers). The biggest challenge for pytorch would be defining the translation layer that implements NumPy's API.

Personally, I think the most viable way to achieve seamless integration with deep learning libraries would be to support integration with JAX, which already implements NumPy's API almost exactly. I have an experimental pull request adding __array_function__ to JAX, but it still needs a bit of work to finish it up, e.g., we probably want to hide this behind a flag at first.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 16.748ms · About: xarray-datasette