home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

13 rows where user = 923438 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, reactions, created_at (date), updated_at (date)

issue 4

  • Use pytorch as backend for xarrays 6
  • How should xarray use/support sparse arrays? 3
  • Sparse arrays 2
  • merge_asof functionality 2

user 1

  • fjanoos · 13 ✖

author_association 1

  • NONE 13
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
766090834 https://github.com/pydata/xarray/issues/3232#issuecomment-766090834 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDc2NjA5MDgzNA== fjanoos 923438 2021-01-23T14:50:04Z 2021-01-23T14:50:04Z NONE

@Duane321 While it would be fantastic to have gpu-enabled auto-diff-able xarrays / DataArrays, an interesting development worth looking into are the named tensor in https://pytorch.org/docs/stable/named_tensor.html. This appears to be an attempt to bridge the gap from the that they are making pytorch tensors increasingly dataarray like. I would not be surprised if within the next few iterations they add indexes to the tensors closing the gap even further.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
656372249 https://github.com/pydata/xarray/issues/3232#issuecomment-656372249 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDY1NjM3MjI0OQ== fjanoos 923438 2020-07-09T22:01:25Z 2020-07-09T22:02:30Z NONE

@andersy005 I'm about to start working actively on cupy support in xarray. Would be great to get some of your input.

Cupy requests that instead of calling __array__ you instead call their .get method for explicit conversion to numpy. So we need to add a little compatibility code for this.

Do you have a sense of the overhead / effort of making jax vs cupy as the gpu backend for xarrays ? One advantage of jax would be built in auto-diff functionality that would enable xarray to be plugged directly into deep learning pipelines. Downside is that it is not as numpy compatible as cupy. How much of a non-starter would this be ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
606322579 https://github.com/pydata/xarray/issues/3232#issuecomment-606322579 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDYwNjMyMjU3OQ== fjanoos 923438 2020-03-31T00:24:06Z 2020-03-31T00:24:06Z NONE

If you have any pointers on how to go about this - I can give it a try.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
606216839 https://github.com/pydata/xarray/issues/3232#issuecomment-606216839 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDYwNjIxNjgzOQ== fjanoos 923438 2020-03-30T20:05:24Z 2020-03-30T20:05:24Z NONE

This might be a good time to revive this thread and see if there is wider interest (and bandwidth) in having xarray use CuPy (https://cupy.chainer.org/ ) as a backend (along with numpy). It appears to be a plug-and-play replacement for numpy - so it might not have all the issues that were brought up regarding pytorch/jax ?

Any thoughts ? cc @mrocklin

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
526747770 https://github.com/pydata/xarray/issues/3213#issuecomment-526747770 https://api.github.com/repos/pydata/xarray/issues/3213 MDEyOklzc3VlQ29tbWVudDUyNjc0Nzc3MA== fjanoos 923438 2019-08-30T20:57:54Z 2019-08-30T20:57:54Z NONE

Thanks.

That solved that error but introduced another one.

Specifically - this is my dataframe

and this is the error that I get with sparse=True

My numpy version is definitely about 1.16

I also set this os.environ["NUMPY_EXPERIMENTAL_ARRAY_FUNCTION"]='1' just in case

Furthermore, I don't get this error when I don't set sparse=True ( I just get OOM errors but that's another matter) ...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  How should xarray use/support sparse arrays? 479942077
526733257 https://github.com/pydata/xarray/issues/3213#issuecomment-526733257 https://api.github.com/repos/pydata/xarray/issues/3213 MDEyOklzc3VlQ29tbWVudDUyNjczMzI1Nw== fjanoos 923438 2019-08-30T20:10:43Z 2019-08-30T20:10:43Z NONE

I cloned the master branch and installed it using 'python setup.py develop'.

When I try to use the sparse data loading functionality as per python oo = xa.Dataset.from_dataframe( my_df, sparse=True ) I get the following error:

```

ModuleNotFoundError Traceback (most recent call last) <ipython-input-9-fce0ca6bc4c2> in <module> ----> 1 oo = xa.Dataset.from_dataframe( poly_df.iloc[:10000], sparse=True )

/mnt/local/xarray/xarray/core/dataset.py in from_dataframe(cls, dataframe, sparse) 4040 4041 if sparse: -> 4042 obj._set_sparse_data_from_dataframe(dataframe, dims, shape) 4043 else: 4044 obj._set_numpy_data_from_dataframe(dataframe, dims, shape)

/mnt/local/xarray/xarray/core/dataset.py in _set_sparse_data_from_dataframe(self, dataframe, dims, shape) 3936 self, dataframe: pd.DataFrame, dims: tuple, shape: Tuple[int, ...] 3937 ) -> None: -> 3938 from sparse import COO 3939 3940 idx = dataframe.index

ModuleNotFoundError: No module named 'sparse'

``` Any suggestions on what I need to do ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  How should xarray use/support sparse arrays? 479942077
526710709 https://github.com/pydata/xarray/issues/3213#issuecomment-526710709 https://api.github.com/repos/pydata/xarray/issues/3213 MDEyOklzc3VlQ29tbWVudDUyNjcxMDcwOQ== fjanoos 923438 2019-08-30T18:53:44Z 2019-08-30T18:53:44Z NONE

Would it be possible that pd.{Series, DataFrame}.to_xarray() automatically creates a sparse dataarray - or we have a flag in to_xarray which allows controlling for this. I have a very sparse dataframe and everytime I try to convert it to xarray I blow out my memory. Keeping it sparse but logically as a DataArray would be fantastic.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  How should xarray use/support sparse arrays? 479942077
526356476 https://github.com/pydata/xarray/issues/1375#issuecomment-526356476 https://api.github.com/repos/pydata/xarray/issues/1375 MDEyOklzc3VlQ29tbWVudDUyNjM1NjQ3Ng== fjanoos 923438 2019-08-29T20:52:10Z 2019-08-29T20:52:10Z NONE

@shoyer Is there documentation for using sparse arrays ? Could you point me to some example code ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Sparse arrays 221858543
524411995 https://github.com/pydata/xarray/issues/3232#issuecomment-524411995 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDUyNDQxMTk5NQ== fjanoos 923438 2019-08-23T18:13:35Z 2019-08-23T18:13:35Z NONE

While it is pretty straightforward to implement a lot of standard xarray operations with a pytorch / Jax backend (since they just fallback on native functions) - it will be interesting to think about how to implement rolling operations / expanding / exponential window in a way that is both efficient and maintains differentiability.

Expanding and exponential window operations would be easy to do leveraging RNN semantics - but doing rolling using convolutions is going to be very inefficient.

Do you have any thoughts on this?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
524348393 https://github.com/pydata/xarray/issues/3232#issuecomment-524348393 https://api.github.com/repos/pydata/xarray/issues/3232 MDEyOklzc3VlQ29tbWVudDUyNDM0ODM5Mw== fjanoos 923438 2019-08-23T15:00:02Z 2019-08-23T15:00:02Z NONE

I haven't used JAX - but was just browsing through its documentation and it looks super cool. Any ideas on how it compares with Pytorch in terms of:

a) Cxecution speed, esp. on GPU b) Memory management on GPUs. Pytorch has the 'Dataloader/Dataset' paradigm which uses background multithreading to shuttle batches of data back and forth - along with a lot of tips and tricks on efficient memory usage. c) support for deep-learning optimization algorithms ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use pytorch as backend for xarrays 482543307
521413970 https://github.com/pydata/xarray/issues/3218#issuecomment-521413970 https://api.github.com/repos/pydata/xarray/issues/3218 MDEyOklzc3VlQ29tbWVudDUyMTQxMzk3MA== fjanoos 923438 2019-08-14T20:52:06Z 2019-08-14T20:52:06Z NONE

That looks correct. Let me try and revert back to you

On Wed, Aug 14, 2019, 16:44 Maximilian Roos notifications@github.com wrote:

How is merge_asof different from using reindex with method='pad'?

Yes this is right! Mea culpa. We can already use the pandas reindexing for the 1D case (which should cover your case @fjanoos https://github.com/fjanoos ?)

@fjanoos https://github.com/fjanoos can you confirm this is what you're looking for?

In [4]: da=xr.DataArray(list('abcdefghil'), dims=['x'],coords=dict(x=range(10)))

In [8]: da.reindex(x=[0,2.5,2.6,2.7,5,6.2], method='nearest') Out[8]:<xarray.DataArray (x: 6)> array(['a', 'd', 'd', 'd', 'f', 'g'], dtype='<U1') Coordinates: * x (x) float64 0.0 2.5 2.6 2.7 5.0 6.2

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/3218?email_source=notifications&email_token=AAHBOLQHOB6IQKA4ZU2EHF3QERVBHA5CNFSM4ILW6VK2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4KBVHA#issuecomment-521411228, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHBOLQM7PRFBOFQMPSNZL3QERVBHANCNFSM4ILW6VKQ .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  merge_asof functionality 480786385
521404974 https://github.com/pydata/xarray/issues/3218#issuecomment-521404974 https://api.github.com/repos/pydata/xarray/issues/3218 MDEyOklzc3VlQ29tbWVudDUyMTQwNDk3NA== fjanoos 923438 2019-08-14T20:25:52Z 2019-08-14T20:25:52Z NONE

As of now, a simple workaround would be to do these tasks in pandas and switch back and forth.

A couple of years ago - before pandas had pd.merge_asof - I had implemented a version of this logic in numba when working with numpy arrays. It was blazingly fast - and if there is interest I can try to dig it up ? I would need some help making it work for xarrays and publishing it into the master branch.

On Wed, Aug 14, 2019, 14:12 Maximilian Roos notifications@github.com wrote:

I think this would be good. It would need to be implemented outside of python (cython / numba / etc) given the performance requirements. I'm not sure whether we could borrow the pandas functionality and apply it to multi-dimensional arrays.

Assuming we'd need to write our own, xarray doesn't have any cython dependencies, so I think it would be best in a separate and optional package. These could go in numbagg. It's non-trivial work, so someone would have to have a strong need for it.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/3218?email_source=notifications&email_token=AAHBOLSMYMJ5GK3LJLPURBTQERDKFA5CNFSM4ILW6VK2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4JURTY#issuecomment-521357519, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHBOLVZ6VEZTAZV4MFOMBLQERDKFANCNFSM4ILW6VKQ .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  merge_asof functionality 480786385
513589352 https://github.com/pydata/xarray/issues/1375#issuecomment-513589352 https://api.github.com/repos/pydata/xarray/issues/1375 MDEyOklzc3VlQ29tbWVudDUxMzU4OTM1Mg== fjanoos 923438 2019-07-21T21:32:23Z 2019-07-21T21:32:23Z NONE

Wondering what the status on this is ? Is there a branch with this functionality implemented - would love to give it a spin !

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Sparse arrays 221858543

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 15.442ms · About: xarray-datasette