github: issue_comments: 18 rows where issue = 653430454 sorted by updated

18 rows where issue = 653430454 sorted by updated_at descending

Search:

✖

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
663148752	https://github.com/pydata/xarray/issues/4208#issuecomment-663148752	https://api.github.com/repos/pydata/xarray/issues/4208	MDEyOklzc3VlQ29tbWVudDY2MzE0ODc1Mg==	mrocklin 306380	2020-07-23T17:57:55Z	2020-07-23T17:57:55Z	MEMBER	Dask collections tokenize quickly. We just use the name I think.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Support for duck Dask Arrays 653430454
663135877	https://github.com/pydata/xarray/issues/4208#issuecomment-663135877	https://api.github.com/repos/pydata/xarray/issues/4208	MDEyOklzc3VlQ29tbWVudDY2MzEzNTg3Nw==	dcherian 2448579	2020-07-23T17:31:18Z	2020-07-23T17:31:18Z	MEMBER	Re:rechunk, this should be part of the spec I guess. We need this for `DataArray.chunk()`. xarray does do some automatic rechunking in `variable.py`. But this comment: `# chunked data should come out with the same chunks; this makes # it feasible to combine shifted and unshifted data # TODO: remove this once dask.array automatically aligns chunks` suggest that we could delete that automatic rechunking today. This will probably be very fast because you're probably just returning the name of the underlying dask array as well as the unit of the pint array/quatity. ah yes, we can rely on the underlying array library to optimize this.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Support for duck Dask Arrays 653430454
663123118	https://github.com/pydata/xarray/issues/4208#issuecomment-663123118	https://api.github.com/repos/pydata/xarray/issues/4208	MDEyOklzc3VlQ29tbWVudDY2MzEyMzExOA==	mrocklin 306380	2020-07-23T17:05:30Z	2020-07-23T17:05:30Z	MEMBER	That's exactly what's been done in Pint (see hgrecco/pint#1129)! @dcherian's points go beyond just that and address what Pint hasn't covered yet through the standard collection interface. Ah, great. My bad. how do we ask a duck dask array to rechunk itself? pint seems to forward the .rechunk call but that isn't formalized anywhere AFAICT. I think that you would want to make a pint array rechunk method that called down to the dask array rechunk method. My guess is that this might come up in other situations as well. less important: should duck dask arrays cache their token somewhere? dask.array uses .name to do this and xarray uses that to check equality cheaply. We can use tokenize of course. But I'm wondering if it's worth asking duck dask arrays to cache their token as an optimization. I think that implementing the `dask.base.normalize_token` method should be fine. This will probably be very fast because you're probably just returning the name of the underlying dask array as well as the unit of the pint array/quatity. I don't think that caching would be necessary here. It's also possible that we could look at the `__dask_layers__` method to get this information. My memory is a bit fuzzy here though.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Support for duck Dask Arrays 653430454
663121190	https://github.com/pydata/xarray/issues/4208#issuecomment-663121190	https://api.github.com/repos/pydata/xarray/issues/4208	MDEyOklzc3VlQ29tbWVudDY2MzEyMTE5MA==	jthielen 3460034	2020-07-23T17:01:44Z	2020-07-23T17:03:20Z	CONTRIBUTOR	In Xarray we implemented the Dask collection spec. https://docs.dask.org/en/latest/custom-collections.html#the-dask-collection-interface We might want to do that with Pint as well, if they're going to contain Dask things. That way Dask operations like `dask.persist`, `dask.visualize`, and `dask.compute` will work normally. That's exactly what's been done in Pint (see https://github.com/hgrecco/pint/pull/1129)! @dcherian's points go beyond just that and address what Pint hasn't covered yet through the standard collection interface.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Support for duck Dask Arrays 653430454
663119539	https://github.com/pydata/xarray/issues/4208#issuecomment-663119539	https://api.github.com/repos/pydata/xarray/issues/4208	MDEyOklzc3VlQ29tbWVudDY2MzExOTUzOQ==	mrocklin 306380	2020-07-23T16:58:27Z	2020-07-23T16:58:27Z	MEMBER	My guess is that we could steal the xarray.DataArray implementations over to Pint without causing harm.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Support for duck Dask Arrays 653430454
663119334	https://github.com/pydata/xarray/issues/4208#issuecomment-663119334	https://api.github.com/repos/pydata/xarray/issues/4208	MDEyOklzc3VlQ29tbWVudDY2MzExOTMzNA==	mrocklin 306380	2020-07-23T16:58:06Z	2020-07-23T16:58:06Z	MEMBER	In Xarray we implemented the Dask collection spec. https://docs.dask.org/en/latest/custom-collections.html#the-dask-collection-interface We might want to do that with Pint as well, if they're going to contain Dask things. That way Dask operations like `dask.persist`, `dask.visualize`, and `dask.compute` will work normally.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Support for duck Dask Arrays 653430454
663117842	https://github.com/pydata/xarray/issues/4208#issuecomment-663117842	https://api.github.com/repos/pydata/xarray/issues/4208	MDEyOklzc3VlQ29tbWVudDY2MzExNzg0Mg==	dcherian 2448579	2020-07-23T16:55:11Z	2020-07-23T16:55:11Z	MEMBER	A couple of things came up in #4221 1. how do we ask a duck dask array to rechunk itself? pint seems to forward the `.rechunk` call but that isn't formalized anywhere AFAICT. 2. less important: should duck dask arrays cache their token somewhere? `dask.array` uses `.name` to do this and xarray uses that to check equality cheaply. We can use `tokenize` of course. But I'm wondering if it's worth asking duck dask arrays to cache their token as an optimization.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Support for duck Dask Arrays 653430454
657152588	https://github.com/pydata/xarray/issues/4208#issuecomment-657152588	https://api.github.com/repos/pydata/xarray/issues/4208	MDEyOklzc3VlQ29tbWVudDY1NzE1MjU4OA==	jthielen 3460034	2020-07-12T00:19:53Z	2020-07-12T00:19:53Z	CONTRIBUTOR	Does/should any of this also consider #4212 (CuPy)? Only indirectly, since this deals with duck Dask arrays (things like Pint that go between xarray and Dask) rather than Dask chunks, which CuPy would be. But, once this, #4212, https://github.com/hgrecco/pint/issues/964, and https://github.com/dask/dask/pull/6393 are all in place, then we can test if xarray( pint( dask( cupy ))) works automatically from it all or not.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Support for duck Dask Arrays 653430454
657135521	https://github.com/pydata/xarray/issues/4208#issuecomment-657135521	https://api.github.com/repos/pydata/xarray/issues/4208	MDEyOklzc3VlQ29tbWVudDY1NzEzNTUyMQ==	dopplershift 221526	2020-07-11T21:49:36Z	2020-07-11T21:49:54Z	CONTRIBUTOR	Does/should any of this also consider #4212 (CuPy)?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Support for duck Dask Arrays 653430454
656250148	https://github.com/pydata/xarray/issues/4208#issuecomment-656250148	https://api.github.com/repos/pydata/xarray/issues/4208	MDEyOklzc3VlQ29tbWVudDY1NjI1MDE0OA==	jthielen 3460034	2020-07-09T17:17:34Z	2020-07-10T17:47:16Z	CONTRIBUTOR	Based on @mrocklin's comment in https://github.com/dask/dask/issues/6385, the plan will be to check for duck Dask Arrays with `dask.base.is_dask_collection` along with xarray's previously used duck array check. This works properly with Pint Quantities as implemented in https://github.com/hgrecco/pint/pull/1129 (returning True if the Pint Quantity contains a Dask Array, and False if it does not).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Support for duck Dask Arrays 653430454
656358719	https://github.com/pydata/xarray/issues/4208#issuecomment-656358719	https://api.github.com/repos/pydata/xarray/issues/4208	MDEyOklzc3VlQ29tbWVudDY1NjM1ODcxOQ==	jthielen 3460034	2020-07-09T21:24:42Z	2020-07-09T21:26:37Z	CONTRIBUTOR	@rpmanser That sounds like a good plan to me at least, but it would be great if any of the xarray maintainers would be able to chime in. Also, thanks again for being willing to work on this while I try working on https://github.com/dask/dask/issues/4583. The hidden 4th step is of course testing--primarily that this doesn't break existing functionality, but also that it works for duck Dask Arrays other than Dask Arrays themselves (not sure if Pint Quantities in upcoming v0.15 or a mocked class would be better). Also, thank you @dcherian for pointing out those checks, you found them faster than I did!	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Support for duck Dask Arrays 653430454
656358078	https://github.com/pydata/xarray/issues/4208#issuecomment-656358078	https://api.github.com/repos/pydata/xarray/issues/4208	MDEyOklzc3VlQ29tbWVudDY1NjM1ODA3OA==	dcherian 2448579	2020-07-09T21:22:56Z	2020-07-09T21:22:56Z	MEMBER	We have https://github.com/pydata/xarray/blob/master/xarray/core/pycompat.py which defines `dask_array_type` and `sparse_array_type` and then use `isinstance(da, dask_array_type)` in a bunch of places (e.g. duck_array_ops). re duck array check: @keewis added this recently https://github.com/pydata/xarray/blob/f3ca63a4ac5c091a92085b477a0d34c08df88aa6/xarray/core/utils.py#L250-L253	{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Support for duck Dask Arrays 653430454
656349279	https://github.com/pydata/xarray/issues/4208#issuecomment-656349279	https://api.github.com/repos/pydata/xarray/issues/4208	MDEyOklzc3VlQ29tbWVudDY1NjM0OTI3OQ==	rpmanser 19578931	2020-07-09T21:01:07Z	2020-07-09T21:01:07Z	CONTRIBUTOR	I can go ahead with putting together a PR for this. Before I do so, I'd like to clarify what is expected. Implement the `is_duck_dask_array()` function In that implementation, use `dask.base.is_dask_collection()` and the existing duck array check(s) Replace `isinstance(x, dask.array.Array)` checks with the new `is_dask_duck_array()` function I searched for existing duck array checks in xarray and nothing immediately obvious to me is showing up. It looks like a check for `__array_function__` is inappropriate based on discussion in #3917. Could someone point out the proper duck array check?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Support for duck Dask Arrays 653430454
656119415	https://github.com/pydata/xarray/issues/4208#issuecomment-656119415	https://api.github.com/repos/pydata/xarray/issues/4208	MDEyOklzc3VlQ29tbWVudDY1NjExOTQxNQ==	jthielen 3460034	2020-07-09T13:13:43Z	2020-07-09T13:13:43Z	CONTRIBUTOR	I think there are already enough headaches with `__iter__` being always defined and confusing libraries such as pandas (hgrecco/pint#1128). I don't see why pint should be explicitly aware of dask (except in unit tests)? It should only deal with generic NEP18-compatible libraries (numpy, dask, sparse, cupy, etc.). Since Pint wraps Dask, in order to leverage Dask Array functionality on Pint Quantities, we need to have the Dask collection interface available. In a sense, Pint needs special handling for Dask like xarray Variables do since they both can be upcast types of Dask Array. Implicitly passing through attributes (how Pint handles special methods/attributes of downcast types in general) from the wrapped Dask Array is not sufficient, however, because the finalizers have to rewrap with Quantity (see https://github.com/hgrecco/pint/pull/1129/files#diff-d9924213798d0fc092b8cff13928d747R1947-R1950), hence the explicit awareness of Dask being needed in Pint. We should ask the dask team to formalize what defines a "dask-array-like", like they already did with dask collections, and implement their definition in xarray. Done! See https://github.com/dask/dask/issues/6385.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Support for duck Dask Arrays 653430454
656068407	https://github.com/pydata/xarray/issues/4208#issuecomment-656068407	https://api.github.com/repos/pydata/xarray/issues/4208	MDEyOklzc3VlQ29tbWVudDY1NjA2ODQwNw==	crusaderky 6213168	2020-07-09T11:18:15Z	2020-07-09T11:19:28Z	MEMBER	Is it acceptable for a Pint Quantity to always have the Dask collection interface defined (i.e., be a duck Dask array), even when its magnitude (what it wraps) is not a Dask Array? I think there are already enough headaches with `__iter__` being always defined and confusing libraries such as pandas (https://github.com/hgrecco/pint/issues/1128). I don't see why pint should be explicitly aware of dask (except in unit tests)? It should only deal with generic NEP18-compatible libraries (numpy, dask, sparse, cupy, etc.). How should xarray check for a duck Dask Array? We should ask the dask team to formalize what defines a "dask-array-like", like they already did with dask collections, and implement their definition in xarray. I'd personally make it "whatever defines a numpy-array-like AND has a chunks method AND the chunks method returns a tuple".	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Support for duck Dask Arrays 653430454
655820797	https://github.com/pydata/xarray/issues/4208#issuecomment-655820797	https://api.github.com/repos/pydata/xarray/issues/4208	MDEyOklzc3VlQ29tbWVudDY1NTgyMDc5Nw==	shoyer 1217238	2020-07-09T00:09:58Z	2020-07-09T00:09:58Z	MEMBER	It might also make sense to check for one or more of the special dask collection attributes (`__dask_graph__`, `__dask_keys__`, etc)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Support for duck Dask Arrays 653430454
655817246	https://github.com/pydata/xarray/issues/4208#issuecomment-655817246	https://api.github.com/repos/pydata/xarray/issues/4208	MDEyOklzc3VlQ29tbWVudDY1NTgxNzI0Ng==	jthielen 3460034	2020-07-08T23:57:36Z	2020-07-08T23:59:28Z	CONTRIBUTOR	Maybe something like this would work? `def is_duck_dask_array(x): return getattr(x, 'chunks', None) is not None` `xarray.DataArray` would pass this test (`chunks` is either `None` for non-dask arrays or a tuple for dask arrays), so this would be consistent with what we already do. That would be a straightforward solution to both problems! A Pint Quantity containing a Dask Array passes along the chunks attribute from the Dask Array, and a Pint Quantity containing something else will raise an AttributeError. Unless there are other objections, I'll see what it will take to swap out the existing Dask checks for this in the xarray internals and hopefully get around to a PR (after I get some MetPy stuff done first).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Support for duck Dask Arrays 653430454
655810311	https://github.com/pydata/xarray/issues/4208#issuecomment-655810311	https://api.github.com/repos/pydata/xarray/issues/4208	MDEyOklzc3VlQ29tbWVudDY1NTgxMDMxMQ==	shoyer 1217238	2020-07-08T23:31:21Z	2020-07-08T23:31:21Z	MEMBER	Maybe something like this would work? `def is_duck_dask_array(x): return getattr(x, 'chunks', None) is not None` `xarray.DataArray` would pass this test (`chunks` is either `None` for non-dask arrays or a tuple for dask arrays), so this would be consistent with what we already do.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Support for duck Dask Arrays 653430454

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);