issue_comments
13 rows where issue = 305757822 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: reactions, created_at (date), updated_at (date)
issue 1
- apply_ufunc support for chunks on input_core_dims · 13 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
842556731 | https://github.com/pydata/xarray/issues/1995#issuecomment-842556731 | https://api.github.com/repos/pydata/xarray/issues/1995 | MDEyOklzc3VlQ29tbWVudDg0MjU1NjczMQ== | TomNicholas 35968931 | 2021-05-17T18:59:18Z | 2021-05-17T18:59:18Z | MEMBER | Has this not been solved by the argument @crusaderky isn't this effectively what you were trying to achieve? ```python import xarray as xr def mulsum(a, b): acc = 0 for i in range(a.size): acc += a[i] * b[i] return acc a = xr.DataArray(data=[1, 2, 3], dims=['x']).chunk({"x": 1}) b = xr.DataArray(data=[4, 5, 6], dims=['x']).chunk({"x": 1}) c = xr.apply_ufunc( mulsum, a, b, input_core_dims=[['x'], ['x']], dask='parallelized', output_dtypes=[float], dask_gufunc_kwargs={'allow_rechunk': True}) print(c.compute())
I think this has only been possible since the implementation of If this is actually doing what I think it's doing then we should document this possibility! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
apply_ufunc support for chunks on input_core_dims 305757822 | |
603493332 | https://github.com/pydata/xarray/issues/1995#issuecomment-603493332 | https://api.github.com/repos/pydata/xarray/issues/1995 | MDEyOklzc3VlQ29tbWVudDYwMzQ5MzMzMg== | stale[bot] 26384082 | 2020-03-24T20:40:45Z | 2020-03-24T20:40:45Z | NONE | In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity If this issue remains relevant, please comment here or remove the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
apply_ufunc support for chunks on input_core_dims 305757822 | |
384071053 | https://github.com/pydata/xarray/issues/1995#issuecomment-384071053 | https://api.github.com/repos/pydata/xarray/issues/1995 | MDEyOklzc3VlQ29tbWVudDM4NDA3MTA1Mw== | crusaderky 6213168 | 2018-04-24T20:35:21Z | 2018-04-24T20:36:00Z | MEMBER | @shoyer , you don't really need a parameter |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
apply_ufunc support for chunks on input_core_dims 305757822 | |
373870013 | https://github.com/pydata/xarray/issues/1995#issuecomment-373870013 | https://api.github.com/repos/pydata/xarray/issues/1995 | MDEyOklzc3VlQ29tbWVudDM3Mzg3MDAxMw== | crusaderky 6213168 | 2018-03-16T23:19:19Z | 2018-03-29T09:57:14Z | MEMBER | [EDIT] drastically simplified chunking algorithm @shoyer , close, but your version doesn't work in case of broadcasting. I think I fixed it although it won't work correctly if only one between a or b has dask backend, and I'm not sure how to fix it: ```python import xarray import numpy import dask.array coefficients = xarray.DataArray( dask.array.random.random((106, 99), chunks=(25, 25)), dims=['formula', 'time']) components = xarray.DataArray( dask.array.random.random((106, 512 * 1024), chunks=(25, 65536)), dims=['formula', 'scenario']) def mulsum(a, b, dim): return xarray.apply_ufunc( _mulsum_xarray_kernel, a, b, input_core_dims=[[dim], [dim]], dask='allowed', output_dtypes=[float]) def _mulsum_xarray_kernel(a, b): if isinstance(a, dask.array.Array) and isinstance(b, dask.array.Array): chunks = dask.array.core.broadcast_chunks(a.chunks, b.chunks) chunks = chunks[:-1] + (tuple(1 for _ in chunks[-1]), )
def _mulsum_dask_kernel(a, b): a = numpy.ascontiguousarray(a) b = numpy.ascontiguousarray(b) res = numpy.einsum('...i,...i', a, b, optimize='optimal') return res[..., numpy.newaxis] mulsum(coefficients, components, dim='formula') ``` Proposal 2Modify apply_ufunc: * remove the check that the input_core_dims must not be chunked * add parameter output_chunks My initial example would become: ```python def mulsum_kernel(a, b): return numpy.einsum('...i,...i', a, b)[..., numpy.newaxis] c = xarray.apply_ufunc( mulsum_kernel, a, b, dask='parallelized', input_core_dims=[['x'], ['x']], output_dtypes=[float], output_core_dims=[['__partial']], output_chunks={'__partial': [1 for _ in a.chunks[a.dims.index('x')]} ).sum('__partial') ``` Although I'm not sure this approach would be univocous when there's more than one core_dim... |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
apply_ufunc support for chunks on input_core_dims 305757822 | |
373871784 | https://github.com/pydata/xarray/issues/1995#issuecomment-373871784 | https://api.github.com/repos/pydata/xarray/issues/1995 | MDEyOklzc3VlQ29tbWVudDM3Mzg3MTc4NA== | shoyer 1217238 | 2018-03-16T23:32:07Z | 2018-03-16T23:32:07Z | MEMBER |
My main concern is ensuring that someone does not inadvertently apply a function not designed for multiple chunks to dask arrays. For example, suppose the function being applied is Some loud flag that makes it very obvious what's going on seems like a good idea, e.g., Then we also need some sort of guarantee that chunked core dimensions aren't entirely removed, or else xarray/dask won't know how to stack them back up. I guess we could check to make sure that at least as many output core dimensions appear as appear in inputs cor edimensions? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
apply_ufunc support for chunks on input_core_dims 305757822 | |
373579142 | https://github.com/pydata/xarray/issues/1995#issuecomment-373579142 | https://api.github.com/repos/pydata/xarray/issues/1995 | MDEyOklzc3VlQ29tbWVudDM3MzU3OTE0Mg== | shoyer 1217238 | 2018-03-16T01:55:44Z | 2018-03-16T01:55:44Z | MEMBER | Try: ```python import dask.array import numpy as np def mulsum_chunk(a, b): return np.einsum('...i,...i', a, b)[..., np.newaxis] def mulsum(a, b): # needs broadcasting/rechunking for a,b mapped = dask.array.map_blocks(mulsum_chunk, a, b, dtype=float, chunks=a.chunks[:-1] + (tuple(1 for _ in a.chunks[-1]),)) return dask.array.sum(mapped, axis=-1) ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
apply_ufunc support for chunks on input_core_dims 305757822 | |
373578226 | https://github.com/pydata/xarray/issues/1995#issuecomment-373578226 | https://api.github.com/repos/pydata/xarray/issues/1995 | MDEyOklzc3VlQ29tbWVudDM3MzU3ODIyNg== | shoyer 1217238 | 2018-03-16T01:50:07Z | 2018-03-16T01:50:07Z | MEMBER |
OK, thinking a little more about it, this would not work with |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
apply_ufunc support for chunks on input_core_dims 305757822 | |
373576583 | https://github.com/pydata/xarray/issues/1995#issuecomment-373576583 | https://api.github.com/repos/pydata/xarray/issues/1995 | MDEyOklzc3VlQ29tbWVudDM3MzU3NjU4Mw== | crusaderky 6213168 | 2018-03-16T01:40:05Z | 2018-03-16T01:40:05Z | MEMBER |
@shoyer could you make an example? That was my first thought but I could not figure out how to make the apply_ufunc do it. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
apply_ufunc support for chunks on input_core_dims 305757822 | |
373572878 | https://github.com/pydata/xarray/issues/1995#issuecomment-373572878 | https://api.github.com/repos/pydata/xarray/issues/1995 | MDEyOklzc3VlQ29tbWVudDM3MzU3Mjg3OA== | shoyer 1217238 | 2018-03-16T01:16:57Z | 2018-03-16T01:16:57Z | MEMBER | One way to allow chunking across I'm reluctant to add For this specific problem, I think you could solve it with |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
apply_ufunc support for chunks on input_core_dims 305757822 | |
373569674 | https://github.com/pydata/xarray/issues/1995#issuecomment-373569674 | https://api.github.com/repos/pydata/xarray/issues/1995 | MDEyOklzc3VlQ29tbWVudDM3MzU2OTY3NA== | fujiisoup 6815844 | 2018-03-16T00:57:21Z | 2018-03-16T00:57:21Z | MEMBER | If |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
apply_ufunc support for chunks on input_core_dims 305757822 | |
373569090 | https://github.com/pydata/xarray/issues/1995#issuecomment-373569090 | https://api.github.com/repos/pydata/xarray/issues/1995 | MDEyOklzc3VlQ29tbWVudDM3MzU2OTA5MA== | shoyer 1217238 | 2018-03-16T00:53:34Z | 2018-03-16T00:53:34Z | MEMBER | For two inputs, don't we use dask.array.tensordot? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
apply_ufunc support for chunks on input_core_dims 305757822 | |
373568992 | https://github.com/pydata/xarray/issues/1995#issuecomment-373568992 | https://api.github.com/repos/pydata/xarray/issues/1995 | MDEyOklzc3VlQ29tbWVudDM3MzU2ODk5Mg== | fujiisoup 6815844 | 2018-03-16T00:52:57Z | 2018-03-16T00:52:57Z | MEMBER | I think if I think it would be nice if we could have a way to allow chunking along |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
apply_ufunc support for chunks on input_core_dims 305757822 | |
373568240 | https://github.com/pydata/xarray/issues/1995#issuecomment-373568240 | https://api.github.com/repos/pydata/xarray/issues/1995 | MDEyOklzc3VlQ29tbWVudDM3MzU2ODI0MA== | shoyer 1217238 | 2018-03-16T00:48:12Z | 2018-03-16T00:48:12Z | MEMBER | Have you tried the new |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
apply_ufunc support for chunks on input_core_dims 305757822 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 5