home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

3 rows where issue = 305757822 and user = 6213168 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • crusaderky · 3 ✖

issue 1

  • apply_ufunc support for chunks on input_core_dims · 3 ✖

author_association 1

  • MEMBER 3
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
384071053 https://github.com/pydata/xarray/issues/1995#issuecomment-384071053 https://api.github.com/repos/pydata/xarray/issues/1995 MDEyOklzc3VlQ29tbWVudDM4NDA3MTA1Mw== crusaderky 6213168 2018-04-24T20:35:21Z 2018-04-24T20:36:00Z MEMBER

@shoyer , you don't really need a parameter possibly_chunked_core_dims=['x']; you are already specifying output_chunks - without which apply_ufunc won't know what to do and crash...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc support for chunks on input_core_dims 305757822
373870013 https://github.com/pydata/xarray/issues/1995#issuecomment-373870013 https://api.github.com/repos/pydata/xarray/issues/1995 MDEyOklzc3VlQ29tbWVudDM3Mzg3MDAxMw== crusaderky 6213168 2018-03-16T23:19:19Z 2018-03-29T09:57:14Z MEMBER

[EDIT] drastically simplified chunking algorithm

@shoyer , close, but your version doesn't work in case of broadcasting. I think I fixed it although it won't work correctly if only one between a or b has dask backend, and I'm not sure how to fix it:

```python import xarray import numpy import dask.array

coefficients = xarray.DataArray( dask.array.random.random((106, 99), chunks=(25, 25)), dims=['formula', 'time']) components = xarray.DataArray( dask.array.random.random((106, 512 * 1024), chunks=(25, 65536)), dims=['formula', 'scenario'])

def mulsum(a, b, dim): return xarray.apply_ufunc( _mulsum_xarray_kernel, a, b, input_core_dims=[[dim], [dim]], dask='allowed', output_dtypes=[float])

def _mulsum_xarray_kernel(a, b): if isinstance(a, dask.array.Array) and isinstance(b, dask.array.Array): chunks = dask.array.core.broadcast_chunks(a.chunks, b.chunks) chunks = chunks[:-1] + (tuple(1 for _ in chunks[-1]), )

    mapped = dask.array.map_blocks(
        _mulsum_dask_kernel, a, b,
        dtype=float, chunks=chunks)
    return dask.array.sum(mapped, axis=-1)
else:
    return _mulsum_dask_kernel(a, b)

def _mulsum_dask_kernel(a, b): a = numpy.ascontiguousarray(a) b = numpy.ascontiguousarray(b) res = numpy.einsum('...i,...i', a, b, optimize='optimal') return res[..., numpy.newaxis]

mulsum(coefficients, components, dim='formula') ```

Proposal 2

Modify apply_ufunc: * remove the check that the input_core_dims must not be chunked * add parameter output_chunks

My initial example would become:

```python def mulsum_kernel(a, b): return numpy.einsum('...i,...i', a, b)[..., numpy.newaxis]

c = xarray.apply_ufunc( mulsum_kernel, a, b, dask='parallelized', input_core_dims=[['x'], ['x']], output_dtypes=[float], output_core_dims=[['__partial']], output_chunks={'__partial': [1 for _ in a.chunks[a.dims.index('x')]} ).sum('__partial') ``` Although I'm not sure this approach would be univocous when there's more than one core_dim...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc support for chunks on input_core_dims 305757822
373576583 https://github.com/pydata/xarray/issues/1995#issuecomment-373576583 https://api.github.com/repos/pydata/xarray/issues/1995 MDEyOklzc3VlQ29tbWVudDM3MzU3NjU4Mw== crusaderky 6213168 2018-03-16T01:40:05Z 2018-03-16T01:40:05Z MEMBER

For this specific problem, I think you could solve it with xarray.apply_ufunc by writing something like a gufunc that keeps the reduced axis as size 1 to apply to each chunk, and afterwards summing up along that dimension.

@shoyer could you make an example? That was my first thought but I could not figure out how to make the apply_ufunc do it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc support for chunks on input_core_dims 305757822

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 2073.47ms · About: xarray-datasette