github: issue_comments: 11 rows where issue = 484752930 sorted by updated

11 rows where issue = 484752930 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
529168271	https://github.com/pydata/xarray/pull/3258#issuecomment-529168271	https://api.github.com/repos/pydata/xarray/issues/3258	MDEyOklzc3VlQ29tbWVudDUyOTE2ODI3MQ==	dcherian 2448579	2019-09-08T04:20:19Z	2019-09-08T04:20:19Z	MEMBER	Closing in favour of #3276	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	[WIP] Add map_blocks. 484752930
527187603	https://github.com/pydata/xarray/pull/3258#issuecomment-527187603	https://api.github.com/repos/pydata/xarray/issues/3258	MDEyOklzc3VlQ29tbWVudDUyNzE4NzYwMw==	mrocklin 306380	2019-09-02T15:37:18Z	2019-09-02T15:37:18Z	MEMBER	I'm glad to see progress here. FWIW, I think that many people would be quite happy with a version that just worked for DataArrays, in case that's faster to get in than the full solution with DataSets.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	[WIP] Add map_blocks. 484752930
527186872	https://github.com/pydata/xarray/pull/3258#issuecomment-527186872	https://api.github.com/repos/pydata/xarray/issues/3258	MDEyOklzc3VlQ29tbWVudDUyNzE4Njg3Mg==	dcherian 2448579	2019-09-02T15:34:21Z	2019-09-02T15:34:21Z	MEMBER	Thanks. That worked. I have a new version up in #3276 that works with both DataArrays and Datasets.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	[WIP] Add map_blocks. 484752930
526756738	https://github.com/pydata/xarray/pull/3258#issuecomment-526756738	https://api.github.com/repos/pydata/xarray/issues/3258	MDEyOklzc3VlQ29tbWVudDUyNjc1NjczOA==	mrocklin 306380	2019-08-30T21:31:49Z	2019-08-30T21:32:02Z	MEMBER	Then you can construct a tuple as a task `(1, 2, 3)` -> `(tuple, [1, 2, 3])`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	[WIP] Add map_blocks. 484752930
526751676	https://github.com/pydata/xarray/pull/3258#issuecomment-526751676	https://api.github.com/repos/pydata/xarray/issues/3258	MDEyOklzc3VlQ29tbWVudDUyNjc1MTY3Ng==	dcherian 2448579	2019-08-30T21:11:28Z	2019-08-30T21:11:28Z	MEMBER	Thanks @mrocklin. Unfortunately that doesn't work with the Dataset constructor. With a list it treats it as array-like ``` The following notations are accepted: `- mapping {var name: DataArray} - mapping {var name: Variable} - mapping {var name: (dimension name, array-like)} - mapping {var name: (tuple of dimension names, array-like)} - mapping {dimension name: array-like} (it will be automatically moved to coords, see below)` ``` Unless @shoyer has another idea, I guess I can insert creating a DataArray into the graph and then refer to those keys in the Dataset constructor.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	[WIP] Add map_blocks. 484752930
525966384	https://github.com/pydata/xarray/pull/3258#issuecomment-525966384	https://api.github.com/repos/pydata/xarray/issues/3258	MDEyOklzc3VlQ29tbWVudDUyNTk2NjM4NA==	mrocklin 306380	2019-08-28T23:54:48Z	2019-08-28T23:54:48Z	MEMBER	Dask doesn't traverse through tuples to find possible keys, so the keys here are hidden from view: `python {'a': (('x', 'y'), ('xarray-a-f178df193efafa67203f3862b3f9f0f4', 0, 0)),` I recommend changing wrapping tuples with lists: `diff - {'a': (('x', 'y'), ('xarray-a-f178df193efafa67203f3862b3f9f0f4', 0, 0)), + {'a': [('x', 'y'), ('xarray-a-f178df193efafa67203f3862b3f9f0f4', 0, 0)],`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	[WIP] Add map_blocks. 484752930
525965607	https://github.com/pydata/xarray/pull/3258#issuecomment-525965607	https://api.github.com/repos/pydata/xarray/issues/3258	MDEyOklzc3VlQ29tbWVudDUyNTk2NTYwNw==	dcherian 2448579	2019-08-28T23:51:43Z	2019-08-28T23:53:46Z	MEMBER	I started prototyping a Dataset version. Here's what I have: ``` python import dask import xarray as xr darray = xr.DataArray(np.ones((10, 20)), dims=['x', 'y'], coords={'x': np.arange(10), 'y': np.arange(100, 120)}) dset = darray.to_dataset(name='a') dset['b'] = dset.a + 50 dset['c'] = (dset.x + 20) dset = dset.chunk({'x': 4, 'y': 5}) ``` The function I'm applying takes a dataset and returns a DataArray because that's easy to test without figuring out how to assemble everything back into a dataset. ``` python import itertools function takes dataset and returns dataarray so that I can check that things work without reconstructing a dataset def function(ds): return ds.a + 10 dataset_dims = list(dset.dims) graph = {} gname = 'dsnew' map dims to list of chunk indexes If different variables have different chunking along the same dim the call to .chunks will raise an error. ichunk = {dim: range(len(dset.chunks[dim])) for dim in dataset_dims} iterate over all possible chunk combinations for v in itertools.product(*ichunk.values()): chunk_index_dict = dict(zip(dataset_dims, v)) data_vars = {} for name, variable in dset.data_vars.items(): # why do does dask_keys have an extra level? # the [0] is not required for dataarrays var_dask_keys = variable.dask_keys()[0] `# recursively index into dask_keys nested list chunk = var_dask_keys for dim in variable.dims: chunk = chunk[chunk_index_dict[dim]] # I have key corresponding to chunk # this tuple is in a dictionary passed to xr.Dataset() # dask doesn't seem to replace this with a numpy array at execution time. data_vars[name] = (variable.dims, chunk) graph[(gname, ) + v] = (function, (xr.Dataset, data_vars))` final_graph = dask.highlevelgraph.HighLevelGraph.from_collections(name, graph, dependencies=[dset]) ``` Elements of the graph look like `('dsnew', 0, 0): (<function __main__.function(ds)>, (xarray.core.dataset.Dataset, {'a': (('x', 'y'), ('xarray-a-f178df193efafa67203f3862b3f9f0f4', 0, 0)), 'b': (('x', 'y'), ('xarray-b-e2d8d06bb9e5c1f351671a94816bd331', 0, 0)), 'c': (('x',), ('xarray-c-d90f8b2af715b53f4c170be391239655', 0))}))` This doesn't work because dask doesn't replace the keys by numpy arrays when the `xr.Dataset` call is executed. `result = dask.array.Array(final_graph, name=gname, chunks=dset.a.data.chunks, meta=dset.a.data._meta) dask.compute(result)` `ValueError: Could not convert tuple of form (dims, data[, attrs, encoding]): (('x', 'y'), ('xarray-a-f178df193efafa67203f3862b3f9f0f4', 0, 0)) to Variable.` The graph is "disconnected": I'm not sure what I'm doing wrong here. An equivalent version for DataArrays works perfectly.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	[WIP] Add map_blocks. 484752930
525427300	https://github.com/pydata/xarray/pull/3258#issuecomment-525427300	https://api.github.com/repos/pydata/xarray/issues/3258	MDEyOklzc3VlQ29tbWVudDUyNTQyNzMwMA==	shoyer 1217238	2019-08-27T18:30:34Z	2019-08-27T18:30:34Z	MEMBER	apply_ufunc is extremely powerful, and when you need to cope with all possible shape transformations, I suspect its verbosity is quite necessary. It's just that, when all you need to do is apply an elementwise, embarassingly parallel function (80% of the times in my real life experience), apply_ufunc is overkill. Yes, 100% agreed! There is a real need for a simpler version of `apply_ufunc`. The thing I have against the name map_blocks is that backends other than dask have no notion of blocks... I think the functionality in this PR is fundamentally dask specific. We shouldn't make a habit of adding backend specific features, but it makes sense in limited cases.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	[WIP] Add map_blocks. 484752930
525425560	https://github.com/pydata/xarray/pull/3258#issuecomment-525425560	https://api.github.com/repos/pydata/xarray/issues/3258	MDEyOklzc3VlQ29tbWVudDUyNTQyNTU2MA==	crusaderky 6213168	2019-08-27T18:26:17Z	2019-08-27T18:26:17Z	MEMBER	@shoyer let me rephrase it - apply_ufunc is extremely powerful, and when you need to cope with all possible shape transformations, I suspect its verbosity is quite necessary. It's just that, when all you need to do is apply an elementwise, embarassingly parallel function (80% of the times in my real life experience), apply_ufunc is overkill. The thing I have against the name map_blocks is that backends other than dask have no notion of blocks...	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	[WIP] Add map_blocks. 484752930
525384446	https://github.com/pydata/xarray/pull/3258#issuecomment-525384446	https://api.github.com/repos/pydata/xarray/issues/3258	MDEyOklzc3VlQ29tbWVudDUyNTM4NDQ0Ng==	shoyer 1217238	2019-08-27T16:40:32Z	2019-08-27T16:40:32Z	MEMBER	could we call it just "map"? It makes sense as this thing would be very useful for non-dask based arrays too. Working routinely with scipy (chiefly with scipy.stats transforms), I tire a lot of writing very verbose `xarray.apply_ufunc` calls. I agree that `apply_ufunc` is overly verbose. See https://github.com/pydata/xarray/issues/1074 and https://github.com/pydata/xarray/issues/1618 (and issues linked therein) for discussion about alternative APIs. I still think this particular set of functionality should be called `map_blocks`, because it works by applying functions over each block, very similar to dask's `map_blocks`.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	[WIP] Add map_blocks. 484752930
525298264	https://github.com/pydata/xarray/pull/3258#issuecomment-525298264	https://api.github.com/repos/pydata/xarray/issues/3258	MDEyOklzc3VlQ29tbWVudDUyNTI5ODI2NA==	crusaderky 6213168	2019-08-27T13:21:44Z	2019-08-27T13:21:44Z	MEMBER	Hi, A few design opinions: could we call it just "map"? It makes sense as this thing would be very useful for non-dask based arrays too. Working routinely with scipy (chiefly with scipy.stats transforms), I tire a lot of writing very verbose `xarray.apply_ufunc` calls. could we have it as a method of DataArray and Dataset, to allow for method chaining? e.g. `python myarray.map(func1).chunk().map(func2).sum().compute()`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	[WIP] Add map_blocks. 484752930

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

11 rows where issue = 484752930 sorted by updated_at descending

function takes dataset and returns dataarray so that I can check that things work without reconstructing a dataset

map dims to list of chunk indexes

If different variables have different chunking along the same dim

the call to .chunks will raise an error.

iterate over all possible chunk combinations

Advanced export