github: issue_comments: 52 rows where author_association = "MEMBER" and user = 1312546 sorted by updated

52 rows where author_association = "MEMBER" and user = 1312546 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
964099251	https://github.com/pydata/xarray/issues/4648#issuecomment-964099251	https://api.github.com/repos/pydata/xarray/issues/4648	IC_kwDOAMm_X845dvyz	TomAugspurger 1312546	2021-11-09T12:17:32Z	2021-11-09T12:17:32Z	MEMBER	"In charge of" is overstating it a bit. It's been segfaulting when building pandas and I haven't had a chance to debug it. If / when I get around to fixing it I'll try adding xarray, but it might be a bit.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Comprehensive benchmarking suite 756425955
953858365	https://github.com/pydata/xarray/pull/5906#issuecomment-953858365	https://api.github.com/repos/pydata/xarray/issues/5906	IC_kwDOAMm_X8442rk9	TomAugspurger 1312546	2021-10-28T13:43:04Z	2021-10-28T13:43:04Z	MEMBER	There are two changes here Only check the `.data` of non-index variables, done at https://github.com/pydata/xarray/pull/5906/files#diff-763e3002fd954d544b05858d8d138b828b66b6a2a0ae3cd58d2040a652f14638R4161-R4163 The check for whether or not a full index was needed is done in a `for dim in dims` loop, but the condition doesn't actually depend on `dim`. So I lifted that check out of the for loop (doesn't matter much, since stuff is cached). cc @dcherian	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Avoid accessing slow .data in unstack 1038531231
953379569	https://github.com/pydata/xarray/issues/5902#issuecomment-953379569	https://api.github.com/repos/pydata/xarray/issues/5902	IC_kwDOAMm_X84402rx	TomAugspurger 1312546	2021-10-27T23:19:49Z	2021-10-27T23:19:49Z	MEMBER	Thanks @dcherian, that seems to fix this performance problem. I'll see if the tests pass and will submit a PR. I came across #5582 while searching, thanks :)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Slow performance of `DataArray.unstack()` from checking `variable.data` 1037894157
953344052	https://github.com/pydata/xarray/issues/5902#issuecomment-953344052	https://api.github.com/repos/pydata/xarray/issues/5902	IC_kwDOAMm_X8440uA0	TomAugspurger 1312546	2021-10-27T22:02:58Z	2021-10-27T22:03:35Z	MEMBER	Oh, hmm... I'm noticing now that `IndexVariable` (currently) eagerly loads data into memory, so that check will always be false for the problematic IndexVariable variable. So perhaps a slight adjustment to `is_duck_dask_array` to handle `xarray.Variable` ? ```diff diff --git a/xarray/core/dataset.py b/xarray/core/dataset.py index 550c3587..16637574 100644 --- a/xarray/core/dataset.py +++ b/xarray/core/dataset.py @@ -4159,14 +4159,14 @@ class Dataset(DataWithCoords, DatasetArithmetic, Mapping): # Dask arrays don't support assignment by index, which the fast unstack # function requires. # https://github.com/pydata/xarray/pull/4746#issuecomment-753282125 - any(is_duck_dask_array(v.data) for v in self.variables.values()) + any(is_duck_dask_array(v) for v in self.variables.values()) # Sparse doesn't currently support (though we could special-case # it) # https://github.com/pydata/sparse/issues/422 - or any( - isinstance(v.data, sparse_array_type) - for v in self.variables.values() - ) + # or any( + # isinstance(v.data, sparse_array_type) + # for v in self.variables.values() + # ) or sparse # Until https://github.com/pydata/xarray/pull/4751 is resolved, # we check explicitly whether it's a numpy array. Once that is @@ -4177,9 +4177,9 @@ class Dataset(DataWithCoords, DatasetArithmetic, Mapping): # # or any( # # isinstance(v.data, pint_array_type) for v in self.variables.values() # # ) - or any( - not isinstance(v.data, np.ndarray) for v in self.variables.values() - ) + # or any( + # not isinstance(v.data, np.ndarray) for v in self.variables.values() + # ) ): result = result._unstack_full_reindex(dim, fill_value, sparse) else: diff --git a/xarray/core/pycompat.py b/xarray/core/pycompat.py index d1649235..e9669105 100644 --- a/xarray/core/pycompat.py +++ b/xarray/core/pycompat.py @@ -44,6 +44,12 @@ class DuckArrayModule: def is_duck_dask_array(x): + from xarray.core.variable import IndexVariable, Variable + if isinstance(x, IndexVariable): + return False + elif isinstance(x, Variable): + x = x.data + if DuckArrayModule("dask").available: from dask.base import is_dask_collection ``` That's completely ignoring the accesses to `v.data` for the sparse and pint checks, which don't look quite as easy to solve.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Slow performance of `DataArray.unstack()` from checking `variable.data` 1037894157
932811398	https://github.com/pydata/xarray/issues/5764#issuecomment-932811398	https://api.github.com/repos/pydata/xarray/issues/5764	IC_kwDOAMm_X843mZKG	TomAugspurger 1312546	2021-10-02T19:48:05Z	2021-10-02T19:48:05Z	MEMBER	Mmm for better or worse, Dask relies on sizeof to estimate the memory usage of objects at runtime. We could move that over to some new duck-typed interface like using `.nbytes` if it's around, but not all objects will want to expose an nbytes attribute in their API. IMO, I think the best path is for objects to implement `__getsizeof__` unless there's some downside I'm missing.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Implement __sizeof__ on objects? 988158051
852667695	https://github.com/pydata/xarray/issues/5426#issuecomment-852667695	https://api.github.com/repos/pydata/xarray/issues/5426	MDEyOklzc3VlQ29tbWVudDg1MjY2NzY5NQ==	TomAugspurger 1312546	2021-06-02T02:37:18Z	2021-06-02T02:37:18Z	MEMBER	Do you run into poor load balancing as well when using Zarr with Xarray? The only thing that comes to mind is everything being assigned to one worker when the entire task graph has a single node at the base of the task graph. But then work stealing kicks in and things level out (that was a while ago though). I haven't noticed any kind of systemic load balancing problem, but I can take a look at that notebook later.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Implement dask.sizeof for xarray.core.indexing.ImplicitToExplicitIndexingAdapter 908971901
852666211	https://github.com/pydata/xarray/issues/5426#issuecomment-852666211	https://api.github.com/repos/pydata/xarray/issues/5426	MDEyOklzc3VlQ29tbWVudDg1MjY2NjIxMQ==	TomAugspurger 1312546	2021-06-02T02:33:28Z	2021-06-02T02:33:28Z	MEMBER	https://github.com/dask/dask/pull/6203 and https://github.com/dask/dask/pull/6773/ are the maybe relevant issues. I actually don't know if that could have an effect here. I don't know (and a brief search couldn't confirm) whether or not xarray uses `dask.array.from_zarr`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Implement dask.sizeof for xarray.core.indexing.ImplicitToExplicitIndexingAdapter 908971901
767797103	https://github.com/pydata/xarray/issues/1094#issuecomment-767797103	https://api.github.com/repos/pydata/xarray/issues/1094	MDEyOklzc3VlQ29tbWVudDc2Nzc5NzEwMw==	TomAugspurger 1312546	2021-01-26T20:09:11Z	2021-01-26T20:09:11Z	MEMBER	Should this and https://github.com/pydata/xarray/issues/1650 be consolidated into a single issue? I think that they're duplicates of eachother.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Supporting out-of-core computation/indexing for very large indexes 187873247
752156934	https://github.com/pydata/xarray/issues/4738#issuecomment-752156934	https://api.github.com/repos/pydata/xarray/issues/4738	MDEyOklzc3VlQ29tbWVudDc1MjE1NjkzNA==	TomAugspurger 1312546	2020-12-29T16:53:16Z	2020-12-29T16:53:16Z	MEMBER	IIUC, something like https://github.com/dask/dask/blob/4a7a2438219c4ee493434042e50f4cdb67b6ec9f/dask/base.py#L778 is what you're looking for. Further down we register tokenizers for various types like pandas' DataFrames and ndarrays.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	ENH: Compute hash of xarray objects 775502974
749205535	https://github.com/pydata/xarray/issues/4717#issuecomment-749205535	https://api.github.com/repos/pydata/xarray/issues/4717	MDEyOklzc3VlQ29tbWVudDc0OTIwNTUzNQ==	TomAugspurger 1312546	2020-12-21T21:29:56Z	2020-12-21T21:29:56Z	MEMBER	I'm not sure offhand. Maybe best to post an issue on the pandas tracker.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	⚠️ Nightly upstream-dev CI failed ⚠️ 771484861
712066302	https://github.com/pydata/xarray/issues/4428#issuecomment-712066302	https://api.github.com/repos/pydata/xarray/issues/4428	MDEyOklzc3VlQ29tbWVudDcxMjA2NjMwMg==	TomAugspurger 1312546	2020-10-19T11:08:13Z	2020-10-19T11:43:46Z	MEMBER	Sorry, my comment in https://github.com/pydata/xarray/issues/4428#issuecomment-711034128 was incorrect in a couple ways We still do the splitting, even when slicing with an out-of-order indexer. Checking on if that's appropriate. I'm checking in on a logic bug when computing the number of chunks. I don't think we properly handle non-uniform chunking on the other axes.	{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Behaviour change in xarray.Dataset.sortby/sel between dask==2.25.0 and dask==2.26.0 702646191
711034128	https://github.com/pydata/xarray/issues/4428#issuecomment-711034128	https://api.github.com/repos/pydata/xarray/issues/4428	MDEyOklzc3VlQ29tbWVudDcxMTAzNDEyOA==	TomAugspurger 1312546	2020-10-17T15:54:48Z	2020-10-17T15:54:48Z	MEMBER	I assume that the indices `[np.argsort(da.x.data)]` are not going to be monotonically increasing. That induces a different slicing pattern. The docs in https://docs.dask.org/en/latest/array-slicing.html#efficiency describe the case where the indices are sorted, but doesn't discuss the non-sorted case (yet).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Behaviour change in xarray.Dataset.sortby/sel between dask==2.25.0 and dask==2.26.0 702646191
709539887	https://github.com/pydata/xarray/issues/4428#issuecomment-709539887	https://api.github.com/repos/pydata/xarray/issues/4428	MDEyOklzc3VlQ29tbWVudDcwOTUzOTg4Nw==	TomAugspurger 1312546	2020-10-15T19:20:53Z	2020-10-15T19:20:53Z	MEMBER	Closing the loop here, with https://github.com/dask/dask/pull/6665 the behavior of Dask=2.25.0 should be restored (possibly with a warning about creating large chunks). So this can probably be closed, though there may be parts of xarray that should be updated to avoid creating large chunks, or we could rely on the user to do that through the dask config system.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Behaviour change in xarray.Dataset.sortby/sel between dask==2.25.0 and dask==2.26.0 702646191
694817581	https://github.com/pydata/xarray/pull/4432#issuecomment-694817581	https://api.github.com/repos/pydata/xarray/issues/4432	MDEyOklzc3VlQ29tbWVudDY5NDgxNzU4MQ==	TomAugspurger 1312546	2020-09-18T11:36:49Z	2020-09-18T11:36:49Z	MEMBER	I'm not sure, but I don't think so. It's strange that it didn't fail on the pull request. On Thu, Sep 17, 2020 at 8:51 PM Maximilian Roos notifications@github.com wrote: Might be best to proceed with #4434 https://github.com/pydata/xarray/pull/4434 for now. I'll need to give this a bit of thought. OK, as you wish, I'll merge if that passes. But your change did pass before the merge. Could it be a conflict (in functionality, not git) with recent changes on master? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/pull/4432#issuecomment-694601049, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKAOISNY5A5N2A44YR2ZMLSGK4JTANCNFSM4RQ6OP2Q .	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Fix optimize for chunked DataArray 703881154
694594817	https://github.com/pydata/xarray/pull/4432#issuecomment-694594817	https://api.github.com/repos/pydata/xarray/issues/4432	MDEyOklzc3VlQ29tbWVudDY5NDU5NDgxNw==	TomAugspurger 1312546	2020-09-18T01:27:30Z	2020-09-18T01:27:30Z	MEMBER	Might be best to proceed with https://github.com/pydata/xarray/pull/4434 for now. I'll need to give this a bit of thought.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Fix optimize for chunked DataArray 703881154
694593225	https://github.com/pydata/xarray/pull/4432#issuecomment-694593225	https://api.github.com/repos/pydata/xarray/issues/4432	MDEyOklzc3VlQ29tbWVudDY5NDU5MzIyNQ==	TomAugspurger 1312546	2020-09-18T01:22:43Z	2020-09-18T01:22:43Z	MEMBER	Huh, I'm able to reproduce locally. Looking into it now.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Fix optimize for chunked DataArray 703881154
691083939	https://github.com/pydata/xarray/issues/4406#issuecomment-691083939	https://api.github.com/repos/pydata/xarray/issues/4406	MDEyOklzc3VlQ29tbWVudDY5MTA4MzkzOQ==	TomAugspurger 1312546	2020-09-11T13:07:00Z	2020-09-11T13:07:00Z	MEMBER	@TomAugspurger do you know off-hand if there have been any recent changes in Dask's scheduler that could have caused this? This is just using Dask's threaded scheduler, right? I don't recall any changes there recently.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Threading Lock issue with to_netcdf and Dask arrays 694112301
690378323	https://github.com/pydata/xarray/issues/3698#issuecomment-690378323	https://api.github.com/repos/pydata/xarray/issues/3698	MDEyOklzc3VlQ29tbWVudDY5MDM3ODMyMw==	TomAugspurger 1312546	2020-09-10T15:42:54Z	2020-09-10T15:42:54Z	MEMBER	Thanks for confirming. I'll take another look at this today then. On Thu, Sep 10, 2020 at 10:30 AM Deepak Cherian notifications@github.com wrote: Reopened #3698 https://github.com/pydata/xarray/issues/3698. — You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/3698#event-3751728444, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKAOIT6LDBKVUQ5KR7VFB3SFDWI3ANCNFSM4KHH63GQ .	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	dask.optimize on xarray objects 550355524
689808725	https://github.com/pydata/xarray/issues/3698#issuecomment-689808725	https://api.github.com/repos/pydata/xarray/issues/3698	MDEyOklzc3VlQ29tbWVudDY4OTgwODcyNQ==	TomAugspurger 1312546	2020-09-09T20:38:39Z	2020-09-09T20:38:39Z	MEMBER	FYI, @dcherian your recent PR to dask fixed this example. Playing around with chunk sizes, it seems to have fixed it even when the chunk size exceeds `dask.config['array']['chunk-size']`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	dask.optimize on xarray objects 550355524
668256401	https://github.com/pydata/xarray/issues/3147#issuecomment-668256401	https://api.github.com/repos/pydata/xarray/issues/3147	MDEyOklzc3VlQ29tbWVudDY2ODI1NjQwMQ==	TomAugspurger 1312546	2020-08-03T21:42:42Z	2020-08-03T21:42:42Z	MEMBER	Thanks for that link. I hope that map_overlap could use pad internally for the external boundaries. On Mon, Aug 3, 2020 at 3:22 PM Deepak Cherian notifications@github.com wrote: This issue about coordinate labels for boundaries exists with pad too: 3868 https://github.com/pydata/xarray/issues/3868 Can map_overlap just use DataArray.pad and we can fix things there? Or perhaps we can expect users to add a call to pad before map_overlap? — You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/3147#issuecomment-668223125, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKAOIWLGJZYO63S7IXTEH3R64MAZANCNFSM4IFAIWOA .	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Implementing map_blocks and map_overlap 470024896
668242904	https://github.com/pydata/xarray/pull/4305#issuecomment-668242904	https://api.github.com/repos/pydata/xarray/issues/4305	MDEyOklzc3VlQ29tbWVudDY2ODI0MjkwNA==	TomAugspurger 1312546	2020-08-03T21:08:38Z	2020-08-03T21:08:38Z	MEMBER	The doc failure looks unrelated: ``` Exception in /home/docs/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/doc/plotting.rst at block ending on line None Specify :okexcept: as an option in the ipython:: block to suppress this message KeyError Traceback (most recent call last) <ipython-input-75-c7d6afd7f8c5> in <module> ----> 1 g_simple = t.plot(x="lon", y="lat", col="time", col_wrap=3) ~/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/xarray/plot/plot.py in call(self, kwargs) 444 445 def call(self, kwargs): --> 446 return plot(self._da, kwargs) 447 448 # we can't use functools.wraps here since that also modifies the name / qualname ~/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/xarray/plot/plot.py in plot(darray, row, col, col_wrap, ax, hue, rtol, subplot_kws, kwargs) 198 kwargs["ax"] = ax 199 --> 200 return plotfunc(darray, kwargs) 201 202 ~/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/xarray/plot/plot.py in newplotfunc(darray, x, y, figsize, size, aspect, ax, row, col, col_wrap, xincrease, yincrease, add_colorbar, add_labels, vmin, vmax, cmap, center, robust, extend, levels, infer_intervals, colors, subplot_kws, cbar_ax, cbar_kwargs, xscale, yscale, xticks, yticks, xlim, ylim, norm, kwargs) 636 # Need the decorated plotting function 637 allargs["plotfunc"] = globals()[plotfunc.name] --> 638 return _easy_facetgrid(darray, kind="dataarray", allargs) 639 640 plt = import_matplotlib_pyplot() ~/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/xarray/plot/facetgrid.py in _easy_facetgrid(data, plotfunc, kind, x, y, row, col, col_wrap, sharex, sharey, aspect, size, subplot_kws, ax, figsize, kwargs) 642 643 if kind == "dataarray": --> 644 return g.map_dataarray(plotfunc, x, y, kwargs) 645 646 if kind == "dataset": ~/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/xarray/plot/facetgrid.py in map_dataarray(self, func, x, y, kwargs) 263 # Get x, y labels for the first subplot 264 x, y = _infer_xy_labels( --> 265 darray=self.data.loc[self.name_dicts.flat[0]], 266 x=x, 267 y=y, ~/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/xarray/core/dataarray.py in getitem(self, key) 196 labels = indexing.expanded_indexer(key, self.data_array.ndim) 197 key = dict(zip(self.data_array.dims, labels)) --> 198 return self.data_array.sel(key) 199 200 def setitem(self, key, value) -> None: ~/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/xarray/core/dataarray.py in sel(self, indexers, method, tolerance, drop, indexers_kwargs) 1147 1148 """ -> 1149 ds = self._to_temp_dataset().sel( 1150 indexers=indexers, 1151 drop=drop, ~/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/xarray/core/dataset.py in sel(self, indexers, method, tolerance, drop, indexers_kwargs) 2099 """ 2100 indexers = either_dict_or_kwargs(indexers, indexers_kwargs, "sel") -> 2101 pos_indexers, new_indexes = remap_label_indexers( 2102 self, indexers=indexers, method=method, tolerance=tolerance 2103 ) ~/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/xarray/core/coordinates.py in remap_label_indexers(obj, indexers, method, tolerance, indexers_kwargs) 394 } 395 --> 396 pos_indexers, new_indexes = indexing.remap_label_indexers( 397 obj, v_indexers, method=method, tolerance=tolerance 398 ) ~/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/xarray/core/indexing.py in remap_label_indexers(data_obj, indexers, method, tolerance) 268 coords_dtype = data_obj.coords[dim].dtype 269 label = maybe_cast_to_coords_dtype(label, coords_dtype) --> 270 idxr, new_idx = convert_label_indexer(index, label, dim, method, tolerance) 271 pos_indexers[dim] = idxr 272 if new_idx is not None: ~/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/xarray/core/indexing.py in convert_label_indexer(index, label, index_name, method, tolerance) 187 indexer = index.get_loc(label.item()) 188 else: --> 189 indexer = index.get_loc( 190 label.item(), method=method, tolerance=tolerance 191 ) ~/checkouts/readthedocs.org/user_builds/xray/conda/4305/lib/python3.8/site-packages/pandas/core/indexes/datetimes.py in get_loc(self, key, method, tolerance) 620 else: 621 # unrecognized type --> 622 raise KeyError(key) 623 624 try: KeyError: 1356998400000000000 <<<------------------------------------------------------------------------- ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Fix map_blocks examples 672281867
668209121	https://github.com/pydata/xarray/issues/3147#issuecomment-668209121	https://api.github.com/repos/pydata/xarray/issues/3147	MDEyOklzc3VlQ29tbWVudDY2ODIwOTEyMQ==	TomAugspurger 1312546	2020-08-03T19:47:47Z	2020-08-03T19:47:57Z	MEMBER	I'm thinking through a `map_overlap` API right now. In dask, map_overlap requires a few extra arguments depth: int, tuple, dict or list The number of elements that each block should share with its neighbors If a tuple or dict then this can be different per axis. If a list then each element of that list must be an int, tuple or dict defining depth for the corresponding array in `args`. Asymmetric depths may be specified using a dict value of (-/+) tuples. Note that asymmetric depths are currently only supported when ``boundary`` is 'none'. The default value is 0. boundary: str, tuple, dict or list How to handle the boundaries. Values include 'reflect', 'periodic', 'nearest', 'none', or any constant value like 0 or np.nan. If a list then each element must be a str, tuple or dict defining the boundary for the corresponding array in `args`. The default value is 'reflect'. In `dask.array` those must be dicts whose keys are the axis number. For xarray we would want to allow the dimension names there. I'm not sure how to handle the DataArray labels for the boundary chunks (dask docs at https://docs.dask.org/en/latest/array-overlap.html#boundaries). For `reflect` / `periodic` I think things are OK, we perhaps just use the label associated with that value. I'm not sure what to do for constants.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Implementing map_blocks and map_overlap 470024896
663584770	https://github.com/pydata/xarray/pull/4256#issuecomment-663584770	https://api.github.com/repos/pydata/xarray/issues/4256	MDEyOklzc3VlQ29tbWVudDY2MzU4NDc3MA==	TomAugspurger 1312546	2020-07-24T15:06:03Z	2020-07-24T15:06:03Z	MEMBER	Yep. I believe that @ogrisel can add you to the organization on anaconda.org so that you can create a key to upload to packages.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	fix matplotlib errors for single level discrete colormaps 664363493
663082208	https://github.com/pydata/xarray/pull/4254#issuecomment-663082208	https://api.github.com/repos/pydata/xarray/issues/4254	MDEyOklzc3VlQ29tbWVudDY2MzA4MjIwOA==	TomAugspurger 1312546	2020-07-23T15:45:57Z	2020-07-23T15:45:57Z	MEMBER	FYI https://github.com/pandas-dev/pandas/pull/35393 is the PR to follow. It'll be included in pandas 1.1.0, which should be out in a week or so.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	fix the RTD timeouts 663977922
641332231	https://github.com/pydata/xarray/issues/4133#issuecomment-641332231	https://api.github.com/repos/pydata/xarray/issues/4133	MDEyOklzc3VlQ29tbWVudDY0MTMzMjIzMQ==	TomAugspurger 1312546	2020-06-09T14:24:59Z	2020-06-09T14:31:26Z	MEMBER	Ah, the (numpy) build failure is because pandas doesn't have a py38 entry in our pyproject.toml. Fixing that now. edit: https://github.com/pandas-dev/pandas/pull/34667. But you'll still want to update your CI at https://github.com/pydata/xarray/blob/2a288f6ed4286910fcf3ab9895e1e9cbd44d30b4/ci/azure/install.yml#L16 and https://github.com/pydata/xarray/blob/2a288f6ed4286910fcf3ab9895e1e9cbd44d30b4/ci/azure/install.yml#L23 to pull from the new locations.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	upstream-dev failure when installing pandas 634979933
641330288	https://github.com/pydata/xarray/issues/4133#issuecomment-641330288	https://api.github.com/repos/pydata/xarray/issues/4133	MDEyOklzc3VlQ29tbWVudDY0MTMzMDI4OA==	TomAugspurger 1312546	2020-06-09T14:22:02Z	2020-06-09T14:22:02Z	MEMBER	@keewis not sure about the build issue, but we (along with many other projects) recently moved our wheels to upload to https://anaconda.org/scipy-wheels-nightly/. https://anaconda.org/scipy-wheels-nightly/pandas/ does have py38 wheels.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	upstream-dev failure when installing pandas 634979933
636808986	https://github.com/pydata/xarray/issues/4112#issuecomment-636808986	https://api.github.com/repos/pydata/xarray/issues/4112	MDEyOklzc3VlQ29tbWVudDYzNjgwODk4Ng==	TomAugspurger 1312546	2020-06-01T11:44:23Z	2020-06-01T11:44:23Z	MEMBER	Rechunking the `indexer` array is how I would be explicit about the desired chunk size. Opened https://github.com/dask/dask/issues/6270 to discuss this on the dask side.	{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Unexpected chunking behavior when using `xr.align` with `join='outer'` 627600168
622128514	https://github.com/pydata/xarray/pull/3816#issuecomment-622128514	https://api.github.com/repos/pydata/xarray/issues/3816	MDEyOklzc3VlQ29tbWVudDYyMjEyODUxNA==	TomAugspurger 1312546	2020-04-30T21:38:21Z	2020-04-30T21:38:21Z	MEMBER	Makes sense. template seems fine. On Thu, Apr 30, 2020 at 3:35 PM Deepak Cherian notifications@github.com wrote: Thanks for the review @TomAugspurger https://github.com/TomAugspurger Question on the name template. I think in dask.dataframe and dask.array we might call this meta. Is that keyword already used elsewhere in xarray? template is also a fine name though. I added the meta kwarg to apply_ufunc so that users could pass that down to dask i.e. that meta = dask's meta = np.ndarray or something like that. So I'd like to avoid reusing meta here where it would exclusively be an xarray object ≠ dask's meta BUT it seems to me like there's a better name than template. Any ideas? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/pull/3816#issuecomment-622095710, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKAOIQWE7DGYAOSJWLG5F3RPHOI7ANCNFSM4K7ODDRA .	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Add template xarray object kwarg to map_blocks 573768194
592101136	https://github.com/pydata/xarray/issues/3698#issuecomment-592101136	https://api.github.com/repos/pydata/xarray/issues/3698	MDEyOklzc3VlQ29tbWVudDU5MjEwMTEzNg==	TomAugspurger 1312546	2020-02-27T18:13:28Z	2020-02-27T18:13:28Z	MEMBER	It looks like xarray is getting a bad task graph after the optimize. ```python In [1]: import xarray as xr import dask In [2]: import dask In [3]: a = dask.array.ones((10,5), chunks=(1,3)) ...: a = dask.optimize(a)[0] In [4]: da = xr.DataArray(a.compute()).chunk({"dim_0": 5}) ...: da = dask.optimize(da)[0] In [5]: dict(da.dask_graph()) Out[5]: {('xarray-<this-array>-e2865aa10d476e027154771611541f99', 1, 0): (<function _operator.getitem(a, b, /)>, 'xarray-<this-array>-e2865aa10d476e027154771611541f99', (slice(5, 10, None), slice(0, 5, None))), ('xarray-<this-array>-e2865aa10d476e027154771611541f99', 0, 0): (<function _operator.getitem(a, b, /)>, 'xarray-<this-array>-e2865aa10d476e027154771611541f99', (slice(0, 5, None), slice(0, 5, None)))} ``` Notice that are references to `xarray-<this-array>-e2865aa10d476e027154771611541f99` (just the string, not a tuple representing a chunk) but that key isn't in the graph. If we manually insert that, you'll see things work ```python In [9]: dsk['xarray-<this-array>-e2865aa10d476e027154771611541f99'] = da._to_temp_dataset()[xr.core.dataarray._THIS_ARRAY] In [11]: dask.get(dsk, keys=[('xarray-<this-array>-e2865aa10d476e027154771611541f99', 1, 0)]) Out[11]: (<xarray.DataArray \<this-array> (dim_0: 5, dim_1: 5)> dask.array<getitem, shape=(5, 5), dtype=float64, chunksize=(5, 5), chunktype=numpy.ndarray> Dimensions without coordinates: dim_0, dim_1,) ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	dask.optimize on xarray objects 550355524
582972083	https://github.com/pydata/xarray/issues/3751#issuecomment-582972083	https://api.github.com/repos/pydata/xarray/issues/3751	MDEyOklzc3VlQ29tbWVudDU4Mjk3MjA4Mw==	TomAugspurger 1312546	2020-02-06T15:55:30Z	2020-02-06T15:55:30Z	MEMBER	FWIW, I think @jbrockmendel is still progressing on an "extension index" interface where you could have a custom dtype / Index subclass that would be properly supported. Long-term, that's the best solution. Short-term, I'm less sure what's best.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	more upstream-dev cftime failures 559873728
580462361	https://github.com/pydata/xarray/pull/3640#issuecomment-580462361	https://api.github.com/repos/pydata/xarray/issues/3640	MDEyOklzc3VlQ29tbWVudDU4MDQ2MjM2MQ==	TomAugspurger 1312546	2020-01-30T21:13:09Z	2020-01-30T21:13:09Z	MEMBER	Is my interpretation correct? Yep, that's the basic idea. Every call to `DataFrame.plot.<kind>` begins with a check for the active backend. Based on the configured value, we the correct backend, make the call, and return the result.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Add entrypoint for plotting backends 539394615
579517151	https://github.com/pydata/xarray/issues/3673#issuecomment-579517151	https://api.github.com/repos/pydata/xarray/issues/3673	MDEyOklzc3VlQ29tbWVudDU3OTUxNzE1MQ==	TomAugspurger 1312546	2020-01-28T23:12:47Z	2020-01-28T23:12:47Z	MEMBER	FYI, we had some failures in our nightly wheel builds so they weren't updated in a while. https://github.com/MacPython/pandas-wheels/pull/70 fixed that, so you'll hopefully get a new wheel tonight. On Tue, Jan 28, 2020 at 5:09 PM Deepak Cherian notifications@github.com wrote: should be closed by pandas-dev/pandas#31136 https://github.com/pandas-dev/pandas/pull/31136 . I think the tests will turn green once the wheels update — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/3673?email_source=notifications&email_token=AAKAOISQMX62U3JJPLTYVEDRAC3JRA5CNFSM4KEMIFRKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKFLHSQ#issuecomment-579515338, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKAOIT7GKDFDJV4LFZA4YDRAC3JRANCNFSM4KEMIFRA .	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Test failures with pandas master 547012915
575688251	https://github.com/pydata/xarray/issues/3673#issuecomment-575688251	https://api.github.com/repos/pydata/xarray/issues/3673	MDEyOklzc3VlQ29tbWVudDU3NTY4ODI1MQ==	TomAugspurger 1312546	2020-01-17T16:06:23Z	2020-01-17T16:06:23Z	MEMBER	Opened https://github.com/pandas-dev/pandas/issues/31109.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Test failures with pandas master 547012915
574256856	https://github.com/pydata/xarray/issues/3673#issuecomment-574256856	https://api.github.com/repos/pydata/xarray/issues/3673	MDEyOklzc3VlQ29tbWVudDU3NDI1Njg1Ng==	TomAugspurger 1312546	2020-01-14T16:25:50Z	2020-01-14T16:25:50Z	MEMBER	@jbrockmendel likely knows more about the index arithmetic issue. ```python In [22]: import xarray as xr In [23]: import pandas as pd In [24]: idx = pd.timedelta_range("1D", periods=5, freq="D") In [25]: a = xr.cftime_range("2000", periods=5) In [26]: idx + a /Users/taugspurger/sandbox/pandas/pandas/core/arrays/datetimelike.py:1204: PerformanceWarning: Adding/subtracting array of DateOffsets to TimedeltaArray not vectorized PerformanceWarning, Out[26]: Index([2000-01-02 00:00:00, 2000-01-04 00:00:00, 2000-01-06 00:00:00, 2000-01-08 00:00:00, 2000-01-10 00:00:00], dtype='object') In [27]: a + idx Out[27]: CFTimeIndex([2000-01-02 00:00:00, 2000-01-04 00:00:00, 2000-01-06 00:00:00, 2000-01-08 00:00:00, 2000-01-10 00:00:00], dtype='object') ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Test failures with pandas master 547012915
569820784	https://github.com/pydata/xarray/issues/2666#issuecomment-569820784	https://api.github.com/repos/pydata/xarray/issues/2666	MDEyOklzc3VlQ29tbWVudDU2OTgyMDc4NA==	TomAugspurger 1312546	2019-12-30T22:58:23Z	2019-12-30T22:58:23Z	MEMBER	I think this is basically the same change. Ah, I was mistaken. I was thinking we needed to plump a `dtype` argument all the way through there, but I don't think that's necessary. I may be able to submit a PR with a `dtypes` argument for `from_dataframe` tomorrow.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Dataset.from_dataframe will produce a FutureWarning for DatetimeTZ data 398107776
569810375	https://github.com/pydata/xarray/issues/2666#issuecomment-569810375	https://api.github.com/repos/pydata/xarray/issues/2666	MDEyOklzc3VlQ29tbWVudDU2OTgxMDM3NQ==	TomAugspurger 1312546	2019-12-30T22:07:30Z	2019-12-30T22:07:30Z	MEMBER	And there are a couple places that need updating, even with a `dtypes` argument to let the user specify things. We also hit this via `Dataset.__setitem__` ```pytb ~/sandbox/xarray/xarray/core/dataset.py in setitem(self, key, value) 1268 ) 1269 -> 1270 self.update({key: value}) 1271 1272 def delitem(self, key: Hashable) -> None: ~/sandbox/xarray/xarray/core/dataset.py in update(self, other, inplace) 3521 """ 3522 _check_inplace(inplace) -> 3523 merge_result = dataset_update_method(self, other) 3524 return self._replace(inplace=True, *merge_result._asdict()) 3525 ~/sandbox/xarray/xarray/core/merge.py in dataset_update_method(dataset, other) 862 other[key] = value.drop_vars(coord_names) 863 --> 864 return merge_core([dataset, other], priority_arg=1, indexes=dataset.indexes) ~/sandbox/xarray/xarray/core/merge.py in merge_core(objects, compat, join, priority_arg, explicit_coords, indexes, fill_value) 550 coerced, join=join, copy=False, indexes=indexes, fill_value=fill_value 551 ) --> 552 collected = collect_variables_and_indexes(aligned) 553 554 prioritized = _get_priority_vars_and_indexes(aligned, priority_arg, compat=compat) ~/sandbox/xarray/xarray/core/merge.py in collect_variables_and_indexes(list_of_mappings) 275 append_all(coords, indexes) 276 --> 277 variable = as_variable(variable, name=name) 278 if variable.dims == (name,): 279 variable = variable.to_index_variable() ~/sandbox/xarray/xarray/core/variable.py in as_variable(obj, name) 105 elif isinstance(obj, tuple): 106 try: --> 107 obj = Variable(obj) 108 except (TypeError, ValueError) as error: 109 # use .format() instead of % because it handles tuples consistently ~/sandbox/xarray/xarray/core/variable.py in init(self, dims, data, attrs, encoding, fastpath) 306 unrecognized encoding items. 307 """ --> 308 self._data = as_compatible_data(data, fastpath=fastpath) 309 self._dims = self._parse_dimensions(dims) 310 self._attrs = None ~/sandbox/xarray/xarray/core/variable.py in as_compatible_data(data, fastpath) 229 if isinstance(data, np.ndarray): 230 if data.dtype.kind == "O": --> 231 data = _possibly_convert_objects(data) 232 elif data.dtype.kind == "M": 233 data = np.asarray(data, "datetime64[ns]") ~/sandbox/xarray/xarray/core/variable.py in _possibly_convert_objects(values) 165 datetime64 and timedelta64, according to the pandas convention. 166 """ --> 167 return np.asarray(pd.Series(values.ravel())).reshape(values.shape) 168 169 ~/sandbox/numpy/numpy/core/_asarray.py in asarray(a, dtype, order) 83 84 """ ---> 85 return array(a, dtype, copy=False, order=order) 86 87 ~/sandbox/pandas/pandas/core/series.py in array(self, dtype) 730 "To keep the old behavior, pass 'dtype=\"datetime64[ns]\"'." 731 ) --> 732 warnings.warn(msg, FutureWarning, stacklevel=3) 733 dtype = "M8[ns]" 734 return np.asarray(self.array, dtype) ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Dataset.from_dataframe will produce a FutureWarning for DatetimeTZ data 398107776
569805431	https://github.com/pydata/xarray/issues/2666#issuecomment-569805431	https://api.github.com/repos/pydata/xarray/issues/2666	MDEyOklzc3VlQ29tbWVudDU2OTgwNTQzMQ==	TomAugspurger 1312546	2019-12-30T21:45:41Z	2019-12-30T21:48:39Z	MEMBER	Just FYI, we're potentially enforcing this deprecation in https://github.com/pandas-dev/pandas/pull/30563 (which would be included in a pandas release in a week or two). Is that likely to cause problems for xarray users? It's not clear to me what the desired behavior is (https://github.com/pydata/xarray/issues/3291 seems to want to preserve the tz, though it isn't clear they are willing to be forced into an object dtype array for it).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Dataset.from_dataframe will produce a FutureWarning for DatetimeTZ data 398107776
562310739	https://github.com/pydata/xarray/pull/3598#issuecomment-562310739	https://api.github.com/repos/pydata/xarray/issues/3598	MDEyOklzc3VlQ29tbWVudDU2MjMxMDczOQ==	TomAugspurger 1312546	2019-12-05T20:47:02Z	2019-12-05T20:47:02Z	MEMBER	Hopefully the new comments make sense. I'm struggling a bit to explain things since I don't fully understand them myself :) So it was a graph construction issue. I think so. Dask doesn't actually validate arguments passed to HighLevelGraph. But I believe we assume that when all the values in `dependencies` are themselves keys of `layers`. We didn't have that before with things like `(Pdb) pp collections[0].dask.dependencies {'all-84bc51ac43a9275b3662b0089710eab9': {'or_-64f95b81b2f8001b4c61f2023ac4c223'}, ... 'eq-abac622d95ce5055d3e7b7dea944ec37': {'lambda-e79de3edfa267f41111057d26471bce3-x', 'ones-c4a83f4b990021618d55e0fa61a351d6'}, ... }` The `'lambda-e79de3edfa267f41111057d26471bce3-x'` wasn't a layer of the graph. It was previously nested under the single new layer we were creating `gname` or `lambda-e79de3edfa267f41111057d26471bce3` in this case.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Fix map_blocks HLG layering 533555794
561794415	https://github.com/pydata/xarray/pull/3584#issuecomment-561794415	https://api.github.com/repos/pydata/xarray/issues/3584	MDEyOklzc3VlQ29tbWVudDU2MTc5NDQxNQ==	TomAugspurger 1312546	2019-12-04T19:09:34Z	2019-12-04T19:09:34Z	MEMBER	@mrocklin if you get a chance, can you confirm that the values in `HighLevelGraph.depedencies` should be a subset of the keys of layers? So in the following, the `lambda-<...>-x` is problematic, because it's not a key in `layers`? `python (Pdb) pp list(self.layers) ['eq-e98e52fb2b8e27b4b5158d399330c72d', 'lambda-0f1d0bc5e7df462d7125839aed006e04', 'ones-c4a83f4b990021618d55e0fa61a351d6'] (Pdb) pp self.dependencies {'eq-e98e52fb2b8e27b4b5158d399330c72d': {'lambda-0f1d0bc5e7df462d7125839aed006e04-x', 'ones-c4a83f4b990021618d55e0fa61a351d6'}, 'lambda-0f1d0bc5e7df462d7125839aed006e04': {'ones-c4a83f4b990021618d55e0fa61a351d6'}, 'ones-c4a83f4b990021618d55e0fa61a351d6': set()}` That's coming from the `name` of the DataArray / the dask arary in `DataArray.data`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Make dask names change when chunking Variables by different amounts. 530657789
561773837	https://github.com/pydata/xarray/pull/3584#issuecomment-561773837	https://api.github.com/repos/pydata/xarray/issues/3584	MDEyOklzc3VlQ29tbWVudDU2MTc3MzgzNw==	TomAugspurger 1312546	2019-12-04T18:17:56Z	2019-12-04T18:17:56Z	MEMBER	So this is enough to fix this in Dask diff diff --git a/dask/blockwise.py b/dask/blockwise.py index 52a36c246..84e0ecc08 100644 --- a/dask/blockwise.py +++ b/dask/blockwise.py @@ -818,7 +818,7 @@ def fuse_roots(graph: HighLevelGraph, keys: list): if ( isinstance(layer, Blockwise) and len(deps) > 1 - and not any(dependencies[dep] for dep in deps) # no need to fuse if 0 or 1 + and not any(dependencies.get(dep, {}) for dep in deps) # no need to fuse if 0 or 1 and all(len(dependents[dep]) == 1 for dep in deps) ): new = toolz.merge(layer, *[layers[dep] for dep in deps]) I'm trying to understand why we're getting this KeyError though. I want to make sure that we have a valid HighLevelGraph before making that change.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Make dask names change when chunking Variables by different amounts. 530657789
510217080	https://github.com/pydata/xarray/issues/2501#issuecomment-510217080	https://api.github.com/repos/pydata/xarray/issues/2501	MDEyOklzc3VlQ29tbWVudDUxMDIxNzA4MA==	TomAugspurger 1312546	2019-07-10T20:30:41Z	2019-07-10T20:30:41Z	MEMBER	Yep, that’s my suspicion as well. I’m still plugging away at it. Currently the pausing logic isn’t quite working well. On Jul 10, 2019, at 12:10, Ryan Abernathey notifications@github.com wrote: I believe that the memory issue is basically the same as dask/distributed#2602. The graphs look like: read --> rechunk --> write. Reading and rechunking increase memory consumption. Writing relieves it. In Rich's case, the workers just load too much data before they write it. Eventually they run out of memory. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset usage and limitations. 372848074
510167911	https://github.com/pydata/xarray/issues/2501#issuecomment-510167911	https://api.github.com/repos/pydata/xarray/issues/2501	MDEyOklzc3VlQ29tbWVudDUxMDE2NzkxMQ==	TomAugspurger 1312546	2019-07-10T18:05:07Z	2019-07-10T18:05:07Z	MEMBER	Great, thanks. I’ll look into the memory issue when writing. We may already have an issue for it. On Jul 10, 2019, at 10:59, Rich Signell notifications@github.com wrote: @TomAugspurger , I sat down here at Scipy with @rabernat and he instantly realized that we needed to drop the feature_id coordinate to prevent open_mfdataset from trying to harmonize that coordinate from all the chunks. So if I use this code, the open_mdfdataset command finishes: def drop_coords(ds): ds = ds.drop(['reference_time','feature_id']) return ds.reset_coords(drop=True) and I can then add back in the dropped coordinate values at the end: dsets = [xr.open_dataset(f) for f in files[:3]] ds.coords['feature_id'] = dsets[0].coords['feature_id'] I'm now running into memory issues when I write the zarr data -- but I should raise that as a new issue, right? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset usage and limitations. 372848074
509346055	https://github.com/pydata/xarray/issues/2501#issuecomment-509346055	https://api.github.com/repos/pydata/xarray/issues/2501	MDEyOklzc3VlQ29tbWVudDUwOTM0NjA1NQ==	TomAugspurger 1312546	2019-07-08T18:46:58Z	2019-07-08T18:46:58Z	MEMBER	@rsignell-usgs very helpful, thanks. I'd noticed that there was a pause after the open_dataset tasks finish, indicating that either the scheduler or (more likely) the client was doing work rather than the cluster. Most likely @rabernat's guess In open_mfdataset, all of the dimensions and coordinates of the individual files have to be checked and verified to be compatible. That is often the source of slow performance with open_mfdataset. is correct. Verifying all that now, and looking into if / how that can be done on the workers.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset usage and limitations. 372848074
509307081	https://github.com/pydata/xarray/issues/2501#issuecomment-509307081	https://api.github.com/repos/pydata/xarray/issues/2501	MDEyOklzc3VlQ29tbWVudDUwOTMwNzA4MQ==	TomAugspurger 1312546	2019-07-08T16:57:15Z	2019-07-08T16:57:15Z	MEMBER	I'm looking into it today. Can you clarify The memory use kept growing until the process died. by "process" do you mean a dask worker process, or just the main python process executing the `ds = xr.open_mfdataset(...)` code?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset usage and limitations. 372848074
506497180	https://github.com/pydata/xarray/issues/2501#issuecomment-506497180	https://api.github.com/repos/pydata/xarray/issues/2501	MDEyOklzc3VlQ29tbWVudDUwNjQ5NzE4MA==	TomAugspurger 1312546	2019-06-27T20:24:26Z	2019-06-27T20:24:26Z	MEMBER	The datasets in our cloud datastore are designed explicitly to avoid this problem! Good to know! FYI, https://github.com/pydata/xarray/issues/2501#issuecomment-506478508 was user error (I can access it, but need to specify the us-east-1 region). Taking a look now.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset usage and limitations. 372848074
506486503	https://github.com/pydata/xarray/issues/2927#issuecomment-506486503	https://api.github.com/repos/pydata/xarray/issues/2927	MDEyOklzc3VlQ29tbWVudDUwNjQ4NjUwMw==	TomAugspurger 1312546	2019-06-27T19:51:58Z	2019-06-27T19:51:58Z	MEMBER	Spoke with @martindurant about this today. The mapping should probably strip the protocol from the `root` provided by the user. Tracking in https://github.com/intake/filesystem_spec/issues/56 (this issue can probably be closed).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Data variables empty with to_zarr / from_zarr on s3 if 's3://' in root s3fs string 438166604
506478508	https://github.com/pydata/xarray/issues/2501#issuecomment-506478508	https://api.github.com/repos/pydata/xarray/issues/2501	MDEyOklzc3VlQ29tbWVudDUwNjQ3ODUwOA==	TomAugspurger 1312546	2019-06-27T19:25:05Z	2019-06-27T19:25:05Z	MEMBER	Thanks, will take a look this afternoon. Are there any datasets on https://pangeo-data.github.io/pangeo-datastore/ that would exhibit this poor behavior? I may not have access to the bucket (or I'm misusing `rclone`) `2019/06/27 14:23:50 NOTICE: Config file "/Users/taugspurger/.config/rclone/rclone.conf" not found - using defaults 2019/06/27 14:23:50 Failed to create file system for "aws-east:nwm-archive/2009": didn't find section in config file`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset usage and limitations. 372848074
339525582	https://github.com/pydata/xarray/issues/1661#issuecomment-339525582	https://api.github.com/repos/pydata/xarray/issues/1661	MDEyOklzc3VlQ29tbWVudDMzOTUyNTU4Mg==	TomAugspurger 1312546	2017-10-26T01:49:12Z	2017-10-26T01:49:12Z	MEMBER	Yep, that was the change. The fix is to explicitly register the converters before plotting: `python from pandas.tseries import converter converter.register()`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	da.plot.pcolormesh fails when there is a datetime coordinate 268487752
339510522	https://github.com/pydata/xarray/issues/1661#issuecomment-339510522	https://api.github.com/repos/pydata/xarray/issues/1661	MDEyOklzc3VlQ29tbWVudDMzOTUxMDUyMg==	TomAugspurger 1312546	2017-10-26T00:05:57Z	2017-10-26T00:05:57Z	MEMBER	Pandas used to register a matplotlib converter for datetimes on import. I’ll take a closer look in a bit.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	da.plot.pcolormesh fails when there is a datetime coordinate 268487752
318451800	https://github.com/pydata/xarray/pull/1457#issuecomment-318451800	https://api.github.com/repos/pydata/xarray/issues/1457	MDEyOklzc3VlQ29tbWVudDMxODQ1MTgwMA==	TomAugspurger 1312546	2017-07-27T18:45:36Z	2017-07-27T18:45:36Z	MEMBER	Yep, thanks again for setting that up. On Thu, Jul 27, 2017 at 11:39 AM, Wes McKinney notifications@github.com wrote: cool, are these numbers coming off the pandabox? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/pull/1457#issuecomment-318417790, or mute the thread https://github.com/notifications/unsubscribe-auth/ABQHIk24wmNhChH3nCVT3AGqR_Q6EHa9ks5sSL1IgaJpZM4N74gy .	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Feature/benchmark 236347050
318376827	https://github.com/pydata/xarray/pull/1457#issuecomment-318376827	https://api.github.com/repos/pydata/xarray/issues/1457	MDEyOklzc3VlQ29tbWVudDMxODM3NjgyNw==	TomAugspurger 1312546	2017-07-27T14:21:30Z	2017-07-27T14:21:30Z	MEMBER	These are now being run and published to https://tomaugspurger.github.io/asv-collection/xarray/ I'm plan to find a more permanent home to publish the results rather than my personal github pages site, but that may take a while before I can get to it.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Feature/benchmark 236347050
315402471	https://github.com/pydata/xarray/pull/1457#issuecomment-315402471	https://api.github.com/repos/pydata/xarray/issues/1457	MDEyOklzc3VlQ29tbWVudDMxNTQwMjQ3MQ==	TomAugspurger 1312546	2017-07-14T16:21:29Z	2017-07-14T16:21:29Z	MEMBER	About hardware, we should be able to run these on the machine running the pandas benchmarks. Once it's merged I should be able to add it easily to https://github.com/TomAugspurger/asv-runner/blob/master/tests/full.yml and the benchmarks will be run and published (to https://tomaugspurger.github.io/asv-collection/ right now; not the permanent home)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Feature/benchmark 236347050

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

52 rows where author_association = "MEMBER" and user = 1312546 sorted by updated_at descending

3868 https://github.com/pydata/xarray/issues/3868

Advanced export