html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/4648#issuecomment-964099251,https://api.github.com/repos/pydata/xarray/issues/4648,964099251,IC_kwDOAMm_X845dvyz,1312546,2021-11-09T12:17:32Z,2021-11-09T12:17:32Z,MEMBER,"""In charge of"" is overstating it a bit. It's been segfaulting when building pandas and I haven't had a chance to debug it.
If / when I get around to fixing it I'll try adding xarray, but it might be a bit.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,756425955
https://github.com/pydata/xarray/pull/5906#issuecomment-953858365,https://api.github.com/repos/pydata/xarray/issues/5906,953858365,IC_kwDOAMm_X8442rk9,1312546,2021-10-28T13:43:04Z,2021-10-28T13:43:04Z,MEMBER,"There are two changes here
1. Only check the `.data` of non-index variables, done at https://github.com/pydata/xarray/pull/5906/files#diff-763e3002fd954d544b05858d8d138b828b66b6a2a0ae3cd58d2040a652f14638R4161-R4163
2. The check for whether or not a full index was needed is done in a `for dim in dims` loop, but the condition doesn't actually depend on `dim`. So I lifted that check out of the for loop (doesn't matter much, since stuff is cached).
cc @dcherian","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1038531231
https://github.com/pydata/xarray/issues/5902#issuecomment-953379569,https://api.github.com/repos/pydata/xarray/issues/5902,953379569,IC_kwDOAMm_X84402rx,1312546,2021-10-27T23:19:49Z,2021-10-27T23:19:49Z,MEMBER,"Thanks @dcherian, that seems to fix this performance problem. I'll see if the tests pass and will submit a PR.
I came across #5582 while searching, thanks :)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1037894157
https://github.com/pydata/xarray/issues/5902#issuecomment-953344052,https://api.github.com/repos/pydata/xarray/issues/5902,953344052,IC_kwDOAMm_X8440uA0,1312546,2021-10-27T22:02:58Z,2021-10-27T22:03:35Z,MEMBER,"Oh, hmm... I'm noticing now that `IndexVariable` (currently) eagerly loads data into memory, so that check will *always* be false for the problematic IndexVariable variable.
So perhaps a slight adjustment to `is_duck_dask_array` to handle `xarray.Variable` ?
```diff
diff --git a/xarray/core/dataset.py b/xarray/core/dataset.py
index 550c3587..16637574 100644
--- a/xarray/core/dataset.py
+++ b/xarray/core/dataset.py
@@ -4159,14 +4159,14 @@ class Dataset(DataWithCoords, DatasetArithmetic, Mapping):
# Dask arrays don't support assignment by index, which the fast unstack
# function requires.
# https://github.com/pydata/xarray/pull/4746#issuecomment-753282125
- any(is_duck_dask_array(v.data) for v in self.variables.values())
+ any(is_duck_dask_array(v) for v in self.variables.values())
# Sparse doesn't currently support (though we could special-case
# it)
# https://github.com/pydata/sparse/issues/422
- or any(
- isinstance(v.data, sparse_array_type)
- for v in self.variables.values()
- )
+ # or any(
+ # isinstance(v.data, sparse_array_type)
+ # for v in self.variables.values()
+ # )
or sparse
# Until https://github.com/pydata/xarray/pull/4751 is resolved,
# we check explicitly whether it's a numpy array. Once that is
@@ -4177,9 +4177,9 @@ class Dataset(DataWithCoords, DatasetArithmetic, Mapping):
# # or any(
# # isinstance(v.data, pint_array_type) for v in self.variables.values()
# # )
- or any(
- not isinstance(v.data, np.ndarray) for v in self.variables.values()
- )
+ # or any(
+ # not isinstance(v.data, np.ndarray) for v in self.variables.values()
+ # )
):
result = result._unstack_full_reindex(dim, fill_value, sparse)
else:
diff --git a/xarray/core/pycompat.py b/xarray/core/pycompat.py
index d1649235..e9669105 100644
--- a/xarray/core/pycompat.py
+++ b/xarray/core/pycompat.py
@@ -44,6 +44,12 @@ class DuckArrayModule:
def is_duck_dask_array(x):
+ from xarray.core.variable import IndexVariable, Variable
+ if isinstance(x, IndexVariable):
+ return False
+ elif isinstance(x, Variable):
+ x = x.data
+
if DuckArrayModule(""dask"").available:
from dask.base import is_dask_collection
```
That's completely ignoring the accesses to `v.data` for the sparse and pint checks, which don't look quite as easy to solve.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1037894157
https://github.com/pydata/xarray/issues/5764#issuecomment-932811398,https://api.github.com/repos/pydata/xarray/issues/5764,932811398,IC_kwDOAMm_X843mZKG,1312546,2021-10-02T19:48:05Z,2021-10-02T19:48:05Z,MEMBER,"Mmm for better or worse, Dask relies on sizeof to estimate the memory usage of objects at runtime. We could move that over to some new duck-typed interface like using `.nbytes` if it's around, but not all objects will want to expose an nbytes attribute in their API.
IMO, I think the best path is for objects to implement `__getsizeof__` unless there's some downside I'm missing.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,988158051
https://github.com/pydata/xarray/issues/5426#issuecomment-852667695,https://api.github.com/repos/pydata/xarray/issues/5426,852667695,MDEyOklzc3VlQ29tbWVudDg1MjY2NzY5NQ==,1312546,2021-06-02T02:37:18Z,2021-06-02T02:37:18Z,MEMBER,"> Do you run into poor load balancing as well when using Zarr with Xarray?
The only thing that comes to mind is everything being assigned to one worker when the entire task graph has a single node at the base of the task graph. But then work stealing kicks in and things level out (that was a while ago though).
I haven't noticed any kind of systemic load balancing problem, but I can take a look at that notebook later. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,908971901
https://github.com/pydata/xarray/issues/5426#issuecomment-852666211,https://api.github.com/repos/pydata/xarray/issues/5426,852666211,MDEyOklzc3VlQ29tbWVudDg1MjY2NjIxMQ==,1312546,2021-06-02T02:33:28Z,2021-06-02T02:33:28Z,MEMBER,https://github.com/dask/dask/pull/6203 and https://github.com/dask/dask/pull/6773/ are the maybe relevant issues. I actually don't know if that could have an effect here. I don't know (and a brief search couldn't confirm) whether or not xarray uses `dask.array.from_zarr`. ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,908971901
https://github.com/pydata/xarray/issues/1094#issuecomment-767797103,https://api.github.com/repos/pydata/xarray/issues/1094,767797103,MDEyOklzc3VlQ29tbWVudDc2Nzc5NzEwMw==,1312546,2021-01-26T20:09:11Z,2021-01-26T20:09:11Z,MEMBER,Should this and https://github.com/pydata/xarray/issues/1650 be consolidated into a single issue? I think that they're duplicates of eachother.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,187873247
https://github.com/pydata/xarray/issues/4738#issuecomment-752156934,https://api.github.com/repos/pydata/xarray/issues/4738,752156934,MDEyOklzc3VlQ29tbWVudDc1MjE1NjkzNA==,1312546,2020-12-29T16:53:16Z,2020-12-29T16:53:16Z,MEMBER,"IIUC, something like https://github.com/dask/dask/blob/4a7a2438219c4ee493434042e50f4cdb67b6ec9f/dask/base.py#L778 is what you're looking for. Further down we register tokenizers for various types like pandas' DataFrames and ndarrays.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,775502974
https://github.com/pydata/xarray/issues/4717#issuecomment-749205535,https://api.github.com/repos/pydata/xarray/issues/4717,749205535,MDEyOklzc3VlQ29tbWVudDc0OTIwNTUzNQ==,1312546,2020-12-21T21:29:56Z,2020-12-21T21:29:56Z,MEMBER,I'm not sure offhand. Maybe best to post an issue on the pandas tracker.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,771484861
https://github.com/pydata/xarray/issues/4428#issuecomment-712066302,https://api.github.com/repos/pydata/xarray/issues/4428,712066302,MDEyOklzc3VlQ29tbWVudDcxMjA2NjMwMg==,1312546,2020-10-19T11:08:13Z,2020-10-19T11:43:46Z,MEMBER,"Sorry, my comment in https://github.com/pydata/xarray/issues/4428#issuecomment-711034128 was incorrect in a couple ways
1. We still do the splitting, even when slicing with an out-of-order indexer. Checking on if that's appropriate.
2. I'm checking in on a logic bug when computing the number of chunks. I don't think we properly handle non-uniform chunking on the other axes.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,702646191
https://github.com/pydata/xarray/issues/4428#issuecomment-711034128,https://api.github.com/repos/pydata/xarray/issues/4428,711034128,MDEyOklzc3VlQ29tbWVudDcxMTAzNDEyOA==,1312546,2020-10-17T15:54:48Z,2020-10-17T15:54:48Z,MEMBER,"I assume that the indices `[np.argsort(da.x.data)]` are not going to be monotonically increasing. That induces a different slicing pattern. The docs in https://docs.dask.org/en/latest/array-slicing.html#efficiency describe the case where the indices are sorted, but doesn't discuss the non-sorted case (yet).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,702646191
https://github.com/pydata/xarray/issues/4428#issuecomment-709539887,https://api.github.com/repos/pydata/xarray/issues/4428,709539887,MDEyOklzc3VlQ29tbWVudDcwOTUzOTg4Nw==,1312546,2020-10-15T19:20:53Z,2020-10-15T19:20:53Z,MEMBER,"Closing the loop here, with https://github.com/dask/dask/pull/6665 the behavior of Dask=2.25.0 should be restored (possibly with a warning about creating large chunks).
So this can probably be closed, though there *may* be parts of xarray that should be updated to avoid creating large chunks, or we could rely on the user to do that through the dask config system.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,702646191
https://github.com/pydata/xarray/pull/4432#issuecomment-694817581,https://api.github.com/repos/pydata/xarray/issues/4432,694817581,MDEyOklzc3VlQ29tbWVudDY5NDgxNzU4MQ==,1312546,2020-09-18T11:36:49Z,2020-09-18T11:36:49Z,MEMBER,"I'm not sure, but I don't think so. It's strange that it didn't fail on the
pull request.
On Thu, Sep 17, 2020 at 8:51 PM Maximilian Roos
wrote:
> Might be best to proceed with #4434
> for now. I'll need to give
> this a bit of thought.
>
> OK, as you wish, I'll merge if that passes.
>
> But your change did pass before the merge. Could it be a conflict (in
> functionality, not git) with recent changes on master?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or
> unsubscribe
>
> .
>
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,703881154
https://github.com/pydata/xarray/pull/4432#issuecomment-694594817,https://api.github.com/repos/pydata/xarray/issues/4432,694594817,MDEyOklzc3VlQ29tbWVudDY5NDU5NDgxNw==,1312546,2020-09-18T01:27:30Z,2020-09-18T01:27:30Z,MEMBER,Might be best to proceed with https://github.com/pydata/xarray/pull/4434 for now. I'll need to give this a bit of thought.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,703881154
https://github.com/pydata/xarray/pull/4432#issuecomment-694593225,https://api.github.com/repos/pydata/xarray/issues/4432,694593225,MDEyOklzc3VlQ29tbWVudDY5NDU5MzIyNQ==,1312546,2020-09-18T01:22:43Z,2020-09-18T01:22:43Z,MEMBER,"Huh, I'm able to reproduce locally. Looking into it now.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,703881154
https://github.com/pydata/xarray/issues/4406#issuecomment-691083939,https://api.github.com/repos/pydata/xarray/issues/4406,691083939,MDEyOklzc3VlQ29tbWVudDY5MTA4MzkzOQ==,1312546,2020-09-11T13:07:00Z,2020-09-11T13:07:00Z,MEMBER,"> @TomAugspurger do you know off-hand if there have been any recent changes in Dask's scheduler that could have caused this?
This is just using Dask's threaded scheduler, right? I don't recall any changes there recently.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,694112301
https://github.com/pydata/xarray/issues/3698#issuecomment-690378323,https://api.github.com/repos/pydata/xarray/issues/3698,690378323,MDEyOklzc3VlQ29tbWVudDY5MDM3ODMyMw==,1312546,2020-09-10T15:42:54Z,2020-09-10T15:42:54Z,MEMBER,"Thanks for confirming. I'll take another look at this today then.
On Thu, Sep 10, 2020 at 10:30 AM Deepak Cherian
wrote:
> Reopened #3698 .
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> , or
> unsubscribe
>
> .
>
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,550355524
https://github.com/pydata/xarray/issues/3698#issuecomment-689808725,https://api.github.com/repos/pydata/xarray/issues/3698,689808725,MDEyOklzc3VlQ29tbWVudDY4OTgwODcyNQ==,1312546,2020-09-09T20:38:39Z,2020-09-09T20:38:39Z,MEMBER,"FYI, @dcherian your recent PR to dask fixed this example. Playing around with chunk sizes, it seems to have fixed it even when the chunk size exceeds `dask.config['array']['chunk-size']`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,550355524
https://github.com/pydata/xarray/issues/3147#issuecomment-668256401,https://api.github.com/repos/pydata/xarray/issues/3147,668256401,MDEyOklzc3VlQ29tbWVudDY2ODI1NjQwMQ==,1312546,2020-08-03T21:42:42Z,2020-08-03T21:42:42Z,MEMBER,"Thanks for that link. I hope that map_overlap could use pad internally for
the external boundaries.
On Mon, Aug 3, 2020 at 3:22 PM Deepak Cherian
wrote:
> This issue about coordinate labels for boundaries exists with pad too:
> #3868
>
> Can map_overlap just use DataArray.pad and we can fix things there?
>
> Or perhaps we can expect users to add a call to pad before map_overlap?
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> , or
> unsubscribe
>
> .
>
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,470024896
https://github.com/pydata/xarray/pull/4305#issuecomment-668242904,https://api.github.com/repos/pydata/xarray/issues/4305,668242904,MDEyOklzc3VlQ29tbWVudDY2ODI0MjkwNA==,1312546,2020-08-03T21:08:38Z,2020-08-03T21:08:38Z,MEMBER,"The doc failure looks unrelated:
```
>>>-------------------------------------------------------------------------
Exception in /home/docs/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/doc/plotting.rst at block ending on line None
Specify :okexcept: as an option in the ipython:: block to suppress this message
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
in
----> 1 g_simple = t.plot(x=""lon"", y=""lat"", col=""time"", col_wrap=3)
~/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/xarray/plot/plot.py in __call__(self, **kwargs)
444
445 def __call__(self, **kwargs):
--> 446 return plot(self._da, **kwargs)
447
448 # we can't use functools.wraps here since that also modifies the name / qualname
~/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/xarray/plot/plot.py in plot(darray, row, col, col_wrap, ax, hue, rtol, subplot_kws, **kwargs)
198 kwargs[""ax""] = ax
199
--> 200 return plotfunc(darray, **kwargs)
201
202
~/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/xarray/plot/plot.py in newplotfunc(darray, x, y, figsize, size, aspect, ax, row, col, col_wrap, xincrease, yincrease, add_colorbar, add_labels, vmin, vmax, cmap, center, robust, extend, levels, infer_intervals, colors, subplot_kws, cbar_ax, cbar_kwargs, xscale, yscale, xticks, yticks, xlim, ylim, norm, **kwargs)
636 # Need the decorated plotting function
637 allargs[""plotfunc""] = globals()[plotfunc.__name__]
--> 638 return _easy_facetgrid(darray, kind=""dataarray"", **allargs)
639
640 plt = import_matplotlib_pyplot()
~/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/xarray/plot/facetgrid.py in _easy_facetgrid(data, plotfunc, kind, x, y, row, col, col_wrap, sharex, sharey, aspect, size, subplot_kws, ax, figsize, **kwargs)
642
643 if kind == ""dataarray"":
--> 644 return g.map_dataarray(plotfunc, x, y, **kwargs)
645
646 if kind == ""dataset"":
~/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/xarray/plot/facetgrid.py in map_dataarray(self, func, x, y, **kwargs)
263 # Get x, y labels for the first subplot
264 x, y = _infer_xy_labels(
--> 265 darray=self.data.loc[self.name_dicts.flat[0]],
266 x=x,
267 y=y,
~/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/xarray/core/dataarray.py in __getitem__(self, key)
196 labels = indexing.expanded_indexer(key, self.data_array.ndim)
197 key = dict(zip(self.data_array.dims, labels))
--> 198 return self.data_array.sel(**key)
199
200 def __setitem__(self, key, value) -> None:
~/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/xarray/core/dataarray.py in sel(self, indexers, method, tolerance, drop, **indexers_kwargs)
1147
1148 """"""
-> 1149 ds = self._to_temp_dataset().sel(
1150 indexers=indexers,
1151 drop=drop,
~/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/xarray/core/dataset.py in sel(self, indexers, method, tolerance, drop, **indexers_kwargs)
2099 """"""
2100 indexers = either_dict_or_kwargs(indexers, indexers_kwargs, ""sel"")
-> 2101 pos_indexers, new_indexes = remap_label_indexers(
2102 self, indexers=indexers, method=method, tolerance=tolerance
2103 )
~/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/xarray/core/coordinates.py in remap_label_indexers(obj, indexers, method, tolerance, **indexers_kwargs)
394 }
395
--> 396 pos_indexers, new_indexes = indexing.remap_label_indexers(
397 obj, v_indexers, method=method, tolerance=tolerance
398 )
~/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/xarray/core/indexing.py in remap_label_indexers(data_obj, indexers, method, tolerance)
268 coords_dtype = data_obj.coords[dim].dtype
269 label = maybe_cast_to_coords_dtype(label, coords_dtype)
--> 270 idxr, new_idx = convert_label_indexer(index, label, dim, method, tolerance)
271 pos_indexers[dim] = idxr
272 if new_idx is not None:
~/checkouts/readthedocs.org/user_builds/xray/checkouts/4305/xarray/core/indexing.py in convert_label_indexer(index, label, index_name, method, tolerance)
187 indexer = index.get_loc(label.item())
188 else:
--> 189 indexer = index.get_loc(
190 label.item(), method=method, tolerance=tolerance
191 )
~/checkouts/readthedocs.org/user_builds/xray/conda/4305/lib/python3.8/site-packages/pandas/core/indexes/datetimes.py in get_loc(self, key, method, tolerance)
620 else:
621 # unrecognized type
--> 622 raise KeyError(key)
623
624 try:
KeyError: 1356998400000000000
<<<-------------------------------------------------------------------------
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,672281867
https://github.com/pydata/xarray/issues/3147#issuecomment-668209121,https://api.github.com/repos/pydata/xarray/issues/3147,668209121,MDEyOklzc3VlQ29tbWVudDY2ODIwOTEyMQ==,1312546,2020-08-03T19:47:47Z,2020-08-03T19:47:57Z,MEMBER,"I'm thinking through a `map_overlap` API right now. In dask, map_overlap requires a few extra arguments
```
depth: int, tuple, dict or list
The number of elements that each block should share with its neighbors
If a tuple or dict then this can be different per axis.
If a list then each element of that list must be an int, tuple or dict
defining depth for the corresponding array in `args`.
Asymmetric depths may be specified using a dict value of (-/+) tuples.
Note that asymmetric depths are currently only supported when
``boundary`` is 'none'.
The default value is 0.
boundary: str, tuple, dict or list
How to handle the boundaries.
Values include 'reflect', 'periodic', 'nearest', 'none',
or any constant value like 0 or np.nan.
If a list then each element must be a str, tuple or dict defining the
boundary for the corresponding array in `args`.
The default value is 'reflect'.
```
In `dask.array` those must be dicts whose keys are the axis number. For xarray we would want to allow the dimension names there.
I'm not sure how to handle the DataArray labels for the boundary chunks (dask docs at https://docs.dask.org/en/latest/array-overlap.html#boundaries). For `reflect` / `periodic` I think things are OK, we perhaps just use the label associated with that value. I'm not sure what to do for constants.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,470024896
https://github.com/pydata/xarray/pull/4256#issuecomment-663584770,https://api.github.com/repos/pydata/xarray/issues/4256,663584770,MDEyOklzc3VlQ29tbWVudDY2MzU4NDc3MA==,1312546,2020-07-24T15:06:03Z,2020-07-24T15:06:03Z,MEMBER,Yep. I believe that @ogrisel can add you to the organization on anaconda.org so that you can create a key to upload to packages.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,664363493
https://github.com/pydata/xarray/pull/4254#issuecomment-663082208,https://api.github.com/repos/pydata/xarray/issues/4254,663082208,MDEyOklzc3VlQ29tbWVudDY2MzA4MjIwOA==,1312546,2020-07-23T15:45:57Z,2020-07-23T15:45:57Z,MEMBER,"FYI https://github.com/pandas-dev/pandas/pull/35393 is the PR to follow. It'll be included in pandas 1.1.0, which should be out in a week or so.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,663977922
https://github.com/pydata/xarray/issues/4133#issuecomment-641332231,https://api.github.com/repos/pydata/xarray/issues/4133,641332231,MDEyOklzc3VlQ29tbWVudDY0MTMzMjIzMQ==,1312546,2020-06-09T14:24:59Z,2020-06-09T14:31:26Z,MEMBER,"Ah, the (numpy) build failure is because pandas doesn't have a py38 entry in our pyproject.toml. Fixing that now.
edit: https://github.com/pandas-dev/pandas/pull/34667. But you'll still want to update your CI at https://github.com/pydata/xarray/blob/2a288f6ed4286910fcf3ab9895e1e9cbd44d30b4/ci/azure/install.yml#L16 and https://github.com/pydata/xarray/blob/2a288f6ed4286910fcf3ab9895e1e9cbd44d30b4/ci/azure/install.yml#L23 to pull from the new locations.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,634979933
https://github.com/pydata/xarray/issues/4133#issuecomment-641330288,https://api.github.com/repos/pydata/xarray/issues/4133,641330288,MDEyOklzc3VlQ29tbWVudDY0MTMzMDI4OA==,1312546,2020-06-09T14:22:02Z,2020-06-09T14:22:02Z,MEMBER,"@keewis not sure about the build issue, but we (along with many other projects) recently moved our wheels to upload to https://anaconda.org/scipy-wheels-nightly/. https://anaconda.org/scipy-wheels-nightly/pandas/ does have py38 wheels.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,634979933
https://github.com/pydata/xarray/issues/4112#issuecomment-636808986,https://api.github.com/repos/pydata/xarray/issues/4112,636808986,MDEyOklzc3VlQ29tbWVudDYzNjgwODk4Ng==,1312546,2020-06-01T11:44:23Z,2020-06-01T11:44:23Z,MEMBER,Rechunking the `indexer` array is how I would be explicit about the desired chunk size. Opened https://github.com/dask/dask/issues/6270 to discuss this on the dask side.,"{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,627600168
https://github.com/pydata/xarray/pull/3816#issuecomment-622128514,https://api.github.com/repos/pydata/xarray/issues/3816,622128514,MDEyOklzc3VlQ29tbWVudDYyMjEyODUxNA==,1312546,2020-04-30T21:38:21Z,2020-04-30T21:38:21Z,MEMBER,"Makes sense. template seems fine.
On Thu, Apr 30, 2020 at 3:35 PM Deepak Cherian
wrote:
> Thanks for the review @TomAugspurger
>
> Question on the name template. I think in dask.dataframe and dask.array
> we might call this meta. Is that keyword already used elsewhere in
> xarray? template is also a fine name though.
>
> I added the meta kwarg to apply_ufunc so that users could pass that down
> to dask i.e. that meta = dask's meta = np.ndarray or something like that.
> So I'd like to avoid reusing meta here where it would exclusively be an
> xarray object ≠ dask's meta
>
> BUT it seems to me like there's a better name than template. Any ideas?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or
> unsubscribe
>
> .
>
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,573768194
https://github.com/pydata/xarray/issues/3698#issuecomment-592101136,https://api.github.com/repos/pydata/xarray/issues/3698,592101136,MDEyOklzc3VlQ29tbWVudDU5MjEwMTEzNg==,1312546,2020-02-27T18:13:28Z,2020-02-27T18:13:28Z,MEMBER,"It looks like xarray is getting a bad task graph after the optimize.
```python
In [1]: import xarray as xr
import dask
In [2]: import dask
In [3]: a = dask.array.ones((10,5), chunks=(1,3))
...: a = dask.optimize(a)[0]
In [4]: da = xr.DataArray(a.compute()).chunk({""dim_0"": 5})
...: da = dask.optimize(da)[0]
In [5]: dict(da.__dask_graph__())
Out[5]:
{('xarray--e2865aa10d476e027154771611541f99',
1,
0): (, 'xarray--e2865aa10d476e027154771611541f99', (slice(5, 10, None),
slice(0, 5, None))),
('xarray--e2865aa10d476e027154771611541f99',
0,
0): (, 'xarray--e2865aa10d476e027154771611541f99', (slice(0, 5, None),
slice(0, 5, None)))}
```
Notice that are references to `xarray--e2865aa10d476e027154771611541f99` (just the string, not a tuple representing a chunk) but that key isn't in the graph.
If we manually insert that, you'll see things work
```python
In [9]: dsk['xarray--e2865aa10d476e027154771611541f99'] = da._to_temp_dataset()[xr.core.dataarray._THIS_ARRAY]
In [11]: dask.get(dsk, keys=[('xarray--e2865aa10d476e027154771611541f99', 1, 0)])
Out[11]:
( (dim_0: 5, dim_1: 5)>
dask.array
Dimensions without coordinates: dim_0, dim_1,)
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,550355524
https://github.com/pydata/xarray/issues/3751#issuecomment-582972083,https://api.github.com/repos/pydata/xarray/issues/3751,582972083,MDEyOklzc3VlQ29tbWVudDU4Mjk3MjA4Mw==,1312546,2020-02-06T15:55:30Z,2020-02-06T15:55:30Z,MEMBER,"FWIW, I think @jbrockmendel is still progressing on an ""extension index"" interface where you could have a custom dtype / Index subclass that would be properly supported. Long-term, that's the best solution.
Short-term, I'm less sure what's best.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728
https://github.com/pydata/xarray/pull/3640#issuecomment-580462361,https://api.github.com/repos/pydata/xarray/issues/3640,580462361,MDEyOklzc3VlQ29tbWVudDU4MDQ2MjM2MQ==,1312546,2020-01-30T21:13:09Z,2020-01-30T21:13:09Z,MEMBER,"> Is my interpretation correct?
Yep, that's the basic idea. Every call to `DataFrame.plot.` begins with a [check for the active backend](https://github.com/pandas-dev/pandas/blob/v1.0.0/pandas/plotting/_core.py#L767). Based on the configured value, we the correct backend, make the call, and return the result.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,539394615
https://github.com/pydata/xarray/issues/3673#issuecomment-579517151,https://api.github.com/repos/pydata/xarray/issues/3673,579517151,MDEyOklzc3VlQ29tbWVudDU3OTUxNzE1MQ==,1312546,2020-01-28T23:12:47Z,2020-01-28T23:12:47Z,MEMBER,"FYI, we had some failures in our nightly wheel builds so they weren't
updated in a while. https://github.com/MacPython/pandas-wheels/pull/70
fixed that, so you'll hopefully get a new wheel tonight.
On Tue, Jan 28, 2020 at 5:09 PM Deepak Cherian
wrote:
> should be closed by pandas-dev/pandas#31136
> . I think the tests
> will turn green once the wheels update
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> ,
> or unsubscribe
>
> .
>
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,547012915
https://github.com/pydata/xarray/issues/3673#issuecomment-575688251,https://api.github.com/repos/pydata/xarray/issues/3673,575688251,MDEyOklzc3VlQ29tbWVudDU3NTY4ODI1MQ==,1312546,2020-01-17T16:06:23Z,2020-01-17T16:06:23Z,MEMBER,Opened https://github.com/pandas-dev/pandas/issues/31109.,"{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,547012915
https://github.com/pydata/xarray/issues/3673#issuecomment-574256856,https://api.github.com/repos/pydata/xarray/issues/3673,574256856,MDEyOklzc3VlQ29tbWVudDU3NDI1Njg1Ng==,1312546,2020-01-14T16:25:50Z,2020-01-14T16:25:50Z,MEMBER,"@jbrockmendel likely knows more about the index arithmetic issue.
```python
In [22]: import xarray as xr
In [23]: import pandas as pd
In [24]: idx = pd.timedelta_range(""1D"", periods=5, freq=""D"")
In [25]: a = xr.cftime_range(""2000"", periods=5)
In [26]: idx + a
/Users/taugspurger/sandbox/pandas/pandas/core/arrays/datetimelike.py:1204: PerformanceWarning: Adding/subtracting array of DateOffsets to TimedeltaArray not vectorized
PerformanceWarning,
Out[26]:
Index([2000-01-02 00:00:00, 2000-01-04 00:00:00, 2000-01-06 00:00:00,
2000-01-08 00:00:00, 2000-01-10 00:00:00],
dtype='object')
In [27]: a + idx
Out[27]:
CFTimeIndex([2000-01-02 00:00:00, 2000-01-04 00:00:00, 2000-01-06 00:00:00,
2000-01-08 00:00:00, 2000-01-10 00:00:00],
dtype='object')
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,547012915
https://github.com/pydata/xarray/issues/2666#issuecomment-569820784,https://api.github.com/repos/pydata/xarray/issues/2666,569820784,MDEyOklzc3VlQ29tbWVudDU2OTgyMDc4NA==,1312546,2019-12-30T22:58:23Z,2019-12-30T22:58:23Z,MEMBER,"> I think this is basically the same change.
Ah, I was mistaken. I was thinking we needed to plump a `dtype` argument all the way through there, but I don't think that's necessary. I may be able to submit a PR with a `dtypes` argument for `from_dataframe` tomorrow.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,398107776
https://github.com/pydata/xarray/issues/2666#issuecomment-569810375,https://api.github.com/repos/pydata/xarray/issues/2666,569810375,MDEyOklzc3VlQ29tbWVudDU2OTgxMDM3NQ==,1312546,2019-12-30T22:07:30Z,2019-12-30T22:07:30Z,MEMBER,"And there are a couple places that need updating, even with a `dtypes` argument to let the user specify things. We also hit this via `Dataset.__setitem__`
```pytb
~/sandbox/xarray/xarray/core/dataset.py in __setitem__(self, key, value)
1268 )
1269
-> 1270 self.update({key: value})
1271
1272 def __delitem__(self, key: Hashable) -> None:
~/sandbox/xarray/xarray/core/dataset.py in update(self, other, inplace)
3521 """"""
3522 _check_inplace(inplace)
-> 3523 merge_result = dataset_update_method(self, other)
3524 return self._replace(inplace=True, **merge_result._asdict())
3525
~/sandbox/xarray/xarray/core/merge.py in dataset_update_method(dataset, other)
862 other[key] = value.drop_vars(coord_names)
863
--> 864 return merge_core([dataset, other], priority_arg=1, indexes=dataset.indexes)
~/sandbox/xarray/xarray/core/merge.py in merge_core(objects, compat, join, priority_arg, explicit_coords, indexes, fill_value)
550 coerced, join=join, copy=False, indexes=indexes, fill_value=fill_value
551 )
--> 552 collected = collect_variables_and_indexes(aligned)
553
554 prioritized = _get_priority_vars_and_indexes(aligned, priority_arg, compat=compat)
~/sandbox/xarray/xarray/core/merge.py in collect_variables_and_indexes(list_of_mappings)
275 append_all(coords, indexes)
276
--> 277 variable = as_variable(variable, name=name)
278 if variable.dims == (name,):
279 variable = variable.to_index_variable()
~/sandbox/xarray/xarray/core/variable.py in as_variable(obj, name)
105 elif isinstance(obj, tuple):
106 try:
--> 107 obj = Variable(*obj)
108 except (TypeError, ValueError) as error:
109 # use .format() instead of % because it handles tuples consistently
~/sandbox/xarray/xarray/core/variable.py in __init__(self, dims, data, attrs, encoding, fastpath)
306 unrecognized encoding items.
307 """"""
--> 308 self._data = as_compatible_data(data, fastpath=fastpath)
309 self._dims = self._parse_dimensions(dims)
310 self._attrs = None
~/sandbox/xarray/xarray/core/variable.py in as_compatible_data(data, fastpath)
229 if isinstance(data, np.ndarray):
230 if data.dtype.kind == ""O"":
--> 231 data = _possibly_convert_objects(data)
232 elif data.dtype.kind == ""M"":
233 data = np.asarray(data, ""datetime64[ns]"")
~/sandbox/xarray/xarray/core/variable.py in _possibly_convert_objects(values)
165 datetime64 and timedelta64, according to the pandas convention.
166 """"""
--> 167 return np.asarray(pd.Series(values.ravel())).reshape(values.shape)
168
169
~/sandbox/numpy/numpy/core/_asarray.py in asarray(a, dtype, order)
83
84 """"""
---> 85 return array(a, dtype, copy=False, order=order)
86
87
~/sandbox/pandas/pandas/core/series.py in __array__(self, dtype)
730 ""To keep the old behavior, pass 'dtype=\""datetime64[ns]\""'.""
731 )
--> 732 warnings.warn(msg, FutureWarning, stacklevel=3)
733 dtype = ""M8[ns]""
734 return np.asarray(self.array, dtype)
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,398107776
https://github.com/pydata/xarray/issues/2666#issuecomment-569805431,https://api.github.com/repos/pydata/xarray/issues/2666,569805431,MDEyOklzc3VlQ29tbWVudDU2OTgwNTQzMQ==,1312546,2019-12-30T21:45:41Z,2019-12-30T21:48:39Z,MEMBER,"Just FYI, we're potentially enforcing this deprecation in https://github.com/pandas-dev/pandas/pull/30563 (which would be included in a pandas release in a week or two). Is that likely to cause problems for xarray users?
It's not clear to me what the desired behavior is (https://github.com/pydata/xarray/issues/3291 seems to want to preserve the tz, though it isn't clear they are willing to be forced into an object dtype array for it).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,398107776
https://github.com/pydata/xarray/pull/3598#issuecomment-562310739,https://api.github.com/repos/pydata/xarray/issues/3598,562310739,MDEyOklzc3VlQ29tbWVudDU2MjMxMDczOQ==,1312546,2019-12-05T20:47:02Z,2019-12-05T20:47:02Z,MEMBER,"Hopefully the new comments make sense. I'm struggling a bit to explain things since I don't fully understand them myself :)
> So it was a graph construction issue.
I *think* so. Dask doesn't actually validate arguments passed to HighLevelGraph. But I believe we assume that when all the values in `dependencies` are themselves keys of `layers`. We didn't have that before with things like
```
(Pdb) pp collections[0].dask.dependencies
{'all-84bc51ac43a9275b3662b0089710eab9': {'or_-64f95b81b2f8001b4c61f2023ac4c223'},
...
'eq-abac622d95ce5055d3e7b7dea944ec37': {'lambda-e79de3edfa267f41111057d26471bce3-x',
'ones-c4a83f4b990021618d55e0fa61a351d6'},
...
}
```
The `'lambda-e79de3edfa267f41111057d26471bce3-x'` wasn't a layer of the graph. It was previously nested under the single new layer we were creating `gname` or `lambda-e79de3edfa267f41111057d26471bce3` in this case.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,533555794
https://github.com/pydata/xarray/pull/3584#issuecomment-561794415,https://api.github.com/repos/pydata/xarray/issues/3584,561794415,MDEyOklzc3VlQ29tbWVudDU2MTc5NDQxNQ==,1312546,2019-12-04T19:09:34Z,2019-12-04T19:09:34Z,MEMBER,"@mrocklin if you get a chance, can you confirm that the values in `HighLevelGraph.depedencies` should be a subset of the keys of layers?
So in the following, the `lambda-<...>-x` is problematic, because it's not a key in `layers`?
```python
(Pdb) pp list(self.layers)
['eq-e98e52fb2b8e27b4b5158d399330c72d',
'lambda-0f1d0bc5e7df462d7125839aed006e04',
'ones-c4a83f4b990021618d55e0fa61a351d6']
(Pdb) pp self.dependencies
{'eq-e98e52fb2b8e27b4b5158d399330c72d': {'lambda-0f1d0bc5e7df462d7125839aed006e04-x',
'ones-c4a83f4b990021618d55e0fa61a351d6'},
'lambda-0f1d0bc5e7df462d7125839aed006e04': {'ones-c4a83f4b990021618d55e0fa61a351d6'},
'ones-c4a83f4b990021618d55e0fa61a351d6': set()}
```
That's coming from the `name` of the DataArray / the dask arary in `DataArray.data`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,530657789
https://github.com/pydata/xarray/pull/3584#issuecomment-561773837,https://api.github.com/repos/pydata/xarray/issues/3584,561773837,MDEyOklzc3VlQ29tbWVudDU2MTc3MzgzNw==,1312546,2019-12-04T18:17:56Z,2019-12-04T18:17:56Z,MEMBER,"So this is enough to fix this in Dask
```diff
diff --git a/dask/blockwise.py b/dask/blockwise.py
index 52a36c246..84e0ecc08 100644
--- a/dask/blockwise.py
+++ b/dask/blockwise.py
@@ -818,7 +818,7 @@ def fuse_roots(graph: HighLevelGraph, keys: list):
if (
isinstance(layer, Blockwise)
and len(deps) > 1
- and not any(dependencies[dep] for dep in deps) # no need to fuse if 0 or 1
+ and not any(dependencies.get(dep, {}) for dep in deps) # no need to fuse if 0 or 1
and all(len(dependents[dep]) == 1 for dep in deps)
):
new = toolz.merge(layer, *[layers[dep] for dep in deps])
```
I'm trying to understand why we're getting this KeyError though. I want to make sure that we have a valid HighLevelGraph before making that change.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,530657789
https://github.com/pydata/xarray/issues/2501#issuecomment-510217080,https://api.github.com/repos/pydata/xarray/issues/2501,510217080,MDEyOklzc3VlQ29tbWVudDUxMDIxNzA4MA==,1312546,2019-07-10T20:30:41Z,2019-07-10T20:30:41Z,MEMBER,"Yep, that’s my suspicion as well. I’m still plugging away at it. Currently the pausing logic isn’t quite working well.
> On Jul 10, 2019, at 12:10, Ryan Abernathey wrote:
>
> I believe that the memory issue is basically the same as dask/distributed#2602.
>
> The graphs look like: read --> rechunk --> write.
>
> Reading and rechunking increase memory consumption. Writing relieves it. In Rich's case, the workers just load too much data before they write it. Eventually they run out of memory.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub, or mute the thread.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,372848074
https://github.com/pydata/xarray/issues/2501#issuecomment-510167911,https://api.github.com/repos/pydata/xarray/issues/2501,510167911,MDEyOklzc3VlQ29tbWVudDUxMDE2NzkxMQ==,1312546,2019-07-10T18:05:07Z,2019-07-10T18:05:07Z,MEMBER,"Great, thanks. I’ll look into the memory issue when writing. We may already have an issue for it.
> On Jul 10, 2019, at 10:59, Rich Signell wrote:
>
> @TomAugspurger , I sat down here at Scipy with @rabernat and he instantly realized that we needed to drop the feature_id coordinate to prevent open_mfdataset from trying to harmonize that coordinate from all the chunks.
>
> So if I use this code, the open_mdfdataset command finishes:
>
> def drop_coords(ds):
> ds = ds.drop(['reference_time','feature_id'])
> return ds.reset_coords(drop=True)
> and I can then add back in the dropped coordinate values at the end:
>
> dsets = [xr.open_dataset(f) for f in files[:3]]
> ds.coords['feature_id'] = dsets[0].coords['feature_id']
> I'm now running into memory issues when I write the zarr data -- but I should raise that as a new issue, right?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub, or mute the thread.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,372848074
https://github.com/pydata/xarray/issues/2501#issuecomment-509346055,https://api.github.com/repos/pydata/xarray/issues/2501,509346055,MDEyOklzc3VlQ29tbWVudDUwOTM0NjA1NQ==,1312546,2019-07-08T18:46:58Z,2019-07-08T18:46:58Z,MEMBER,"@rsignell-usgs very helpful, thanks. I'd noticed that there was a pause after the open_dataset tasks finish, indicating that either the scheduler or (more likely) the client was doing work rather than the cluster. Most likely @rabernat's guess
> In open_mfdataset, all of the dimensions and coordinates of the individual files have to be checked and verified to be compatible. That is often the source of slow performance with open_mfdataset.
is correct. Verifying all that now, and looking into if / how that can be done on the workers.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,372848074
https://github.com/pydata/xarray/issues/2501#issuecomment-509307081,https://api.github.com/repos/pydata/xarray/issues/2501,509307081,MDEyOklzc3VlQ29tbWVudDUwOTMwNzA4MQ==,1312546,2019-07-08T16:57:15Z,2019-07-08T16:57:15Z,MEMBER,"I'm looking into it today. Can you clarify
> The memory use kept growing until the process died.
by ""process"" do you mean a dask worker process, or just the main python process executing the `ds = xr.open_mfdataset(...)` code?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,372848074
https://github.com/pydata/xarray/issues/2501#issuecomment-506497180,https://api.github.com/repos/pydata/xarray/issues/2501,506497180,MDEyOklzc3VlQ29tbWVudDUwNjQ5NzE4MA==,1312546,2019-06-27T20:24:26Z,2019-06-27T20:24:26Z,MEMBER,"> The datasets in our cloud datastore are designed explicitly to avoid this problem!
Good to know!
FYI, https://github.com/pydata/xarray/issues/2501#issuecomment-506478508 was user error (I can access it, but need to specify the us-east-1 region). Taking a look now.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,372848074
https://github.com/pydata/xarray/issues/2927#issuecomment-506486503,https://api.github.com/repos/pydata/xarray/issues/2927,506486503,MDEyOklzc3VlQ29tbWVudDUwNjQ4NjUwMw==,1312546,2019-06-27T19:51:58Z,2019-06-27T19:51:58Z,MEMBER,Spoke with @martindurant about this today. The mapping should probably strip the protocol from the `root` provided by the user. Tracking in https://github.com/intake/filesystem_spec/issues/56 (this issue can probably be closed).,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,438166604
https://github.com/pydata/xarray/issues/2501#issuecomment-506478508,https://api.github.com/repos/pydata/xarray/issues/2501,506478508,MDEyOklzc3VlQ29tbWVudDUwNjQ3ODUwOA==,1312546,2019-06-27T19:25:05Z,2019-06-27T19:25:05Z,MEMBER,"Thanks, will take a look this afternoon. Are there any datasets on https://pangeo-data.github.io/pangeo-datastore/ that would exhibit this poor behavior? I may not have access to the bucket (or I'm misusing `rclone`)
```
2019/06/27 14:23:50 NOTICE: Config file ""/Users/taugspurger/.config/rclone/rclone.conf"" not found - using defaults
2019/06/27 14:23:50 Failed to create file system for ""aws-east:nwm-archive/2009"": didn't find section in config file
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,372848074
https://github.com/pydata/xarray/issues/1661#issuecomment-339525582,https://api.github.com/repos/pydata/xarray/issues/1661,339525582,MDEyOklzc3VlQ29tbWVudDMzOTUyNTU4Mg==,1312546,2017-10-26T01:49:12Z,2017-10-26T01:49:12Z,MEMBER,"Yep, that was the change.
The fix is to explicitly register the converters before plotting:
```python
from pandas.tseries import converter
converter.register()
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,268487752
https://github.com/pydata/xarray/issues/1661#issuecomment-339510522,https://api.github.com/repos/pydata/xarray/issues/1661,339510522,MDEyOklzc3VlQ29tbWVudDMzOTUxMDUyMg==,1312546,2017-10-26T00:05:57Z,2017-10-26T00:05:57Z,MEMBER,Pandas used to register a matplotlib converter for datetimes on import. I’ll take a closer look in a bit. ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,268487752
https://github.com/pydata/xarray/pull/1457#issuecomment-318451800,https://api.github.com/repos/pydata/xarray/issues/1457,318451800,MDEyOklzc3VlQ29tbWVudDMxODQ1MTgwMA==,1312546,2017-07-27T18:45:36Z,2017-07-27T18:45:36Z,MEMBER,"Yep, thanks again for setting that up.
On Thu, Jul 27, 2017 at 11:39 AM, Wes McKinney
wrote:
> cool, are these numbers coming off the pandabox?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or mute
> the thread
>
> .
>
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,236347050
https://github.com/pydata/xarray/pull/1457#issuecomment-318376827,https://api.github.com/repos/pydata/xarray/issues/1457,318376827,MDEyOklzc3VlQ29tbWVudDMxODM3NjgyNw==,1312546,2017-07-27T14:21:30Z,2017-07-27T14:21:30Z,MEMBER,"These are now being run and published to https://tomaugspurger.github.io/asv-collection/xarray/
I'm plan to find a more permanent home to publish the results rather than my personal github pages site, but that may take a while before I can get to it.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,236347050
https://github.com/pydata/xarray/pull/1457#issuecomment-315402471,https://api.github.com/repos/pydata/xarray/issues/1457,315402471,MDEyOklzc3VlQ29tbWVudDMxNTQwMjQ3MQ==,1312546,2017-07-14T16:21:29Z,2017-07-14T16:21:29Z,MEMBER,"About hardware, we should be able to run these on the machine running the pandas benchmarks. Once it's merged I should be able to add it easily to https://github.com/TomAugspurger/asv-runner/blob/master/tests/full.yml and the benchmarks will be run and published (to https://tomaugspurger.github.io/asv-collection/ right now; not the permanent home)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,236347050