home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

44 rows where user = 7441788 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, reactions, created_at (date), updated_at (date)

issue 20

  • Could a DataArrayRolling object compute an arbitrary function on rolling windows? 6
  • Feature/weighted 5
  • ENH: format_array_flat() always displays first and last items. 4
  • {DataArray,Dataset}.rank() should support an optional list of dimensions 4
  • DataArray.argsort should be deleted 3
  • np.minimum.accumulate(da) doesn't work 3
  • ENH: format_array_flat() always displays first and last items. 3
  • DataArray.clip() no longer supports the out argument 3
  • BUG: datetime.date slicing doesn't work with Pandas 1.0.0 2
  • MultiIndex serialization to NetCDF 1
  • Including last coordinate values when displaying coordinates 1
  • Coordinate type changing from string to object 1
  • keepdims=True for xarray reductions 1
  • ENH: apply_ufunc logging or callback 1
  • Documentation of DataArray does not warn that inferring dimension names is deprecated 1
  • {DataArray,Dataset} accessors with parameters 1
  • open_mfdataset(paths, combine='nested') with and without concat_dim=None 1
  • Fix indexing with datetime64[ns] with pandas=1.1 1
  • Indexing a datetime64[ns] coordinate with a scalar datetime.date produces a KeyError 1
  • Export ufuncs from DataArray API 1

user 1

  • seth-p · 44 ✖

author_association 1

  • CONTRIBUTOR 44
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
856340566 https://github.com/pydata/xarray/issues/5278#issuecomment-856340566 https://api.github.com/repos/pydata/xarray/issues/5278 MDEyOklzc3VlQ29tbWVudDg1NjM0MDU2Ng== seth-p 7441788 2021-06-08T00:02:48Z 2021-06-08T00:04:28Z CONTRIBUTOR

almost no attention is paid to minimizing memory consumption (whether through in-place operations, or more generally minimizing temporary memory usage).

I think we'd be open to fixing this when it doesn't compromise readability. Can you open a new issue with some particularly bad examples?

I wouldn't necessarily say that it's particularly bad, but see the discussion following https://github.com/pydata/xarray/pull/2922#issuecomment-601496897.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  DataArray.clip() no longer supports the out argument 879033384
835597488 https://github.com/pydata/xarray/issues/5278#issuecomment-835597488 https://api.github.com/repos/pydata/xarray/issues/5278 MDEyOklzc3VlQ29tbWVudDgzNTU5NzQ4OA== seth-p 7441788 2021-05-09T00:40:08Z 2021-05-09T00:40:08Z CONTRIBUTOR

I'm not familiar at all with the various numpy interfaces, so I can't offer any input implementation-wise. But as a user, being able to do operations in place (via out or otherwise) is extremely useful when dealing with large arrays under memory constraints. In fact my one "philosophical" beef with xarray is that it seems almost no attention is paid to minimizing memory consumption (whether through in-place operations, or more generally minimizing temporary memory usage).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  DataArray.clip() no longer supports the out argument 879033384
834629871 https://github.com/pydata/xarray/issues/5278#issuecomment-834629871 https://api.github.com/repos/pydata/xarray/issues/5278 MDEyOklzc3VlQ29tbWVudDgzNDYyOTg3MQ== seth-p 7441788 2021-05-07T17:11:01Z 2021-05-07T17:11:14Z CONTRIBUTOR

What is the case for having out kwargs?

It lets you reuse memory you already have. In particular for a simple operation like clip, you can do it in-place: da.clip(..., out=da.values). Very useful if you deal with lots of data and memory is a concern.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  DataArray.clip() no longer supports the out argument 879033384
834415185 https://github.com/pydata/xarray/issues/5261#issuecomment-834415185 https://api.github.com/repos/pydata/xarray/issues/5261 MDEyOklzc3VlQ29tbWVudDgzNDQxNTE4NQ== seth-p 7441788 2021-05-07T13:51:46Z 2021-05-07T13:53:08Z CONTRIBUTOR

I'm wondering if one could just have a generic implementation of DataArray.func(*args, **kwargs) for any unrecognized "func" that calls np.func(self, *args, **kwargs) (perhaps conditional on some global numpy_fallthrough option)? And similarly for Dataset.func(*args, **kwargs).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Export ufuncs from DataArray API 876394165
695165172 https://github.com/pydata/xarray/issues/4363#issuecomment-695165172 https://api.github.com/repos/pydata/xarray/issues/4363 MDEyOklzc3VlQ29tbWVudDY5NTE2NTE3Mg== seth-p 7441788 2020-09-19T05:00:50Z 2020-09-19T05:00:50Z CONTRIBUTOR

Indeed, this is reported in https://github.com/pandas-dev/pandas/issues/35466#issuecomment-678407125 and https://github.com/pandas-dev/pandas/issues/35830. Also https://github.com/pandas-dev/pandas/pull/35478.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Indexing a datetime64[ns] coordinate with a scalar datetime.date produces a KeyError 683657289
693117098 https://github.com/pydata/xarray/pull/4292#issuecomment-693117098 https://api.github.com/repos/pydata/xarray/issues/4292 MDEyOklzc3VlQ29tbWVudDY5MzExNzA5OA== seth-p 7441788 2020-09-16T01:34:08Z 2020-09-16T01:34:08Z CONTRIBUTOR

Does this fix #4363?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Fix indexing with datetime64[ns] with pandas=1.1 669307837
625530671 https://github.com/pydata/xarray/issues/4044#issuecomment-625530671 https://api.github.com/repos/pydata/xarray/issues/4044 MDEyOklzc3VlQ29tbWVudDYyNTUzMDY3MQ== seth-p 7441788 2020-05-07T22:33:46Z 2020-05-07T22:33:46Z CONTRIBUTOR

@TomNicholas , yes, thank you.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset(paths, combine='nested') with and without concat_dim=None 614149170
601885539 https://github.com/pydata/xarray/pull/2922#issuecomment-601885539 https://api.github.com/repos/pydata/xarray/issues/2922 MDEyOklzc3VlQ29tbWVudDYwMTg4NTUzOQ== seth-p 7441788 2020-03-20T19:57:54Z 2020-03-20T20:00:20Z CONTRIBUTOR

All good points:

What could be done, though is to only do da = da.fillna(0.0) if da contains NaNs.

Good idea, though I don't know what the performance hit would be of the extra check (in the case that da does contain NaNs, so the check is for naught).

I assume so. I don't know what kind of temporary variables np.einsum creates. Also np.einsum is wrapped in xr.apply_ufunc so all kinds of magic is going on.

Well, (da * weights) will be at least as large as da. I'm not certain, but I don't think np.einsum creates huge temporary arrays.

Do you want to leave it away for performance reasons? Because it was a deliberate decision to not support NaNs in the weights and I don't think this is going to change.

Yes. You can continue not supporting NaNs in the weights, yet not explicitly check that there are no NaNs (optionally, if the caller assures you that there are no NaNs).

None of your suggested functions support NaNs so they won't work.

Correct. These have nothing to do with the NaNs issue.

For profiling memory usage, I use psutil.Process(os.getpid()).memory_info().rss for current usage and resource.getusage(resource.RUSAGE_SElF).ru_maxrss for peak usage (on linux).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/weighted 437765416
601709733 https://github.com/pydata/xarray/pull/2922#issuecomment-601709733 https://api.github.com/repos/pydata/xarray/issues/2922 MDEyOklzc3VlQ29tbWVudDYwMTcwOTczMw== seth-p 7441788 2020-03-20T13:47:39Z 2020-03-20T16:31:14Z CONTRIBUTOR

@mathause, have you considered using these functions? - np.average() to calculate weighted mean(). - np.cov() to calculate weighted cov(), var(), and std(). - sp.stats.cumfreq() to calculate weighted median() (I haven't thought this through). - sp.spatial.distance.correlation() to calculate weighted corrcoef(). (Of course one could also calculate this from weighted cov() (see above), but first need to mask the two arrays simultaneously.) - sklearn.utils.extmath.weighted_mode() to calculate weighted mode(). - gmisclib.weighted_percentile.{wp,wtd_median}() to calculate weighted quantile() and median().

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/weighted 437765416
601708110 https://github.com/pydata/xarray/pull/2922#issuecomment-601708110 https://api.github.com/repos/pydata/xarray/issues/2922 MDEyOklzc3VlQ29tbWVudDYwMTcwODExMA== seth-p 7441788 2020-03-20T13:44:03Z 2020-03-20T13:52:06Z CONTRIBUTOR

@mathause, ideally dot() would support skipna, so you could eliminate the da = da.fillna(0.0) and pass the skipna down the line. But alas it doesn't...

(da * weights).sum(dim=dim, skipna=skipna) would likely make things worse, I think, as it would necessarily create a temporary array of sized at least da, no?

Either way, this only addresses the da = da.fillna(0.0), not the mask = da.notnull().

Also, perhaps the test if weights.isnull().any() in Weighted.__init__() should be optional?

Maybe I'm more sensitive to this than others, but I regularly deal with 10-100GB arrays.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/weighted 437765416
601699091 https://github.com/pydata/xarray/pull/2922#issuecomment-601699091 https://api.github.com/repos/pydata/xarray/issues/2922 MDEyOklzc3VlQ29tbWVudDYwMTY5OTA5MQ== seth-p 7441788 2020-03-20T13:25:21Z 2020-03-20T13:25:21Z CONTRIBUTOR

@max-sixty, I wish I could, but I'm afraid that I cannot submit code due to employer limitations.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/weighted 437765416
601496897 https://github.com/pydata/xarray/pull/2922#issuecomment-601496897 https://api.github.com/repos/pydata/xarray/issues/2922 MDEyOklzc3VlQ29tbWVudDYwMTQ5Njg5Nw== seth-p 7441788 2020-03-20T02:11:53Z 2020-03-20T02:12:24Z CONTRIBUTOR

I realize this is a bit late, but I'm still concerned about memory usage, specifically in https://github.com/pydata/xarray/blob/master/xarray/core/weighted.py#L130 and https://github.com/pydata/xarray/blob/master/xarray/core/weighted.py#L143. If da.sizes = {'dim_0': 100000, 'dim_1': 100000}, the two lines above will cause da.weighted(weights).mean('dim_0') to create two simultaneous temporary 100000x100000 arrays, which could be problematic.

I would have implemented this using apply_ufunc, so that one creates these temporary variables only on as small an array as absolutely necessary -- in this case just of size sizes['dim_0'] = 100000. (Much as I would like to, I'm afraid I'm not able to contribute code.) Of course this won't help in the case one is summing over all dimensions, but might as well minimize memory usage in some cases even if not in all.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/weighted 437765416
594682466 https://github.com/pydata/xarray/issues/3829#issuecomment-594682466 https://api.github.com/repos/pydata/xarray/issues/3829 MDEyOklzc3VlQ29tbWVudDU5NDY4MjQ2Ng== seth-p 7441788 2020-03-04T17:27:09Z 2020-03-04T17:27:09Z CONTRIBUTOR

@keewis, thanks for the suggestions. Both seem reasonable.

In your first example, if you wanted to prohibit obj.weighted.sum(dim), you could just check for self._weight in sum(). Though I suppose it would be nice to be able to have the interpreter enforce the requirement and not have to do an explicit check in every method.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  {DataArray,Dataset} accessors with parameters 575564170
594128647 https://github.com/pydata/xarray/issues/3820#issuecomment-594128647 https://api.github.com/repos/pydata/xarray/issues/3820 MDEyOklzc3VlQ29tbWVudDU5NDEyODY0Nw== seth-p 7441788 2020-03-03T19:34:39Z 2020-03-03T19:34:39Z CONTRIBUTOR

Note that inferring dimensions from coords when it is a list of tuples does still work (with no deprecation warning): ``` In [1]: import numpy as np, xarray as xr

In [2]: xr.DataArray(np.zeros((2, 2)), coords=[('x', [1, 2]), ('y', [1, 2])])
Out[2]: <xarray.DataArray (x: 2, y: 2)> array([[0., 0.], [0., 0.]]) Coordinates: * x (x) int64 1 2 * y (y) int64 1 2 ```

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Documentation of DataArray does not warn that inferring dimension names is deprecated 574097799
592737661 https://github.com/pydata/xarray/issues/3810#issuecomment-592737661 https://api.github.com/repos/pydata/xarray/issues/3810 MDEyOklzc3VlQ29tbWVudDU5MjczNzY2MQ== seth-p 7441788 2020-02-28T21:29:58Z 2020-02-28T21:31:31Z CONTRIBUTOR

Note that with the apply_ufunc implementation we're only reshaping dims-sized ndarrays, not (necessarily) the whole DataArray, so maybe it's not too bad? It might be better to first sort dims to be in the same order as self.dims. i.e. dims = [dim_ for dim_ in self.dims if dim_ in dims]. But I'm just speculating.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  {DataArray,Dataset}.rank() should support an optional list of dimensions 572875480
592715925 https://github.com/pydata/xarray/issues/3810#issuecomment-592715925 https://api.github.com/repos/pydata/xarray/issues/3810 MDEyOklzc3VlQ29tbWVudDU5MjcxNTkyNQ== seth-p 7441788 2020-02-28T20:33:43Z 2020-02-28T20:35:57Z CONTRIBUTOR

A few minor tweaks needed: ``` In [20]: import bottleneck

In [21]: xr.apply_ufunc( ...: lambda x: bottleneck.rankdata(x).reshape(x.shape), ...: d, ...: input_core_dims=[['xyz', 'abc']], ...: output_core_dims=[['xyz', 'abc']], ...: vectorize=True ...: ).transpose(*d.dims)
Out[21]: <xarray.DataArray (abc: 4, xyz: 3)> array([[ 1., 2., 3.], [ 4., 5., 6.], [ 7., 8., 9.], [10., 11., 12.]]) Dimensions without coordinates: abc, xyz ```

Despite what the docs say, bottleneck.{nan}rankdata(a) returns a 1-dimensional ndarray, not an array with the same shape as a.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  {DataArray,Dataset}.rank() should support an optional list of dimensions 572875480
592672463 https://github.com/pydata/xarray/issues/3810#issuecomment-592672463 https://api.github.com/repos/pydata/xarray/issues/3810 MDEyOklzc3VlQ29tbWVudDU5MjY3MjQ2Mw== seth-p 7441788 2020-02-28T18:51:18Z 2020-02-28T18:52:29Z CONTRIBUTOR

What's wrong with the following? (Still need to deal with pct and keep_attrs.) apply_ufunc( bottleneck.{nan}rankdata, self, input_core_dims=[dims], output_core_dims=[dims], vectorize=True )

Per https://kwgoodman.github.io/bottleneck-doc/reference.html#bottleneck.rankdata, "The default (axis=None) is to rank the elements of the flattened array."

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  {DataArray,Dataset}.rank() should support an optional list of dimensions 572875480
592654794 https://github.com/pydata/xarray/issues/3810#issuecomment-592654794 https://api.github.com/repos/pydata/xarray/issues/3810 MDEyOklzc3VlQ29tbWVudDU5MjY1NDc5NA== seth-p 7441788 2020-02-28T18:06:57Z 2020-02-28T18:06:57Z CONTRIBUTOR

Assuming dims is a non-empty list of dimensions, the following code seems to work: temp_dim = '__temp_dim__' return da.stack(**{temp_dim: dims}).\ rank(temp_dim, pct=pct, keep_attrs=keep_attrs).\ unstack(temp_dim).transpose(*da.dims).\ drop_vars([dim_ for dim_ in dims if dim_ not in da.coords])

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  {DataArray,Dataset}.rank() should support an optional list of dimensions 572875480
592151913 https://github.com/pydata/xarray/issues/2017#issuecomment-592151913 https://api.github.com/repos/pydata/xarray/issues/2017 MDEyOklzc3VlQ29tbWVudDU5MjE1MTkxMw== seth-p 7441788 2020-02-27T20:04:44Z 2020-02-27T20:04:44Z CONTRIBUTOR

I'm afraid I'm not able to submit a PR. Sorry.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  np.minimum.accumulate(da) doesn't work 309098246
592033172 https://github.com/pydata/xarray/issues/2017#issuecomment-592033172 https://api.github.com/repos/pydata/xarray/issues/2017 MDEyOklzc3VlQ29tbWVudDU5MjAzMzE3Mg== seth-p 7441788 2020-02-27T15:55:23Z 2020-02-27T15:55:23Z CONTRIBUTOR

I think the only necessary changes are (a) delete the if method != "__call__" check (https://github.com/pydata/xarray/blob/master/xarray/core/arithmetic.py#L49), and (b) in the apply_ufunc() call, replace ufunc with getattr(ufunc, method) (https://github.com/pydata/xarray/blob/master/xarray/core/arithmetic.py#L71).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  np.minimum.accumulate(da) doesn't work 309098246
592027630 https://github.com/pydata/xarray/issues/2017#issuecomment-592027630 https://api.github.com/repos/pydata/xarray/issues/2017 MDEyOklzc3VlQ29tbWVudDU5MjAyNzYzMA== seth-p 7441788 2020-02-27T15:38:05Z 2020-02-27T15:38:05Z CONTRIBUTOR

This issue is still relevant.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  np.minimum.accumulate(da) doesn't work 309098246
582613810 https://github.com/pydata/xarray/issues/3736#issuecomment-582613810 https://api.github.com/repos/pydata/xarray/issues/3736 MDEyOklzc3VlQ29tbWVudDU4MjYxMzgxMA== seth-p 7441788 2020-02-05T21:09:43Z 2020-02-05T21:09:43Z CONTRIBUTOR

This is fixed in Pandas 1.0.1.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  BUG: datetime.date slicing doesn't work with Pandas 1.0.0 558204984
580897245 https://github.com/pydata/xarray/issues/3736#issuecomment-580897245 https://api.github.com/repos/pydata/xarray/issues/3736 MDEyOklzc3VlQ29tbWVudDU4MDg5NzI0NQ== seth-p 7441788 2020-01-31T20:24:30Z 2020-01-31T20:30:22Z CONTRIBUTOR

Pandas bug: https://github.com/pandas-dev/pandas/issues/31501

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  BUG: datetime.date slicing doesn't work with Pandas 1.0.0 558204984
545261919 https://github.com/pydata/xarray/issues/1635#issuecomment-545261919 https://api.github.com/repos/pydata/xarray/issues/1635 MDEyOklzc3VlQ29tbWVudDU0NTI2MTkxOQ== seth-p 7441788 2019-10-23T04:35:37Z 2019-10-23T04:35:37Z CONTRIBUTOR

I think this issue is still relevant.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  DataArray.argsort should be deleted 266133430
523374500 https://github.com/pydata/xarray/issues/3236#issuecomment-523374500 https://api.github.com/repos/pydata/xarray/issues/3236 MDEyOklzc3VlQ29tbWVudDUyMzM3NDUwMA== seth-p 7441788 2019-08-21T09:22:35Z 2019-08-21T09:24:43Z CONTRIBUTOR

I was thinking a tuple/list (corresponding to args) of dicts (dim -> value) containing the non-input_core_dims being evaluated. (If it weren't for exclude_dims, if I understand it correctly, I think one would need only a single dict (dim -> value).)

I would be fine with this being an optional kwarg to the actual func.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: apply_ufunc logging or callback 483028482
500217864 https://github.com/pydata/xarray/issues/1266#issuecomment-500217864 https://api.github.com/repos/pydata/xarray/issues/1266 MDEyOklzc3VlQ29tbWVudDUwMDIxNzg2NA== seth-p 7441788 2019-06-09T14:50:17Z 2019-06-09T14:50:17Z CONTRIBUTOR

I think this is still an issue.

{
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Coordinate type changing from string to object 207317762
438749995 https://github.com/pydata/xarray/issues/1666#issuecomment-438749995 https://api.github.com/repos/pydata/xarray/issues/1666 MDEyOklzc3VlQ29tbWVudDQzODc0OTk5NQ== seth-p 7441788 2018-11-14T17:35:17Z 2018-11-14T17:38:31Z CONTRIBUTOR

Also, the following code seems to accomplish the same as the above:

``` def apply_func_rolling(func, args, kwargs): # determine rolling parameters, and remove them from kwargs apply_func_kwargs = {'input_core_dims', 'output_core_dims', 'vectorize', 'join', 'dataset_join', 'keep_attrs', 'exclude_dims', 'dataset_fill_value', 'kwargs', 'dask', 'output_dtypes', 'output_sizes'} min_periods = kwargs.pop('min_periods', None) center = kwargs.pop('center', False) dim = xr.core.utils.either_dict_or_kwargs(kwargs.pop('dim', None), {k: v for k, v in kwargs.items() if k not in apply_func_kwargs}, 'apply_func_rolling') if len(dim) != 1: raise ValueError("precisely one rolling dimension must be specified") rolling_dim = list(dim.keys())[0] kwargs.pop(rolling_dim) temp_rolling_dim = 'temp{}__'.format(rolling_dim) # change input_core_dims rolling_dim values to temp_rolling_dim input_core_dims = kwargs.get('input_core_dims', None) if input_core_dims: kwargs['input_core_dims'] = [[(temp_rolling_dim if (dim_ == rolling_dim) else dim_) for dim_ in dims_] for dims_ in input_core_dims] # change exclude_dims rolling_dim values to temp_rolling_dim exclude_dims = kwargs.get('exclude_dims', None) if exclude_dims: kwargs['exclude_dims'] = [[(temp_rolling_dim if (dim_ == rolling_dim) else dim_) for dim_ in dims_] for dims_ in exclude_dims] # call apply_func() with rolling-constructed objects return xr.apply_ufunc(func, [(arg.rolling(dim=dim, min_periods=min_periods, center=center). construct(temp_rolling_dim) if (rolling_dim in arg.dims) else arg) for arg in args], **kwargs)

apply_func_rolling(lambda a, b, w: ..., variables, observations, weights, date=N, input_core_dims=[['date', 'dim1', 'dim2', 'var'], ['date', 'dim1', 'dim2'], ['date', 'dim1', 'dim2']], output_core_dims=[['var']], vectorize=True) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Could a DataArrayRolling object compute an arbitrary function on rolling windows? 269297904
438372589 https://github.com/pydata/xarray/issues/1666#issuecomment-438372589 https://api.github.com/repos/pydata/xarray/issues/1666 MDEyOklzc3VlQ29tbWVudDQzODM3MjU4OQ== seth-p 7441788 2018-11-13T17:56:17Z 2018-11-13T20:16:57Z CONTRIBUTOR

construct method does not allocate that large array in memory. It uses the strided trick and therefore consumers only the order of 1000x1000x1000.

Ah. I didn't realize that. Good to know.

What I'm actually looking to do is a rolling weighted regression. I have three DataArrays: - observations, dims=('date', 'dim1', 'dim2') - variables, dims=('date', 'dim1', 'dim2', 'var') - weights, dims=('date', 'dim1', 'dim2')

I want to calculate a regression_coefficients DataArray with dims=('date', 'var'), where for each date it has the weighted regression coefficients calculated over the trailing N dates (over 'dim1' and 'dim2'). One way would be to put the three DataArrays in a Dataset, and then use a newly-defined Dataset.rolling().apply(). Another way would be to use an enhanced version of apply_ufunc() that can take Rolling objects. But now that I know that DataArrayRolling.construct() won't kill my machine, I'll try apply_ufunc() with the three DataArrayRolling.construct() objects. I'd welcome other suggestions.

OK, I seem to have got my problem working using: apply_ufunc(lambda a, b, w: ..., variables.rolling(date=N).construct('temp_date'), observations.rolling(date=N).construct('temp_date'), weights.rolling(date=N).construct('temp_date'), input_core_dims=[['temp_date', 'dim1', 'dim2', 'var'], ['temp_date', 'dim1', 'dim2'], ['temp_date', 'dim1', 'dim2']], output_core_dims=[['var']], vectorize=True) Still, I wonder if there isn't a more "natural" way of accomplishing this.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Could a DataArrayRolling object compute an arbitrary function on rolling windows? 269297904
438351586 https://github.com/pydata/xarray/issues/1666#issuecomment-438351586 https://api.github.com/repos/pydata/xarray/issues/1666 MDEyOklzc3VlQ29tbWVudDQzODM1MTU4Ng== seth-p 7441788 2018-11-13T17:06:38Z 2018-11-13T17:06:38Z CONTRIBUTOR

I think there are actually a couple different ways Rolling.apply could work, but this seems like one possible way:

``` from xarray.core.utils import maybe_wrap_array from xarray.core.combine import concat

def rolling_apply(rolling, func, args, kwargs): applied = [maybe_wrap_array(label, func(arr, args, **kwargs)) for label, arr in rolling] combined = concat(applied, dim=rolling.obj.coords[rolling.dim]) return combined ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Could a DataArrayRolling object compute an arbitrary function on rolling windows? 269297904
438345386 https://github.com/pydata/xarray/issues/1666#issuecomment-438345386 https://api.github.com/repos/pydata/xarray/issues/1666 MDEyOklzc3VlQ29tbWVudDQzODM0NTM4Ng== seth-p 7441788 2018-11-13T16:50:08Z 2018-11-13T16:52:13Z CONTRIBUTOR

The problem I have with Rolling.construct is the same that I have with Rolling.reduce: it's very (potentially) memory-inefficient. E.g. consider a 1000x1000x1000 array for which I want to apply rolling window of length 500 along the final dimension; I believe Rolling.construct/reduce will construct a 1000x1000x1000x500 array. This can quickly get out of hand.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Could a DataArrayRolling object compute an arbitrary function on rolling windows? 269297904
438344608 https://github.com/pydata/xarray/issues/1666#issuecomment-438344608 https://api.github.com/repos/pydata/xarray/issues/1666 MDEyOklzc3VlQ29tbWVudDQzODM0NDYwOA== seth-p 7441788 2018-11-13T16:48:09Z 2018-11-13T16:48:09Z CONTRIBUTOR

Separately, maybe apply_ufunc should accepts Rolling objects?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Could a DataArrayRolling object compute an arbitrary function on rolling windows? 269297904
438322013 https://github.com/pydata/xarray/issues/1666#issuecomment-438322013 https://api.github.com/repos/pydata/xarray/issues/1666 MDEyOklzc3VlQ29tbWVudDQzODMyMjAxMw== seth-p 7441788 2018-11-13T16:06:24Z 2018-11-13T16:06:24Z CONTRIBUTOR

I think what is needed are DataArrayRolling.apply and DatasetRolling.apply (like DataArrayGroupBy.apply and DatasetGroupBy.apply). The problem with the reduce methods is that they are memory-inefficient.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Could a DataArrayRolling object compute an arbitrary function on rolling windows? 269297904
436015893 https://github.com/pydata/xarray/issues/1077#issuecomment-436015893 https://api.github.com/repos/pydata/xarray/issues/1077 MDEyOklzc3VlQ29tbWVudDQzNjAxNTg5Mw== seth-p 7441788 2018-11-05T20:03:48Z 2018-11-05T20:03:48Z CONTRIBUTOR

This code isn't particularly pretty, and I'm not sure if it handles all cases, but it enables serialization of MultiIndex indices by calling ds.mi.encode_multiindices() before serializing and ds.mi.decode_multiindices() after deserializing.

``` @xr.register_dataset_accessor('mi') class MiscDatasetAccessor(): def init(self, xarray_obj): self._obj = xarray_obj

def encode_multiindices(self):
    result = self._obj
    for name, index in list(self._obj.indexes.items()):
        if isinstance(index, pd.MultiIndex):
            temp_name = '__' + name
            new_coords = {'{}__{}'.format(temp_name, level_name): level_values.rename(None)
                          for level_name, level_values in zip(index.names, index.levels)}
            new_coords[temp_name] = xr.DataArray(index.labels,
                                                 dims=('{}__names__'.format(temp_name),
                                                       '{}__num__'.format(temp_name)),
                                                 coords={'{}__names__'.format(temp_name): index.names,
                                                         '{}__num__'.format(temp_name): list(range(len(index)))},
                                                 attrs={'__is_multiindex': 1})
            result = result.drop(name).assign_coords(**new_coords)
    return result

def decode_multiindices(self):
    result = self._obj
    for temp_name, da in list(self._obj.coords.items()):
        if temp_name.startswith('__') and da.attrs.get('__is_multiindex', False):
            name = temp_name[2:]
            level_names = da.coords['{}__names__'.format(temp_name)].values
            levels = [result.coords['{}__{}'.format(temp_name, level_name)].values for level_name in level_names]
            labels = da.values
            result = result.assign_coords(**{name: pd.MultiIndex(levels=levels, labels=labels, names=level_names)})
            result = result.drop(['{}__{}'.format(temp_name, level_name) for level_name in level_names] +
                                 list(da.dims) + [temp_name])
    return result

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  MultiIndex serialization to NetCDF 187069161
407159487 https://github.com/pydata/xarray/issues/2170#issuecomment-407159487 https://api.github.com/repos/pydata/xarray/issues/2170 MDEyOklzc3VlQ29tbWVudDQwNzE1OTQ4Nw== seth-p 7441788 2018-07-23T18:39:52Z 2018-07-23T18:39:52Z CONTRIBUTOR

I second this request.

The following may not be optimal, but seems to work for me as a keepdims=True version of reduce(): def dim_preserving_reduce(self, func, dim=None, axis=None, label=None, keep_attrs=False, **kwargs): if axis is not None: dim = np.take(self._obj.dims, axis, mode='wrap') dims = dim if isinstance(dim, (list, tuple)) else [dim] dims_coords = {dim: [lab] for dim, lab in zip(dims, (label if isinstance(label, list) else [label]))} return self._obj.reduce(func, dim=dims, keep_attrs=keep_attrs, **kwargs). \ expand_dims(dims, axis=[self._obj.dims.index(dim) for dim in dims]). \ assign_coords(**dims_coords)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  keepdims=True for xarray reductions 325436508
406639293 https://github.com/pydata/xarray/pull/2293#issuecomment-406639293 https://api.github.com/repos/pydata/xarray/issues/2293 MDEyOklzc3VlQ29tbWVudDQwNjYzOTI5Mw== seth-p 7441788 2018-07-20T15:40:23Z 2018-07-20T15:40:23Z CONTRIBUTOR

I added a note to whats-new.rst.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: format_array_flat() always displays first and last items. 341664808
406485754 https://github.com/pydata/xarray/pull/2293#issuecomment-406485754 https://api.github.com/repos/pydata/xarray/issues/2293 MDEyOklzc3VlQ29tbWVudDQwNjQ4NTc1NA== seth-p 7441788 2018-07-20T04:27:51Z 2018-07-20T04:28:35Z CONTRIBUTOR

One slight oddity is that formatting.format_array_flat(np.arange(4), 0) returns 0 ... 3 even though 0 1 2 3 would take up the same number of characters. I'm not inclined to add a special case for this, but let me know if you think I should.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: format_array_flat() always displays first and last items. 341664808
406390564 https://github.com/pydata/xarray/pull/2293#issuecomment-406390564 https://api.github.com/repos/pydata/xarray/issues/2293 MDEyOklzc3VlQ29tbWVudDQwNjM5MDU2NA== seth-p 7441788 2018-07-19T19:38:24Z 2018-07-19T19:38:24Z CONTRIBUTOR

@shoyer, I think I've implemented all your suggestions. Let me know what you think. (I haven't yet updated whats-new.rst.)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: format_array_flat() always displays first and last items. 341664808
405700114 https://github.com/pydata/xarray/issues/1186#issuecomment-405700114 https://api.github.com/repos/pydata/xarray/issues/1186 MDEyOklzc3VlQ29tbWVudDQwNTcwMDExNA== seth-p 7441788 2018-07-17T19:28:49Z 2018-07-17T19:28:49Z CONTRIBUTOR

I included sample output in https://github.com/pydata/xarray/pull/2293#issuecomment-405369643 and https://github.com/pydata/xarray/pull/2293/files#diff-f82411dbe6aa53e3b6a5d9c2b601094c.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Including last coordinate values when displaying coordinates 197709208
405369880 https://github.com/pydata/xarray/pull/2285#issuecomment-405369880 https://api.github.com/repos/pydata/xarray/issues/2285 MDEyOklzc3VlQ29tbWVudDQwNTM2OTg4MA== seth-p 7441788 2018-07-16T20:23:50Z 2018-07-16T20:23:50Z CONTRIBUTOR

Replaced with https://github.com/pydata/xarray/pull/2293.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: format_array_flat() always displays first and last items. 341149017
405369643 https://github.com/pydata/xarray/pull/2293#issuecomment-405369643 https://api.github.com/repos/pydata/xarray/issues/2293 MDEyOklzc3VlQ29tbWVudDQwNTM2OTY0Mw== seth-p 7441788 2018-07-16T20:23:02Z 2018-07-16T20:23:02Z CONTRIBUTOR

Sample output:

``` (base) C:\Users\Seth\github\xarray>ipython Python 3.6.5 | packaged by conda-forge | (default, Apr 6 2018, 16:13:55) [MSC v.1900 64 bit (AMD64)] Type 'copyright', 'credits' or 'license' for more information IPython 6.4.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import xarray as xr

In [2]: words = "This is the time for all good men to come to the aid of their country".split(' ')

In [3]: for i in range(0, len(words) + 1): ...: print("-------------------------------------------------------------------------------") ...: print(xr.DataArray(words[:i], dims=('foo',), coords={'foo': words[:i]})) ...:


<xarray.DataArray (foo: 0)> array([], dtype=float64) Coordinates: * foo (foo) float64


<xarray.DataArray (foo: 1)> array(['This'], dtype='<U4') Coordinates: * foo (foo) <U4 'This'


<xarray.DataArray (foo: 2)> array(['This', 'is'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is'


<xarray.DataArray (foo: 3)> array(['This', 'is', 'the'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the'


<xarray.DataArray (foo: 4)> array(['This', 'is', 'the', 'time'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the' 'time'


<xarray.DataArray (foo: 5)> array(['This', 'is', 'the', 'time', 'for'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the' 'time' 'for'


<xarray.DataArray (foo: 6)> array(['This', 'is', 'the', 'time', 'for', 'all'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the' 'time' 'for' 'all'


<xarray.DataArray (foo: 7)> array(['This', 'is', 'the', 'time', 'for', 'all', 'good'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the' 'time' 'for' 'all' 'good'


<xarray.DataArray (foo: 8)> array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the' 'time' 'for' 'all' 'good' 'men'


<xarray.DataArray (foo: 9)> array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the' 'time' 'for' 'all' 'good' 'men' 'to'


<xarray.DataArray (foo: 10)> array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the' 'time' ... 'good' 'men' 'to' 'come'


<xarray.DataArray (foo: 11)> array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the' 'time' 'for' ... 'men' 'to' 'come' 'to'


<xarray.DataArray (foo: 12)> array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to', 'the'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the' 'time' 'for' ... 'to' 'come' 'to' 'the'


<xarray.DataArray (foo: 13)> array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to', 'the', 'aid'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the' 'time' ... 'come' 'to' 'the' 'aid'


<xarray.DataArray (foo: 14)> array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to', 'the', 'aid', 'of'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the' 'time' 'for' ... 'to' 'the' 'aid' 'of'


<xarray.DataArray (foo: 15)> array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to', 'the', 'aid', 'of', 'their'], dtype='<U5') Coordinates: * foo (foo) <U5 'This' 'is' 'the' 'time' ... 'the' 'aid' 'of' 'their'


<xarray.DataArray (foo: 16)> array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to', 'the', 'aid', 'of', 'their', 'country'], dtype='<U7') Coordinates: * foo (foo) <U7 'This' 'is' 'the' 'time' ... 'aid' 'of' 'their' 'country' ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: format_array_flat() always displays first and last items. 341664808
405368970 https://github.com/pydata/xarray/pull/2285#issuecomment-405368970 https://api.github.com/repos/pydata/xarray/issues/2285 MDEyOklzc3VlQ29tbWVudDQwNTM2ODk3MA== seth-p 7441788 2018-07-16T20:20:38Z 2018-07-16T20:20:38Z CONTRIBUTOR

For some reason (presumably due to Github's outage today), this PR isn't updating to reflect the latest commits in my branch. So I'm going to close this one and create a new one.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: format_array_flat() always displays first and last items. 341149017
405305811 https://github.com/pydata/xarray/pull/2285#issuecomment-405305811 https://api.github.com/repos/pydata/xarray/issues/2285 MDEyOklzc3VlQ29tbWVudDQwNTMwNTgxMQ== seth-p 7441788 2018-07-16T16:28:21Z 2018-07-16T16:28:21Z CONTRIBUTOR

Sample output:

``` (base) C:\Users\Seth\github\xarray>ipython Python 3.6.5 | packaged by conda-forge | (default, Apr 6 2018, 16:13:55) [MSC v.1900 64 bit (AMD64)] Type 'copyright', 'credits' or 'license' for more information IPython 6.4.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import xarray as xr

In [2]: words = "This is the time for all good men to come to the aid of their country".split(' ')

In [3]: for i in range(0, len(words) + 1): ...: print("-------------------------------------------------------------------------------") ...: print(xr.DataArray(words[:i], dims=('foo',), coords={'foo': words[:i]})) ...:


<xarray.DataArray (foo: 0)> array([], dtype=float64) Coordinates: * foo (foo) float64


<xarray.DataArray (foo: 1)> array(['This'], dtype='<U4') Coordinates: * foo (foo) <U4 'This'


<xarray.DataArray (foo: 2)> array(['This', 'is'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is'


<xarray.DataArray (foo: 3)> array(['This', 'is', 'the'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the'


<xarray.DataArray (foo: 4)> array(['This', 'is', 'the', 'time'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the' 'time'


<xarray.DataArray (foo: 5)> array(['This', 'is', 'the', 'time', 'for'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the' 'time' 'for'


<xarray.DataArray (foo: 6)> array(['This', 'is', 'the', 'time', 'for', 'all'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the' 'time' 'for' 'all'


<xarray.DataArray (foo: 7)> array(['This', 'is', 'the', 'time', 'for', 'all', 'good'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the' 'time' 'for' 'all' 'good'


<xarray.DataArray (foo: 8)> array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the' 'time' 'for' 'all' 'good' 'men'


<xarray.DataArray (foo: 9)> array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the' 'time' ... 'all' 'good' 'men' 'to'


<xarray.DataArray (foo: 10)> array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the' 'time' ... 'good' 'men' 'to' 'come'


<xarray.DataArray (foo: 11)> array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the' 'time' ... 'men' 'to' 'come' 'to'


<xarray.DataArray (foo: 12)> array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to', 'the'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the' 'time' ... 'to' 'come' 'to' 'the'


<xarray.DataArray (foo: 13)> array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to', 'the', 'aid'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the' 'time' ... 'come' 'to' 'the' 'aid'


<xarray.DataArray (foo: 14)> array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to', 'the', 'aid', 'of'], dtype='<U4') Coordinates: * foo (foo) <U4 'This' 'is' 'the' 'time' 'for' ... 'to' 'the' 'aid' 'of'


<xarray.DataArray (foo: 15)> array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to', 'the', 'aid', 'of', 'their'], dtype='<U5') Coordinates: * foo (foo) <U5 'This' 'is' 'the' 'time' ... 'the' 'aid' 'of' 'their'


<xarray.DataArray (foo: 16)> array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to', 'the', 'aid', 'of', 'their', 'country'], dtype='<U7') Coordinates: * foo (foo) <U7 'This' 'is' 'the' 'time' ... 'of' 'their' 'country' ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: format_array_flat() always displays first and last items. 341149017
337672616 https://github.com/pydata/xarray/issues/1635#issuecomment-337672616 https://api.github.com/repos/pydata/xarray/issues/1635 MDEyOklzc3VlQ29tbWVudDMzNzY3MjYxNg== seth-p 7441788 2017-10-18T17:48:28Z 2017-10-18T18:36:25Z CONTRIBUTOR

I'm not a fan of auto-flattening either, but that's what nd.argsort() does...

One option is to have DataArray.arg{min,max,sort}() all take an optional flag argument specifying whether to return integer indices or index labels. But I think my preference would be be to have six separate functions: DataArray.{idx,}arg{min,max,sort}() (or some such nomenclature that includes arg in all six functions).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  DataArray.argsort should be deleted 266133430
337623613 https://github.com/pydata/xarray/issues/1635#issuecomment-337623613 https://api.github.com/repos/pydata/xarray/issues/1635 MDEyOklzc3VlQ29tbWVudDMzNzYyMzYxMw== seth-p 7441788 2017-10-18T15:08:57Z 2017-10-18T15:08:57Z CONTRIBUTOR

I think that makes sense, though I don't quite understand what would go in its place. Another possibility -- perhaps a bad one -- is to permute the values in the sorted dimension so that they match the resulting values (i.e. something like result.coords[dim] = np.take(da.coords[dim].values, result.values, axis=axis)).

Note that ndarray.argsort(axis=None) sorts the flattened array, so the returned DataArray should respect this

Alternative suggestion: have DataArray.argsort() return an ndarray filled with labels from the sorted dimension, i.e. something like: class DataArray: def argsort(self, **kwargs): # TODO: update kwargs['axis'] based 'axis' and 'dim', and remove 'dim' if kwargs['axis'] is None: kwargs['axis'] = -1 return self.stack(dim=self.dims).argsort(**kwargs) return np.take(self.coords[self.dims[kwargs['axis']].values, self.values.argsort(**kwargs))

BTW, I'm just thinking in terms of ndarrays. Someone more knowledgeable than me may want to consider how to make it work intelligently with dask.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  DataArray.argsort should be deleted 266133430

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 23.833ms · About: xarray-datasette