home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

13 rows where issue = 352677925 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 3

  • jsignell 7
  • shoyer 4
  • fujiisoup 2

author_association 2

  • CONTRIBUTOR 7
  • MEMBER 6

issue 1

  • Make `dim` optional on unstack · 13 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
417430252 https://github.com/pydata/xarray/pull/2375#issuecomment-417430252 https://api.github.com/repos/pydata/xarray/issues/2375 MDEyOklzc3VlQ29tbWVudDQxNzQzMDI1Mg== jsignell 4806877 2018-08-30T18:58:35Z 2018-08-30T18:58:35Z CONTRIBUTOR

Great! Thanks so much for all the feedback :)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Make `dim` optional on unstack 352677925
415647225 https://github.com/pydata/xarray/pull/2375#issuecomment-415647225 https://api.github.com/repos/pydata/xarray/issues/2375 MDEyOklzc3VlQ29tbWVudDQxNTY0NzIyNQ== shoyer 1217238 2018-08-24T04:24:33Z 2018-08-24T04:24:33Z MEMBER

It might make sense to use a list instead of a set here. On Thu, Aug 23, 2018 at 8:37 PM Keisuke Fujii notifications@github.com wrote:

@fujiisoup commented on this pull request.

Thanks. A few comments.

In xarray/core/dataset.py https://github.com/pydata/xarray/pull/2375#discussion_r212513643:

+ + missing_dims = [dim for dim in dims if dim not in self.dims] + if missing_dims: + raise ValueError('Dataset does not contain the dimensions: %s' + % missing_dims) + + non_multi_dims = [dim for dim in dims + if not isinstance(self.get_index(dim), pd.MultiIndex)] + if non_multi_dims and dim_from_kwarg: + raise ValueError('cannot unstack dimensions that do not ' + 'have a MultiIndex: %s' % non_multi_dims) + + dims = dims - set(non_multi_dims) + if len(dims) == 0: + raise ValueError('cannot unstack an object that does not have ' + 'MultiIndex dimensions')

I think that we can allow to unstack an object without MultiIndex, which just returns as is. It would be useful if users want to remove any MultiIndexes from an object.


In xarray/core/dataset.py https://github.com/pydata/xarray/pull/2375#discussion_r212513859:


  • unstacked : Dataset
  • Dataset with unstacked data. +
  • See also

  • Dataset.stack
  • """
  • dim_from_kwarg = dim is not None +
  • if isinstance(dim, basestring):
  • dims = set([dim])
  • elif dim is None:
  • dims = set(self.dims)
  • else:
  • dims = set(dim)

Maybe we can use OrderedSet instead of set so that the resultant dimension order is fixed.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/pull/2375#pullrequestreview-149165031, or mute the thread https://github.com/notifications/unsubscribe-auth/ABKS1k4sM5-2xSpTAI02Sa792Zqido7eks5uT3TggaJpZM4WGc2s .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Make `dim` optional on unstack 352677925
415493314 https://github.com/pydata/xarray/pull/2375#issuecomment-415493314 https://api.github.com/repos/pydata/xarray/issues/2375 MDEyOklzc3VlQ29tbWVudDQxNTQ5MzMxNA== jsignell 4806877 2018-08-23T17:02:15Z 2018-08-23T17:02:15Z CONTRIBUTOR

Thanks for the context!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Make `dim` optional on unstack 352677925
415492951 https://github.com/pydata/xarray/pull/2375#issuecomment-415492951 https://api.github.com/repos/pydata/xarray/issues/2375 MDEyOklzc3VlQ29tbWVudDQxNTQ5Mjk1MQ== shoyer 1217238 2018-08-23T17:01:23Z 2018-08-23T17:01:23Z MEMBER

Dataset.transpose accepts *args based on the design of numpy.ndarray.transpose, but that API is probably a mistake (both in NumPy and xarray). Everything else uses an axis/dim argument that can take a scalar or sequence value. On Thu, Aug 23, 2018 at 9:56 AM Julia Signell notifications@github.com wrote:

I can change it. I guess I was looking at Dataset.transpose: https://github.com/pydata/xarray/blob/master/xarray/core/dataset.py#L2498

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/pull/2375#issuecomment-415490818, or mute the thread https://github.com/notifications/unsubscribe-auth/ABKS1iLWT36Dqq18sjbI-Tymh8_eJJG1ks5uTt6ogaJpZM4WGc2s .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Make `dim` optional on unstack 352677925
415490818 https://github.com/pydata/xarray/pull/2375#issuecomment-415490818 https://api.github.com/repos/pydata/xarray/issues/2375 MDEyOklzc3VlQ29tbWVudDQxNTQ5MDgxOA== jsignell 4806877 2018-08-23T16:56:07Z 2018-08-23T16:56:07Z CONTRIBUTOR

I can change it. I guess I was looking at Dataset.transpose: https://github.com/pydata/xarray/blob/master/xarray/core/dataset.py#L2498

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Make `dim` optional on unstack 352677925
415486919 https://github.com/pydata/xarray/pull/2375#issuecomment-415486919 https://api.github.com/repos/pydata/xarray/issues/2375 MDEyOklzc3VlQ29tbWVudDQxNTQ4NjkxOQ== shoyer 1217238 2018-08-23T16:46:41Z 2018-08-23T16:46:41Z MEMBER

I chose to use *dims rather than a list of dims so that this change will have a very small impact on people. Most people probably do something like unstack('z') right now, and that will still work.

Usually we prefer to stick to a single argument, but use isinstance checks to support both single dimensions and lists of dimensions, e.g., see how dim is parsed in Dataset.reduce: https://github.com/pydata/xarray/blob/master/xarray/core/dataset.py#L2774-L2779

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Make `dim` optional on unstack 352677925
415473341 https://github.com/pydata/xarray/pull/2375#issuecomment-415473341 https://api.github.com/repos/pydata/xarray/issues/2375 MDEyOklzc3VlQ29tbWVudDQxNTQ3MzM0MQ== jsignell 4806877 2018-08-23T16:06:51Z 2018-08-23T16:06:51Z CONTRIBUTOR

I chose to use *dims rather than a list of dims so that this change will have a very small impact on people. Most people probably do something like unstack('z') right now, and that will still work.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Make `dim` optional on unstack 352677925
415437261 https://github.com/pydata/xarray/pull/2375#issuecomment-415437261 https://api.github.com/repos/pydata/xarray/issues/2375 MDEyOklzc3VlQ29tbWVudDQxNTQzNzI2MQ== jsignell 4806877 2018-08-23T14:29:01Z 2018-08-23T14:29:01Z CONTRIBUTOR

Ok so in this PR I will make unstack accept multiple dims like xr.DataFrame.unstack(*dims). The order of the dims will only be roundtripped if all dims are stacked into one, but I think that is reasonable.

In a follow on PR I will make xarray.label_like(array, other). I think that notation speaks more to what we are trying to convey, but I do think the position of the arguments isn't intuitive.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Make `dim` optional on unstack 352677925
415227870 https://github.com/pydata/xarray/pull/2375#issuecomment-415227870 https://api.github.com/repos/pydata/xarray/issues/2375 MDEyOklzc3VlQ29tbWVudDQxNTIyNzg3MA== shoyer 1217238 2018-08-23T00:06:56Z 2018-08-23T00:06:56Z MEMBER

I think unstack() unstacking all dimensions by default would make sense.

Should we be using xr.full_like in this way?

I'm not really opposed to full_like working this way, but it does look a little strange to my eye. The "full" part of the name doesn't really make sense to me. I would usually suggest using the DataArray constructor here, e.g., xr.DataArray(output_values, flat_input.coords, flat_input.dims, flat_inputs.attrs).

Maybe we can figure a better way to spell "label these arrays like this template xarray object" that doesn't require referencing flat_input multiple times. Maybe xarray.label_like(array, source) or source.with_data(array)?

Would something like xr.unstack_like be desirable?

I'm not sure that a dedicated function unstack_like would make sense for xarray. This is the sort of helper function that you can write yourself in a couple of lines.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Make `dim` optional on unstack 352677925
415201313 https://github.com/pydata/xarray/pull/2375#issuecomment-415201313 https://api.github.com/repos/pydata/xarray/issues/2375 MDEyOklzc3VlQ29tbWVudDQxNTIwMTMxMw== fujiisoup 6815844 2018-08-22T22:22:48Z 2018-08-22T22:22:48Z MEMBER

But maybe it is better to choose the first dim that is MultiIndex rather than the first dim.

first dimension is not well defined in Dataset, as it is a union of the dims of all the dataarrays it has. For example, in the following example, ds.unstack()['var'] and da['var'].unstack() will give different results.

```python In [15]: import numpy as np ...: import xarray as xr ...: ...: ds = xr.Dataset({'var': (('x', 'y', 'z', 'w'), np.random.randn(2,3,4,5))}) ...: ds = ds.stack(b=['z', 'w']).stack(a=['x', 'y']) ...: ds ...: Out[15]: <xarray.Dataset> Dimensions: (a: 6, b: 20) Coordinates: * b (b) MultiIndex - z (b) int64 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 - w (b) int64 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 * a (a) MultiIndex - x (a) int64 0 0 0 1 1 1 - y (a) int64 0 1 2 0 1 2 Data variables: var (b, a) float64 -1.277 -0.4031 -0.3816 ... 1.398 0.6763 -0.6735

In [16]: list(ds.dims) Out[16]: ['a', 'b']

In [17]: list(ds['var'].dims) Out[17]: ['b', 'a'] ```

but in that case should we allow passing in multiple dims?

I like this direction. stack accepts multiple pairs of dimensions to be stacked, like ds.stack(a=['x', 'y'], b=['z', 'w']). In this method, it repeatedly calls _stack_once method. I think unstack also can have the similar logic.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Make `dim` optional on unstack 352677925
415099271 https://github.com/pydata/xarray/pull/2375#issuecomment-415099271 https://api.github.com/repos/pydata/xarray/issues/2375 MDEyOklzc3VlQ29tbWVudDQxNTA5OTI3MQ== jsignell 4806877 2018-08-22T16:44:47Z 2018-08-22T17:46:24Z CONTRIBUTOR

we have similar method reset_index. Do we also want to make dim optional?

I don't have an opinion on that except to say that reset_index takes an iter of dims so it is at least slightly different. So to me it seems fine to only make dim optional on unstack.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Make `dim` optional on unstack 352677925
415053454 https://github.com/pydata/xarray/pull/2375#issuecomment-415053454 https://api.github.com/repos/pydata/xarray/issues/2375 MDEyOklzc3VlQ29tbWVudDQxNTA1MzQ1NA== jsignell 4806877 2018-08-22T14:32:51Z 2018-08-22T15:23:27Z CONTRIBUTOR

what should be done if DataArray or Dataset has multiple MultiIndexes. Maybe do we unstack all the MultiIndexes?

I like the idea of unstacking all the MultiIndexes, but in that case should we allow passing in multiple dims? It seems weird to do a recursive unstack in the case of no argument passed without allowing the user to specifically choose multiple dims along which to unstack.

I think it is probably better to just choose a default dim to unstack like this PR does. But maybe it is better to choose the first dim that is MultiIndex rather than the first dim. That way if you do a stack().unstack() you will roundtrip your data since the stacked index gets added to the end of dims. And if you just pass an object with one MultiIndex (probably the most common scenario) unstack will do the right thing. And if you pass an object with multiple MultiIndexes and unstack repeatedly, you will get your original data out.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Make `dim` optional on unstack 352677925
414860789 https://github.com/pydata/xarray/pull/2375#issuecomment-414860789 https://api.github.com/repos/pydata/xarray/issues/2375 MDEyOklzc3VlQ29tbWVudDQxNDg2MDc4OQ== fujiisoup 6815844 2018-08-22T00:01:45Z 2018-08-22T00:01:45Z MEMBER

Thanks, @jsignell.

I like this idea (unstack without explicit dimension names), but I think we may need to decide what API would be the best. My particular concern is + what should be done if DataArray or Dataset has multiple MultiIndexes. Maybe do we unstack all the MultiIndexes? + we have similar method reset_index. Do we also want to make dim optional?

For unstack_like, I'm not sure it is worth adding as a top level function as xr.full_like(other, data).unstack() is simple enoguh...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Make `dim` optional on unstack 352677925

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 481.638ms · About: xarray-datasette