issue_comments
24 rows where issue = 60303760 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: reactions, created_at (date), updated_at (date)
issue 1
- pd.Grouper support? · 24 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
751259037 | https://github.com/pydata/xarray/issues/364#issuecomment-751259037 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDc1MTI1OTAzNw== | stale[bot] 26384082 | 2020-12-25T14:49:52Z | 2020-12-25T14:49:52Z | NONE | In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity If this issue remains relevant, please comment here or remove the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 | |
347792648 | https://github.com/pydata/xarray/issues/364#issuecomment-347792648 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDM0Nzc5MjY0OA== | shoyer 1217238 | 2017-11-29T08:51:19Z | 2017-11-29T08:51:19Z | MEMBER | Well, the functionality is still there, it's just recommended that you use pd.Grouper. On Wed, Nov 29, 2017 at 2:47 AM lexual notifications@github.com wrote:
|
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 | |
347735817 | https://github.com/pydata/xarray/issues/364#issuecomment-347735817 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDM0NzczNTgxNw== | lexual 410907 | 2017-11-29T02:47:44Z | 2017-11-29T02:47:44Z | NONE | pd.TimeGrouper is deprecated in latest pandas release, so I imagine this bug should be closed. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 | |
341614052 | https://github.com/pydata/xarray/issues/364#issuecomment-341614052 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDM0MTYxNDA1Mg== | shoyer 1217238 | 2017-11-03T03:13:14Z | 2017-11-03T03:13:14Z | MEMBER | Have you tried iterating over a resample object in the v0.10 release candidate? I believe the new resample API supports iteration. On Thu, Nov 2, 2017 at 5:40 PM hazbottles notifications@github.com wrote:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 | |
341598412 | https://github.com/pydata/xarray/issues/364#issuecomment-341598412 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDM0MTU5ODQxMg== | hazbottles 14136435 | 2017-11-03T00:40:14Z | 2017-11-03T00:40:39Z | CONTRIBUTOR | Hi, being able to pass a ```python import pandas as pd import xarray as xr dates = pd.DatetimeIndex(['2017-01-01 15:00', '2017-01-02 14:00', '2017-01-02 23:00']) da = xr.DataArray([1, 2, 3], dims=['time'], coords={'time': dates}) time_grouper = pd.TimeGrouper(freq='24h', base=15) digging around the source code for xr.DataArray.resample i found thisgrouped = xr.core.groupby.DataArrayGroupBy(da, 'time', grouper=time_grouper) for _, sub_da in grouped: print(sub_da) ``` which prints:
Would it be possible to add a |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 | |
231021167 | https://github.com/pydata/xarray/issues/364#issuecomment-231021167 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDIzMTAyMTE2Nw== | saulomeirelles 7504461 | 2016-07-07T08:54:46Z | 2016-07-07T08:59:15Z | NONE | Thanks, @shoyer ! Here is an example of how I circumvented the problem:
In my case, the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 | |
230935548 | https://github.com/pydata/xarray/issues/364#issuecomment-230935548 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDIzMDkzNTU0OA== | shoyer 1217238 | 2016-07-06T23:15:27Z | 2016-07-06T23:15:56Z | MEMBER | @saulomeirelles Nope, this hasn't been added yet, beyond what you can do with the current |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 | |
228723336 | https://github.com/pydata/xarray/issues/364#issuecomment-228723336 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDIyODcyMzMzNg== | saulomeirelles 7504461 | 2016-06-27T11:45:09Z | 2016-06-27T11:45:09Z | NONE | This is a very useful functionality. I am wondering if I can specify the time window, for example, like |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 | |
78239807 | https://github.com/pydata/xarray/issues/364#issuecomment-78239807 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDc4MjM5ODA3 | naught101 167164 | 2015-03-11T10:38:05Z | 2015-03-11T10:38:05Z | NONE | Ah, yep, making the dimension using |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 | |
78214797 | https://github.com/pydata/xarray/issues/364#issuecomment-78214797 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDc4MjE0Nzk3 | shoyer 1217238 | 2015-03-11T07:06:57Z | 2015-03-11T07:06:57Z | MEMBER | The problem is that you've created a new
Also, unlike pandas, xray currently does the core loop for all groupby operations in pure Python, which means that yes, it will be slow when you have a very large number of groups (and it loops again to handle your 15 different variables). Using something like Cython or Numba to speedup groupby operations is on my to-do list, but I've found this to be less of a barrier than you might expect for multi-dimensional datasets -- individual group members tend to include more elements than in DataFrames. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 | |
78211171 | https://github.com/pydata/xarray/issues/364#issuecomment-78211171 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDc4MjExMTcx | naught101 167164 | 2015-03-11T06:17:10Z | 2015-03-11T06:17:10Z | NONE | Ok, weird. That example works for me, but even if I take a really short slice of my data set, the same thing won't work: ``` In [61]: d = data.sel(time=slice('2002-01-01','2002-01-03')) d Out[61]: <xray.Dataset> Dimensions: (time: 143, timeofday: 70128, x: 1, y: 1, z: 1) Coordinates: * x (x) >f8 1.0 * y (y) >f8 1.0 * z (z) >f8 1.0 * time (time) datetime64[ns] 2002-01-01T00:30:00 ... * timeofday (timeofday) timedelta64[ns] 1800000000000 nanoseconds ... Data variables: SWdown (time, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 14.58 ... Rainf_qc (time, y, x) float64 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 ... SWdown_qc (time, y, x) float64 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 ... Tair (time, z, y, x) float64 282.9 282.9 282.7 282.6 282.4 281.7 281.0 ... Tair_qc (time, y, x) float64 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 ... LWdown (time, y, x) float64 296.7 297.3 297.3 297.3 297.2 295.9 294.5 ... PSurf_qc (time, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... latitude (y, x) float64 -35.66 Wind (time, z, y, x) float64 2.2 2.188 1.9 2.2 2.5 2.5 2.5 2.25 2.0 2.35 ... LWdown_qc (time, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... Rainf (time, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... Qair_qc (time, y, x) float64 1.0 0.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 ... longitude (y, x) float64 148.2 PSurf (time, y, x) float64 8.783e+04 8.783e+04 8.782e+04 8.781e+04 ... reference_height (y, x) float64 70.0 elevation (y, x) float64 1.2e+03 Qair (time, z, y, x) float64 0.00448 0.004608 0.004692 0.004781 ... Wind_qc (time, y, x) float64 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 ... Attributes: Production_time: 2012-09-27 12:44:42 Production_source: PALS automated netcdf conversion Contact: palshelp@gmail.com PALS_fluxtower_template_version: 1.0.2 PALS_dataset_name: TumbaFluxnet PALS_dataset_version: 1.4 In [62]: d.groupby('timeofday').mean('time') ``` That last command will not complete - it will run for minutes. Not really sure how to debug that behaviour. Perhaps it's to do with the long/lat/height variables that really should be coordinates (I'm just using the data as it came, but I can clean that, if necessary) |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 | |
78192774 | https://github.com/pydata/xarray/issues/364#issuecomment-78192774 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDc4MTkyNzc0 | shoyer 1217238 | 2015-03-11T03:08:31Z | 2015-03-11T03:08:31Z | MEMBER | I don't think the timeofday issue is related to using Timedeltas in the index (and it's certainly not related to the Here's an example that seems to be working properly (except for uselessly display timedeltas in nanoseconds): ``` In [29]: time = pd.date_range('2000-01-01', freq='H', periods=100) In [30]: daystart = time.to_period(freq='1D').to_datetime() In [31]: timeofday = time.values - daystart.values In [32]: ds = xray.Dataset({'data': ('time', range(100))}, {'time': time, 'timeofday': ('time', timeofday)}) In [33]: ds Out[33]: <xray.Dataset> Dimensions: (time: 100) Coordinates: timeofday (time) timedelta64[ns] 0 nanoseconds ... * time (time) datetime64[ns] 2000-01-01 2000-01-01T01:00:00 ... Data variables: data (time) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 ... In [34]: ds.groupby('timeofday').mean('time') Out[34]: <xray.Dataset> Dimensions: (timeofday: 24) Coordinates: * timeofday (timeofday) timedelta64[ns] 0 nanoseconds ... Data variables: data (timeofday) float64 48.0 49.0 50.0 51.0 40.0 41.0 42.0 43.0 44.0 45.0 46.0 ... ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 | |
78191526 | https://github.com/pydata/xarray/issues/364#issuecomment-78191526 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDc4MTkxNTI2 | naught101 167164 | 2015-03-11T03:00:03Z | 2015-03-11T03:00:03Z | NONE | same problem with |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 | |
78036587 | https://github.com/pydata/xarray/issues/364#issuecomment-78036587 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDc4MDM2NTg3 | naught101 167164 | 2015-03-10T11:30:10Z | 2015-03-10T11:30:10Z | NONE | Dunno if this is related to the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 | |
78008962 | https://github.com/pydata/xarray/issues/364#issuecomment-78008962 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDc4MDA4OTYy | naught101 167164 | 2015-03-10T07:51:45Z | 2015-03-10T07:51:45Z | NONE | Nice. Ok, I have hit a stumbling block, and this is much more of a support request, so feel free to direct me else where, but since we're on the topic, I want to do something like:
where The assignment of |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 | |
77984506 | https://github.com/pydata/xarray/issues/364#issuecomment-77984506 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDc3OTg0NTA2 | shoyer 1217238 | 2015-03-10T02:21:37Z | 2015-03-10T02:21:37Z | MEMBER | Hmm. However, it should work in pandas -- you can do ``` In [13]: t = pd.date_range('2000-01-01', periods=10000, freq='H') In [14]: t.time Out[14]: array([datetime.time(0, 0), datetime.time(1, 0), datetime.time(2, 0), ..., datetime.time(13, 0), datetime.time(14, 0), datetime.time(15, 0)], dtype=object) ``` The simplest way to do timeofday, though, is probably just to calculate |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 | |
77978458 | https://github.com/pydata/xarray/issues/364#issuecomment-77978458 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDc3OTc4NDU4 | naught101 167164 | 2015-03-10T01:16:25Z | 2015-03-10T01:16:25Z | NONE | Ah, cool, thanks for that link, I missed that in the docs. One thing that would be nice (in both pandas and xray) is a |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 | |
77898592 | https://github.com/pydata/xarray/issues/364#issuecomment-77898592 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDc3ODk4NTky | shoyer 1217238 | 2015-03-09T17:16:14Z | 2015-03-09T17:16:14Z | MEMBER | For pandas resample, see here: http://pandas.pydata.org/pandas-docs/stable/timeseries.html#up-and-downsampling The doc string could definitely use an update there, too -- see https://github.com/pydata/pandas/issues/5023 (I think I'll try to update this, too) For I'm going to consolidate all the time/date functionality into a new documentation page for the next release of xray, since this is kind of all over the place now. Also, I should probably break up that monolithic page on "Data structures", perhaps into "Basics" and "Advanced" pages. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 | |
77824657 | https://github.com/pydata/xarray/issues/364#issuecomment-77824657 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDc3ODI0NjU3 | naught101 167164 | 2015-03-09T09:46:15Z | 2015-03-09T09:46:15Z | NONE | Heh, I meant the pandas docs - they don't specify the
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 | |
77818399 | https://github.com/pydata/xarray/issues/364#issuecomment-77818399 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDc3ODE4Mzk5 | shoyer 1217238 | 2015-03-09T08:54:34Z | 2015-03-09T08:54:34Z | MEMBER | Indeed, I need to complete the For your other use case, you just want to group by |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 | |
77810787 | https://github.com/pydata/xarray/issues/364#issuecomment-77810787 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDc3ODEwNzg3 | naught101 167164 | 2015-03-09T07:34:49Z | 2015-03-09T07:34:49Z | NONE | Unfortunately I'm not familiar enough with pd.resample and pd.TeimGrouper to know the difference in what they can do. One thing that I would like to be able to do that is not covered by resample, and might be covered by TimeGrouper is to group over month only (not month and year), in order to create a plot of mean seasonal cycle (at monthly resolution), or similarly, a daily cycle at hourly resolution. I haven't figured out if I can do that with TimeGrouper yet though. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 | |
77808372 | https://github.com/pydata/xarray/issues/364#issuecomment-77808372 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDc3ODA4Mzcy | shoyer 1217238 | 2015-03-09T07:01:11Z | 2015-03-09T07:01:11Z | MEMBER | Well, I guess the first question is -- are there uses for TimeGrouper that you can't easily do with resample? I suppose the simplest (no new method) would be to allow passing a dict where the key is the time dimension and the value is the grouper. Something like |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 | |
77807590 | https://github.com/pydata/xarray/issues/364#issuecomment-77807590 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDc3ODA3NTkw | naught101 167164 | 2015-03-09T06:49:55Z | 2015-03-09T06:49:55Z | NONE | Looks good to me. I don't know enough to be able to comment on the API question. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 | |
77806318 | https://github.com/pydata/xarray/issues/364#issuecomment-77806318 | https://api.github.com/repos/pydata/xarray/issues/364 | MDEyOklzc3VlQ29tbWVudDc3ODA2MzE4 | shoyer 1217238 | 2015-03-09T06:31:14Z | 2015-03-09T06:31:28Z | MEMBER | I wrote a resample function last week based on TimeGrouper. See the dev docs for more details: http://xray.readthedocs.org/en/latest/whats-new.html This should go out in the 0.4.1 release, which I'd like to get out later this week (everyone likes faster release cycles if they are backwards compatible). It would be pretty straightforward to create some sort of API that gives direct access to the resulting GroupBy object. I was considering something like |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pd.Grouper support? 60303760 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 6