github: issue_comments: 18 rows where user = 102827 sorted by updated

18 rows where user = 102827 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
896827548	https://github.com/pydata/xarray/issues/2808#issuecomment-896827548	https://api.github.com/repos/pydata/xarray/issues/2808	IC_kwDOAMm_X841dICc	cchwala 102827	2021-08-11T13:28:08Z	2021-08-11T13:28:08Z	CONTRIBUTOR	Thanks @keewis for linking the new tutorial. It helped me a lot figuring out how to use `apply_ufunc` for my 1D case. The fact that the tutorial shows the "typical" errors messages that you get when trying to use it, make the tutorial really nice to follow.	{ "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 1, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Improving documentation on `apply_ufunc` 420584430
434966059	https://github.com/pydata/xarray/pull/2532#issuecomment-434966059	https://api.github.com/repos/pydata/xarray/issues/2532	MDEyOklzc3VlQ29tbWVudDQzNDk2NjA1OQ==	cchwala 102827	2018-11-01T08:13:48Z	2018-11-01T08:13:48Z	CONTRIBUTOR	Yes. Test are still failing. The PR is WIP. I just wanted to open the PR now to have the discussion here instead of in the issues. I will work on fixing the code to pass all current test. I will also check how the rechunking affects performance.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	[WIP] Fix problem with wrong chunksizes when using rolling_window on dask.array 376162232
433454137	https://github.com/pydata/xarray/issues/2514#issuecomment-433454137	https://api.github.com/repos/pydata/xarray/issues/2514	MDEyOklzc3VlQ29tbWVudDQzMzQ1NDEzNw==	cchwala 102827	2018-10-26T15:49:20Z	2018-10-31T21:14:48Z	CONTRIBUTOR	EDIT: The issue of this post is now separated #2531 I think I have a fix, but wanted to write some failing tests before committing the changes. Doing this I discovered that also `DataArray.rolling()` does not preserve the chunksizes, apparently depending on the applied method. ```python import pandas as pd import numpy as np import xarray as xr t = pd.date_range(start='2018-01-01', end='2018-02-01', freq='H') bar = np.sin(np.arange(len(t))) baz = np.cos(np.arange(len(t))) da_test = xr.DataArray(data=np.stack([bar, baz]), coords={'time': t, 'sensor': ['one', 'two']}, dims=('sensor', 'time')) print(da_test.chunk({'time': 100}).rolling(time=60).mean().chunks) print(da_test.chunk({'time': 100}).rolling(time=60).count().chunks) Output for `mean`: ((2,), (745,)) Output for `count`: ((2,), (100, 100, 100, 100, 100, 100, 100, 45)) Desired Output: ((2,), (100, 100, 100, 100, 100, 100, 100, 45)) ``` My fix solves my initial problem, but maybe if done correctly it should also solve this bug, too. Any idea why this depends on whether `.mean()` or `.count()` is used? I have already pushed some WIP changes. Should I already open a PR if though most new test still fail?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	interpolate_na with limit argument changes size of chunks 374279704
434843563	https://github.com/pydata/xarray/issues/2531#issuecomment-434843563	https://api.github.com/repos/pydata/xarray/issues/2531	MDEyOklzc3VlQ29tbWVudDQzNDg0MzU2Mw==	cchwala 102827	2018-10-31T20:52:49Z	2018-10-31T20:52:49Z	CONTRIBUTOR	The cause has been explained by @fujiisoup here https://github.com/pydata/xarray/issues/2514#issuecomment-433528586 Nice catch! For some historical reasons, `mean` and some reduction method uses bottleneck as default, while `count` does not. `mean` goes through this function xarray/xarray/core/dask_array_ops.py Line 23 in b622c5e def dask_rolling_wrapper(moving_func, a, window, min_count=None, axis=-1): It looks there is another but for this function.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	DataArray.rolling() does not preserve chunksizes in some cases 376154741
433992180	https://github.com/pydata/xarray/issues/2514#issuecomment-433992180	https://api.github.com/repos/pydata/xarray/issues/2514	MDEyOklzc3VlQ29tbWVudDQzMzk5MjE4MA==	cchwala 102827	2018-10-29T17:01:12Z	2018-10-29T17:01:12Z	CONTRIBUTOR	@dcherian Okay. A WIP PR will follow, but might take some days.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	interpolate_na with limit argument changes size of chunks 374279704
433369567	https://github.com/pydata/xarray/issues/2514#issuecomment-433369567	https://api.github.com/repos/pydata/xarray/issues/2514	MDEyOklzc3VlQ29tbWVudDQzMzM2OTU2Nw==	cchwala 102827	2018-10-26T10:53:32Z	2018-10-26T10:53:32Z	CONTRIBUTOR	Thanks @fujiisoup for the quick response and the pointers. I will have a look and report back if a PR is within my capabilities or not.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	interpolate_na with limit argument changes size of chunks 374279704
433346685	https://github.com/pydata/xarray/issues/2514#issuecomment-433346685	https://api.github.com/repos/pydata/xarray/issues/2514	MDEyOklzc3VlQ29tbWVudDQzMzM0NjY4NQ==	cchwala 102827	2018-10-26T09:27:19Z	2018-10-26T09:27:19Z	CONTRIBUTOR	The problem seems to occur here https://github.com/pydata/xarray/blob/5940100761478604080523ebb1291ecff90e779e/xarray/core/missing.py#L368-L376 because of the usage of `.construct()`. A quick try without it, shows that the chunksize is preserved then. Hence, `.construct()` might need a fix for correctly dealing with the chunks of `dask.arrays`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	interpolate_na with limit argument changes size of chunks 374279704
361532119	https://github.com/pydata/xarray/issues/1836#issuecomment-361532119	https://api.github.com/repos/pydata/xarray/issues/1836	MDEyOklzc3VlQ29tbWVudDM2MTUzMjExOQ==	cchwala 102827	2018-01-30T09:32:26Z	2018-01-30T09:32:26Z	CONTRIBUTOR	Thanks @jhamman for looking into this. Currently I am fine with using `persist()` since I can break down my analysis workflow to certain time periods for which data fits into RAM on a large machine. As I have written, the distributed scheduler failed for me because of #1464. But I would like to use it in the future. From other discussions on the dask schedulers (here or on SO) using the distributed scheduler seems to be a general recommendation anyway. In summary, I am fine with my current workaround. I do not think that solving this issue has a high priority, in particular when the distributed scheduler is further improved. The main annoyance was to track down the problem described in my first post. Hence, maybe the limitations of the schedulers could be described a bit better in the documentation. Would you want a PR on this?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	HDF5 error when working with compressed NetCDF files and the dask multiprocessing scheduler 289342234
358445479	https://github.com/pydata/xarray/issues/1836#issuecomment-358445479	https://api.github.com/repos/pydata/xarray/issues/1836	MDEyOklzc3VlQ29tbWVudDM1ODQ0NTQ3OQ==	cchwala 102827	2018-01-17T21:07:43Z	2018-01-17T21:07:43Z	CONTRIBUTOR	Thanks for the quick answer. The problem is that my actual use case also involves writing back a `xarray.Dataset` via `to_netcdf()`. I left this out of the example above to isolate the problem. With the `distributed` scheduler and `to_netcdf()`, I ran into this issue #1464. As I can see, this might be fixed "soon" (#1793).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	HDF5 error when working with compressed NetCDF files and the dask multiprocessing scheduler 289342234
317786250	https://github.com/pydata/xarray/pull/1414#issuecomment-317786250	https://api.github.com/repos/pydata/xarray/issues/1414	MDEyOklzc3VlQ29tbWVudDMxNzc4NjI1MA==	cchwala 102827	2017-07-25T16:03:46Z	2017-07-25T16:03:46Z	CONTRIBUTOR	@jhamman @shoyer This should be ready to merge. Should I open an xarray issue concerning the bug with `pandas.to_timedelta()` or is it enough to have the issue I submitted for pandas? I think the bug should be resolved in xarray when it is resolved in pandas because then the overflow check here should catch the cases I discovered.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Speed up `decode_cf_datetime` 229807027
316963228	https://github.com/pydata/xarray/pull/1414#issuecomment-316963228	https://api.github.com/repos/pydata/xarray/issues/1414	MDEyOklzc3VlQ29tbWVudDMxNjk2MzIyOA==	cchwala 102827	2017-07-21T10:10:54Z	2017-07-21T10:10:54Z	CONTRIBUTOR	hmm... it's still complicated. To avoid the `NaT`s in my code, I tried to extend the current overflow check so that it switches to `_decode_datetime_with_netcdf4()` earlier. This was my attempt: `python (pd.to_timedelta(flat_num_dates.min(), delta) - pd.to_timedelta(1, 'd') + ref_date) (pd.to_timedelta(flat_num_dates.max(), delta) + pd.to_timedelta(1, 'd') + ref_date)` But unfortunately, as shown in my notebook above, `pandas.to_timedelta()` has a bug and does not detect the overflow in those esoteric cases that I have identified... I have filed this Issue pandas-dev/pandas/issues/17037 because it should be solved there. Since I do not think this will be fixed soon (I would gladly look at it, but have no time and probably not enough knowledge about the `pandas` core stuff), I am not sure what to do. Do you want to merge this PR, knowing that there still is the overflow issue that was in the code before? Or should I continue to try to fix the current overflow bug in this PR?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Speed up `decode_cf_datetime` 229807027
315643209	https://github.com/pydata/xarray/pull/1414#issuecomment-315643209	https://api.github.com/repos/pydata/xarray/issues/1414	MDEyOklzc3VlQ29tbWVudDMxNTY0MzIwOQ==	cchwala 102827	2017-07-16T22:41:50Z	2017-07-16T22:41:50Z	CONTRIBUTOR	...but wait. The `NaT`s that my code produces beyond the int64 overflow should be valid dates, produced using `_decode_datetime_with_netcdf4`, right? Hence, I should still add a check for `NaT` results and fall back to the netcdf version then.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Speed up `decode_cf_datetime` 229807027
315637844	https://github.com/pydata/xarray/pull/1414#issuecomment-315637844	https://api.github.com/repos/pydata/xarray/issues/1414	MDEyOklzc3VlQ29tbWVudDMxNTYzNzg0NA==	cchwala 102827	2017-07-16T21:15:04Z	2017-07-16T21:34:12Z	CONTRIBUTOR	@jhamman - I found some differences between the old code in master an my code when decoding values close to the np.datetime64 overflow. My code produces `NaT` where the old code returned some date. First, I wanted to test and fix that. However, I may have found that the old implementation did not behave correctly when crossing the "overflow" line just slightly. I have summed that up in a notebook here. My conclusion would be, that the code in this PR here is not only faster, but also more correct than the old one. However, since it is quite late in the evening and my head needs some rest, I would like to get a second (or third) opinion...	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Speed up `decode_cf_datetime` 229807027
315322859	https://github.com/pydata/xarray/pull/1414#issuecomment-315322859	https://api.github.com/repos/pydata/xarray/issues/1414	MDEyOklzc3VlQ29tbWVudDMxNTMyMjg1OQ==	cchwala 102827	2017-07-14T10:05:04Z	2017-07-14T10:05:04Z	CONTRIBUTOR	@jhamman - Sorry. I was away from office (and everything related to work) for more than a month and had to catchup with a lot of things. I will sum up my stuff and post here, hopefully after todays lunch break.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Speed up `decode_cf_datetime` 229807027
305469383	https://github.com/pydata/xarray/pull/1414#issuecomment-305469383	https://api.github.com/repos/pydata/xarray/issues/1414	MDEyOklzc3VlQ29tbWVudDMwNTQ2OTM4Mw==	cchwala 102827	2017-06-01T11:43:27Z	2017-06-01T11:43:27Z	CONTRIBUTOR	Just a short notice. Sorry, for the delay. I am still working on this PR, but I am too busy right now to finish the overflow testing. I think I found some edge cases which have to be handled. I will provide more details soon.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Speed up `decode_cf_datetime` 229807027
302943727	https://github.com/pydata/xarray/pull/1414#issuecomment-302943727	https://api.github.com/repos/pydata/xarray/issues/1414	MDEyOklzc3VlQ29tbWVudDMwMjk0MzcyNw==	cchwala 102827	2017-05-21T15:28:15Z	2017-05-21T15:28:15Z	CONTRIBUTOR	Thanks @shoyer and @jhamman for the feedback. I will change things accordingly. Concerning tests, I will think again about additional checking for correct handling of overflow. I must admit, that I am not 100% sure that every case is handled correctly by the current code and checked by the current tests. Will have to think about it a little when I find time within the next days...	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Speed up `decode_cf_datetime` 229807027
300072972	https://github.com/pydata/xarray/issues/1399#issuecomment-300072972	https://api.github.com/repos/pydata/xarray/issues/1399	MDEyOklzc3VlQ29tbWVudDMwMDA3Mjk3Mg==	cchwala 102827	2017-05-09T06:26:36Z	2017-05-09T06:26:36Z	CONTRIBUTOR	Okay. I will try to come up with a PR within the next days.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	`decode_cf_datetime()` slow because `pd.to_timedelta()` is slow if floats are passed 226549366
299819380	https://github.com/pydata/xarray/issues/1399#issuecomment-299819380	https://api.github.com/repos/pydata/xarray/issues/1399	MDEyOklzc3VlQ29tbWVudDI5OTgxOTM4MA==	cchwala 102827	2017-05-08T09:32:58Z	2017-05-08T09:32:58Z	CONTRIBUTOR	Hmm... The "nanosecond"-issue seems to need a fix very much at the foundation. As long as pandas and xarray rely on `datetime64[ns]` you cannot avoid nanoseconds, right? `pd.to_datetime()` forces the conversion to nanoscends even if you pass integers but for a time `unit` different to `ns`. This does not make me as nervous as Fabien since my data is always quite recent, but I see that this is far from ideal for a tool for climate scientists. An intermediate fix (@shoyer, do you actually want one?) that I could think of for the performance issue right now would be to do the conversion to `datetime64[ns]` depending on the time unit, e.g. multiply raw values (most likely floats) with number of nanoseconds in time `unit` for units smaller then days (or hours?) and use these values as integers in `pd.to_datetime()` else, fall back to using netCDF4/netcdftime for months and years (as suggested by shoyer) casting the raw values to floats The only thing that bothers me is that I am not sure if the "number of nanoseconds" is always the same in every day or hour in the view of `datetime64`, due to leap seconds or other particularities. @shoyer: Does this sound reasonable or did I forget to take into account any side effects?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	`decode_cf_datetime()` slow because `pd.to_timedelta()` is slow if floats are passed 226549366

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);