issue_comments
19 rows where author_association = "CONTRIBUTOR" and user = 145117 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: issue_url, reactions, created_at (date), updated_at (date)
user 1
- mankoff · 19 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
1269235342 | https://github.com/pydata/xarray/pull/6812#issuecomment-1269235342 | https://api.github.com/repos/pydata/xarray/issues/6812 | IC_kwDOAMm_X85Lpv6O | mankoff 145117 | 2022-10-06T02:48:22Z | 2022-10-06T02:48:22Z | CONTRIBUTOR | A bit more detail about the existing tests that don't match the CF spec. Per the spec, There is 1 test in In addition, the expected I am concerned that this is a significant change and I'm not sure what the process is for making this change. I would like to have some idea, even if not a guarantee, that it would be welcomed and accepted before doing all the work. I note that a recent other large PR to try to fix cf decoding has also stalled, and I'm not sure why (see #2751) |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Improved CF decoding 1309966595 | |
1266136366 | https://github.com/pydata/xarray/pull/6812#issuecomment-1266136366 | https://api.github.com/repos/pydata/xarray/issues/6812 | IC_kwDOAMm_X85Ld7Uu | mankoff 145117 | 2022-10-03T22:29:28Z | 2022-10-03T22:29:28Z | CONTRIBUTOR | Hi @dcherian - I dropped this because I went down a rabbit hole that seemed very very deep. Xarray has written 10s (100s?) of tests that touch this decoding function that make assumptions that I believe are incorrect after a careful reading of the CF spec. I believe the path forward will take some conversation before coding, so perhaps this should be moved to an issue rather than a pull request? A big decision is if the decode option strictly follows CF guidelines. If so, then a lot of tests need to be changed (for example, to follow the simple rule of Enforcing this would probably break Furthermore, the CF conventions are themselves not very clear, and possibly ambiguous. I started a conversation here: https://github.com/cf-convention/cf-conventions/issues/374 on this, but that is also unresolved at the moment. The CF convention mentions |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Improved CF decoding 1309966595 | |
1201464999 | https://github.com/pydata/xarray/issues/2304#issuecomment-1201464999 | https://api.github.com/repos/pydata/xarray/issues/2304 | IC_kwDOAMm_X85HnOan | mankoff 145117 | 2022-08-01T16:56:01Z | 2022-08-01T16:56:01Z | CONTRIBUTOR | Packing Qs
Unpacking Qs
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
float32 instead of float64 when decoding int16 with scale_factor netcdf var using xarray 343659822 | |
1201461626 | https://github.com/pydata/xarray/issues/2304#issuecomment-1201461626 | https://api.github.com/repos/pydata/xarray/issues/2304 | IC_kwDOAMm_X85HnNl6 | mankoff 145117 | 2022-08-01T16:52:47Z | 2022-08-01T16:52:47Z | CONTRIBUTOR |
I think this means double is advised? If so, this should be stated. Should be rephrased to advise what to do (if there is one or only a few choices) rather than what not to do, or at least include that if not replacing current wording. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
float32 instead of float64 when decoding int16 with scale_factor netcdf var using xarray 343659822 | |
1201443208 | https://github.com/pydata/xarray/pull/6851#issuecomment-1201443208 | https://api.github.com/repos/pydata/xarray/issues/6851 | IC_kwDOAMm_X85HnJGI | mankoff 145117 | 2022-08-01T16:35:44Z | 2022-08-01T16:35:44Z | CONTRIBUTOR |
There's a whole table of tests! https://github.com/pydata/xarray/issues/2304#issuecomment-1200627783 But now I'm building a test for the code as-is, which isn't CF-compliant. Is this worth writing? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Fix logic bug - add_offset is in encoding, not attrs. 1322645651 | |
1200627783 | https://github.com/pydata/xarray/issues/2304#issuecomment-1200627783 | https://api.github.com/repos/pydata/xarray/issues/2304 | IC_kwDOAMm_X85HkCBH | mankoff 145117 | 2022-08-01T02:49:28Z | 2022-08-01T05:55:15Z | CONTRIBUTOR | Current algorithm
Due to calling bug,
Here I call the function twice, once with ```python import numpy as np def _choose_float_dtype(dtype, has_offset): if dtype.itemsize <= 4 and np.issubdtype(dtype, np.floating): return np.float32 if dtype.itemsize <= 2 and np.issubdtype(dtype, np.integer): if not has_offset: return np.float32 return np.float64 generic typesfor dtype in [np.byte, np.ubyte, np.short, np.ushort, np.intc, np.uintc, np.int_, np.uint, np.longlong, np.ulonglong, np.half, np.float16, np.single, np.double, np.longdouble, np.csingle, np.cdouble, np.clongdouble, np.int8, np.int16, np.int32, np.int64, np.uint8, np.uint16, np.uint32, np.uint64, np.float16, np.float32, np.float64]: print("|", dtype, "|", _choose_float_dtype(np.dtype(dtype), False), "|", _choose_float_dtype(np.dtype(dtype), True), "|") ``` | Input | Output as called | Output as written | |-----------------------------|---------------------------|--------------------------| | <class 'numpy.int8'> | <class 'numpy.float32'> | <class 'numpy.float64'> | | <class 'numpy.uint8'> | <class 'numpy.float32'> | <class 'numpy.float64'> | | <class 'numpy.int16'> | <class 'numpy.float32'> | <class 'numpy.float64'> | | <class 'numpy.uint16'> | <class 'numpy.float32'> | <class 'numpy.float64'> | | <class 'numpy.int32'> | <class 'numpy.float64'> | <class 'numpy.float64'> | | <class 'numpy.uint32'> | <class 'numpy.float64'> | <class 'numpy.float64'> | | <class 'numpy.int64'> | <class 'numpy.float64'> | <class 'numpy.float64'> | | <class 'numpy.uint64'> | <class 'numpy.float64'> | <class 'numpy.float64'> | | <class 'numpy.longlong'> | <class 'numpy.float64'> | <class 'numpy.float64'> | | <class 'numpy.ulonglong'> | <class 'numpy.float64'> | <class 'numpy.float64'> | | <class 'numpy.float16'> | <class 'numpy.float32'> | <class 'numpy.float32'> | | <class 'numpy.float16'> | <class 'numpy.float32'> | <class 'numpy.float32'> | | <class 'numpy.float32'> | <class 'numpy.float32'> | <class 'numpy.float32'> | | <class 'numpy.float64'> | <class 'numpy.float64'> | <class 'numpy.float64'> | | <class 'numpy.float128'> | <class 'numpy.float64'> | <class 'numpy.float64'> | | <class 'numpy.complex64'> | <class 'numpy.float64'> | <class 'numpy.float64'> | | <class 'numpy.complex128'> | <class 'numpy.float64'> | <class 'numpy.float64'> | | <class 'numpy.complex256'> | <class 'numpy.float64'> | <class 'numpy.float64'> | | <class 'numpy.int8'> | <class 'numpy.float32'> | <class 'numpy.float64'> | | <class 'numpy.int16'> | <class 'numpy.float32'> | <class 'numpy.float64'> | | <class 'numpy.int32'> | <class 'numpy.float64'> | <class 'numpy.float64'> | | <class 'numpy.int64'> | <class 'numpy.float64'> | <class 'numpy.float64'> | | <class 'numpy.uint8'> | <class 'numpy.float32'> | <class 'numpy.float64'> | | <class 'numpy.uint16'> | <class 'numpy.float32'> | <class 'numpy.float64'> | | <class 'numpy.uint32'> | <class 'numpy.float64'> | <class 'numpy.float64'> | | <class 'numpy.uint64'> | <class 'numpy.float64'> | <class 'numpy.float64'> | | <class 'numpy.float16'> | <class 'numpy.float32'> | <class 'numpy.float32'> | | <class 'numpy.float32'> | <class 'numpy.float32'> | <class 'numpy.float32'> | | <class 'numpy.float64'> | <class 'numpy.float64'> | <class 'numpy.float64'> | |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
float32 instead of float64 when decoding int16 with scale_factor netcdf var using xarray 343659822 | |
1200266255 | https://github.com/pydata/xarray/issues/2304#issuecomment-1200266255 | https://api.github.com/repos/pydata/xarray/issues/2304 | IC_kwDOAMm_X85HipwP | mankoff 145117 | 2022-07-30T17:58:51Z | 2022-07-30T17:58:51Z | CONTRIBUTOR | This issue, based on its title and initial post, is fixed by PR #6851. The code to select dtype was already correct, but the outer function that called it had a bug in the call. Per the CF spec,
I find this is ambiguous. is The broader discussion here is about CF compliance. I find the spec ambiguous and xarray non-compliant. So many tests rely on the existing behavior, that I am unsure how best to proceed to improve compliance. I worry it may be a major refactor, and possibly break things relying on the existing behavior. I'd like to discuss architecture. Should this be in a new issue, if this closes with PR #6851? Should there be a new keyword for |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
float32 instead of float64 when decoding int16 with scale_factor netcdf var using xarray 343659822 | |
1189512813 | https://github.com/pydata/xarray/pull/6812#issuecomment-1189512813 | https://api.github.com/repos/pydata/xarray/issues/6812 | IC_kwDOAMm_X85G5oZt | mankoff 145117 | 2022-07-19T20:19:29Z | 2022-07-19T20:19:29Z | CONTRIBUTOR | I'm reading more in https://github.com/pydata/xarray/blob/2a5686c6fe855502523e495e43bd381d14191c7b/xarray/coding/variables.py and I'm confused about some logic:
I think this is happening based on inspecting with the debugger. Furthermore, the fix I implemented in this Pull Request which returns
should not run, but do run because of this issue. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Improved CF decoding 1309966595 | |
1189485451 | https://github.com/pydata/xarray/pull/6812#issuecomment-1189485451 | https://api.github.com/repos/pydata/xarray/issues/6812 | IC_kwDOAMm_X85G5huL | mankoff 145117 | 2022-07-19T19:46:23Z | 2022-07-19T19:46:23Z | CONTRIBUTOR | Note - I also have not run the "Running the performance test suite" code in https://xarray.pydata.org/en/stable/contributing.html - I assume changing from |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Improved CF decoding 1309966595 | |
1188529343 | https://github.com/pydata/xarray/issues/2304#issuecomment-1188529343 | https://api.github.com/repos/pydata/xarray/issues/2304 | IC_kwDOAMm_X85G14S_ | mankoff 145117 | 2022-07-19T02:35:30Z | 2022-07-19T03:20:51Z | CONTRIBUTOR | I've run into this issue too, and the xarray decision to use The data value is 1395. The scale is 0.0001.
Because we are using |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
float32 instead of float64 when decoding int16 with scale_factor netcdf var using xarray 343659822 | |
708594913 | https://github.com/pydata/xarray/issues/2139#issuecomment-708594913 | https://api.github.com/repos/pydata/xarray/issues/2139 | MDEyOklzc3VlQ29tbWVudDcwODU5NDkxMw== | mankoff 145117 | 2020-10-14T18:52:38Z | 2020-10-14T18:52:38Z | CONTRIBUTOR | The issue is that if you pass in This multi-index came from a small 12 MB file - 5000 rows and 40 variables. When I then did Now that I've figured all this out, I don't think that any bugs exist in |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
From pandas to xarray without blowing up memory 323703742 | |
708513119 | https://github.com/pydata/xarray/issues/2139#issuecomment-708513119 | https://api.github.com/repos/pydata/xarray/issues/2139 | MDEyOklzc3VlQ29tbWVudDcwODUxMzExOQ== | mankoff 145117 | 2020-10-14T16:23:36Z | 2020-10-14T16:23:36Z | CONTRIBUTOR | @max-sixty Sorry for posting this here. This memory blow-up was a byproduct of another bug that it took me a few more hours to track down. This other bug is in Pandas, not xarray. |
{ "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 1, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
From pandas to xarray without blowing up memory 323703742 | |
708339519 | https://github.com/pydata/xarray/issues/2139#issuecomment-708339519 | https://api.github.com/repos/pydata/xarray/issues/2139 | MDEyOklzc3VlQ29tbWVudDcwODMzOTUxOQ== | mankoff 145117 | 2020-10-14T11:25:03Z | 2020-10-14T11:25:03Z | CONTRIBUTOR | Late reply, but if anyone else finds this issue, I was filling memory with:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
From pandas to xarray without blowing up memory 323703742 | |
706688398 | https://github.com/pydata/xarray/issues/4498#issuecomment-706688398 | https://api.github.com/repos/pydata/xarray/issues/4498 | MDEyOklzc3VlQ29tbWVudDcwNjY4ODM5OA== | mankoff 145117 | 2020-10-11T11:11:47Z | 2020-10-11T11:19:56Z | CONTRIBUTOR | Thanks for the clarification that this is a real issue not due to just my coding, and the suggestion to solve this elsewhere. For now I just use the fast Pandas version with this code:
|
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Resample is ~100x slower than Pandas resample; Speed is related to resample period (unlike Pandas) 718436141 | |
706688498 | https://github.com/pydata/xarray/issues/4498#issuecomment-706688498 | https://api.github.com/repos/pydata/xarray/issues/4498 | MDEyOklzc3VlQ29tbWVudDcwNjY4ODQ5OA== | mankoff 145117 | 2020-10-11T11:12:47Z | 2020-10-11T11:12:47Z | CONTRIBUTOR | The linked issues refer to |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Resample is ~100x slower than Pandas resample; Speed is related to resample period (unlike Pandas) 718436141 | |
706548763 | https://github.com/pydata/xarray/issues/4498#issuecomment-706548763 | https://api.github.com/repos/pydata/xarray/issues/4498 | MDEyOklzc3VlQ29tbWVudDcwNjU0ODc2Mw== | mankoff 145117 | 2020-10-10T13:23:24Z | 2020-10-10T13:23:24Z | CONTRIBUTOR | The every 4th or 5th lag is not in the creation, it's in the ```` +BEGIN_SRC jupyter-python :kernel ds :session bugreportfor i in np.arange(25): start = time.time() ds_r = ds.resample({'time':"1H"}) print('xr', str(time.time() - start)) +END_SRC+RESULTS:+begin_examplexr 0.04479050636291504 xr 0.047682762145996094 xr 0.8904871940612793 xr 0.05605506896972656 xr 0.0452876091003418 xr 0.0467374324798584 xr 0.8709239959716797 xr 0.05595755577087402 xr 0.046492576599121094 xr 0.04648017883300781 xr 0.045223236083984375 xr 0.8187246322631836 xr 0.05060911178588867 xr 0.04763054847717285 xr 0.8156075477600098 xr 0.055490970611572266 xr 0.047312259674072266 xr 0.04651069641113281 xr 0.8001837730407715 xr 0.05546212196350098 xr 0.04549074172973633 xr 0.04680013656616211 xr 0.04383039474487305 xr 0.7662224769592285 xr 0.04914355278015137 +end_example```` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Resample is ~100x slower than Pandas resample; Speed is related to resample period (unlike Pandas) 718436141 | |
706548513 | https://github.com/pydata/xarray/issues/4498#issuecomment-706548513 | https://api.github.com/repos/pydata/xarray/issues/4498 | MDEyOklzc3VlQ29tbWVudDcwNjU0ODUxMw== | mankoff 145117 | 2020-10-10T13:21:19Z | 2020-10-10T13:21:19Z | CONTRIBUTOR | "performance" is a good tag. My actual use case is a dataset with 500,000 timestamps and 15 variables (10 minute weather station for a decade). In this case, pandas takes 0.03 seconds, and xarray takes 200 seconds. 4 orders of magnitude. Should I change the title to reflect the larger difference in performance? Here is that MWE: ```python import numpy as np import xarray as xr import pandas as pd import time size = 500000 times = pd.date_range('2000-01-01', periods=size, freq="10Min") ds = xr.Dataset({ 'foo': xr.DataArray( data = np.random.random(size), dims = ['time'], coords = {'time': times} )}) for v in 'abcdefghijelm': ds[v] = (('time'), np.random.random(size)) start = time.time() ds_r = ds.resample({'time':"1H"}).mean() print('xr', str(time.time() - start)) start = time.time() ds_r = ds.to_dataframe().resample("1H").mean() print('pd', str(time.time() - start)) ``` Result:
The strange thing here is if I drop the
But every 4th or 5th time that I run this, I get this:
This is repeatable. I've Run this code 100s of times now, and every 4th or 5th run it takes 10x. Nothing else is going on on my computer. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Resample is ~100x slower than Pandas resample; Speed is related to resample period (unlike Pandas) 718436141 | |
368456391 | https://github.com/pydata/xarray/issues/1917#issuecomment-368456391 | https://api.github.com/repos/pydata/xarray/issues/1917 | MDEyOklzc3VlQ29tbWVudDM2ODQ1NjM5MQ== | mankoff 145117 | 2018-02-26T10:28:16Z | 2018-02-26T10:28:16Z | CONTRIBUTOR | Appears fixed. Thank you! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Decode times adds micro-second noise to standard calendar 297780998 | |
366382745 | https://github.com/pydata/xarray/issues/1917#issuecomment-366382745 | https://api.github.com/repos/pydata/xarray/issues/1917 | MDEyOklzc3VlQ29tbWVudDM2NjM4Mjc0NQ== | mankoff 145117 | 2018-02-16T22:58:14Z | 2018-02-16T22:58:14Z | CONTRIBUTOR | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Decode times adds micro-second noise to standard calendar 297780998 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
issue 6