issue_comments
14 rows where user = 691772 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: issue_url, created_at (date), updated_at (date)
user 1
- lumbric · 14 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
1268031159 | https://github.com/pydata/xarray/issues/7059#issuecomment-1268031159 | https://api.github.com/repos/pydata/xarray/issues/7059 | IC_kwDOAMm_X85LlJ63 | lumbric 691772 | 2022-10-05T07:02:23Z | 2022-10-05T07:02:48Z | CONTRIBUTOR |
What do you mean by that?
Uhm yes, you are right, this should be removed, not sure how this happened. Removing
Oh wow, thanks! Haven't seen flox before. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pandas.errors.InvalidIndexError raised when running computation in parallel using dask 1379372915 | |
1254873700 | https://github.com/pydata/xarray/issues/7059#issuecomment-1254873700 | https://api.github.com/repos/pydata/xarray/issues/7059 | IC_kwDOAMm_X85Ky9pk | lumbric 691772 | 2022-09-22T11:09:16Z | 2022-09-22T11:09:16Z | CONTRIBUTOR | I have managed to reduce the reproducing example (see "Minimal Complete Verifiable Example 2" above) and then also find a proper solution to fix this issue. I am still not sure whether this is a bug or intended behavior, so I'll won't close the issue for now. Basically the issue occurs when a chunked NetCDF file is loaded from disk, passed to ``` --- run-broken.py 2022-09-22 13:00:41.095555961 +0200 +++ run.py 2022-09-22 13:01:14.452696511 +0200 @@ -30,17 +30,17 @@ def resample_annually(data): return data.sortby("time").resample(time="1A", label="left", loffset="1D").mean(dim="time")
This seems to fix this issue and seems to be the proper solution anyway. I still don't see why I am not allowed to use |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pandas.errors.InvalidIndexError raised when running computation in parallel using dask 1379372915 | |
1252561840 | https://github.com/pydata/xarray/issues/7059#issuecomment-1252561840 | https://api.github.com/repos/pydata/xarray/issues/7059 | IC_kwDOAMm_X85KqJOw | lumbric 691772 | 2022-09-20T15:54:48Z | 2022-09-20T15:54:48Z | CONTRIBUTOR | @benbovy thanks for the hint! I tried passing an explicit lock to |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pandas.errors.InvalidIndexError raised when running computation in parallel using dask 1379372915 | |
1243864752 | https://github.com/pydata/xarray/issues/6816#issuecomment-1243864752 | https://api.github.com/repos/pydata/xarray/issues/6816 | IC_kwDOAMm_X85KI96w | lumbric 691772 | 2022-09-12T14:55:06Z | 2022-09-13T09:39:48Z | CONTRIBUTOR | Not sure what changed, but now I do get the same error also with my small and synthetic test data. This way I was able to debug a bit further. I am pretty sure this is a bug in xarray or pandas. I think something in I can create a new ticket, if you prefer, but since I am not sure in which project, I will continue to collect information here. Unfortunately I have not yet managed to create a minimal example as this is quite tricky with timing issues. Additional debugging print and proof of race conditionIf I add the following debugging print to the pandas code: ``` --- /tmp/base.py 2022-09-12 16:35:53.739971953 +0200 +++ /home/lumbric/.conda/envs/my_project/lib/python3.8/site-packages/pandas/core/indexes/base.py 2022-09-12 16:35:58.864144801 +0200 @@ -3718,7 +3718,6 @@ self._check_indexing_method(method, limit, tolerance)
So the index seems to be unique, but To confirm that the race condition is at this point we wait for 1s and then check again for uniqueness: ``` --- /tmp/base.py 2022-09-12 16:35:53.739971953 +0200 +++ /home/lumbric/.conda/envs/my_project/lib/python3.8/site-packages/pandas/core/indexes/base.py 2022-09-12 16:35:58.864144801 +0200 @@ -3718,7 +3718,10 @@ self._check_indexing_method(method, limit, tolerance)
This outputs:
Traceback
WorkaroundThe issue does not occur if I use the synchronous dask scheduler by adding at the very beginning of my script:
Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:10)
[GCC 10.3.0]
python-bits: 64
OS: Linux
OS-release: 5.4.0-124-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.1
libnetcdf: 4.8.1
xarray: 2022.3.0
pandas: 1.4.2
numpy: 1.22.4
scipy: 1.8.1
netCDF4: 1.5.8
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.6.0
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.2.10
cfgrib: None
iris: None
bottleneck: None
dask: 2022.05.2
distributed: 2022.5.2
matplotlib: 3.5.2
cartopy: None
seaborn: 0.11.2
numbagg: None
fsspec: 2022.5.0
cupy: None
pint: None
sparse: None
setuptools: 62.3.2
pip: 22.1.2
conda: 4.12.0
pytest: 7.1.2
IPython: 8.4.0
sphinx: None
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pandas.errors.InvalidIndexError is raised in some runs when using chunks and map_blocks() 1315111684 | |
1243882465 | https://github.com/pydata/xarray/issues/6816#issuecomment-1243882465 | https://api.github.com/repos/pydata/xarray/issues/6816 | IC_kwDOAMm_X85KJCPh | lumbric 691772 | 2022-09-12T15:07:45Z | 2022-09-12T15:07:45Z | CONTRIBUTOR | I think these are the values of the index, the values seem to be unique and monotonic. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pandas.errors.InvalidIndexError is raised in some runs when using chunks and map_blocks() 1315111684 | |
1220519740 | https://github.com/pydata/xarray/issues/6816#issuecomment-1220519740 | https://api.github.com/repos/pydata/xarray/issues/6816 | IC_kwDOAMm_X85Iv6c8 | lumbric 691772 | 2022-08-19T10:33:59Z | 2022-08-19T10:33:59Z | CONTRIBUTOR | Thanks a lot for your quick reply and your helpful hints! In the meantime I have verified that: Unfortunately I was not able to reproduce the error often enough lately to test it with the synchronous scheduler nor to create a smaller synthetic example which reproduces the problem. One run takes about an hour until the exception occurs (or not), which makes things hard to debug. But I will continue trying and keep this ticket updated. Any further suggestions very welcome :) Thanks a lot! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
pandas.errors.InvalidIndexError is raised in some runs when using chunks and map_blocks() 1315111684 | |
1046665303 | https://github.com/pydata/xarray/issues/2186#issuecomment-1046665303 | https://api.github.com/repos/pydata/xarray/issues/2186 | IC_kwDOAMm_X84-YthX | lumbric 691772 | 2022-02-21T09:41:00Z | 2022-02-21T09:41:00Z | CONTRIBUTOR | I just stumbled across the same issue and created a minimal example similar to @lkilcher. I am using What seems to work: do not use the If I understand things correctly, this indicates that the issue is a consequence of dask/dask#3530. Not sure if there is anything to be fixed on the xarray side or what would be the best work around. I will try to use the processes scheduler. I can create a new (xarray) ticket with all details about the minimal example, if anyone thinks that this might be helpful (to collect work-a-rounds or discuss fixes on the xarray side). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Memory leak while looping through a Dataset 326533369 | |
510939525 | https://github.com/pydata/xarray/issues/2928#issuecomment-510939525 | https://api.github.com/repos/pydata/xarray/issues/2928 | MDEyOklzc3VlQ29tbWVudDUxMDkzOTUyNQ== | lumbric 691772 | 2019-07-12T15:56:28Z | 2019-07-12T15:56:28Z | CONTRIBUTOR | Fixed in 714ae8661a829d. (Sorry for the delay... I actually prepared a PR but never finished it completely even it was such a simple thing.) |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Dask outputs warning: "The da.atop function has moved to da.blockwise" 438389323 | |
487759590 | https://github.com/pydata/xarray/issues/2928#issuecomment-487759590 | https://api.github.com/repos/pydata/xarray/issues/2928 | MDEyOklzc3VlQ29tbWVudDQ4Nzc1OTU5MA== | lumbric 691772 | 2019-04-29T22:00:58Z | 2019-04-29T22:00:58Z | CONTRIBUTOR |
Yes, can do so. When writing the report, I actually thought maybe preparing a PR is easier to write and to read than the ticket... :) In this case it really shouldn't be a big deal fixing it. Maybe a bit off-topic: The thing I don't really understand and why I wanted to ask first: is there a clear paradigm about compatibility in the pydata universe? Despite its 0.x version number, I guess xarray tries to stay backward compatible regarding its public interface, right? When are the versions of dependencies increase? Simply motivated by need of new features in one of the dependent libraries? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Dask outputs warning: "The da.atop function has moved to da.blockwise" 438389323 | |
484239080 | https://github.com/pydata/xarray/pull/2904#issuecomment-484239080 | https://api.github.com/repos/pydata/xarray/issues/2904 | MDEyOklzc3VlQ29tbWVudDQ4NDIzOTA4MA== | lumbric 691772 | 2019-04-17T20:00:22Z | 2019-04-17T20:00:22Z | CONTRIBUTOR | Ah yes, true! I've confused something here. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Minor improvement of docstring for Dataset 434444058 | |
484232306 | https://github.com/pydata/xarray/pull/2904#issuecomment-484232306 | https://api.github.com/repos/pydata/xarray/issues/2904 | MDEyOklzc3VlQ29tbWVudDQ4NDIzMjMwNg== | lumbric 691772 | 2019-04-17T19:39:42Z | 2019-04-17T19:39:42Z | CONTRIBUTOR | Hm yes, good error messages would be great, but I feel like it is widely accepted that in the scientific Python ecosystem error messages are hard to read quite often. Maybe this is the downside the duck typing? I've mentioned this only as explanation why I was so confused after running |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Minor improvement of docstring for Dataset 434444058 | |
464338041 | https://github.com/pydata/xarray/issues/1346#issuecomment-464338041 | https://api.github.com/repos/pydata/xarray/issues/1346 | MDEyOklzc3VlQ29tbWVudDQ2NDMzODA0MQ== | lumbric 691772 | 2019-02-16T11:20:20Z | 2019-02-16T11:20:20Z | CONTRIBUTOR | Oh yes, of course! I've underestimated the low precision of float32 values above 2**24. Thanks for the hint. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
bottleneck : Wrong mean for float32 array 218459353 | |
463324373 | https://github.com/pydata/xarray/issues/1346#issuecomment-463324373 | https://api.github.com/repos/pydata/xarray/issues/1346 | MDEyOklzc3VlQ29tbWVudDQ2MzMyNDM3Mw== | lumbric 691772 | 2019-02-13T19:02:52Z | 2019-02-16T10:53:51Z | CONTRIBUTOR | I think (!) xarray is not effected any longer, but pandas is. Bisecting the GIT history leads to commit 0b9ab2d1, which means that xarray >= v0.10.9 should not be affected. Uninstalling bottleneck is also a valid workaround. <s>Bottleneck's documentation explicitly mentions that no error is raised in case of an overflow. But it seams to be very evil behavior, so it might be worth reporting upstream.</s> What do you think? (I think kwgoodman/bottleneck#164 is something different, isn't it?) Edit: this is not an overflow. It's a numerical error by not applying pairwise summation. A couple of minimal examples: ```python
Done with the following versions:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
bottleneck : Wrong mean for float32 array 218459353 | |
464016154 | https://github.com/pydata/xarray/issues/1346#issuecomment-464016154 | https://api.github.com/repos/pydata/xarray/issues/1346 | MDEyOklzc3VlQ29tbWVudDQ2NDAxNjE1NA== | lumbric 691772 | 2019-02-15T11:41:36Z | 2019-02-15T11:41:36Z | CONTRIBUTOR | Oh hm, I think I didn't really understand what happens in Isn't this what bottleneck is doing? Summing up a bunch of float32 values and then dividing by the length? ```
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
bottleneck : Wrong mean for float32 array 218459353 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
issue 6