home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

11 rows where issue = 1392878100 and user = 1828519 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 1

  • djhoese · 11 ✖

issue 1

  • New deep copy behavior in 2022.9.0 causes maximum recursion error · 11 ✖

author_association 1

  • CONTRIBUTOR 11
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1267531839 https://github.com/pydata/xarray/issues/7111#issuecomment-1267531839 https://api.github.com/repos/pydata/xarray/issues/7111 IC_kwDOAMm_X85LjQA_ djhoese 1828519 2022-10-04T20:20:19Z 2022-10-04T20:20:19Z CONTRIBUTOR

We talked about this today in our pytroll/satpy meeting. We're not sure we agree with cf-xarray putting ancillary variables as coordinates or that it will work for us, so we think we could eventually remove any "automatic" ancillary variable loading and require that the user explicitly request any ancillary variables they want from Satpy's readers.

That said, this will take a lot of work to change. Since it seems like #7112 fixes a majority of our issues I'm hoping that that can still be merged. I'd hope that the memo logic when deepcopying will still protect against other recursive objects (possibly optimize?) even if they can't be directly serialized to NetCDF.

Side note: I feel like there is a difference between the NetCDF model and serializing/saving to a NetCDF file.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  New deep copy behavior in 2022.9.0 causes maximum recursion error 1392878100
1266889779 https://github.com/pydata/xarray/issues/7111#issuecomment-1266889779 https://api.github.com/repos/pydata/xarray/issues/7111 IC_kwDOAMm_X85LgzQz djhoese 1828519 2022-10-04T12:07:12Z 2022-10-04T12:07:37Z CONTRIBUTOR

@mraspaud See the cf-xarray link from Deepak. We could make them coordinates. Or we could reference them by name:

ds = xr.open_dataset(...) anc_name = ds["my_var"].attrs["ancillary_variables"][0] anc_var = ds[anc_name]

Edit: Let's talk more in the pytroll meeting today.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  New deep copy behavior in 2022.9.0 causes maximum recursion error 1392878100
1265910157 https://github.com/pydata/xarray/issues/7111#issuecomment-1265910157 https://api.github.com/repos/pydata/xarray/issues/7111 IC_kwDOAMm_X85LdEGN djhoese 1828519 2022-10-03T19:13:58Z 2022-10-03T19:13:58Z CONTRIBUTOR

@dcherian Thanks for the feedback. When these decisions were made in Satpy xarray was not able to contain dask arrays as coordinates and we depend heavily on dask for our use cases. Putting some of these datasets as coordinates as cf xarray does may have caused extra unnecessary loading/computation. I'm not sure that would be the case with modern xarray.

Note that ancillary_variables are not the only case of "embedded" DataArrays in our code. We also needed something for CRS + bounds or other geolocation information. As you know I'm very much interested in CRS and geolocation handling in xarray, but for backwards compatibility we also have pyresample AreaDefinition and SwathDefinition objects in our DataArray .attrs["area"] attributes. A SwathDefinition is able to contain two DataArray objects for longitude and latitude. These also get copied with this new deep copy behavior.

We have a monthly Pytroll/Satpy meeting tomorrow so if you have any other suggestions or points for or against our usage please comment here and we'll see what we can do.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  New deep copy behavior in 2022.9.0 causes maximum recursion error 1392878100
1265792923 https://github.com/pydata/xarray/issues/7111#issuecomment-1265792923 https://api.github.com/repos/pydata/xarray/issues/7111 IC_kwDOAMm_X85Lcneb djhoese 1828519 2022-10-03T17:27:55Z 2022-10-03T17:27:55Z CONTRIBUTOR

@TomNicholas Do you mean the "name" of the sub-DataArray? Or the numpy/dask array of the sub-DataArray?

This is what I was trying to describe in https://github.com/pydata/xarray/issues/7111#issuecomment-1264386173. In Satpy we have our own Dataset-like/DataTree-like object where the user explicitly says "I want to load X from input files". As a convenience we put any ancillary variables (ex. data quality flags) in the DataArray .attrs for easier access. In Satpy there is no other direct connection between one DataArray and another. They are overall independent objects on a processing level so there may not be access to this higher-level Dataset-like container object in order to get ancillary variables by name.

@mraspaud was one of the original people who proposed our current design so maybe he can provide more context.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  New deep copy behavior in 2022.9.0 causes maximum recursion error 1392878100
1264515251 https://github.com/pydata/xarray/issues/7111#issuecomment-1264515251 https://api.github.com/repos/pydata/xarray/issues/7111 IC_kwDOAMm_X85LXviz djhoese 1828519 2022-10-02T00:25:36Z 2022-10-02T00:41:00Z CONTRIBUTOR

Sorry, false alarm. I was running with an old environment. With this new PR it seems the ancillary_variables tests that were failing now pass, but the dask .copy() related ones still fail...which is expected so I'm ok with that.

Edit: I hacked variable.py so it had this:

if deep: if is_duck_dask_array(ndata): ndata = ndata else: ndata = copy.deepcopy(ndata, memo)

and that fixed a lot of my dask related tests, but also seems to have introduced two new failures from what I can tell. So :man_shrugging:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  New deep copy behavior in 2022.9.0 causes maximum recursion error 1392878100
1264437857 https://github.com/pydata/xarray/issues/7111#issuecomment-1264437857 https://api.github.com/repos/pydata/xarray/issues/7111 IC_kwDOAMm_X85LXcph djhoese 1828519 2022-10-01T18:01:16Z 2022-10-01T18:01:16Z CONTRIBUTOR

It looks like that PR fixes all of my Satpy unit tests. I'm not sure how that is possible if it doesn't also change when dask arrays are copied.

{
    "total_count": 2,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 2,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  New deep copy behavior in 2022.9.0 causes maximum recursion error 1392878100
1264388047 https://github.com/pydata/xarray/issues/7111#issuecomment-1264388047 https://api.github.com/repos/pydata/xarray/issues/7111 IC_kwDOAMm_X85LXQfP djhoese 1828519 2022-10-01T14:56:45Z 2022-10-01T14:56:45Z CONTRIBUTOR

Also note the other important change in this new behavior which is that dask arrays are now copied (.copy()) when they weren't before. This is causing some equality issues for us in Satpy, but I agree with the change on xarray's side (xarray should be able to call .copy() on whatever array it has.

https://github.com/dask/dask/issues/9533

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  New deep copy behavior in 2022.9.0 causes maximum recursion error 1392878100
1264386173 https://github.com/pydata/xarray/issues/7111#issuecomment-1264386173 https://api.github.com/repos/pydata/xarray/issues/7111 IC_kwDOAMm_X85LXQB9 djhoese 1828519 2022-10-01T14:47:51Z 2022-10-01T14:47:51Z CONTRIBUTOR

I'm a little torn on this. Obviously I'm not an xarray maintainer so I'm not the one who would have to maintain it or answer support questions about it. We actually had the user-side of this discussion in the Satpy library group a while ago which is leading to this whole problem for us now. In Satpy we don't typically use or deal with xarray Datasets (the new DataTree library is likely what we'll move to) so when we have relationships between DataArrays we'll use something like ancillary variables to connect them. For example, a data quality flag that is used by the other variables in a file. Our users don't usually care about the DQF but we don't want to stop them from being able to easily access it. I was never a huge fan of putting a DataArray in the attrs of another DataArray, but nothing seemed to disallow it so I ultimately lost that argument.

So on one hand I agree it seems like there shouldn't be a need in most cases to have a DataArray inside a DataArray, especially a circular dependency. On the other hand, I'm not looking forward to the updates I'll need to make to Satpy to fix this. Note, we don't do this everywhere in Satpy, just something we use for a few formats we read.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  New deep copy behavior in 2022.9.0 causes maximum recursion error 1392878100
1263967009 https://github.com/pydata/xarray/issues/7111#issuecomment-1263967009 https://api.github.com/repos/pydata/xarray/issues/7111 IC_kwDOAMm_X85LVpsh djhoese 1828519 2022-09-30T19:58:11Z 2022-09-30T19:58:11Z CONTRIBUTOR

I'd have to check, but this structure I think was originally produce by xarray reading a CF compliant NetCDF file. That is my memory at least. It could be that our library (satpy) is doing this as a convenience, replacing the name of an ancillary variable with the DataArray of that ancillary variable.

My other new issue seems to be related to .copy() doing a .copy() on dask arrays which then makes them not equivalent anymore. Working on an MVCE now.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  New deep copy behavior in 2022.9.0 causes maximum recursion error 1392878100
1263952252 https://github.com/pydata/xarray/issues/7111#issuecomment-1263952252 https://api.github.com/repos/pydata/xarray/issues/7111 IC_kwDOAMm_X85LVmF8 djhoese 1828519 2022-09-30T19:41:30Z 2022-09-30T19:41:30Z CONTRIBUTOR

I get a similar error for different structures and if I do something like data_arr.where(data_arr > 5, drop=True). In this case I have dask array based DataArrays and dask ends up trying to hash the object and it ends up in a loop trying to get xarray to hash the DataArray or something and xarray trying to hash the DataArrays inside .attrs.

``` In [9]: import dask.array as da

In [15]: a = xr.DataArray(da.zeros(5.0), attrs={}, dims=("a_dim",))

In [16]: b = xr.DataArray(da.zeros(8.0), attrs={}, dims=("b_dim",))

In [20]: a.attrs["other"] = b

In [24]: lons = xr.DataArray(da.random.random(8), attrs={"ancillary_variables": [b]})

In [25]: lats = xr.DataArray(da.random.random(8), attrs={"ancillary_variables": [b]})

In [26]: b.attrs["some_attr"] = [lons, lats]

In [27]: cond = a > 5

In [28]: c = a.where(cond, drop=True) ... File ~/miniconda3/envs/satpy_py310/lib/python3.10/site-packages/dask/utils.py:1982, in _HashIdWrapper.hash(self) 1981 def hash(self): -> 1982 return id(self.wrapped)

RecursionError: maximum recursion depth exceeded while calling a Python object

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  New deep copy behavior in 2022.9.0 causes maximum recursion error 1392878100
1263927640 https://github.com/pydata/xarray/issues/7111#issuecomment-1263927640 https://api.github.com/repos/pydata/xarray/issues/7111 IC_kwDOAMm_X85LVgFY djhoese 1828519 2022-09-30T19:14:48Z 2022-09-30T19:14:48Z CONTRIBUTOR

CC @headtr1ck any idea if this is supposed to work with your new #7089?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  New deep copy behavior in 2022.9.0 causes maximum recursion error 1392878100

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.53ms · About: xarray-datasette