home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where issue = 485708282 and user = 1956032 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • gmaze · 4 ✖

issue 1

  • Stateful user-defined accessors · 4 ✖

author_association 1

  • CONTRIBUTOR 4
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
539465775 https://github.com/pydata/xarray/issues/3268#issuecomment-539465775 https://api.github.com/repos/pydata/xarray/issues/3268 MDEyOklzc3VlQ29tbWVudDUzOTQ2NTc3NQ== gmaze 1956032 2019-10-08T11:13:25Z 2019-10-08T11:13:25Z CONTRIBUTOR

Alright, I think I get it, thanks for the clarification @crusaderky

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Stateful user-defined accessors 485708282
539383066 https://github.com/pydata/xarray/issues/3268#issuecomment-539383066 https://api.github.com/repos/pydata/xarray/issues/3268 MDEyOklzc3VlQ29tbWVudDUzOTM4MzA2Ng== gmaze 1956032 2019-10-08T07:28:07Z 2019-10-08T07:28:07Z CONTRIBUTOR

Ok, I get it. Probably the accessor is not the best solution in my case. And yes, an attribute was in fact my first implementation of the add/clean idea. But I was afraid it would be less reliable than the internal list over a long term perspective (but that was before getting in the troubles described above).

But why is asking accessor developers to define a copy method an issue ? That wouldn't be mandatory but only required in situations where propagating functional informations could be useful. Sorry if that's a naive question for you guys.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Stateful user-defined accessors 485708282
539174999 https://github.com/pydata/xarray/issues/3268#issuecomment-539174999 https://api.github.com/repos/pydata/xarray/issues/3268 MDEyOklzc3VlQ29tbWVudDUzOTE3NDk5OQ== gmaze 1956032 2019-10-07T19:49:41Z 2019-10-07T19:49:41Z CONTRIBUTOR

@crusaderky thanks for the explanation, that's a solution to my pb.

Although I understand that since accessor will be created from scratch, a dataset copy won't propagate the accessor properties (in this case the list of added variables):

```python ds = xarray.Dataset() ds['ext_data'] = xarray.DataArray(1.)

my_estimator = BaseEstimator() # With "clean" method from @crusaderky ds.my_accessor.fit(my_estimator, x=2.) ds.my_accessor.transform(my_estimator, y=3.)

ds2 = ds.copy()

ds = ds.my_accessor.clean() ds2 = ds2.my_accessor.clean()

print(ds.data_vars) print(ds2.data_vars) gives:python Data variables: ext_data float64 1.0 Data variables: ext_data float64 1.0 fit_data float64 4.0 trf_data float64 7.0 ``` "Cleaning" the dataset works as expected, but the copy (ds2) has en empty list of added variables so the "clean" method doesn't have the expected result. We have the same behavior for deep copy.

Would that make any sense that the xr.DataSet.copy() method also return a copy of the accessors ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Stateful user-defined accessors 485708282
538461456 https://github.com/pydata/xarray/issues/3268#issuecomment-538461456 https://api.github.com/repos/pydata/xarray/issues/3268 MDEyOklzc3VlQ29tbWVudDUzODQ2MTQ1Ng== gmaze 1956032 2019-10-04T16:07:21Z 2019-10-04T16:09:00Z CONTRIBUTOR

Hi all, I recently encountered an issue that look like this with accessor, but not sure. Here is a peace of code that reproduces the issue.

Starting from a class with the core of the code and an accessor to implement the user API:

``` python import xarray

class BaseEstimator(): def fit(self, this_ds, x=None): # Do something with this_ds: x = x**2 # and create a new array with results: da = xarray.DataArray(x).rename('fit_data') # Return results: return da

def transform(self, this_ds, **kw):
    # Do something with this_ds:
    val = kw['y'] + this_ds['fit_data']
    # and create a new array with results:
    da = xarray.DataArray(val).rename('trf_data')
    # Return results:
    return da

@xarray.register_dataset_accessor('my_accessor') class Foo: def init(self, obj): self.obj = obj self.added = list()

def add(self, da):
    self.obj[da.name] = da
    self.added.append(da.name)
    return self.obj

def clean(self):
    for v in self.added:
        self.obj = self.obj.drop(v)
        self.added.remove(v)
    return self.obj

def fit(self, estimator, **kw):
    this_da = estimator.fit(self, **kw)
    return self.add(this_da)

def transform(self, estimator, **kw):
    this_da = estimator.transform(self.obj, **kw)
    return self.add(this_da)

```

Now if we consider this workflow: ``` python

ds = xarray.Dataset() ds['ext_data'] = xarray.DataArray(1.)

my_estimator = BaseEstimator() ds = ds.my_accessor.fit(my_estimator, x=2.)

print("Before clean:") print("xr.DataSet var :", list(ds.data_vars)) print("accessor.obj var:", list(ds.my_accessor.obj.data_vars))

print("\nAfter clean:")

ds.my_accessor.clean() # This does nothing to ds but clean the accessor.obj

ds = ds.my_accessor.clean() # Cleaning ok for both ds and accessor.obj

ds_clean = ds.my_accessor.clean() # Cleaning ok on new ds, does nothing to ds as expected but clean in accessor.obj print("xr.DataSet var :", list(ds.data_vars)) print("accessor.obj var :", list(ds.my_accessor.obj.data_vars)) print("Cleaned xr.DataSet var:", list(ds_clean.data_vars)) We have the following output:python Before clean: xr.DataSet var : ['ext_data', 'fit_data'] accessor.obj var: ['ext_data', 'fit_data']

After clean: xr.DataSet var : ['ext_data', 'fit_data'] accessor.obj var : ['ext_data'] Cleaned xr.DataSet var: ['ext_data'] ``` The issue is clear here: the base space dataset has the 'fit_data' variable but not the accessor object: they've been "disconnected" and it's not apparent to users.

So if users later proceed to run the "transform":

python ds.my_accessor.transform(my_estimator, y=2.) they get an KeyError raised because the 'fit_data' is not in the accessor, although it still appears on the list of the ds variables, which is more than confusing.

Sorry for this long post, I'm not sure it's relevant to this issue but it seems so to me. I don't see a solution to this from the accessor developer side, except for not "interfering" with the content of the accessed object.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Stateful user-defined accessors 485708282

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 12.344ms · About: xarray-datasette