home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

12 rows where author_association = "CONTRIBUTOR" and user = 1956032 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, created_at (date), updated_at (date)

issue 6

  • Stateful user-defined accessors 4
  • IndexError when printing dataset from an Argo file 3
  • Decoding time according to CF conventions raises error if a NaN is found 2
  • Cannot open NetCDF file if dimension with time coordinate has length 0 (`ValueError` when decoding CF datetime) 1
  • bfill behavior dask arrays with small chunk size 1
  • Recommendations for domain-specific accessor documentation 1

user 1

  • gmaze · 12 ✖

author_association 1

  • CONTRIBUTOR · 12 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
709570971 https://github.com/pydata/xarray/issues/1329#issuecomment-709570971 https://api.github.com/repos/pydata/xarray/issues/1329 MDEyOklzc3VlQ29tbWVudDcwOTU3MDk3MQ== gmaze 1956032 2020-10-15T20:25:10Z 2020-10-15T20:25:10Z CONTRIBUTOR

I don't know if this issue is still relevant for xarray But I encountered the same error with https://github.com/euroargodev/argopy and may be surprisingly, only with xarray version 0.16.1

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Cannot open NetCDF file if dimension with time coordinate has length 0 (`ValueError` when decoding CF datetime) 217216935
539465775 https://github.com/pydata/xarray/issues/3268#issuecomment-539465775 https://api.github.com/repos/pydata/xarray/issues/3268 MDEyOklzc3VlQ29tbWVudDUzOTQ2NTc3NQ== gmaze 1956032 2019-10-08T11:13:25Z 2019-10-08T11:13:25Z CONTRIBUTOR

Alright, I think I get it, thanks for the clarification @crusaderky

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Stateful user-defined accessors 485708282
539383066 https://github.com/pydata/xarray/issues/3268#issuecomment-539383066 https://api.github.com/repos/pydata/xarray/issues/3268 MDEyOklzc3VlQ29tbWVudDUzOTM4MzA2Ng== gmaze 1956032 2019-10-08T07:28:07Z 2019-10-08T07:28:07Z CONTRIBUTOR

Ok, I get it. Probably the accessor is not the best solution in my case. And yes, an attribute was in fact my first implementation of the add/clean idea. But I was afraid it would be less reliable than the internal list over a long term perspective (but that was before getting in the troubles described above).

But why is asking accessor developers to define a copy method an issue ? That wouldn't be mandatory but only required in situations where propagating functional informations could be useful. Sorry if that's a naive question for you guys.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Stateful user-defined accessors 485708282
539174999 https://github.com/pydata/xarray/issues/3268#issuecomment-539174999 https://api.github.com/repos/pydata/xarray/issues/3268 MDEyOklzc3VlQ29tbWVudDUzOTE3NDk5OQ== gmaze 1956032 2019-10-07T19:49:41Z 2019-10-07T19:49:41Z CONTRIBUTOR

@crusaderky thanks for the explanation, that's a solution to my pb.

Although I understand that since accessor will be created from scratch, a dataset copy won't propagate the accessor properties (in this case the list of added variables):

```python ds = xarray.Dataset() ds['ext_data'] = xarray.DataArray(1.)

my_estimator = BaseEstimator() # With "clean" method from @crusaderky ds.my_accessor.fit(my_estimator, x=2.) ds.my_accessor.transform(my_estimator, y=3.)

ds2 = ds.copy()

ds = ds.my_accessor.clean() ds2 = ds2.my_accessor.clean()

print(ds.data_vars) print(ds2.data_vars) gives:python Data variables: ext_data float64 1.0 Data variables: ext_data float64 1.0 fit_data float64 4.0 trf_data float64 7.0 ``` "Cleaning" the dataset works as expected, but the copy (ds2) has en empty list of added variables so the "clean" method doesn't have the expected result. We have the same behavior for deep copy.

Would that make any sense that the xr.DataSet.copy() method also return a copy of the accessors ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Stateful user-defined accessors 485708282
538461456 https://github.com/pydata/xarray/issues/3268#issuecomment-538461456 https://api.github.com/repos/pydata/xarray/issues/3268 MDEyOklzc3VlQ29tbWVudDUzODQ2MTQ1Ng== gmaze 1956032 2019-10-04T16:07:21Z 2019-10-04T16:09:00Z CONTRIBUTOR

Hi all, I recently encountered an issue that look like this with accessor, but not sure. Here is a peace of code that reproduces the issue.

Starting from a class with the core of the code and an accessor to implement the user API:

``` python import xarray

class BaseEstimator(): def fit(self, this_ds, x=None): # Do something with this_ds: x = x**2 # and create a new array with results: da = xarray.DataArray(x).rename('fit_data') # Return results: return da

def transform(self, this_ds, **kw):
    # Do something with this_ds:
    val = kw['y'] + this_ds['fit_data']
    # and create a new array with results:
    da = xarray.DataArray(val).rename('trf_data')
    # Return results:
    return da

@xarray.register_dataset_accessor('my_accessor') class Foo: def init(self, obj): self.obj = obj self.added = list()

def add(self, da):
    self.obj[da.name] = da
    self.added.append(da.name)
    return self.obj

def clean(self):
    for v in self.added:
        self.obj = self.obj.drop(v)
        self.added.remove(v)
    return self.obj

def fit(self, estimator, **kw):
    this_da = estimator.fit(self, **kw)
    return self.add(this_da)

def transform(self, estimator, **kw):
    this_da = estimator.transform(self.obj, **kw)
    return self.add(this_da)

```

Now if we consider this workflow: ``` python

ds = xarray.Dataset() ds['ext_data'] = xarray.DataArray(1.)

my_estimator = BaseEstimator() ds = ds.my_accessor.fit(my_estimator, x=2.)

print("Before clean:") print("xr.DataSet var :", list(ds.data_vars)) print("accessor.obj var:", list(ds.my_accessor.obj.data_vars))

print("\nAfter clean:")

ds.my_accessor.clean() # This does nothing to ds but clean the accessor.obj

ds = ds.my_accessor.clean() # Cleaning ok for both ds and accessor.obj

ds_clean = ds.my_accessor.clean() # Cleaning ok on new ds, does nothing to ds as expected but clean in accessor.obj print("xr.DataSet var :", list(ds.data_vars)) print("accessor.obj var :", list(ds.my_accessor.obj.data_vars)) print("Cleaned xr.DataSet var:", list(ds_clean.data_vars)) We have the following output:python Before clean: xr.DataSet var : ['ext_data', 'fit_data'] accessor.obj var: ['ext_data', 'fit_data']

After clean: xr.DataSet var : ['ext_data', 'fit_data'] accessor.obj var : ['ext_data'] Cleaned xr.DataSet var: ['ext_data'] ``` The issue is clear here: the base space dataset has the 'fit_data' variable but not the accessor object: they've been "disconnected" and it's not apparent to users.

So if users later proceed to run the "transform":

python ds.my_accessor.transform(my_estimator, y=2.) they get an KeyError raised because the 'fit_data' is not in the accessor, although it still appears on the list of the ds variables, which is more than confusing.

Sorry for this long post, I'm not sure it's relevant to this issue but it seems so to me. I don't see a solution to this from the accessor developer side, except for not "interfering" with the content of the accessed object.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Stateful user-defined accessors 485708282
537669216 https://github.com/pydata/xarray/issues/3361#issuecomment-537669216 https://api.github.com/repos/pydata/xarray/issues/3361 MDEyOklzc3VlQ29tbWVudDUzNzY2OTIxNg== gmaze 1956032 2019-10-02T20:35:26Z 2019-10-02T20:35:26Z CONTRIBUTOR

Thanks for the example @jthielen, this looks great !

May be it would be better to not show to readers the accessor class name, since it will never be seen on the API frontend, only the scope name will.

So it would be great to be able to have the documentation to read something like:

xarray.DataSet.<scope_name>.<accessor_method_or_property>

instead of:

<package_name>.xarray.<accessor_class_name>.<accessor_method_or_property> (in the case where the accessor class is in xarray.py file on the package)

But I don't know how to manage that with sphinx documentation

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Recommendations for domain-specific accessor documentation 500949040
537018647 https://github.com/pydata/xarray/issues/2699#issuecomment-537018647 https://api.github.com/repos/pydata/xarray/issues/2699 MDEyOklzc3VlQ29tbWVudDUzNzAxODY0Nw== gmaze 1956032 2019-10-01T12:43:46Z 2019-10-01T12:43:46Z CONTRIBUTOR

I also recently encountered this bug and without user warnings it took me a while to identify its origin. I'll use this temporary fix. Thanks

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  bfill behavior dask arrays with small chunk size 402413097
347108958 https://github.com/pydata/xarray/issues/1732#issuecomment-347108958 https://api.github.com/repos/pydata/xarray/issues/1732 MDEyOklzc3VlQ29tbWVudDM0NzEwODk1OA== gmaze 1956032 2017-11-27T08:21:15Z 2017-11-27T08:21:15Z CONTRIBUTOR

The scipy backend has shown to be a good alternative as of now, if not I'll write a work around though. Thanks for your help !

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  IndexError when printing dataset from an Argo file 275744315
346066403 https://github.com/pydata/xarray/issues/1732#issuecomment-346066403 https://api.github.com/repos/pydata/xarray/issues/1732 MDEyOklzc3VlQ29tbWVudDM0NjA2NjQwMw== gmaze 1956032 2017-11-21T15:42:02Z 2017-11-21T16:01:04Z CONTRIBUTOR

Sorry guys, just found out that the issue is still going on with some of the variables in the dataset:

Works ok for temperature TEMP for instance: python ds = xr.open_dataset(argofile, autoclose=True, decode_cf=True) ds['TEMP'] Out[89]: <xarray.DataArray 'TEMP' (N_PROF: 338, N_LEVELS: 51)> array([[ 27.393 , 27.392 , 27.393 , ..., 3.597 , 3.34 , nan], [ 27.57 , 27.572001, 27.570999, ..., 3.543 , 3.265 , nan], [ 28.094999, 28.091999, 28.096001, ..., 3.544 , 3.287 , nan], ..., [ 27.157 , 27.156 , 27.159 , ..., 3.318 , nan, nan], [ 27.608999, 27.610001, 27.608999, ..., 3.419 , nan, nan], [ 27.569 , 27.566999, 27.561001, ..., 3.422 , nan, nan]]) Dimensions without coordinates: N_PROF, N_LEVELS Attributes: long_name: Sea temperature in-situ ITS-90 scale standard_name: sea_water_temperature units: degree_Celsius valid_min: -2.5 valid_max: 40.0 C_format: %9.3f FORTRAN_format: F9.3 resolution: 0.001

but for the variable "HISTORY_STEP", I get the error: python ds['HISTORY_STEP'] Out[90]: Traceback (most recent call last): File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/IPython/core/formatters.py", line 190, in catch_format_error r = method(self, *args, **kwargs) File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/IPython/core/formatters.py", line 672, in __call__ printer.pretty(obj) File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/IPython/lib/pretty.py", line 383, in pretty return _default_pprint(obj, self, cycle) File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/IPython/lib/pretty.py", line 503, in _default_pprint _repr_pprint(obj, p, cycle) File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/IPython/lib/pretty.py", line 701, in _repr_pprint output = repr(obj) File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/xarray/core/common.py", line 100, in __repr__ return formatting.array_repr(self) File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/xarray/core/formatting.py", line 393, in array_repr summary.append(short_array_repr(arr.values)) File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/xarray/core/dataarray.py", line 412, in values return self.variable.values File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/xarray/core/variable.py", line 396, in values return _as_array_or_item(self._data) File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/xarray/core/variable.py", line 217, in _as_array_or_item data = np.asarray(data) File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/numpy/core/numeric.py", line 482, in asarray return array(a, dtype, copy=False, order=order) File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/xarray/core/indexing.py", line 557, in __array__ self._ensure_cached() File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/xarray/core/indexing.py", line 554, in _ensure_cached self.array = NumpyIndexingAdapter(np.asarray(self.array)) File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/numpy/core/numeric.py", line 482, in asarray return array(a, dtype, copy=False, order=order) File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/xarray/core/indexing.py", line 538, in __array__ return np.asarray(self.array, dtype=dtype) File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/numpy/core/numeric.py", line 482, in asarray return array(a, dtype, copy=False, order=order) File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/xarray/core/indexing.py", line 505, in __array__ return np.asarray(array[self.key], dtype=None) File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/xarray/conventions.py", line 388, in __getitem__ return mask_and_scale(self.array[key], self.fill_value, File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/xarray/conventions.py", line 498, in __getitem__ return char_to_bytes(self.array[key]) File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/xarray/conventions.py", line 640, in char_to_bytes arr = np.array(arr, copy=False, order='C') File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/xarray/core/indexing.py", line 505, in __array__ return np.asarray(array[self.key], dtype=None) File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/xarray/backends/netCDF4_.py", line 72, in __getitem__ raise IndexError(msg) IndexError: The indexing operation you are attempting to perform is not valid on netCDF4.Variable object. Try loading your data into memory first by calling .load(). Original traceback: Traceback (most recent call last): File "/Users/gmaze/anaconda/envs/obidam/lib/python2.7/site-packages/xarray/backends/netCDF4_.py", line 61, in __getitem__ data = getitem(self.get_array(), key) File "netCDF4/_netCDF4.pyx", line 3961, in netCDF4._netCDF4.Variable.__getitem__ File "netCDF4/_netCDF4.pyx", line 4796, in netCDF4._netCDF4.Variable._get IndexError

The new state of the versions: ```python INSTALLED VERSIONS


commit: None python: 2.7.12.final.0 python-bits: 64 OS: Darwin OS-release: 16.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.None xarray: 0.10.0 pandas: 0.21.0 numpy: 1.11.3 scipy: 0.18.1 netCDF4: 1.3.1 h5netcdf: 0.3.1 Nio: None bottleneck: 1.2.0 cyordereddict: 1.0.0 dask: 0.16.0 matplotlib: 1.5.3 cartopy: 0.15.1 seaborn: 0.7.1 setuptools: 36.5.0 pip: 9.0.1 conda: None pytest: None IPython: 5.2.2 sphinx: 1.5.2 ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  IndexError when printing dataset from an Argo file 275744315
346058667 https://github.com/pydata/xarray/issues/1732#issuecomment-346058667 https://api.github.com/repos/pydata/xarray/issues/1732 MDEyOklzc3VlQ29tbWVudDM0NjA1ODY2Nw== gmaze 1956032 2017-11-21T15:18:37Z 2017-11-21T15:18:37Z CONTRIBUTOR

Ok, upgrading to 0.10.0 solve the issue ! Thanks Should have tried this in the 1st place

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  IndexError when printing dataset from an Argo file 275744315
340008407 https://github.com/pydata/xarray/issues/1662#issuecomment-340008407 https://api.github.com/repos/pydata/xarray/issues/1662 MDEyOklzc3VlQ29tbWVudDM0MDAwODQwNw== gmaze 1956032 2017-10-27T15:44:11Z 2017-10-27T15:44:11Z CONTRIBUTOR

Note that if the xarray decode_cf is given a NaT, in a datetime64, it works:

```python attrs = {'units': 'days since 1950-01-01 00:00:00 UTC'} # Classic Argo data Julian Day reference jd = [24658.46875, 24658.46366898, 24658.47256944, np.NaN] # Sample

def dirtyfixNaNjd(ref,day): td = pd.NaT if not np.isnan(day): td = pd.Timedelta(days=day) return pd.Timestamp(ref) + td

jd = [dirtyfixNaNjd('1950-01-01',day) for day in jd] print jd python [Timestamp('2017-07-06 11:15:00'), Timestamp('2017-07-06 11:07:40.999872'), Timestamp('2017-07-06 11:20:29.999616'), NaT] then:python ds = xr.Dataset({'time': ('time', jd, {'units': 'ns'})}) # Update the units attribute appropriately ds = xr.decode_cf(ds) print ds['time'].values python ['2017-07-06T11:15:00.000000000' '2017-07-06T11:07:40.999872000' '2017-07-06T11:20:29.999616000' 'NaT'] ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Decoding time according to CF conventions raises error if a NaN is found 268725471
339991529 https://github.com/pydata/xarray/issues/1662#issuecomment-339991529 https://api.github.com/repos/pydata/xarray/issues/1662 MDEyOklzc3VlQ29tbWVudDMzOTk5MTUyOQ== gmaze 1956032 2017-10-27T14:42:56Z 2017-10-27T14:42:56Z CONTRIBUTOR

Hi Ryan, never been very far, following/promoting xarray around here, and congrats for Pangeo !

Ok, I get the datatype being wrong, but about the issue from pandas TimedeltaIndex: Does this means that a quick/dirty fix should be to decode value by value rather than on a vector ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Decoding time according to CF conventions raises error if a NaN is found 268725471

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 16.777ms · About: xarray-datasette