home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

8 rows where author_association = "CONTRIBUTOR" and user = 1325771 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, created_at (date), updated_at (date)

issue 4

  • Automate interpretation of _Unsigned attribute 4
  • Best practice when the _Unsigned attribute is present in NetCDF files 2
  • Cannot open NetCDF file if dimension with time coordinate has length 0 (`ValueError` when decoding CF datetime) 1
  • open_mfdataset memory leak, very simple case. v0.12 1

user 1

  • deeplycloudy · 8 ✖

author_association 1

  • CONTRIBUTOR · 8 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1416446874 https://github.com/pydata/xarray/issues/3200#issuecomment-1416446874 https://api.github.com/repos/pydata/xarray/issues/3200 IC_kwDOAMm_X85UbUOa deeplycloudy 1325771 2023-02-03T21:52:57Z 2023-02-03T21:52:57Z CONTRIBUTOR

I was iterating today over a large dataset loaded with open_mfdataset, and had been observing memory usage growing from 2GB to 8GB+.

I can confirm that xr.set_options(file_cache_maxsize=1) kept memory use at a steady 2GB, properly releasing memory.

libnetcdf 4.8.1 nompi_h261ec11_106 conda-forge netcdf4 1.6.0 nompi_py310h0a86a1f_103 conda-forge xarray 2023.1.0 pyhd8ed1ab_0 conda-forge dask 2023.1.0 pyhd8ed1ab_0 conda-forge dask-core 2023.1.0 pyhd8ed1ab_0 conda-forge

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset memory leak, very simple case. v0.12 479190812
491952681 https://github.com/pydata/xarray/issues/1329#issuecomment-491952681 https://api.github.com/repos/pydata/xarray/issues/1329 MDEyOklzc3VlQ29tbWVudDQ5MTk1MjY4MQ== deeplycloudy 1325771 2019-05-13T19:23:16Z 2019-05-13T19:23:16Z CONTRIBUTOR

I ran into this issue with a file from the GOES-17 lightning mapper.

A simple script to reproduce is: d=xr.open_dataset('OR_GLM-L2-LCFA_G17_s20191110644200_e20191110644370_c20191110645086.nc') d.load()

giving the error

``` ----> 1 d.load()

~/anaconda/envs/isatss/lib/python3.7/site-packages/xarray/core/dataset.py in load(self, **kwargs) 516 for k, v in self.variables.items(): 517 if k not in lazy_data: --> 518 v.load() 519 520 return self

~/anaconda/envs/isatss/lib/python3.7/site-packages/xarray/core/variable.py in load(self, kwargs) 325 self._data = as_compatible_data(self._data.compute(kwargs)) 326 elif not isinstance(self._data, np.ndarray): --> 327 self._data = np.asarray(self._data) 328 return self 329

~/anaconda/envs/isatss/lib/python3.7/site-packages/numpy/core/numeric.py in asarray(a, dtype, order) 499 500 """ --> 501 return array(a, dtype, copy=False, order=order) 502 503

~/anaconda/envs/isatss/lib/python3.7/site-packages/xarray/core/indexing.py in array(self, dtype) 624 625 def array(self, dtype=None): --> 626 self._ensure_cached() 627 return np.asarray(self.array, dtype=dtype) 628

~/anaconda/envs/isatss/lib/python3.7/site-packages/xarray/core/indexing.py in _ensure_cached(self) 621 def _ensure_cached(self): 622 if not isinstance(self.array, NumpyIndexingAdapter): --> 623 self.array = NumpyIndexingAdapter(np.asarray(self.array)) 624 625 def array(self, dtype=None):

~/anaconda/envs/isatss/lib/python3.7/site-packages/numpy/core/numeric.py in asarray(a, dtype, order) 499 500 """ --> 501 return array(a, dtype, copy=False, order=order) 502 503

~/anaconda/envs/isatss/lib/python3.7/site-packages/xarray/core/indexing.py in array(self, dtype) 602 603 def array(self, dtype=None): --> 604 return np.asarray(self.array, dtype=dtype) 605 606 def getitem(self, key):

~/anaconda/envs/isatss/lib/python3.7/site-packages/numpy/core/numeric.py in asarray(a, dtype, order) 499 500 """ --> 501 return array(a, dtype, copy=False, order=order) 502 503

~/anaconda/envs/isatss/lib/python3.7/site-packages/xarray/core/indexing.py in array(self, dtype) 508 def array(self, dtype=None): 509 array = as_indexable(self.array) --> 510 return np.asarray(array[self.key], dtype=None) 511 512 def transpose(self, order):

~/anaconda/envs/isatss/lib/python3.7/site-packages/numpy/core/numeric.py in asarray(a, dtype, order) 499 500 """ --> 501 return array(a, dtype, copy=False, order=order) 502 503

~/anaconda/envs/isatss/lib/python3.7/site-packages/xarray/coding/variables.py in array(self, dtype) 66 67 def array(self, dtype=None): ---> 68 return self.func(self.array) 69 70 def repr(self):

~/anaconda/envs/isatss/lib/python3.7/site-packages/xarray/coding/times.py in decode_cf_datetime(num_dates, units, calendar, use_cftime) 174 try: 175 dates = _decode_datetime_with_pandas(flat_num_dates, units, --> 176 calendar) 177 except (OutOfBoundsDatetime, OverflowError): 178 dates = _decode_datetime_with_cftime(

~/anaconda/envs/isatss/lib/python3.7/site-packages/xarray/coding/times.py in _decode_datetime_with_pandas(flat_num_dates, units, calendar) 139 warnings.filterwarnings('ignore', 'invalid value encountered', 140 RuntimeWarning) --> 141 pd.to_timedelta(flat_num_dates.min(), delta) + ref_date 142 pd.to_timedelta(flat_num_dates.max(), delta) + ref_date 143

~/anaconda/envs/isatss/lib/python3.7/site-packages/numpy/core/_methods.py in _amin(a, axis, out, keepdims, initial) 30 def _amin(a, axis=None, out=None, keepdims=False, 31 initial=_NoValue): ---> 32 return umr_minimum(a, axis, None, out, keepdims, initial) 33 34 def _sum(a, axis=None, dtype=None, out=None, keepdims=False,

ValueError: zero-size array to reduction operation minimum which has no identity ```

Versions: xarray = 0.12.1, pandas = 0.24.1

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Cannot open NetCDF file if dimension with time coordinate has length 0 (`ValueError` when decoding CF datetime) 217216935
315620355 https://github.com/pydata/xarray/pull/1453#issuecomment-315620355 https://api.github.com/repos/pydata/xarray/issues/1453 MDEyOklzc3VlQ29tbWVudDMxNTYyMDM1NQ== deeplycloudy 1325771 2017-07-16T16:29:10Z 2017-07-16T16:29:10Z CONTRIBUTOR

Tests now pass after I realized I wasn't converting the _FillValue to unsigned.

I also turned off PyNIO's internal support for masking, in keeping with the philosophy that xarray should only use the backends to retrieve the bytes as represented on disk.

Note that some of the CI builds are skipping most of their tests (e.g, py=3.4; you can tell by the run time). This is a problem in other PRs as well.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Automate interpretation of _Unsigned attribute 235761029
309848945 https://github.com/pydata/xarray/issues/1444#issuecomment-309848945 https://api.github.com/repos/pydata/xarray/issues/1444 MDEyOklzc3VlQ29tbWVudDMwOTg0ODk0NQ== deeplycloudy 1325771 2017-06-20T18:35:11Z 2017-06-20T18:35:23Z CONTRIBUTOR

I'm dropping a quick note here to flag that PR #1453 has been written to address this issue. I have used the draft PR in earnest as part of some ongoing analyses of the data that led me to raise the issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Best practice when the _Unsigned attribute is present in NetCDF files 233992696
309767695 https://github.com/pydata/xarray/pull/1453#issuecomment-309767695 https://api.github.com/repos/pydata/xarray/issues/1453 MDEyOklzc3VlQ29tbWVudDMwOTc2NzY5NQ== deeplycloudy 1325771 2017-06-20T14:11:36Z 2017-06-20T14:11:36Z CONTRIBUTOR

The CI fail is for 2.7/cdat/pynio in a couple of my new tests. In one, the fill value is not being applied, while in the other the unsigned conversion isn't happening. Are there any known differences in that cdat/pynio stack that would cause these to fail while others pass?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Automate interpretation of _Unsigned attribute 235761029
309623307 https://github.com/pydata/xarray/pull/1453#issuecomment-309623307 https://api.github.com/repos/pydata/xarray/issues/1453 MDEyOklzc3VlQ29tbWVudDMwOTYyMzMwNw== deeplycloudy 1325771 2017-06-20T02:03:12Z 2017-06-20T02:03:12Z CONTRIBUTOR

I've created a new UnsignedIntTypeArray and have separated the logic from mask_and_scale. Lint has been removed and docs updated.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Automate interpretation of _Unsigned attribute 235761029
308326393 https://github.com/pydata/xarray/pull/1453#issuecomment-308326393 https://api.github.com/repos/pydata/xarray/issues/1453 MDEyOklzc3VlQ29tbWVudDMwODMyNjM5Mw== deeplycloudy 1325771 2017-06-14T05:49:52Z 2017-06-14T05:49:52Z CONTRIBUTOR

In addition to the included (basic) test I've also tested this with the real-world data that motivated the PR and #1444. While it's a working draft, I'd welcome comments on the basic approach and appropriateness of the test coverage.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Automate interpretation of _Unsigned attribute 235761029
306848675 https://github.com/pydata/xarray/issues/1444#issuecomment-306848675 https://api.github.com/repos/pydata/xarray/issues/1444 MDEyOklzc3VlQ29tbWVudDMwNjg0ODY3NQ== deeplycloudy 1325771 2017-06-07T16:25:01Z 2017-06-07T16:29:33Z CONTRIBUTOR

Ah, I see the distinction concerning the xarray implementation being made in the earlier discussion. There certainly are some tradeoffs being made by the data provider, and I'm far enough removed from those decisions that there's not much I can do.

Thanks for the clarification on the background discussion and the willingness to consider support for the attribute. It's my first time dealing with xarray internals, but given that there's a reference implementation in NetCDF4-python, I'm willing to have a go at a PR for this.

If I had to take a guess, it belongs in conventions.py, fitting with the logic beginning line 789 where the mask_and_scale is handled. It could be built into the MaskedAndScaledArray class, though perhaps a new UnsignedIntArray class might be preferred?

The docstrings note that the implementation is designed around CF-style data. As noted above, this attribute is outside the CF conventions, so I might note that exception in a comment in the code.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Best practice when the _Unsigned attribute is present in NetCDF files 233992696

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.942ms · About: xarray-datasette