home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

16 rows where author_association = "CONTRIBUTOR" and user = 10512793 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, created_at (date), updated_at (date)

issue 4

  • open_mfdataset() memory error in v0.10 11
  • Add warning for netCDF4 bug 3
  • How to broadcast along dayofyear 1
  • added apply_ufunc example to toy weather data 1

user 1

  • braaannigan · 16 ✖

author_association 1

  • CONTRIBUTOR · 16 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
359708569 https://github.com/pydata/xarray/pull/1848#issuecomment-359708569 https://api.github.com/repos/pydata/xarray/issues/1848 MDEyOklzc3VlQ29tbWVudDM1OTcwODU2OQ== braaannigan 10512793 2018-01-23T08:07:40Z 2018-01-23T08:07:40Z CONTRIBUTOR

Added the extra underline dots

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  added apply_ufunc example to toy weather data 290396804
359366336 https://github.com/pydata/xarray/issues/1844#issuecomment-359366336 https://api.github.com/repos/pydata/xarray/issues/1844 MDEyOklzc3VlQ29tbWVudDM1OTM2NjMzNg== braaannigan 10512793 2018-01-22T09:21:56Z 2018-01-22T09:21:56Z CONTRIBUTOR

Example for the docs proposed here: https://github.com/pydata/xarray/pull/1848

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  How to broadcast along dayofyear 290023410
358926430 https://github.com/pydata/xarray/pull/1835#issuecomment-358926430 https://api.github.com/repos/pydata/xarray/issues/1835 MDEyOklzc3VlQ29tbWVudDM1ODkyNjQzMA== braaannigan 10512793 2018-01-19T10:26:17Z 2018-01-19T10:26:17Z CONTRIBUTOR

I've pushed an updated version with a successful test on my machine.

I've modified the dataset name in the test case to ensure it has 88 characters (the filepath on github has to have less than 88 characters of course, this seems like a safe assumption).

I've made some formatting changes to netCDF4_.py to pass the pyflake tests. I've moved the import statement for LooseVersion to the start of the script.

Thanks for the reviews and advice @shoyer

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add warning for netCDF4 bug 288957853
358336467 https://github.com/pydata/xarray/pull/1835#issuecomment-358336467 https://api.github.com/repos/pydata/xarray/issues/1835 MDEyOklzc3VlQ29tbWVudDM1ODMzNjQ2Nw== braaannigan 10512793 2018-01-17T15:18:21Z 2018-01-17T15:18:21Z CONTRIBUTOR

There are some other small issues I need to deal with, I'll leave a message when I've sorted those out

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add warning for netCDF4 bug 288957853
358333272 https://github.com/pydata/xarray/pull/1835#issuecomment-358333272 https://api.github.com/repos/pydata/xarray/issues/1835 MDEyOklzc3VlQ29tbWVudDM1ODMzMzI3Mg== braaannigan 10512793 2018-01-17T15:08:17Z 2018-01-17T15:08:17Z CONTRIBUTOR

Thanks for the advice @shoyer

I've moved the warning to NetCDF4DataStore.open() and reverted backends/api.py back to the original.

I've used LooseVersion for the version comparison and made the pep8 change.

There is a problem with the test. I've added the test to NetCDF4DataTest. However, when the filename has 88 characters the seg fault occurs and the test gets stopped. The output from $py.test tests/test_backends.py is just: ``` ============================= test session starts ============================== platform linux -- Python 3.6.2, pytest-3.2.1, py-1.4.34, pluggy-0.4.0 rootdir: /my_python_path/xarray, inifile: pytest.ini plugins: pylama-7.4.3 collected 1042 items

tests/test_backends.py . *** Error in `/path_to/python': double free or corruption (!prev): *** Aborted (core dumped) ```

If the filename doesn't have 88 characters then the test result is "Failed: DID NOT WARN.", which is fine as it shouldn't trigger the warning.

I've been trying to see how people handle this issue, but haven't been able to find examples of people trying to trigger seg faults in a test yet

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add warning for netCDF4 bug 288957853
356255726 https://github.com/pydata/xarray/issues/1745#issuecomment-356255726 https://api.github.com/repos/pydata/xarray/issues/1745 MDEyOklzc3VlQ29tbWVudDM1NjI1NTcyNg== braaannigan 10512793 2018-01-09T11:17:12Z 2018-01-09T11:17:12Z CONTRIBUTOR

Hi @shoyer

Updating netcdf4 to version 1.3.1 solves the problem. I'm trying to think what the potential solutions are. Essentially, we would need to modify the function ds.filepath(). However, this isn't possible inside xarray.

Is there anything we can do other than add a warning message with the recommendation to upgrade netcdf4 when the file path has 88 characters and netcdf4 is version 1.2.4?

Should we also submit an issue to anaconda to get the default package updates?

Happy to prepare these if you think it's the best way to proceed.

Liam

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset() memory error in v0.10 277538485
351944694 https://github.com/pydata/xarray/issues/1745#issuecomment-351944694 https://api.github.com/repos/pydata/xarray/issues/1745 MDEyOklzc3VlQ29tbWVudDM1MTk0NDY5NA== braaannigan 10512793 2017-12-15T08:30:57Z 2017-12-15T08:30:57Z CONTRIBUTOR

Hi @shoyer I've tried this print(ds.filepath()) suggestion and it reproduces when I use the full length file path which has 88 characters.
Again, the segfault doesn't arise if I add or subtract a character to the file path (after copying the underlying file to a new name).

This dependence on 88 characters is consistent with the bug here: https://github.com/Unidata/netcdf4-python/issues/585

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset() memory error in v0.10 277538485
351786510 https://github.com/pydata/xarray/issues/1745#issuecomment-351786510 https://api.github.com/repos/pydata/xarray/issues/1745 MDEyOklzc3VlQ29tbWVudDM1MTc4NjUxMA== braaannigan 10512793 2017-12-14T17:51:11Z 2017-12-14T17:51:11Z CONTRIBUTOR

Interesting. I've tried to look at this a bit more by in netCDF4_.py running: self._filename = self.ds.filepath() print(self.ds) self.is_remote = is_remote_uri(self._filename) So, all I did was add a print statement print(self.ds).

In this case the open_dataset call worked fine.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset() memory error in v0.10 277538485
351782302 https://github.com/pydata/xarray/issues/1745#issuecomment-351782302 https://api.github.com/repos/pydata/xarray/issues/1745 MDEyOklzc3VlQ29tbWVudDM1MTc4MjMwMg== braaannigan 10512793 2017-12-14T17:35:24Z 2017-12-14T17:35:24Z CONTRIBUTOR

I've also now tried out the re.match approach you suggest above, but it generates the same core dump as the re.search('^...') approach

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset() memory error in v0.10 277538485
351781405 https://github.com/pydata/xarray/issues/1745#issuecomment-351781405 https://api.github.com/repos/pydata/xarray/issues/1745 MDEyOklzc3VlQ29tbWVudDM1MTc4MTQwNQ== braaannigan 10512793 2017-12-14T17:32:04Z 2017-12-14T17:32:04Z CONTRIBUTOR

With print(repr(path)) I get: 'grid.nc' '/path/verification/cabl/y2d/mnc_test_0008_1day_restoring/grid.nc' '/path/verification/cabl/y2d/mnc_test_0008_1day_restoring/grid.nc' where I've edited the changed the first part of the filename to "/path/"

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset() memory error in v0.10 277538485
351774833 https://github.com/pydata/xarray/issues/1745#issuecomment-351774833 https://api.github.com/repos/pydata/xarray/issues/1745 MDEyOklzc3VlQ29tbWVudDM1MTc3NDgzMw== braaannigan 10512793 2017-12-14T17:07:49Z 2017-12-14T17:07:49Z CONTRIBUTOR

If the ^ isn't strictly necessary I'm happy to put together a PR with it removed.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset() memory error in v0.10 277538485
351768893 https://github.com/pydata/xarray/issues/1745#issuecomment-351768893 https://api.github.com/repos/pydata/xarray/issues/1745 MDEyOklzc3VlQ29tbWVudDM1MTc2ODg5Mw== braaannigan 10512793 2017-12-14T16:50:43Z 2017-12-14T16:50:43Z CONTRIBUTOR

Hi @shoyer

The crash does not occur when the ^ is removed.

When I run python -c 'import sys; print(sys.getfilesystemencoding()) The output is: utf-8

The file loads with the scipy engine. I get a module import error with h5netcdf, even though conda list shows that I have version 0.5 installed.

xr.show_versions() gives: INSTALLED VERSIONS


commit: None python: 3.6.2.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-101-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

xarray: 0.10.0 pandas: 0.21.0 numpy: 1.13.1 scipy: 0.19.1 netCDF4: 1.2.4 h5netcdf: None Nio: None bottleneck: 1.2.1 cyordereddict: None dask: 0.16.0 matplotlib: 2.0.2 cartopy: None seaborn: 0.7.1 setuptools: 36.7.1 pip: 9.0.1 conda: None pytest: 3.2.1 IPython: 6.2.1 sphinx: None

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset() memory error in v0.10 277538485
351733817 https://github.com/pydata/xarray/issues/1745#issuecomment-351733817 https://api.github.com/repos/pydata/xarray/issues/1745 MDEyOklzc3VlQ29tbWVudDM1MTczMzgxNw== braaannigan 10512793 2017-12-14T14:56:04Z 2017-12-14T16:36:31Z CONTRIBUTOR

There is also some filename dependence. The file load works for g.nc, gr.nc, gri.nc and then fails for grid.nc. The file load also works for grida.nc

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset() memory error in v0.10 277538485
351732553 https://github.com/pydata/xarray/issues/1745#issuecomment-351732553 https://api.github.com/repos/pydata/xarray/issues/1745 MDEyOklzc3VlQ29tbWVudDM1MTczMjU1Mw== braaannigan 10512793 2017-12-14T14:51:36Z 2017-12-14T14:51:36Z CONTRIBUTOR

Hi @shoyer, thanks for getting back to me.

That hasn't worked unfortunately. The only difference including the with LOCK statement makes is that the file load seems to work, but then the core dump happens when you try to access the object, e.g. with the ds line below: import xarray as xr ds = xr.open_dataset('grid.nc') ds As above, removing the ^ avoids the crash when the with LOCK statement is used.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset() memory error in v0.10 277538485
351345428 https://github.com/pydata/xarray/issues/1745#issuecomment-351345428 https://api.github.com/repos/pydata/xarray/issues/1745 MDEyOklzc3VlQ29tbWVudDM1MTM0NTQyOA== braaannigan 10512793 2017-12-13T10:11:50Z 2017-12-13T10:11:50Z CONTRIBUTOR

I've played around with it a bit more. It seems like it's the ^ character in the re.search term that's causing the issue. If this is removed and the function is simply: def is_remote_uri(path): return bool(re.search('https?\://', path)) then I can load the file.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset() memory error in v0.10 277538485
351339156 https://github.com/pydata/xarray/issues/1745#issuecomment-351339156 https://api.github.com/repos/pydata/xarray/issues/1745 MDEyOklzc3VlQ29tbWVudDM1MTMzOTE1Ng== braaannigan 10512793 2017-12-13T09:49:13Z 2017-12-13T09:49:13Z CONTRIBUTOR

I'm getting a similar error. The file size is very small (Kbs), so I don't think it's the size issue above. Instead, the error I get is due to something strange happening in core.utils.is_remote_uri(path). The error occurs when I'm reading netcdf3 files with the default netcdf4 engine (which should be able to handle netcdf3 of course).
There is a workaround in that I can use the scipy reader to read netcdf3 files with no problems. Note that whenever I refer to "error" below it means the error that gives the following output rather than a python exception.

The error message is: *** Error in `/path/anaconda2/envs/base3/bin/python': corrupted size vs. prev_size: 0x0000000001814930 *** Aborted (core dumped)

The function where the problem arises is: def is_remote_uri(path): return bool(re.search('^https?\://', path)) The function is called a few times during the open_dataset (or open_mfdataset, I get the same error). On the third or fourth call it triggers the error. As I'm not using remote datasets, I can hard-code the output of the function to be return False and then the file reads with no problems.

The is_remote_uri(path) call is made a few times. However, it's only on line 233 of netCDF4_.py with is_remote_uri(self._filename) that the error is triggered.

I've output the argument to the is_remote_uri() function for each time it's called. In the first call the argument is the filename, in the second call the argument is the filename with the absolute path and in the third (and fatal) call the argument is also the filename with the absolute path.

I can't see any difference between the arguments to the function on the second and third call. When I copy them, assign them to variables and check equality in python it evaluates to True.

I've added in a simpler call to re.search in the function: def is_remote_uri(path): print((re.search('.nc','.nc'))) return bool(re.search('^https?\://', path)) This also triggers the error on the third call to the function. As such we can rule out something to do with the path name.

I've played around with the print((re.search('.nc','.nc'))) line that I've added in. It only triggers an error on the third call when the first argument of re.search has a dot in the string, so re.search('.nc','.nc') causes the error, but re.search('nc','.nc') doesn't. The error isn't dependent on .nc in any way, '.AAA' in the arguments will cause the same error. The error doesn't replicate if I simply import re in ipython.

The error does not occur in xarray 0.9.6. The same function is called in a similar way and the function evaluates to False each time.

I'm not really sure what to do next, though. The obvious workaround is to set engine='scipy' if you're working with netcdf3 files.

Can anyone replicate this error?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset() memory error in v0.10 277538485

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 636.242ms · About: xarray-datasette