html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/1745#issuecomment-356255726,https://api.github.com/repos/pydata/xarray/issues/1745,356255726,MDEyOklzc3VlQ29tbWVudDM1NjI1NTcyNg==,10512793,2018-01-09T11:17:12Z,2018-01-09T11:17:12Z,CONTRIBUTOR,"Hi @shoyer Updating netcdf4 to version 1.3.1 solves the problem. I'm trying to think what the potential solutions are. Essentially, we would need to modify the function ds.filepath(). However, this isn't possible inside xarray. Is there anything we can do other than add a warning message with the recommendation to upgrade netcdf4 when the file path has 88 characters and netcdf4 is version 1.2.4? Should we also submit an issue to anaconda to get the default package updates? Happy to prepare these if you think it's the best way to proceed. Liam ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,277538485 https://github.com/pydata/xarray/issues/1745#issuecomment-351944694,https://api.github.com/repos/pydata/xarray/issues/1745,351944694,MDEyOklzc3VlQ29tbWVudDM1MTk0NDY5NA==,10512793,2017-12-15T08:30:57Z,2017-12-15T08:30:57Z,CONTRIBUTOR,"Hi @shoyer I've tried this print(ds.filepath()) suggestion and it reproduces when I use the full length file path which has 88 characters. Again, the segfault doesn't arise if I add or subtract a character to the file path (after copying the underlying file to a new name). This dependence on 88 characters is consistent with the bug here: https://github.com/Unidata/netcdf4-python/issues/585","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,277538485 https://github.com/pydata/xarray/issues/1745#issuecomment-351786510,https://api.github.com/repos/pydata/xarray/issues/1745,351786510,MDEyOklzc3VlQ29tbWVudDM1MTc4NjUxMA==,10512793,2017-12-14T17:51:11Z,2017-12-14T17:51:11Z,CONTRIBUTOR,"Interesting. I've tried to look at this a bit more by in netCDF4_.py running: ``` self._filename = self.ds.filepath() print(self.ds) self.is_remote = is_remote_uri(self._filename) ``` So, all I did was add a print statement ```print(self.ds)```. In this case the open_dataset call worked fine.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,277538485 https://github.com/pydata/xarray/issues/1745#issuecomment-351782302,https://api.github.com/repos/pydata/xarray/issues/1745,351782302,MDEyOklzc3VlQ29tbWVudDM1MTc4MjMwMg==,10512793,2017-12-14T17:35:24Z,2017-12-14T17:35:24Z,CONTRIBUTOR,"I've also now tried out the re.match approach you suggest above, but it generates the same core dump as the re.search('^...') approach","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,277538485 https://github.com/pydata/xarray/issues/1745#issuecomment-351781405,https://api.github.com/repos/pydata/xarray/issues/1745,351781405,MDEyOklzc3VlQ29tbWVudDM1MTc4MTQwNQ==,10512793,2017-12-14T17:32:04Z,2017-12-14T17:32:04Z,CONTRIBUTOR,"With print(repr(path)) I get: 'grid.nc' '/path/verification/cabl/y2d/mnc_test_0008_1day_restoring/grid.nc' '/path/verification/cabl/y2d/mnc_test_0008_1day_restoring/grid.nc' where I've edited the changed the first part of the filename to ""/path/"" ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,277538485 https://github.com/pydata/xarray/issues/1745#issuecomment-351774833,https://api.github.com/repos/pydata/xarray/issues/1745,351774833,MDEyOklzc3VlQ29tbWVudDM1MTc3NDgzMw==,10512793,2017-12-14T17:07:49Z,2017-12-14T17:07:49Z,CONTRIBUTOR,If the ^ isn't strictly necessary I'm happy to put together a PR with it removed.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,277538485 https://github.com/pydata/xarray/issues/1745#issuecomment-351768893,https://api.github.com/repos/pydata/xarray/issues/1745,351768893,MDEyOklzc3VlQ29tbWVudDM1MTc2ODg5Mw==,10512793,2017-12-14T16:50:43Z,2017-12-14T16:50:43Z,CONTRIBUTOR,"Hi @shoyer The crash does not occur when the ^ is removed. When I run ```python -c 'import sys; print(sys.getfilesystemencoding())``` The output is: utf-8 The file loads with the scipy engine. I get a module import error with h5netcdf, even though ```conda list``` shows that I have version 0.5 installed. ```xr.show_versions()``` gives: INSTALLED VERSIONS ------------------ commit: None python: 3.6.2.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-101-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.0 pandas: 0.21.0 numpy: 1.13.1 scipy: 0.19.1 netCDF4: 1.2.4 h5netcdf: None Nio: None bottleneck: 1.2.1 cyordereddict: None dask: 0.16.0 matplotlib: 2.0.2 cartopy: None seaborn: 0.7.1 setuptools: 36.7.1 pip: 9.0.1 conda: None pytest: 3.2.1 IPython: 6.2.1 sphinx: None","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,277538485 https://github.com/pydata/xarray/issues/1745#issuecomment-351733817,https://api.github.com/repos/pydata/xarray/issues/1745,351733817,MDEyOklzc3VlQ29tbWVudDM1MTczMzgxNw==,10512793,2017-12-14T14:56:04Z,2017-12-14T16:36:31Z,CONTRIBUTOR,"There is also some filename dependence. The file load works for g.nc, gr.nc, gri.nc and then fails for grid.nc. The file load also works for grida.nc","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,277538485 https://github.com/pydata/xarray/issues/1745#issuecomment-351732553,https://api.github.com/repos/pydata/xarray/issues/1745,351732553,MDEyOklzc3VlQ29tbWVudDM1MTczMjU1Mw==,10512793,2017-12-14T14:51:36Z,2017-12-14T14:51:36Z,CONTRIBUTOR,"Hi @shoyer, thanks for getting back to me. That hasn't worked unfortunately. The only difference including the with LOCK statement makes is that the file load seems to work, but then the core dump happens when you try to access the object, e.g. with the ```ds``` line below: ``` import xarray as xr ds = xr.open_dataset('grid.nc') ds ``` As above, removing the ```^``` avoids the crash when the with LOCK statement is used.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,277538485 https://github.com/pydata/xarray/issues/1745#issuecomment-351345428,https://api.github.com/repos/pydata/xarray/issues/1745,351345428,MDEyOklzc3VlQ29tbWVudDM1MTM0NTQyOA==,10512793,2017-12-13T10:11:50Z,2017-12-13T10:11:50Z,CONTRIBUTOR,"I've played around with it a bit more. It seems like it's the ^ character in the re.search term that's causing the issue. If this is removed and the function is simply: ``` def is_remote_uri(path): return bool(re.search('https?\://', path)) ``` then I can load the file.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,277538485 https://github.com/pydata/xarray/issues/1745#issuecomment-351339156,https://api.github.com/repos/pydata/xarray/issues/1745,351339156,MDEyOklzc3VlQ29tbWVudDM1MTMzOTE1Ng==,10512793,2017-12-13T09:49:13Z,2017-12-13T09:49:13Z,CONTRIBUTOR,"I'm getting a similar error. The file size is very small (Kbs), so I don't think it's the size issue above. Instead, the error I get is due to something strange happening in core.utils.is_remote_uri(path). The error occurs when I'm reading netcdf3 files with the default netcdf4 engine (which should be able to handle netcdf3 of course). There is a workaround in that I can use the scipy reader to read netcdf3 files with no problems. Note that whenever I refer to ""error"" below it means the error that gives the following output rather than a python exception. The error message is: *** Error in `/path/anaconda2/envs/base3/bin/python': corrupted size vs. prev_size: 0x0000000001814930 *** Aborted (core dumped) The function where the problem arises is: ``` def is_remote_uri(path): return bool(re.search('^https?\://', path)) ``` The function is called a few times during the open_dataset (or open_mfdataset, I get the same error). On the third or fourth call it triggers the error. As I'm not using remote datasets, I can hard-code the output of the function to be ``` return False ``` and then the file reads with no problems. The ```is_remote_uri(path)``` call is made a few times. However, it's only on line 233 of netCDF4_.py with ```is_remote_uri(self._filename)``` that the error is triggered. I've output the argument to the ```is_remote_uri()``` function for each time it's called. In the first call the argument is the filename, in the second call the argument is the filename with the absolute path and in the third (and fatal) call the argument is also the filename with the absolute path. I can't see any difference between the arguments to the function on the second and third call. When I copy them, assign them to variables and check equality in python it evaluates to True. I've added in a simpler call to ```re.search``` in the function: ``` def is_remote_uri(path): print((re.search('.nc','.nc'))) return bool(re.search('^https?\://', path)) ``` This also triggers the error on the third call to the function. As such we can rule out something to do with the path name. I've played around with the ```print((re.search('.nc','.nc')))``` line that I've added in. It only triggers an error on the third call when the first argument of re.search has a dot in the string, so ```re.search('.nc','.nc')``` causes the error, but ```re.search('nc','.nc')``` doesn't. The error isn't dependent on .nc in any way, '.AAA' in the arguments will cause the same error. The error doesn't replicate if I simply import re in ipython. The error does not occur in xarray 0.9.6. The same function is called in a similar way and the function evaluates to False each time. I'm not really sure what to do next, though. The obvious workaround is to set engine='scipy' if you're working with netcdf3 files. Can anyone replicate this error?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,277538485