home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 379472634

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
379472634 MDU6SXNzdWUzNzk0NzI2MzQ= 2554 open_mfdataset crashes with segfault 1217238 closed 0     10 2018-11-10T23:34:04Z 2019-01-17T22:16:44Z 2019-01-17T22:16:44Z MEMBER      

Copied from the report on the xarray mailing list:


This crashes with SIGSEGV: ```

foo.py

import xarray as xr ds = xr.open_mfdataset('/tmp/nam/bufr.701940/bufr201012011.nc', data_vars='minimal', parallel=True) print(ds) ```

Traceback: ``` [gtrojan@asok precip]$ gdb python3 GNU gdb (GDB) Fedora 8.1.1-3.fc28 Copyright (C) 2018 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/. Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from python3...done. (gdb) r Starting program: /mnt/sdc1/local/Python-3.6.5/bin/python3 foo.py [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". [New Thread 0x7fffe6dfb700 (LWP 11176)] [New Thread 0x7fffe4dfa700 (LWP 11177)] [New Thread 0x7fffdedf9700 (LWP 11178)] [New Thread 0x7fffdadf8700 (LWP 11179)] [New Thread 0x7fffd6df7700 (LWP 11180)] [New Thread 0x7fffd2df6700 (LWP 11181)] [New Thread 0x7fffcedf5700 (LWP 11182)] warning: Loadable section ".note.gnu.property" outside of ELF segments [Thread 0x7fffdadf8700 (LWP 11179) exited] [Thread 0x7fffd2df6700 (LWP 11181) exited] [Thread 0x7fffcedf5700 (LWP 11182) exited] [Thread 0x7fffd6df7700 (LWP 11180) exited] [Thread 0x7fffdedf9700 (LWP 11178) exited] [Thread 0x7fffe4dfa700 (LWP 11177) exited] [Thread 0x7fffe6dfb700 (LWP 11176) exited] Detaching after fork from child process 11183. [New Thread 0x7fffcedf5700 (LWP 11184)] [New Thread 0x7fffe56f1700 (LWP 11185)] [New Thread 0x7fffdedf9700 (LWP 11186)] [New Thread 0x7fffdadf8700 (LWP 11187)] [New Thread 0x7fffd6df7700 (LWP 11188)] [New Thread 0x7fffd2df6700 (LWP 11189)] [New Thread 0x7fffa7fff700 (LWP 11190)] [New Thread 0x7fff9bfff700 (LWP 11191)] [New Thread 0x7fff93fff700 (LWP 11192)] [New Thread 0x7fff8bfff700 (LWP 11193)] [New Thread 0x7fff83fff700 (LWP 11194)] warning: Loadable section ".note.gnu.property" outside of ELF segments warning: Loadable section ".note.gnu.property" outside of ELF segments

Thread 9 "python3" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffcedf5700 (LWP 11184)] 0x00007fffbd95cca9 in H5SL_insert_common () from /usr/lib64/libhdf5.so.10 ```

This happens with the most recent dask and xarray:

INSTALLED VERSIONS

commit: None python: 3.6.5.final.0 python-bits: 64 OS: Linux OS-release: 4.18.14-200.fc28.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_CA.UTF-8 LOCALE: en_CA.UTF-8

xarray: 0.11.0 pandas: 0.23.0 numpy: 1.15.2 scipy: 1.1.0 netCDF4: 1.4.0 h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.0b1 PseudonetCDF: None rasterio: None iris: None bottleneck: 1.3.0.dev0 cyordereddict: None dask: 0.20.1 distributed: 1.22.1 matplotlib: 3.0.0 cartopy: None seaborn: 0.9.0 setuptools: 39.0.1 pip: 18.1 conda: None pytest: 3.6.3 IPython: 6.3.1 sphinx: 1.8.1

When I change the code in open_mfdataset to use parallel scheduler, the code runs as expected.

``` Line 619 in api.py:

datasets, file_objs = dask.compute(datasets, file_objs)

datasets, file_objs = dask.compute(datasets, file_objs, scheduler='processes') ```

The file sizes are about 300kB, my example reads only 2 files.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2554/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 10 rows from issue in issue_comments
Powered by Datasette · Queries took 0.867ms · About: xarray-datasette