issue_comments
26 rows where issue = 91184107 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
issue 1
- segmentation fault with `open_mfdataset` · 26 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
120447670 | https://github.com/pydata/xarray/issues/444#issuecomment-120447670 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDEyMDQ0NzY3MA== | shoyer 1217238 | 2015-07-10T16:11:19Z | 2015-07-10T16:11:19Z | MEMBER | @razvanc87 I've gotten a few other reports of issues with multithreading (not just you), so I think we do definitely need to add our own lock when accessing these files. Misconfigured hdf5 installs may not be so uncommon. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
119698728 | https://github.com/pydata/xarray/issues/444#issuecomment-119698728 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExOTY5ODcyOA== | razcore-rad 1177508 | 2015-07-08T19:07:41Z | 2015-07-08T19:07:41Z | NONE | I think this issue can be closed, after some digging and playing with different |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
118436430 | https://github.com/pydata/xarray/issues/444#issuecomment-118436430 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExODQzNjQzMA== | andrewcollette 3101370 | 2015-07-03T23:02:52Z | 2015-07-03T23:02:52Z | NONE | @shoyer, there are basically two levels of thread safety for HDF5/h5py. First, the HDF5 library has an optional compile-time "threadsafe" build option that wraps all API access in a lock. This is all-or-nothing; I'm not aware of any per-file effects. Second, h5py uses its own global lock on the Python side to serialize access, which is only disabled in MPI mode. For added protection, h5py also does not presently release the GIL around reads/writes. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
118435615 | https://github.com/pydata/xarray/issues/444#issuecomment-118435615 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExODQzNTYxNQ== | shoyer 1217238 | 2015-07-03T22:43:41Z | 2015-07-03T22:43:41Z | MEMBER | @razvanc87 netcdf4 and h5py use the same HDF5 libraries, but have different bindings from Python. H5py likely does a more careful job of using its own locks to ensure thread safety, which likely explains the difference you are seeing (the attribute encoding is a separate issue). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
118435484 | https://github.com/pydata/xarray/issues/444#issuecomment-118435484 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExODQzNTQ4NA== | shoyer 1217238 | 2015-07-03T22:40:57Z | 2015-07-03T22:40:57Z | MEMBER |
@andrewcollette could you comment on this for h5py/hdf5? @mrocklin based on my reading of Andrew's comment in the h5py issue, this is indeed the case. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
118373477 | https://github.com/pydata/xarray/issues/444#issuecomment-118373477 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExODM3MzQ3Nw== | razcore-rad 1177508 | 2015-07-03T15:28:16Z | 2015-07-03T15:28:16Z | NONE | Per file basis ( |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
118373188 | https://github.com/pydata/xarray/issues/444#issuecomment-118373188 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExODM3MzE4OA== | mrocklin 306380 | 2015-07-03T15:26:18Z | 2015-07-03T15:26:18Z | MEMBER | The library itself is not threadsafe? What about on a per-file basis? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
118195247 | https://github.com/pydata/xarray/issues/444#issuecomment-118195247 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExODE5NTI0Nw== | shoyer 1217238 | 2015-07-02T23:45:01Z | 2015-07-02T23:45:01Z | MEMBER | Ah, I think I know why the seg faults are still occuring. By default, @mrocklin maybe |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
118091969 | https://github.com/pydata/xarray/issues/444#issuecomment-118091969 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExODA5MTk2OQ== | razcore-rad 1177508 | 2015-07-02T16:55:02Z | 2015-07-02T16:55:02Z | NONE | Yes, I'm using the same files that I once uploaded on Dropbox for you to play with for #443. I'm not doing anything special, just passing in the glob pattern to |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
118090209 | https://github.com/pydata/xarray/issues/444#issuecomment-118090209 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExODA5MDIwOQ== | shoyer 1217238 | 2015-07-02T16:46:57Z | 2015-07-02T16:46:57Z | MEMBER | Thanks for your help debugging! I made a new issue for ascii attributes handling: https://github.com/xray/xray/issues/451 This is one case where Python 3's insistence that bytes and strings are different is annoying. I'll probably have to decode all bytes type attributes read from h5netcdf. How do you trigger the seg-fault with netcdf4-python? Just using |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
117993960 | https://github.com/pydata/xarray/issues/444#issuecomment-117993960 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExNzk5Mzk2MA== | razcore-rad 1177508 | 2015-07-02T10:36:06Z | 2015-07-02T12:18:09Z | NONE | OK... as a follow-up, I did some tests and with
This is simple to solve.. just have every edit: boy... there are some differences between these packages (
I didn't put the full error cause I don't think it's relevant. Anyway, needless to say... edit2: so I was going through the posts here and now I saw you addressed this issue using that |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
117217039 | https://github.com/pydata/xarray/issues/444#issuecomment-117217039 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExNzIxNzAzOQ== | razcore-rad 1177508 | 2015-06-30T14:55:58Z | 2015-06-30T14:55:58Z | NONE | Well... I have a couple of remarks to make. After some more thought about this it might have been all along my fault. Let me explain. I have this machine at work where I don't have administrative privileges so I decided to give |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
116787098 | https://github.com/pydata/xarray/issues/444#issuecomment-116787098 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExNjc4NzA5OA== | shoyer 1217238 | 2015-06-29T18:30:48Z | 2015-06-29T18:30:48Z | MEMBER | @razvanc87 What version of h5py were you using with h5netcdf? @andrewcollette suggests (https://github.com/h5py/h5py/issues/591#issuecomment-116785660) that h5py should already have the lock that fixes this issue if you were using h5py 2.4.0 or later. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
116779716 | https://github.com/pydata/xarray/issues/444#issuecomment-116779716 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExNjc3OTcxNg== | shoyer 1217238 | 2015-06-29T18:07:52Z | 2015-06-29T18:07:52Z | MEMBER | Just merged the fix to master. @razvanc87 if you could try installing the development version, I would love to hear if this resolves your issues. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
116189535 | https://github.com/pydata/xarray/issues/444#issuecomment-116189535 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExNjE4OTUzNQ== | shoyer 1217238 | 2015-06-28T03:34:30Z | 2015-06-28T03:34:30Z | MEMBER | I have a tentative fix (adding the threading lock) in https://github.com/xray/xray/pull/446 Still wondering why multi-threading can't use more than one CPU -- hopefully my h5py issue (referenced above) will get us some answers. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
116182511 | https://github.com/pydata/xarray/issues/444#issuecomment-116182511 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExNjE4MjUxMQ== | mrocklin 306380 | 2015-06-28T01:55:39Z | 2015-06-28T01:55:39Z | MEMBER | Oh, I didn't realize that that was built in already. Sounds like you could handle this easily on the xray side. On Jun 27, 2015 4:40 PM, "Stephan Hoyer" notifications@github.com wrote:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
116165986 | https://github.com/pydata/xarray/issues/444#issuecomment-116165986 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExNjE2NTk4Ng== | shoyer 1217238 | 2015-06-27T23:40:29Z | 2015-06-27T23:40:29Z | MEMBER | Of course, concurrent access to HDF5 files works fine on my laptop, using Anaconda's build of HDF5 (version 1.8.14). I have no idea what special flags they invoked when building it :). That said, I have been unable to produce any benchmarks that show improved performance when simply doing multithreaded reads without doing any computation (e.g., Given these considerations, it seems like we should use a lock when reading data into xray with dask. @mrocklin we could just use |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
116162351 | https://github.com/pydata/xarray/issues/444#issuecomment-116162351 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExNjE2MjM1MQ== | mrocklin 306380 | 2015-06-27T22:12:37Z | 2015-06-27T22:12:37Z | MEMBER | There was a similar problem with PyTables, which didn't support concurrency well. This resulted in the from-hdf5 function in dask array which uses explicit locks to avoid concurrent access. We could repeat this treatment more generally without much trouble to force single threaded access on access but still allow parallelism otherwise. On Jun 27, 2015 2:33 PM, "Răzvan Rădulescu" notifications@github.com wrote:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
116146897 | https://github.com/pydata/xarray/issues/444#issuecomment-116146897 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExNjE0Njg5Nw== | razcore-rad 1177508 | 2015-06-27T21:33:30Z | 2015-06-27T21:33:30Z | NONE | So I just tried @mrocklin's idea with using single-threaded stuff. This seems to fix the segmentation fault, but I am very curious as to why there's a problem with working in parallel. I tried two different hdf5 libraries (I think version 1.8.13 and 1.8.14) but I got the same segmentation fault. Anyway, working on a single thread is not a big deal, I'll just do that for the time being... I already tried @shoyer, the files are not the issue here, they're the same ones I provided in #443. Question: does the hdf5 library need to be built with parallel support (mpi or something) maybe?... thanks guys |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
115930797 | https://github.com/pydata/xarray/issues/444#issuecomment-115930797 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExNTkzMDc5Nw== | mrocklin 306380 | 2015-06-27T01:09:44Z | 2015-06-27T01:09:44Z | MEMBER | Alternatively can we try doing the operations that xray would do manually and see if one of them triggers something? One could also try
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
115930685 | https://github.com/pydata/xarray/issues/444#issuecomment-115930685 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExNTkzMDY4NQ== | mrocklin 306380 | 2015-06-27T01:08:13Z | 2015-06-27T01:08:13Z | MEMBER | @shoyer asked me to chime in in case this is an issue with dask. One thing to try would be to remove multi-threading from the equation. I'm not sure how this would affect things but it's worth a shot. ``` python
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
115925776 | https://github.com/pydata/xarray/issues/444#issuecomment-115925776 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExNTkyNTc3Ng== | shoyer 1217238 | 2015-06-27T00:49:19Z | 2015-06-27T00:49:19Z | MEMBER | do you have an example file? this might also be your HDF5 install.... |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
115906191 | https://github.com/pydata/xarray/issues/444#issuecomment-115906191 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExNTkwNjE5MQ== | razcore-rad 1177508 | 2015-06-26T22:10:46Z | 2015-06-26T22:22:11Z | NONE | Just tried edit: I was right... it's actually the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
115902800 | https://github.com/pydata/xarray/issues/444#issuecomment-115902800 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExNTkwMjgwMA== | shoyer 1217238 | 2015-06-26T22:01:41Z | 2015-06-26T22:01:41Z | MEMBER | Another backend to try would be That might help us identify if this is a netCDF4-python bug. I am also baffled by how inserting |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
115900337 | https://github.com/pydata/xarray/issues/444#issuecomment-115900337 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExNTkwMDMzNw== | razcore-rad 1177508 | 2015-06-26T21:50:01Z | 2015-06-26T21:53:50Z | NONE | Unfortunately I can't use ``` print(arr1.dtype, arr2.dtype) print((arr1 == arr2)) print((arr1 == arr2) | (isnull(arr1) & isnull(arr2))) gives:float64 float64 dask.array<x_1, shape=(50, 39, 59), chunks=((50,), (39,), (59,)), dtype=bool> dask.array<x_6, shape=(50, 39, 59), chunks=((50,), (39,), (59,)), dtype=bool> ``` Funny thing is when I'm adding these print statements and so on I get some traceback from Python (some times). Without them I would only get segmetation fault with no additional information. For example, just now, after introducing these edit: oh yeah... this is a funny thing. If I do |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 | |
115887568 | https://github.com/pydata/xarray/issues/444#issuecomment-115887568 | https://api.github.com/repos/pydata/xarray/issues/444 | MDEyOklzc3VlQ29tbWVudDExNTg4NzU2OA== | shoyer 1217238 | 2015-06-26T21:25:50Z | 2015-06-26T21:25:50Z | MEMBER | Oh my, that's bad! Can you experiment with the It would be also be helpful to report the dtypes of the arrays that trigger failure in |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
segmentation fault with `open_mfdataset` 91184107 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 4