html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/463#issuecomment-120668247,https://api.github.com/repos/pydata/xarray/issues/463,120668247,MDEyOklzc3VlQ29tbWVudDEyMDY2ODI0Nw==,1197350,2015-07-11T23:01:38Z,2015-07-11T23:01:38Z,MEMBER,"8 MB. This is daily satellite data, with one file per time point. (Most satellite data is distributed this way.) There are many other workarounds to this problem. You can try to increase your ulimits. Or you can join these small netcdf files together into a big one. I had daily data files, and I used NCO to concatentate them into monthly files. That basically solved my problem. But of course that involves going out of xray. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,94328498 https://github.com/pydata/xarray/issues/463#issuecomment-120662901,https://api.github.com/repos/pydata/xarray/issues/463,120662901,MDEyOklzc3VlQ29tbWVudDEyMDY2MjkwMQ==,1197350,2015-07-11T21:37:42Z,2015-07-11T21:37:42Z,MEMBER,"I came up with a solution for this, but it is so slow that it is useless. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,94328498 https://github.com/pydata/xarray/issues/463#issuecomment-120449743,https://api.github.com/repos/pydata/xarray/issues/463,120449743,MDEyOklzc3VlQ29tbWVudDEyMDQ0OTc0Mw==,1197350,2015-07-10T16:19:15Z,2015-07-10T16:19:15Z,MEMBER,"Ok, I will have a look at this. I would be happy to contribute to this awesome project. By the way, by monitoring /proc, I was able to see that the scipy backend actually opens each file _TWICE_, exacerbating the problem. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,94328498 https://github.com/pydata/xarray/issues/463#issuecomment-120446569,https://api.github.com/repos/pydata/xarray/issues/463,120446569,MDEyOklzc3VlQ29tbWVudDEyMDQ0NjU2OQ==,1197350,2015-07-10T16:08:48Z,2015-07-10T16:08:48Z,MEMBER,"I am using the scipy backend because the netcdf4 backend doesn't work for me at all. It core dumps with the error ``` python: posixio.c:366: px_rel: Assertion `pxp->bf_offset <= offset && offset < pxp->bf_offset + (off_t) pxp->bf_extent' failed. Aborted (core dumped) ``` Are you suggesting I work on the scipy backend? ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,94328498 https://github.com/pydata/xarray/issues/463#issuecomment-120442769,https://api.github.com/repos/pydata/xarray/issues/463,120442769,MDEyOklzc3VlQ29tbWVudDEyMDQ0Mjc2OQ==,1197350,2015-07-10T15:53:48Z,2015-07-10T15:53:48Z,MEMBER,"Just a little follow up...I tried to work around the file limit by serializing the processing of the files and creating xray datasets with with fewer files in them. However, I still eventually hit this error, suggesting that the files are never being closed. For example I would like to do ``` python ds = xray.open_mfdataset(ddir + '*.nc' % yr, engine='scipy') EKE = (ds.variables['u']**2 + ds.variables['v']**2).mean(dim='time').load() ``` This tries to open 8031 files and produces the `error: [Errno 24] Too many open files` So then I try to create a new dataset for each year ``` python EKE = [] for yr in xrange(1993,2015): print yr # this opens about 365 files ds = xray.open_mfdataset(ddir + '/dt_global_allsat_msla_uv_%04d*.nc' % yr, engine='scipy') EKE.append((ds.variables['u']**2 + ds.variables['v']**2).mean(dim='time').load()) ``` This works okay for the first two years. However, by the third year, I _still_ get the `error: [Errno 24] Too many open files`. This is when the ulimit of 1024 files is exceeded. Using xray version 0.5.1 via conda module. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,94328498