html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/1086#issuecomment-259044805,https://api.github.com/repos/pydata/xarray/issues/1086,259044805,MDEyOklzc3VlQ29tbWVudDI1OTA0NDgwNQ==,1217238,2016-11-08T04:46:23Z,2016-11-08T04:46:23Z,MEMBER,"> So it would be more efficient to concat all of the datasets (subset for the relevant variables), and then just use a single .to_dataframe() call on the entire dataset? If so, that would require quite a bit of refactoring on my part, but it could be worth it.
Maybe? I'm not confident enough to advise you to go to that trouble.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,187608079
https://github.com/pydata/xarray/issues/1086#issuecomment-259035428,https://api.github.com/repos/pydata/xarray/issues/1086,259035428,MDEyOklzc3VlQ29tbWVudDI1OTAzNTQyOA==,1217238,2016-11-08T03:25:58Z,2016-11-08T03:25:58Z,MEMBER,"Under the covers open_mfdataset just uses open_dataset and merge/concat. So
this would be similar either way.
On Mon, Nov 7, 2016 at 7:14 PM naught101 notifications@github.com wrote:
> Yeah, I'm loading each file separately with xr.open_dataset(), since it's
> not really a multi-file dataset (it's a lot of single-site datasets, some
> of which have different variables, and overlapping time dimensions). I
> don't think I can avoid loading them separately...
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> https://github.com/pydata/xarray/issues/1086#issuecomment-259033970, or mute
> the thread
> https://github.com/notifications/unsubscribe-auth/ABKS1oUWnGIBO3mX5h56mgPvCbCU7PI3ks5q7-krgaJpZM4Kqw2_
> .
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,187608079
https://github.com/pydata/xarray/issues/1086#issuecomment-259028693,https://api.github.com/repos/pydata/xarray/issues/1086,259028693,MDEyOklzc3VlQ29tbWVudDI1OTAyODY5Mw==,1217238,2016-11-08T02:36:16Z,2016-11-08T02:36:16Z,MEMBER,"One thing that might hurt is that xarray (lazily) decodes times from each file separately, rather than decoding times all at one. But this hasn't been much of an issue before even with hundreds of times, so I'm not sure what's going on here.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,187608079
https://github.com/pydata/xarray/issues/1086#issuecomment-258884141,https://api.github.com/repos/pydata/xarray/issues/1086,258884141,MDEyOklzc3VlQ29tbWVudDI1ODg4NDE0MQ==,1217238,2016-11-07T16:27:21Z,2016-11-07T16:27:21Z,MEMBER,"can you give me a copy/pastable script that has the slowness issue with that file?
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,187608079
https://github.com/pydata/xarray/issues/1086#issuecomment-258755912,https://api.github.com/repos/pydata/xarray/issues/1086,258755912,MDEyOklzc3VlQ29tbWVudDI1ODc1NTkxMg==,1217238,2016-11-07T06:20:18Z,2016-11-07T06:20:18Z,MEMBER,"How did you construct this dataset?
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,187608079
https://github.com/pydata/xarray/issues/1086#issuecomment-258754037,https://api.github.com/repos/pydata/xarray/issues/1086,258754037,MDEyOklzc3VlQ29tbWVudDI1ODc1NDAzNw==,1217238,2016-11-07T06:02:56Z,2016-11-07T06:02:56Z,MEMBER,"Try calling `.load()` before `.to_dataframe`
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,187608079
https://github.com/pydata/xarray/issues/1086#issuecomment-258748969,https://api.github.com/repos/pydata/xarray/issues/1086,258748969,MDEyOklzc3VlQ29tbWVudDI1ODc0ODk2OQ==,1217238,2016-11-07T05:14:11Z,2016-11-07T05:14:24Z,MEMBER,"The simplest thing to try is making use of `.squeeze()`, e.g., `dataset[data_vars].squeeze().to_dataframe()`. Does that have any better performance? At least it's a bit less typing.
I'm not sure why `pandas.tslib.array_to_timedelta64` is slow here, or even how it is being called in your example. I would need a complete example that I can run to debug that.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,187608079