html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/4043#issuecomment-628016841,https://api.github.com/repos/pydata/xarray/issues/4043,628016841,MDEyOklzc3VlQ29tbWVudDYyODAxNjg0MQ==,1197350,2020-05-13T14:13:06Z,2020-05-13T14:13:06Z,MEMBER,"> Using this chunk of time=500Mb the code runs properly but it is really slow compared with the response through local network. You might want to experiment with smaller chunks. In general, opendap will always introduce overhead compared to direct file access.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,614144170 https://github.com/pydata/xarray/issues/4043#issuecomment-627387025,https://api.github.com/repos/pydata/xarray/issues/4043,627387025,MDEyOklzc3VlQ29tbWVudDYyNzM4NzAyNQ==,1197350,2020-05-12T14:38:37Z,2020-05-12T14:38:37Z,MEMBER,"> Just for my understanding, So theoretically It is not possible to make big requests without using chunking? This depends entirely on the TDS server configuration. See comment in https://github.com/Unidata/netcdf-c/issues/1667#issuecomment-597372065. The default limit appears to be 500 MB. It's important to note that _none of this_ has to do with xarray. Xarray is simply the top layer of a very deep software stack. If the TDS server could deliver larger data requests, and the netCDF4-python library could accept them, xarray would have no problem.","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 1, ""rocket"": 0, ""eyes"": 0}",,614144170 https://github.com/pydata/xarray/issues/4043#issuecomment-627368616,https://api.github.com/repos/pydata/xarray/issues/4043,627368616,MDEyOklzc3VlQ29tbWVudDYyNzM2ODYxNg==,1197350,2020-05-12T14:07:39Z,2020-05-12T14:07:39Z,MEMBER,"I have spent plenty of time debugging these sorts of issues. It really helps to take xarray out of the equation. Try making your request with just the netCDF--that's all that xarray uses under the hood. Overall your example is very complicated, which makes it hard to find the core issue. You generally want to try something like this ```python import netCDF4 ncds = netCDF4.Dataset(OPENDAP_url) data = ncds[variable_name][:] ``` Try playing around with the slice `[:]` to see under what circumstances the opendap server fails. Then use chunking in xarray to limit the size of each individual request. That's what's described in pangeo-data/pangeo#767. A few additional comments about your code: ```python # Select spatial subset [lon,lat] ds = ds.where((ds.lon >= Lon[0] - dl) & (ds.lon <= Lon[1] + dl) & (ds.lat >= Lat[0] - dl) & (ds.lat <= Lat[1] + dl), drop=True) ``` This is **NOT** how you do subsetting with xarray. Where is meant for masking. I recommend reviewing the xarray docs on [indexing and selecting](http://xarray.pydata.org/en/stable/indexing.html). Your call should be something like ```python ds = ds.sel(lon=slice(...), lat=slice(...)) ``` What's the difference? `where` downloads all of the data from the opendap server and then fills it with NaNs outside of your selection, while `sel` lazily limits the size of the request from the opendap server. This could make a big difference in terms of the server's memory usage. ```python ds = ds.sortby('lon', 'lat') ``` Can you do this sorting *after* loading the data. It's an expensive operation and might not interact well with the opendap server.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,614144170