html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/1396#issuecomment-453806119,https://api.github.com/repos/pydata/xarray/issues/1396,453806119,MDEyOklzc3VlQ29tbWVudDQ1MzgwNjExOQ==,2443309,2019-01-13T06:32:45Z,2019-01-13T06:32:45Z,MEMBER,closed via https://github.com/dask/dask/pull/2364 (a long time ago),"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,225774140
https://github.com/pydata/xarray/issues/1396#issuecomment-306843285,https://api.github.com/repos/pydata/xarray/issues/1396,306843285,MDEyOklzc3VlQ29tbWVudDMwNjg0MzI4NQ==,1197350,2017-06-07T16:07:03Z,2017-06-07T16:07:03Z,MEMBER,"Hi @JanisGailis. Thanks for looking into this issue! I will give your solution a try as soon as I get some free time.

However, I would like to point out that the issue is completely resolved by dask/dask#2364. So this can probably be closed after the next dask release.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,225774140
https://github.com/pydata/xarray/issues/1396#issuecomment-303254497,https://api.github.com/repos/pydata/xarray/issues/1396,303254497,MDEyOklzc3VlQ29tbWVudDMwMzI1NDQ5Nw==,1197350,2017-05-23T00:16:58Z,2017-05-23T00:16:58Z,MEMBER,This dask bug also explains why it is so slow to generate the `repr` for these big datasets.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,225774140
https://github.com/pydata/xarray/issues/1396#issuecomment-301837577,https://api.github.com/repos/pydata/xarray/issues/1396,301837577,MDEyOklzc3VlQ29tbWVudDMwMTgzNzU3Nw==,1197350,2017-05-16T16:27:52Z,2017-05-16T16:27:52Z,MEMBER,"I have created a self-contained, reproducible example of this serious performance problem.
https://gist.github.com/rabernat/7175328ee04a3167fa4dede1821964c6

This issue is becoming a big problem for me. I imagine other people must be experiencing it too.

I am happy to try to dig in and fix it, but I think some of @shoyer's backend insight would be valuable first.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,225774140
https://github.com/pydata/xarray/issues/1396#issuecomment-298755789,https://api.github.com/repos/pydata/xarray/issues/1396,298755789,MDEyOklzc3VlQ29tbWVudDI5ODc1NTc4OQ==,1197350,2017-05-02T20:45:29Z,2017-05-02T20:45:41Z,MEMBER,"> dask may be loading full arrays to do this computation

This is definitely what I suspect is happening. The problem with adding more chunks is that I quickly hit my system ulimit (see #1394), since, for some reason, all the 1754  files are opened as soon as I call `.load()`. Putting more chunks in just multiplies that number.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,225774140
https://github.com/pydata/xarray/issues/1396#issuecomment-298755027,https://api.github.com/repos/pydata/xarray/issues/1396,298755027,MDEyOklzc3VlQ29tbWVudDI5ODc1NTAyNw==,1217238,2017-05-02T20:42:37Z,2017-05-02T20:42:37Z,MEMBER,"One thing worth trying is specifying `chunks` manually in `open_mfdataset`. Point-wise indexing should not really require chunks specified ahead of time, but the optimizations dask uses to make these operations efficient are somewhat fragile, so dask may be loading full arrays to do this computation.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,225774140
https://github.com/pydata/xarray/issues/1396#issuecomment-298754480,https://api.github.com/repos/pydata/xarray/issues/1396,298754480,MDEyOklzc3VlQ29tbWVudDI5ODc1NDQ4MA==,1217238,2017-05-02T20:40:35Z,2017-05-02T20:40:35Z,MEMBER,"OK, so that isn't terribly useful -- the slow-down is somewhere in dask-land. If it was an issue with alignment, that would come up when building the dask graph, not computing it.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,225774140
https://github.com/pydata/xarray/issues/1396#issuecomment-298745833,https://api.github.com/repos/pydata/xarray/issues/1396,298745833,MDEyOklzc3VlQ29tbWVudDI5ODc0NTgzMw==,1197350,2017-05-02T20:06:10Z,2017-05-02T20:06:10Z,MEMBER,"The relevant part of the stack trace is
```
/home/rpa/.conda/envs/dask_distributed/lib/python3.5/site-packages/xarray/core/dataarray.py in load(self)
    571         working with many file objects on disk.
    572         """"""
--> 573         ds = self._to_temp_dataset().load()
    574         new = self._from_temp_dataset(ds)
    575         self._variable = new._variable

/home/rpa/.conda/envs/dask_distributed/lib/python3.5/site-packages/xarray/core/dataset.py in load(self)
    467 
    468             # evaluate all the dask arrays simultaneously
--> 469             evaluated_data = da.compute(*lazy_data.values())
    470 
    471             for k, data in zip(lazy_data, evaluated_data):

/home/rpa/.conda/envs/dask_distributed/lib/python3.5/site-packages/dask/base.py in compute(*args, **kwargs)
    200     dsk = collections_to_dsk(variables, optimize_graph, **kwargs)
    201     keys = [var._keys() for var in variables]
--> 202     results = get(dsk, keys, **kwargs)
    203 
    204     results_iter = iter(results)

/home/rpa/.conda/envs/dask_distributed/lib/python3.5/site-packages/distributed/client.py in get(self, dsk, keys, restrictions, loose_restrictions, resources, **kwargs)
   1523         return sync(self.loop, self._get, dsk, keys, restrictions=restrictions,
   1524                     loose_restrictions=loose_restrictions,
-> 1525                     resources=resources)
   1526 
   1527     def _optimize_insert_futures(self, dsk, keys):

/home/rpa/.conda/envs/dask_distributed/lib/python3.5/site-packages/distributed/utils.py in sync(loop, func, *args, **kwargs)
    200     loop.add_callback(f)
    201     while not e.is_set():
--> 202         e.wait(1000000)
    203     if error[0]:
    204         six.reraise(*error[0])

/home/rpa/.conda/envs/dask_distributed/lib/python3.5/threading.py in wait(self, timeout)
    547             signaled = self._flag
    548             if not signaled:
--> 549                 signaled = self._cond.wait(timeout)
    550             return signaled
    551 

/home/rpa/.conda/envs/dask_distributed/lib/python3.5/threading.py in wait(self, timeout)
    295             else:
    296                 if timeout > 0:
--> 297                     gotit = waiter.acquire(True, timeout)
    298                 else:
    299                     gotit = waiter.acquire(False)

KeyboardInterrupt: 
```

I think the issue you are referring to is also mine (#1385).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,225774140
https://github.com/pydata/xarray/issues/1396#issuecomment-298738745,https://api.github.com/repos/pydata/xarray/issues/1396,298738745,MDEyOklzc3VlQ29tbWVudDI5ODczODc0NQ==,1217238,2017-05-02T19:38:36Z,2017-05-02T19:38:36Z,MEMBER,"Can you try using `Ctrl + C` to interrupt things and report the stack-trace?

This might be an issue with xarray verifying aligned coordinates, which we should have an option to disable. (This came up somewhere else recently, but I couldn't find the issue with a quick search.)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,225774140