html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/2912#issuecomment-542369777,https://api.github.com/repos/pydata/xarray/issues/2912,542369777,MDEyOklzc3VlQ29tbWVudDU0MjM2OTc3Nw==,668201,2019-10-15T19:32:50Z,2019-10-15T19:32:50Z,NONE,"Thanks for the explanations @jhamman and @shoyer :) Actually it turns out that I was not using particularly small chunks, but the filesystem for /tmp was faulty... After trying on a reliable filesystem, the results are much more reasonable.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,435535284 https://github.com/pydata/xarray/issues/2912#issuecomment-533801682,https://api.github.com/repos/pydata/xarray/issues/2912,533801682,MDEyOklzc3VlQ29tbWVudDUzMzgwMTY4Mg==,668201,2019-09-21T14:21:17Z,2019-09-21T14:21:17Z,NONE,"> There are ways to side step some of these challenges (`save_mfdataset` and the distributed dask scheduler) @jhamman Could you elaborate on these ways ? I am having severe slow-downs when writing Datasets by blocks (backed by dask). I have also noticed that the slowdowns do not occur when writing to ramdisk. Here are the timings of `to_netcdf`, which uses default engine and encoding (the nc file is 4.3 GB) : - When writing to ramdisk (`/dev/shm/`) : 2min 1s - When writing to `/tmp/` : 27min 28s - When writing to `/tmp/` after `.load()`, as suggested here : 34s (`.load` takes 1min 43s) The workaround suggested here works, but the datasets may not always fit in memory, and it fails the essential purpose of dask... Note: I am using dask 2.3.0 and xarray 0.12.3","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,435535284 https://github.com/pydata/xarray/issues/1378#issuecomment-295657656,https://api.github.com/repos/pydata/xarray/issues/1378,295657656,MDEyOklzc3VlQ29tbWVudDI5NTY1NzY1Ng==,668201,2017-04-20T09:50:19Z,2017-04-20T09:53:33Z,NONE,"I cannot see a use case in which repeated dims actually make sense. In my case this situation originates from h5 files which indeed contains repeated dimensions (`variables(dimensions): uint16 B0(phony_dim_0,phony_dim_0), ..., uint8 VAA(phony_dim_1,phony_dim_1)`), thus xarray is not to blame here. These are ""dummy"" dimensions, not associated with physical values. What we do to circumvent this problem is ""re-dimension"" all variables. Maybe a safe approach would be for open_dataset to raise a warning by default when encountering such variables, with possibly an option to perform automatic or custom dimension naming to avoid repeated dims. I also agree with @shoyer that failing loudly when operating on such DataArrays instead of providing confusing results would be an improvement.","{""total_count"": 5, ""+1"": 1, ""-1"": 4, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,222676855 https://github.com/pydata/xarray/issues/1378#issuecomment-295593740,https://api.github.com/repos/pydata/xarray/issues/1378,295593740,MDEyOklzc3VlQ29tbWVudDI5NTU5Mzc0MA==,668201,2017-04-20T06:11:02Z,2017-04-20T06:11:02Z,NONE,"Right, also positional indexing works unexpectedly in this case, though I understand it's tricky and should probably be discouraged: ```python A[0,:] # returns A A[:,0] # returns A.isel(dim0=0) ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,222676855