id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
1340994913,I_kwDOAMm_X85P7fVh,6924,Memory Leakage Issue When Running to_netcdf ,64621312,closed,0,,,2,2022-08-16T23:58:17Z,2023-01-17T18:38:40Z,2023-01-17T18:38:40Z,NONE,,,,"### What is your issue?

I have a zarr file that I'd like to convert to a netcdf which is too large to fit in memory. My computer has 32GB of RAM so writing ~5.5GB chunks shouldn't be a problem. However, within seconds of running this script, my memory usage quickly tops out consuming the available ~20GB and the script fails.

Data: [Dropbox link](https://www.dropbox.com/sh/xmcz93p53n1w3ft/AACjI9EskzwKsA8sp-WmM2BFa?dl=0) to zarr file containing radar rainfall data for 6/28/2014 over the United States that is around 1.8GB in total.

Code:
```python
import xarray as xr
import zarr

fpath_zarr = ""out_zarr_20140628.zarr""

ds_from_zarr = xr.open_zarr(store=fpath_zarr, chunks={'outlat':3500, 'outlon':7000, 'time':30})

ds_from_zarr.to_netcdf(""ds_zarr_to_nc.nc"", encoding= {""rainrate"":{""zlib"":True}})
```

Outputs:
```python
MemoryError: Unable to allocate 5.48 GiB for an array with shape (30, 3500, 7000) and data type float64
```

Package versions:
```
dask                         2022.7.0
xarray                       2022.3.0
zarr                          2.8.1
```

![memory_screenshot](https://user-images.githubusercontent.com/64621312/185004542-7c91bcbc-7e7b-4656-a306-732bc1d2e9c3.jpg)
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6924/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1340474484,I_kwDOAMm_X85P5gR0,6920,Writing a netCDF file is slow,64621312,closed,1,,,3,2022-08-16T14:48:37Z,2022-08-16T17:05:37Z,2022-08-16T17:05:37Z,NONE,,,,"### What is your issue?

This has been discussed in [another thread](https://github.com/pydata/xarray/issues/2912), but the proposed solution there (first `.load()` the dataset into memory before running `to_netcdf`) does not work for me since my dataset is too large to fit into memory. The following code takes around 8 hours to run. You'll notice that I tried both `xr.open_mfdataset` and `xr.concat` in case it would make a difference, but it doesn't. I also tried profiling the code according to [this example](https://docs.dask.org/en/latest/diagnostics-local.html#example). The results are in this [html](https://www.dropbox.com/sh/42gzmne9a06qo8m/AAB6qqiFFQOScg8Ou4hH5GoZa?dl=0) (dropbox link) but I'm not really sure what I'm looking at.

Data: [dropbox link](https://www.dropbox.com/sh/onr9l7g7n254848/AAD9vkvWFg1FbinZ-EHHC7L2a?dl=0) to 717 netcdf files containing radar rainfall data for 6/28/2014 over the United States that is around 1GB in total.

Code:
```python
#%% Import libraries
import xarray as xr
from glob import glob
import pandas as pd
import time
import dask
dask.config.set(**{'array.slicing.split_large_chunks': False})

files =  glob(""data/*.nc"")
#%% functions
def extract_file_timestep(fname):
    fname = fname.split('/')[-1]
    fname = fname.split(""."")
    ftype = fname.pop(-1)
    fname = ''.join(fname)
    str_tstep = fname.split(""_"")[-1]
    if ftype == ""nc"":
        date_format = '%Y%m%d%H%M'
    if ftype == ""grib2"":
        date_format = '%Y%m%d-%H%M%S'

    tstep = pd.to_datetime(str_tstep, format=date_format)

    return tstep

def ds_preprocessing(ds):
    tstamp = extract_file_timestep(ds.encoding['source'])
    ds.coords[""time""] = tstamp
    ds = ds.expand_dims({""time"":1})
    ds = ds.rename({""lon"":""longitude"", ""lat"":""latitude"", ""mrms_a2m"":""rainrate""})
    ds = ds.chunk(chunks={""latitude"":3500, ""longitude"":7000, ""time"":1})
    return ds

#%% Loading and formatting data
lst_ds = []
start_time = time.time()
for f in files:
    ds = xr.open_dataset(f, chunks={""latitude"":3500, ""longitude"":7000})
    ds = ds_preprocessing(ds)
    lst_ds.append(ds)

ds_comb_frm_lst = xr.concat(lst_ds, dim=""time"")
print(""Time to load dataset using concat on list of datasets: {}"".format(time.time() - start_time))

start_time = time.time()
ds_comb_frm_open_mfdataset = xr.open_mfdataset(files, chunks={""latitude"":3500, ""longitude"":7000},
                                               concat_dim = ""time"", preprocess=ds_preprocessing, combine=""nested"")
print(""Time to load dataset using open_mfdataset: {}"".format(time.time() - start_time))
#%% exporting to netcdf
start_time = time.time()
ds_comb_frm_lst.to_netcdf(""ds_comb_frm_lst.nc"", encoding= {""rainrate"":{""zlib"":True}})
print(""Time to export dataset created using concat on list of datasets: {}"".format(time.time() - start_time))

start_time = time.time()
ds_comb_frm_open_mfdataset.to_netcdf(""ds_comb_frm_open_mfdataset.nc"", encoding= {""rainrate"":{""zlib"":True}})
print(""Time to export dataset created using open_mfdataset: {}"".format(time.time() - start_time))
```

","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6920/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1332143835,I_kwDOAMm_X85PZubb,6892,2 Dimension Plot Producing Discontinuous Grid,64621312,closed,0,,,1,2022-08-08T16:59:14Z,2022-08-08T17:12:41Z,2022-08-08T17:11:44Z,NONE,,,,"### What is your issue?

**Problem:** I'm expecting a plot that looks like the one [here](https://docs.xarray.dev/en/stable/user-guide/plotting.html#id2) (Plotting-->Two Dimensions-->Simple Example) with a continuous grid, but instead I'm getting the plot below which has a discontinuous grid. This could be due to different spacing in the x and y dimensions (0.005 spacing in the `outlat` dimension and 0.00328768 spacing in the `outlon` dimension), but I don't know what to do about it.
    ![image](https://user-images.githubusercontent.com/64621312/183471078-e2a76231-1f5e-4b13-8ca5-511af22bf792.png)

**Data:** [Dropbox download link for 20 years of monthly rainfall totals covering Norfolk, VA in netcdf format (2.2MB)](https://www.dropbox.com/s/so61kkqosvru9q6/monthly_rainfall.nc?dl=0)

**Reprex:**
```python
import xarray as xr
ds= xr.open_dataset(""monthly_rainfall.nc"")
ds.rainrate.isel(time=100).plot()
```","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6892/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1308176241,I_kwDOAMm_X85N-S9x,6805,PermissionError: [Errno 13] Permission denied,64621312,closed,0,,,5,2022-07-18T16:05:31Z,2022-07-18T17:58:38Z,2022-07-18T17:58:38Z,NONE,,,,"### What is your issue?

This was raised about a year ago but still seems to be unresolved, so I'm hoping this will bring attention back to the issue. (https://github.com/pydata/xarray/issues/5488) 

**Data**: [dropbox sharing link](https://www.dropbox.com/sh/1jfwpzas0vfqd3o/AAAOaQsgjLBqYIc37ucshOMwa?dl=0)
**Description**: This folder contains 2 files each containing 1 day's worth of 1kmx1km gridded precipitation rate data from the National Severe Storms Laboratory. Each is about a gig (sorry they're so big, but it's what I'm working with!)
**Code**:
```python
import xarray as xr

f_in_ncs = ""data/""
f_in_nc = ""data/20190520.nc""

#%% works
ds = xr.open_dataset(f_in_nc, 
                    chunks={'outlat':3500, 'outlon':7000, 'time':50})
#%% doesn't work
mf_ds = xr.open_mfdataset(f_in_ncs,  concat_dim = ""time"",
            chunks={'outlat':3500, 'outlon':7000, 'time':50},
            combine = ""nested"", engine = 'netcdf4')
```
**Error**:
```Python
Output exceeds the [size limit](command:workbench.action.openSettings?[). Open the full output data [in a text editor](command:workbench.action.openLargeOutput?1f03b506-1f93-46ca-ad53-ff5a1ca1a767)
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File c:\Users\Daniel\anaconda3\envs\mrms\lib\site-packages\xarray\backends\file_manager.py:199, in CachingFileManager._acquire_with_cache_info(self, needs_lock)
    [198](file:///c%3A/Users/Daniel/anaconda3/envs/mrms/lib/site-packages/xarray/backends/file_manager.py?line=197) try:
--> [199](file:///c%3A/Users/Daniel/anaconda3/envs/mrms/lib/site-packages/xarray/backends/file_manager.py?line=198)     file = self._cache[self._key]
    [200](file:///c%3A/Users/Daniel/anaconda3/envs/mrms/lib/site-packages/xarray/backends/file_manager.py?line=199) except KeyError:

File c:\Users\Daniel\anaconda3\envs\mrms\lib\site-packages\xarray\backends\lru_cache.py:53, in LRUCache.__getitem__(self, key)
     [52](file:///c%3A/Users/Daniel/anaconda3/envs/mrms/lib/site-packages/xarray/backends/lru_cache.py?line=51) with self._lock:
---> [53](file:///c%3A/Users/Daniel/anaconda3/envs/mrms/lib/site-packages/xarray/backends/lru_cache.py?line=52)     value = self._cache[key]
     [54](file:///c%3A/Users/Daniel/anaconda3/envs/mrms/lib/site-packages/xarray/backends/lru_cache.py?line=53)     self._cache.move_to_end(key)

KeyError: [<class 'netCDF4._netCDF4.Dataset'>, ('d:\\mrms_processing\\_reprex\\2022-7-18_open_mfdataset\\data',), 'r', (('clobber', True), ('diskless', False), ('format', 'NETCDF4'), ('persist', False))]

During handling of the above exception, another exception occurred:

PermissionError                           Traceback (most recent call last)
Input In [4], in <cell line: 5>()
      1 import xarray as xr
      3 f_in_ncs = ""data/""
----> 5 ds = xr.open_mfdataset(f_in_ncs,  concat_dim = ""time"",
      6             chunks={'outlat':3500, 'outlon':7000, 'time':50},
      7             combine = ""nested"", engine = 'netcdf4')

File c:\Users\Daniel\anaconda3\envs\mrms\lib\site-packages\xarray\backends\api.py:908, in open_mfdataset(paths, chunks, concat_dim, compat, preprocess, engine, data_vars, coords, combine, parallel, join, attrs_file, combine_attrs, **kwargs)
...
File src\netCDF4\_netCDF4.pyx:2307, in netCDF4._netCDF4.Dataset.__init__()

File src\netCDF4\_netCDF4.pyx:1925, in netCDF4._netCDF4._ensure_nc_success()

PermissionError: [Errno 13] Permission denied: b'd:\\mrms_processing\\_reprex\\2022-7-18_open_mfdataset\\data'
```","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6805/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue