id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
2037869483,I_kwDOAMm_X855d2ur,8544,Reading netcdf file with string coordinates makes IPython kernel crash (netcdf4 engine),36678697,closed,0,,,14,2023-12-12T14:26:42Z,2024-01-04T21:49:34Z,2023-12-22T09:29:45Z,NONE,,,,"### What happened?
When trying to open a netcdf file that has strings as coordinates it makes the notebook kernel crash.
This only happens when `engine=netcdf4`, and not when `engine=h5netcdf`.
The bug occurs in IPython, in Jupyter in the web browser and in VSCode notebooks at least.
The bug can consistently be reproduced when reading the same file twice on the same cell, when running the cell twice.
### What did you expect to happen?
It is expected for `engine=netcdf4` to work the same as `engine=h5netcdf`, i.e. don't make the kernel crash.
### Minimal Complete Verifiable Example
```Python
# %%
import numpy as np
import xarray as xr
# %%
fpath = ""test.nc""
da = xr.DataArray(
data=np.random.randn(3, 10),
dims=[""label"", ""values""],
coords=dict(
label=[""a"", ""b"", ""c""],
),
)
da.to_netcdf(fpath)
# %%
# engine = ""h5netcdf""
engine = ""netcdf4""
xr.open_dataarray(fpath, engine=engine)
xr.open_dataarray(fpath, engine=engine)
```
### MVCE confirmation
- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
- [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.
### Relevant log output
IPython crashes with: `Segmentation fault (core dumped)`
Jupyter Notebook logs:
```
[I 2023-12-12 15:20:00.474 ServerApp] Kernel restarted: 054a63c4-4f46-4dc2-b58f-4dcd4ce9951c
[I 2023-12-12 15:20:00.482 ServerApp] Starting buffering for 054a63c4-4f46-4dc2-b58f-4dcd4ce9951c:0bd5dcd6-faa7-413a-b6c5-080b1c774933
[I 2023-12-12 15:20:00.494 ServerApp] Connecting to kernel 054a63c4-4f46-4dc2-b58f-4dcd4ce9951c.
[I 2023-12-12 15:20:00.494 ServerApp] Restoring connection for 054a63c4-4f46-4dc2-b58f-4dcd4ce9951c:0bd5dcd6-faa7-413a-b6c5-080b1c774933
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
[IPKernelApp] WARNING | Unknown error in handling startup files:
[I 2023-12-12 15:20:09.463 ServerApp] AsyncIOLoopKernelRestarter: restarting kernel (1/5), keep random ports
[W 2023-12-12 15:20:09.463 ServerApp] kernel 054a63c4-4f46-4dc2-b58f-4dcd4ce9951c restarted
[I 2023-12-12 15:20:09.470 ServerApp] Starting buffering for 054a63c4-4f46-4dc2-b58f-4dcd4ce9951c:0bd5dcd6-faa7-413a-b6c5-080b1c774933
[I 2023-12-12 15:20:09.504 ServerApp] Connecting to kernel 054a63c4-4f46-4dc2-b58f-4dcd4ce9951c.
[I 2023-12-12 15:20:09.505 ServerApp] Restoring connection for 054a63c4-4f46-4dc2-b58f-4dcd4ce9951c:0bd5dcd6-faa7-413a-b6c5-080b1c774933
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
[IPKernelApp] WARNING | Unknown error in handling startup files:
```
VSCode notebook Jupyter logs:
```
15:23:03.501 [info] Restart requested ~/Desktop/bug_xarray_notebook/bug.ipynb
15:23:03.502 [info] Dispose Kernel process 2763594.
15:23:03.589 [info] Process Execution: ~/miniconda3/bin/python -c ""import ipykernel; print(ipykernel.__version__); print(""5dc3a68c-e34e-4080-9c3e-2a532b2ccb4d""); print(ipykernel.__file__)""
15:23:03.671 [info] Process Execution: ~/miniconda3/bin/python -m ipykernel_launcher --f=~/.local/share/jupyter/runtime/kernel-v2-2727807dzOm3m1LEA5V.json
> cwd: ~/Desktop/bug_xarray_notebook
15:23:04.149 [warn] StdErr from Kernel Process [IPKernelApp] WARNING | Unknown error in handling startup files:
15:23:04.454 [info] Restarted bd04fd87-98e7-486d-a6c6-7308101edcdf
15:23:08.046 [info] Handle Execution of Cells 0 for ~/Desktop/bug_xarray_notebook/bug.ipynb
15:23:08.055 [info] Kernel acknowledged execution of cell 0 @ 1702390988054
15:23:08.412 [info] End cell 0 execution after 0.358s, completed @ 1702390988412, started @ 1702390988054
15:23:09.260 [info] Handle Execution of Cells 1 for ~/Desktop/bug_xarray_notebook/bug.ipynb
15:23:09.269 [info] Kernel acknowledged execution of cell 1 @ 1702390989268
15:23:09.305 [info] End cell 1 execution after 0.036s, completed @ 1702390989304, started @ 1702390989268
15:23:10.893 [info] Handle Execution of Cells 2 for ~/Desktop/bug_xarray_notebook/bug.ipynb
15:23:10.907 [info] Kernel acknowledged execution of cell 2 @ 1702390990907
15:23:10.971 [info] End cell 2 execution after 0.064s, completed @ 1702390990971, started @ 1702390990907
15:23:12.255 [info] Handle Execution of Cells 2 for ~/Desktop/bug_xarray_notebook/bug.ipynb
15:23:12.262 [info] Kernel acknowledged execution of cell 2 @ 1702390992262
15:23:12.504 [error] Disposing session as kernel process died ExitCode: undefined, Reason: [IPKernelApp] WARNING | Unknown error in handling startup files:
15:23:12.505 [info] Dispose Kernel process 2764104.
15:23:12.518 [info] End cell 2 execution after -1702390992.262s, completed @ undefined, started @ 1702390992262
```
### Anything else we need to know?
_No response_
### Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.12.0 | packaged by conda-forge | (main, Oct 3 2023, 08:43:22) [GCC 12.3.0]
python-bits: 64
OS: Linux
OS-release: 5.15.0-91-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.3
libnetcdf: 4.9.2
xarray: 2023.12.0
pandas: 2.1.4
numpy: 1.26.2
scipy: None
netCDF4: 1.6.5
pydap: None
h5netcdf: 1.3.0
h5py: 3.10.0
Nio: None
zarr: None
cftime: 1.6.3
nc_time_axis: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 68.2.2
pip: 23.3.1
conda: None
pytest: None
mypy: None
IPython: 8.18.1
sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8544/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
627735640,MDU6SXNzdWU2Mjc3MzU2NDA=,4113,xarray.DataArray.stack load data into memory,36678697,closed,0,,,6,2020-05-30T13:45:38Z,2022-04-19T16:10:26Z,2022-04-19T16:10:25Z,NONE,,,,"Stacking is loading the data into memory, which is unexpected, or at least undocumented, afaik.
#### MCVE Code Sample
```python
import os
import psutil
import numpy as np
import xarray as xr
def main():
xr.DataArray(
np.random.randn(1024, 1024, 100),
dims=(""x"", ""y"", ""z""),
).to_netcdf(""da.nc"")
da = xr.open_dataarray(""da.nc"")
print(f"" da: {mb(da.nbytes)} MB"")
print_ram_state()
mda = da.stack(px=(""x"", ""y""))
print_ram_state()
def print_ram_state():
# https://stackoverflow.com/a/21632554
process = psutil.Process(os.getpid())
ram_state = process.memory_info().rss
print(f""RAM: {mb(ram_state) :.2f} MB"")
def mb(nbytes):
return nbytes / (1024 * 1024)
if __name__ == ""__main__"":
main()
```
#### Problem Description
Using [`xarray.DataArray.stack`](http://xarray.pydata.org/en/stable/generated/xarray.DataArray.stack.html) method is loading the data into memory, which is unexpected behavior, or at least undocumented afaik.
#### Versions
Output of xr.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.7.6 | packaged by conda-forge | (default, Mar 23 2020, 23:03:20)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 5.3.0-53-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.5
libnetcdf: None
xarray: 0.15.1
pandas: 1.0.3
numpy: 1.17.5
scipy: 1.4.1
netCDF4: None
pydap: None
h5netcdf: None
h5py: 2.10.0
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2.16.0
distributed: 2.16.0
matplotlib: 3.2.1
cartopy: None
seaborn: 0.10.1
numbagg: None
setuptools: 46.4.0.post20200518
pip: 20.1.1
conda: 4.8.3
pytest: 5.4.2
IPython: 7.14.0
sphinx: 3.0.4
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4113/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue