id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 435535284,MDU6SXNzdWU0MzU1MzUyODQ=,2912,Writing a netCDF file is unexpectedly slow,2014301,closed,0,,,12,2019-04-21T18:31:36Z,2023-09-12T15:58:18Z,2023-09-12T15:58:18Z,NONE,,,,"```python ncdat=xr.open_mfdataset(nclist, concat_dim='time') ncdat['lat']=ncdat['lat'].isel(time=0).drop('time') ncdat['lon']=ncdat['lon'].isel(time=0).drop('time') ncdat=ncdat.rename({'north_south':'lat', 'east_west':'lon'}) lat_coords=ncdat.lat[:,0] #Extract latitudes lon_coords=ncdat.lon[0,:] #Extract longitudes ncdat=ncdat.drop(['lat','lon']) reformatted_ncdat=ncdat.assign_coords(lat=lat_coords,lon=lon_coords, time=ncdat.coords['time']) ncdat = reformatted_ncdat.sortby('time') ncdat.to_netcdf('testing.nc') ``` #### Problem description After some processing, I am left with this xarray dataset `ncdat` which I want to export to a netCDF file. ``` Dimensions: (lat: 59, lon: 75, time: 500) Coordinates: * time (time) datetime64[ns] 2007-01-22 ... 2008-06-04 * lat (lat) float32 -4.25 -4.15 ... 1.4500003 1.5500002 * lon (lon) float32 29.049988 29.149994 ... 36.450012 Data variables: Streamflow_tavg (time, lat, lon) float32 dask.array RiverDepth_tavg (time, lat, lon) float32 dask.array RiverFlowVelocity_tavg (time, lat, lon) float32 dask.array FloodedFrac_tavg (time, lat, lon) float32 dask.array SurfElev_tavg (time, lat, lon) float32 dask.array SWS_tavg (time, lat, lon) float32 dask.array Attributes: missing_value: -9999.0 NUM_SOIL_LAYERS: 1 SOIL_LAYER_THICKNESSES: 1.0 title: LIS land surface model output institution: NASA GSFC source: model_not_specified history: created on date: 2019-04-19T09:11:12.992 references: Kumar_etal_EMS_2006, Peters-Lidard_etal_ISSE_2007 conventions: CF-1.6 comment: website: http://lis.gsfc.nasa.gov/ MAP_PROJECTION: EQUIDISTANT CYLINDRICAL SOUTH_WEST_CORNER_LAT: -4.25 SOUTH_WEST_CORNER_LON: 29.05 DX: 0.1 DY: 0.1 ``` But the problem is it takes an inordinately long time to export. Almost 10 mins for this particular file which is only 35M. How can I expedite this process? Is there anything wrong with the structure of `ncdat`? #### Expected Output A netCDF file #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 | packaged by conda-forge | (default, Mar 27 2019, 23:01:00) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.0.101-0.47.105-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.12.1 pandas: 0.24.2 numpy: 1.16.2 scipy: 1.2.1 netCDF4: 1.5.0.1 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 1.2.0 distributed: 1.27.0 matplotlib: 3.0.3 cartopy: 0.17.0 seaborn: 0.9.0 setuptools: 41.0.0 pip: 19.0.3 conda: None pytest: None IPython: 7.4.0 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2912/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 455200681,MDU6SXNzdWU0NTUyMDA2ODE=,3017,Why am I running into a IndexVariable error message while assigning dims?,2014301,closed,0,,,1,2019-06-12T13:01:44Z,2020-04-06T22:43:53Z,2020-04-06T22:43:53Z,NONE,,,,"#### Code Sample, a copy-pastable example if possible I am trying to remove the (north_south, east_west) from `lat` and `lon` and and make them coordinate dimensions. The following piece of code was working fine earlier. But now I am running into an error message. I think I updated xarray. Would like to know what I am doing wrong. Or if there is another way to do this? ```python ncdat = xr.open_mfdataset(files_lis) ncdat['lat']=ncdat['lat'].isel(time=0).drop('time') ncdat['lon']=ncdat['lon'].isel(time=0).drop('time') ncdat=ncdat.rename({'north_south':'lat', 'east_west':'lon'}) lat_coords = ncdat['lat'].values[:,-1] #Extract latitudes lon_coords = ncdat['lon'].values[-1,:] #Extract longitudes reformatted_ncdat=ncdat.assign_coords(lat=lat_coords, lon=lon_coords, time=ncdat.coords['time']) ``` #### Problem description This is `ncdat` after `xr.open_mfdataset(files_lis)` ``` Dimensions: (SoilMoist_profiles: 3, east_west: 172, north_south: 92, time: 10) Coordinates: * time (time) datetime64[ns] 2016-08-01 2016-08-02 ... 2016-08-10 Dimensions without coordinates: SoilMoist_profiles, east_west, north_south Data variables: lat (time, north_south, east_west) float32 dask.array lon (time, north_south, east_west) float32 dask.array Snowf_tavg (time, north_south, east_west) float32 dask.array Rainf_tavg (time, north_south, east_west) float32 dask.array Evap_tavg (time, north_south, east_west) float32 dask.array Qs_tavg (time, north_south, east_west) float32 dask.array Qsb_tavg (time, north_south, east_west) float32 dask.array Qsm_tavg (time, north_south, east_west) float32 dask.array SWE_tavg (time, north_south, east_west) float32 dask.array SnowDepth_tavg (time, north_south, east_west) float32 dask.array SoilMoist_tavg (time, SoilMoist_profiles, north_south, east_west) float32 dask.array CanopInt_tavg (time, north_south, east_west) float32 dask.array WaterTableD_tavg (time, north_south, east_west) float32 dask.array TWS_tavg (time, north_south, east_west) float32 dask.array GWS_tavg (time, north_south, east_west) float32 dask.array SnowCover_tavg (time, north_south, east_west) float32 dask.array TotalPrecip_tavg (time, north_south, east_west) float32 dask.array ``` #### Error `ValueError: IndexVariable objects must be 1-dimensional` #### Expected Output Please ignore values. Just an illustration. ``` Dimensions: (SoilMoist_profiles: 3, lat: 92, lon: 172, time: 10) Coordinates: * lat (lat) float32 4.1249995 4.375 4.625 ... 26.624996 26.875 * lon (lon) float32 nan nan nan ... 24.375 24.625004 24.874996 * time (time) datetime64[ns] 2016-08-01 2016-08-02 ... 2017-12-31 Dimensions without coordinates: SoilMoist_profiles Data variables: Snowf_tavg (time, lat, lon) float32 ... Rainf_tavg (time, lat, lon) float32 ... Evap_tavg (time, lat, lon) float32 ... Qs_tavg (time, lat, lon) float32 ... Qsb_tavg (time, lat, lon) float32 ... Qsm_tavg (time, lat, lon) float32 ... SWE_tavg (time, lat, lon) float32 ... SnowDepth_tavg (time, lat, lon) float32 ... SoilMoist_tavg (time, SoilMoist_profiles, lat, lon) float32 ... CanopInt_tavg (time, lat, lon) float32 ... WaterTableD_tavg (time, lat, lon) float32 ... TWS_tavg (time, lat, lon) float32 ... GWS_tavg (time, lat, lon) float32 ... SnowCover_tavg (time, lat, lon) float32 ... TotalPrecip_tavg (time, lat, lon) float32 ... ``` #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 (default, Mar 27 2019, 16:54:48) [Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 18.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.12.1 pandas: 0.24.2 numpy: 1.16.4 scipy: 1.2.1 netCDF4: 1.4.2 pydap: None h5netcdf: None h5py: 2.9.0 Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: 1.0.21 cfgrib: None iris: None bottleneck: None dask: 1.2.2 distributed: 1.28.1 matplotlib: 3.1.0 cartopy: None seaborn: 0.9.0 setuptools: 41.0.1 pip: 19.1.1 conda: None pytest: None IPython: 7.5.0 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3017/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 392816513,MDU6SXNzdWUzOTI4MTY1MTM=,2623,Why is my export to netcdf command leading to a __truediv__ error?,2014301,closed,0,,,7,2018-12-19T23:16:14Z,2018-12-24T15:58:03Z,2018-12-24T15:58:03Z,NONE,,,,"#### Code Sample, a copy-pastable example if possible ```python import numpy as np import xarray as xr import glob, os NCDIR = './output/out/' finalfile = 'summaout.nc' outfilelist = glob.glob((NCDIR+'/*{}*.nc').format('basin_*timestep')) ds=xr.open_mfdataset(outfilelist, concat_dim='hru') replace = ds['pptrate'] runoff = ds['averageInstantRunoff'].values runoff = np.squeeze(runoffdata, axis=2) runoff = runoff.transpose() replace.values = runoff ncconvert = ds.drop('averageInstantRunoff') runoffarray = xr.DataArray(runoff, dims=['time','hru']) ds['averageInstantRunoff'] = runoffarray ds.to_netcdf('test.nc') ``` #### Problem description This is `ds` just before export. ``` Dimensions: (hru: 17, time: 233) Coordinates: * hru (hru) int64 9 17 11 8 3 2 6 4 7 12 1 13 10 16 15 5 14 * time (time) datetime64[ns] 2010-01-01 ... 2010-01-30 Data variables: pptrate (time, hru) float64 9.241e-05 9.241e-05 ... 2.717e-09 hruId (hru) int64 dask.array averageInstantRunoff (time, hru) float64 9.241e-05 9.241e-05 ... 2.717e-09 nSnow (time, hru) int32 dask.array nSoil (time, hru) int32 dask.array nLayers (time, hru) int32 dask.array ``` I get this error message: ``` TypeError: cannot perform __truediv__ with this index type: ``` #### Expected Output netCDF #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: None python: 3.6.4.final.0 python-bits: 64 OS: Linux OS-release: 3.12.62-60.64.8-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.11.0 pandas: 0.21.0 numpy: 1.13.3 scipy: 1.0.0 netCDF4: 1.4.0 h5netcdf: 0.5.0 h5py: 2.7.1 Nio: None zarr: None cftime: 1.0.0 PseudonetCDF: None rasterio: None iris: None bottleneck: 1.2.1 cyordereddict: None dask: 0.19.3 distributed: 1.23.3 matplotlib: 2.1.2 cartopy: 0.15.1 seaborn: 0.9.0 setuptools: 36.7.2 pip: 18.1 conda: 4.5.11 pytest: 3.2.5 IPython: 6.2.1 sphinx: 1.6.5
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2623/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue