html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/5396#issuecomment-1028911838,https://api.github.com/repos/pydata/xarray/issues/5396,1028911838,IC_kwDOAMm_X849U_Le,37222223,2022-02-03T11:49:29Z,2022-02-03T11:49:29Z,NONE,"I'm receiving a very similiar error when trying to work with data from the [NOAA ObsPack](https://gml.noaa.gov/ccgg/obspack/data.php) (obspack_multi-species_1_CCGGTowerInsitu_v1.0_2018-02-08). This is collated greenhouse data from global measurement stations and is in a largely standardised format. This is an example of the code written: ``` import xarray as xr from pathlib import Path directory = Path(""/user/work/rt17603/Input_files/obs_raw/NOAA_ObsPack_test_multi2/data/nc"") filename1 = directory / ""co2_bao_tower-insitu_1_ccgg_all.nc"" home_directory = Path(""/user/home/rt17603/Input_files_temp/"") out_filename1 = home_directory / str(filename1.name).replace(""_all.nc"", ""_cut.nc"") ds1 = xr.open_dataset(filename1) ds1.to_netcdf(out_filename1) ``` ### What happened If I just try to read this data and then immediately write this back to a netcdf file, as shown above, I receive a `ValueError: operands could not be broadcast together with remapped shapes`. I also tried this for multiple files from the ObsPack with the same result.
Full error message ``` Traceback (most recent call last): File ""src/netCDF4/_netCDF4.pyx"", line 4916, in netCDF4._netCDF4.Variable.__setitem__ ValueError: cannot reshape array of size 612918 into shape (204306,100) During handling of the above exception, another exception occurred: Traceback (most recent call last): File ""/user/home/rt17603/Code_BP1/OpenGHG/Create_NOAA_nc_test_files.py"", line 78, in ds1.to_netcdf(out_filename1) File ""/user/home/rt17603/work/environments/openghg_env/lib/python3.8/site-packages/xarray/core/dataset.py"", line 1902, in to_netcdf return to_netcdf( File ""/user/home/rt17603/work/environments/openghg_env/lib/python3.8/site-packages/xarray/backends/api.py"", line 1072, in to_netcdf dump_to_store( File ""/user/home/rt17603/work/environments/openghg_env/lib/python3.8/site-packages/xarray/backends/api.py"", line 1119, in dump_to_store store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims) File ""/user/home/rt17603/work/environments/openghg_env/lib/python3.8/site-packages/xarray/backends/common.py"", line 265, in store self.set_variables( File ""/user/home/rt17603/work/environments/openghg_env/lib/python3.8/site-packages/xarray/backends/common.py"", line 307, in set_variables writer.add(source, target) File ""/user/home/rt17603/work/environments/openghg_env/lib/python3.8/site-packages/xarray/backends/common.py"", line 156, in add target[...] = source File ""/user/home/rt17603/work/environments/openghg_env/lib/python3.8/site-packages/xarray/backends/netCDF4_.py"", line 69, in __setitem__ data[key] = value File ""src/netCDF4/_netCDF4.pyx"", line 4918, in netCDF4._netCDF4.Variable.__setitem__ File ""<__array_function__ internals>"", line 5, in broadcast_to File ""/user/home/rt17603/work/environments/openghg_env/lib/python3.8/site-packages/numpy/lib/stride_tricks.py"", line 411, in broadcast_to return _broadcast_to(array, shape, subok=subok, readonly=True) File ""/user/home/rt17603/work/environments/openghg_env/lib/python3.8/site-packages/numpy/lib/stride_tricks.py"", line 348, in _broadcast_to it = np.nditer( ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (204306,3) and requested shape (204306,100) ```
I can upload an example netcdf file if needed (the files are between 0.1-5MB in size). This data can also be downloaded from the [NOAA ObsPack](https://gml.noaa.gov/ccgg/obspack/data.php) pages - ""obspack_multi-species_1_CCGGTowerInsitu_v1.0_2018-02-08"" option. This is a printed example of the contents of one of the files: ``` Dimensions: (obs: 204306, calendar_components: 6) Dimensions without coordinates: obs, calendar_components Data variables: (12/18) time (obs) datetime64[ns] ... time_decimal (obs) float64 ... time_components (obs, calendar_components) float64 ... solartime_components (obs, calendar_components) float64 ... value (obs) float32 ... value_unc (obs) float32 ... ... ... qcflag (obs) object ... instrument (obs) object ... obs_num (obs) int32 ... obs_id (obs) |S100 ... obspack_num (obs) int32 ... obspack_id (obs) |S200 ... Attributes: (12/69) site_code: BAO site_name: Boulder Atmospheric Observator... site_country: United States site_country_flag: http://www.esrl.noaa.gov/gmd/w... site_latitude: 40.05 site_longitude: -105.004 ... ... Conventions: CF-1.6 institution: NOAA ESRL GMD CCGG source: tower-insitu references: see dataset_references title: co2_bao_tower-insitu_1_ccgg_all history: 2018-02-08: File creation ``` ### What was expected Expected that the data read using xarray would be written to file. The fix suggested above would not be suitable for me as I require the data to be stored as the original dtype and not stored as object.
Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.8.8 (default, Apr 13 2021, 19:58:26) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-1160.45.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.0 libnetcdf: 4.7.4 xarray: 0.20.1 pandas: 1.3.4 numpy: 1.21.4 scipy: 1.7.3 netCDF4: 1.5.8 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.5.1.1 nc_time_axis: 1.4.0 PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2021.11.2 distributed: None matplotlib: 3.5.0 cartopy: None seaborn: None numbagg: None fsspec: 2021.11.1 cupy: None pint: None sparse: None setuptools: 49.2.1 pip: 20.2.3 conda: None pytest: 6.2.5 IPython: 7.30.1 sphinx: None <\details> ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,905476569