home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 1266738659

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1266738659 I_kwDOAMm_X85LgOXj 6684 pass `**kwargs` through from `save_mfdataset` to `to_netcdf` 8796694 closed 0     2 2022-06-09T22:37:43Z 2022-06-11T18:26:47Z 2022-06-11T18:26:47Z CONTRIBUTOR      

Is your feature request related to a problem?

Based on the documentation of xarray.save_mfdataset, I would expect that arguments that can be passed to xarray.Dataset.to_netcdf() can also be passed to xarray.save_mfdataset:

When not using dask, it is no different than calling to_netcdf repeatedly.

But it appears that the unlimited_dims and encoding arguments available in to_netcdf are not also available in save_mfdataset:

test_save_mfdataset_encoding_opt.py: ```python import xarray as xr

create a timeseries to store in a netCDF file

times = list(range(0,3652)) time = xr.DataArray(times, dims = ("time",))

create a simple dataset to write using save_mfdataset

test_ds = xr.Dataset() test_ds['time'] = time

tell netCDF to write the times as doubles

encoding = dict(time = dict(dtype = "double"))

set the output file name

output_path = "test.nc"

the test fails when encoding is added as an argument to save_mfdataset

but it works if instead the dataset is saved using

test_ds.to_netcdf(output_path, encoding = encoding)

xr.save_mfdataset([test_ds], [output_path], encoding = encoding) ```

bash $ python3 test_save_mfdataset_encoding_opt.py Traceback (most recent call last): File "test_save_mfdataset_encoding_opt.py", line 21, in <module> xr.save_mfdataset([test_ds], [output_path], encoding = encoding) TypeError: save_mfdataset() got an unexpected keyword argument 'encoding'

This appears to be because save_mfdataset does not accept the encoding argument, nor does it accept and pass along **kwargs.

This means that datasets written with save_mfdataset are less flexible than those written with to_netcdf.

Describe the solution you'd like

A simple fix, which I have verified, is to modify save_mfdataset to accept and pass along **kwargs:

```diff diff --git a/xarray/backends/api.py b/xarray/backends/api.py index d1166624..8baca58c 100644 --- a/xarray/backends/api.py +++ b/xarray/backends/api.py @@ -1258,7 +1258,7 @@ def dump_to_store(

def save_mfdataset( - datasets, paths, mode="w", format=None, groups=None, engine=None, compute=True + datasets, paths, mode="w", format=None, groups=None, engine=None, compute=True, **kwargs ): """Write multiple datasets to disk as netCDF files simultaneously.

@@ -1280,6 +1280,7 @@ def save_mfdataset( these locations will be overwritten. format : {"NETCDF4", "NETCDF4_CLASSIC", "NETCDF3_64BIT", \ "NETCDF3_CLASSIC"}, optional + **kwargs : additional arguments are passed along to to_netcdf

     File format for the resulting netCDF file:

@@ -1358,7 +1359,7 @@ def save_mfdataset( writers, stores = zip( [ to_netcdf( - ds, path, mode, format, group, engine, compute=compute, multifile=True + ds, path, mode, format, group, engine, compute=compute, multifile=True, *kwargs ) for ds, path, group in zip(datasets, paths, groups) ] ```

When a version of xarray with xarray/backends/api.py patched as above, the test file indicated above runs as expected, with the encoding passed along:

bash $ python3 test_save_mfdataset_encoding_opt.py $ ncdump -h test.nc netcdf test { dimensions: time = 3652 ; variables: double time(time) ; time:_FillValue = NaN ; }

Describe alternatives you've considered

I attempted to set the encoding dictionary directly on the dataset prior to calling save_mfdataset, but that didn't seem to have an effect.

Additional context

Here is version information, in case it is relevant: ```bash $ python3 -c 'import xarray; print(xarray.show_versions())'

INSTALLED VERSIONS

commit: None python: 3.7.4 (default, Aug 13 2019, 15:17:50) [Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 21.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.1

xarray: 0.15.0 pandas: 0.25.1 numpy: 1.17.2 scipy: 1.6.3 netCDF4: 1.4.2 pydap: installed h5netcdf: None h5py: 2.9.0 Nio: None zarr: None cftime: 1.1.1.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 2.5.2 distributed: 2.5.2 matplotlib: 3.1.3 cartopy: None seaborn: 0.9.0 numbagg: None setuptools: 41.4.0 pip: 19.2.3 conda: 4.8.3 pytest: 5.2.1 IPython: 7.8.0 sphinx: 2.2.0 None ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6684/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 3 rows from issues_id in issues_labels
  • 2 rows from issue in issue_comments
Powered by Datasette · Queries took 0.639ms · About: xarray-datasette