html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/2995#issuecomment-658540125,https://api.github.com/repos/pydata/xarray/issues/2995,658540125,MDEyOklzc3VlQ29tbWVudDY1ODU0MDEyNQ==,1217238,2020-07-15T04:35:35Z,2020-07-15T04:35:35Z,MEMBER,"> That's because it falls back to the `'scipy'` engine. Would be nice to have a non-hacky way to write netcdf4 files to byte streams. 😃

I agree, this would be a welcome improvement!

Currently `Dataset.to_netcdf()` without a `path` argument always using the SciPy netCDF writer, which only supports netCDF3. This is mostly because support for bytestreams is a relatively new feature in netCDF4-Python and h5py.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,449706080
https://github.com/pydata/xarray/issues/2995#issuecomment-497063685,https://api.github.com/repos/pydata/xarray/issues/2995,497063685,MDEyOklzc3VlQ29tbWVudDQ5NzA2MzY4NQ==,10050469,2019-05-29T18:49:37Z,2019-05-29T18:49:37Z,MEMBER,"> This takes about a minute to open for me.

It took me much longer earlier this week when I tried :roll_eyes: Is the bottleneck in the parsing of the coordinates?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,449706080
https://github.com/pydata/xarray/issues/2995#issuecomment-497038453,https://api.github.com/repos/pydata/xarray/issues/2995,497038453,MDEyOklzc3VlQ29tbWVudDQ5NzAzODQ1Mw==,1197350,2019-05-29T17:42:45Z,2019-05-29T17:42:45Z,MEMBER,"Forget about zarr for a minute. Let's stick with the original goal of remote access to netcdf4 files in S3. You can use [s3fs](https://s3fs.readthedocs.io/en/latest/) (or [gcsfs](https://gcsfs.readthedocs.io/en/latest/)) for this.

```python
import xarray as xr
import s3fs
fs_s3 = s3fs.S3FileSystem(anon=True)
s3path = 'era5-pds/2008/01/data/air_temperature_at_2_metres.nc'
remote_file_obj = fs_s3.open(s3path, mode='rb')
ds = xr.open_dataset(remote_file_obj, engine='h5netcdf')
```

```
<xarray.Dataset>
Dimensions:                      (lat: 640, lon: 1280, time0: 744)
Coordinates:
  * lon                          (lon) float32 0.0 0.2812494 ... 359.718
  * lat                          (lat) float32 89.784874 89.5062 ... -89.784874
  * time0                        (time0) datetime64[ns] 2008-01-01T07:00:00 ... 2008-02-01T06:00:00
Data variables:
    air_temperature_at_2_metres  (time0, lat, lon) float32 ...
Attributes:
    source:       Reanalysis
    institution:  ECMWF
    title:        ""ERA5 forecasts""
    history:      Wed Jul  4 22:08:50 2018: ncatted /data.e1/wrk/s3_out_in/20...
```

This takes about a minute to open for me. I have not tried writing, but this is perhaps a starting point.

If you are unsatisfied by the performance of netcdf4 on cloud, I would indeed encourage you to investigate zarr.

","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,449706080