html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/2995#issuecomment-658540125,https://api.github.com/repos/pydata/xarray/issues/2995,658540125,MDEyOklzc3VlQ29tbWVudDY1ODU0MDEyNQ==,1217238,2020-07-15T04:35:35Z,2020-07-15T04:35:35Z,MEMBER,"> That's because it falls back to the `'scipy'` engine. Would be nice to have a non-hacky way to write netcdf4 files to byte streams. 😃 I agree, this would be a welcome improvement! Currently `Dataset.to_netcdf()` without a `path` argument always using the SciPy netCDF writer, which only supports netCDF3. This is mostly because support for bytestreams is a relatively new feature in netCDF4-Python and h5py.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,449706080 https://github.com/pydata/xarray/issues/2995#issuecomment-497063685,https://api.github.com/repos/pydata/xarray/issues/2995,497063685,MDEyOklzc3VlQ29tbWVudDQ5NzA2MzY4NQ==,10050469,2019-05-29T18:49:37Z,2019-05-29T18:49:37Z,MEMBER,"> This takes about a minute to open for me. It took me much longer earlier this week when I tried :roll_eyes: Is the bottleneck in the parsing of the coordinates?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,449706080 https://github.com/pydata/xarray/issues/2995#issuecomment-497038453,https://api.github.com/repos/pydata/xarray/issues/2995,497038453,MDEyOklzc3VlQ29tbWVudDQ5NzAzODQ1Mw==,1197350,2019-05-29T17:42:45Z,2019-05-29T17:42:45Z,MEMBER,"Forget about zarr for a minute. Let's stick with the original goal of remote access to netcdf4 files in S3. You can use [s3fs](https://s3fs.readthedocs.io/en/latest/) (or [gcsfs](https://gcsfs.readthedocs.io/en/latest/)) for this. ```python import xarray as xr import s3fs fs_s3 = s3fs.S3FileSystem(anon=True) s3path = 'era5-pds/2008/01/data/air_temperature_at_2_metres.nc' remote_file_obj = fs_s3.open(s3path, mode='rb') ds = xr.open_dataset(remote_file_obj, engine='h5netcdf') ``` ``` Dimensions: (lat: 640, lon: 1280, time0: 744) Coordinates: * lon (lon) float32 0.0 0.2812494 ... 359.718 * lat (lat) float32 89.784874 89.5062 ... -89.784874 * time0 (time0) datetime64[ns] 2008-01-01T07:00:00 ... 2008-02-01T06:00:00 Data variables: air_temperature_at_2_metres (time0, lat, lon) float32 ... Attributes: source: Reanalysis institution: ECMWF title: ""ERA5 forecasts"" history: Wed Jul 4 22:08:50 2018: ncatted /data.e1/wrk/s3_out_in/20... ``` This takes about a minute to open for me. I have not tried writing, but this is perhaps a starting point. If you are unsatisfied by the performance of netcdf4 on cloud, I would indeed encourage you to investigate zarr. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,449706080