id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 332471780,MDU6SXNzdWUzMzI0NzE3ODA=,2233,Problem opening unstructured grid ocean forecasts with 4D vertical coordinates,1872600,closed,0,,,15,2018-06-14T16:15:56Z,2023-07-19T18:25:35Z,2023-07-19T18:25:35Z,NONE,,,,"We can't open the IOOS New England triangular mesh ocean forecasts with Xarray because it doesn't understand their more complex CF vertical coordinate system. ```python import xarray as xr url='http://www.smast.umassd.edu:8080/thredds/dodsC/FVCOM/NECOFS/Forecasts/NECOFS_GOM3_FORECAST.nc' xr.open_dataset(url) ``` fails with: ``` MissingDimensionsError: 'siglay' has more than 1-dimension and the same name as one of its dimensions ('siglay', 'node'). xarray disallows such variables because they conflict with the coordinates used to label dimensions. ``` If you open this dataset with `nc = netCDF4.Dataset(url)` you can see what the data variables (e.g. `temp`) and coordinates (e.g. `siglay`) look like: ``` print(nc['temp']) float32 temp(time, siglay, node) long_name: temperature standard_name: sea_water_potential_temperature units: degrees_C coordinates: time siglay lat lon type: data coverage_content_type: modelResult mesh: fvcom_mesh location: node unlimited dimensions: time current shape = (145, 40, 53087) print(nc['siglay']) float32 siglay(siglay, node) long_name: Sigma Layers standard_name: ocean_sigma_coordinate positive: up valid_min: -1.0 valid_max: 0.0 formula_terms: sigma: siglay eta: zeta depth: h unlimited dimensions: current shape = (40, 53087) ``` So the `siglay` variable in this dataset specifies the fraction of water column contained in the layer and because this fraction changes over the grid, it has dimensions of `siglay` and `node`. The variable `siglay` is just one of the variables used in the calculation of this CF-compliant vertical coordinate. The actual vertical coordinate (after computation via `formula_terms`) ends up being 4D. While we understand that there is no way to represent the vertical coordinate with a one-dimensional coordinate that xarray would like, it would be nice if there way to at least load the variable array data like `temp` into xarray. We tried: ``` ds = xr.open_dataset(url,decode_times=False, decode_coords=False, decode_cf=False) ``` and we get the same error. Is there any workaround for this?
``` --------------------------------------------------------------------------- MissingDimensionsError Traceback (most recent call last) in () ----> 1 xr.open_dataset(url) ~/miniconda3/envs/pangeo/lib/python3.6/site-packages/xarray/backends/api.py in open_dataset(filename_or_obj, group, decode_cf, mask_and_scale, decode_times, autoclose, concat_characters, decode_coords, engine, chunks, lock, cache, drop_variables, backend_kwargs) 344 lock = _default_lock(filename_or_obj, engine) 345 with close_on_error(store): --> 346 return maybe_decode_store(store, lock) 347 else: 348 if engine is not None and engine != 'scipy': ~/miniconda3/envs/pangeo/lib/python3.6/site-packages/xarray/backends/api.py in maybe_decode_store(store, lock) 256 store, mask_and_scale=mask_and_scale, decode_times=decode_times, 257 concat_characters=concat_characters, decode_coords=decode_coords, --> 258 drop_variables=drop_variables) 259 260 _protect_dataset_variables_inplace(ds, cache) ~/miniconda3/envs/pangeo/lib/python3.6/site-packages/xarray/conventions.py in decode_cf(obj, concat_characters, mask_and_scale, decode_times, decode_coords, drop_variables) 428 vars, attrs, concat_characters, mask_and_scale, decode_times, 429 decode_coords, drop_variables=drop_variables) --> 430 ds = Dataset(vars, attrs=attrs) 431 ds = ds.set_coords(coord_names.union(extra_coords).intersection(vars)) 432 ds._file_obj = file_obj ~/miniconda3/envs/pangeo/lib/python3.6/site-packages/xarray/core/dataset.py in __init__(self, data_vars, coords, attrs, compat) 363 coords = {} 364 if data_vars is not None or coords is not None: --> 365 self._set_init_vars_and_dims(data_vars, coords, compat) 366 if attrs is not None: 367 self.attrs = attrs ~/miniconda3/envs/pangeo/lib/python3.6/site-packages/xarray/core/dataset.py in _set_init_vars_and_dims(self, data_vars, coords, compat) 381 382 variables, coord_names, dims = merge_data_and_coords( --> 383 data_vars, coords, compat=compat) 384 385 self._variables = variables ~/miniconda3/envs/pangeo/lib/python3.6/site-packages/xarray/core/merge.py in merge_data_and_coords(data, coords, compat, join) 363 indexes = dict(extract_indexes(coords)) 364 return merge_core(objs, compat, join, explicit_coords=explicit_coords, --> 365 indexes=indexes) 366 367 ~/miniconda3/envs/pangeo/lib/python3.6/site-packages/xarray/core/merge.py in merge_core(objs, compat, join, priority_arg, explicit_coords, indexes) 433 coerced = coerce_pandas_values(objs) 434 aligned = deep_align(coerced, join=join, copy=False, indexes=indexes) --> 435 expanded = expand_variable_dicts(aligned) 436 437 coord_names, noncoord_names = determine_coords(coerced) ~/miniconda3/envs/pangeo/lib/python3.6/site-packages/xarray/core/merge.py in expand_variable_dicts(list_of_variable_dicts) 209 var_dicts.append(coords) 210 --> 211 var = as_variable(var, name=name) 212 sanitized_vars[name] = var 213 ~/miniconda3/envs/pangeo/lib/python3.6/site-packages/xarray/core/variable.py in as_variable(obj, name) 112 'dimensions %r. xarray disallows such variables because they ' 113 'conflict with the coordinates used to label ' --> 114 'dimensions.' % (name, obj.dims)) 115 obj = obj.to_index_variable() 116 MissingDimensionsError: 'siglay' has more than 1-dimension and the same name as one of its dimensions ('siglay', 'node'). xarray disallows such variables because they conflict with the coordinates used to label dimensions. ```
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2233/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 631085856,MDU6SXNzdWU2MzEwODU4NTY=,4122,Document writing netcdf from xarray directly to S3,1872600,open,0,,,24,2020-06-04T19:20:54Z,2023-03-03T18:12:01Z,,NONE,,,,"I'm trying to write a netcdf file directly from xarray to S3 object storage. I'm wondering: 1. Why writing NetCDF files requires a ""seek"" 2. Why the `scipy` engine is getting used instead of the specified `netcdf4` engine. 3. If there are nice workarounds (besides writing the NetCDF file locally, then using the AWS CLI to transfer to S3) #### Code sample: ```python import fsspec import xarray as xr ds = xr.open_dataset('http://geoport.usgs.esipfed.org/thredds/dodsC' '/silt/usgs/Projects/stellwagen/CF-1.6/BUZZ_BAY/2651-A.cdf') outfile = fsspec.open('s3://chs-pangeo-data-bucket/rsignell/test.nc', mode='wb', profile='default') with outfile as f: ds.to_netcdf(f, engine='netcdf4') ``` which produces: ```python-traceback --------------------------------------------------------------------------- OSError Traceback (most recent call last) in 2 mode='wb', profile='default') 3 with outfile as f: ----> 4 ds.to_netcdf(f, engine='netcdf4') /srv/conda/envs/pangeo/lib/python3.7/site-packages/xarray/core/dataset.py in to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute, invalid_netcdf) 1552 unlimited_dims=unlimited_dims, 1553 compute=compute, -> 1554 invalid_netcdf=invalid_netcdf, 1555 ) 1556 /srv/conda/envs/pangeo/lib/python3.7/site-packages/xarray/backends/api.py in to_netcdf(dataset, path_or_file, mode, format, group, engine, encoding, unlimited_dims, compute, multifile, invalid_netcdf) 1102 finally: 1103 if not multifile and compute: -> 1104 store.close() 1105 1106 if not compute: /srv/conda/envs/pangeo/lib/python3.7/site-packages/xarray/backends/scipy_.py in close(self) 221 222 def close(self): --> 223 self._manager.close() /srv/conda/envs/pangeo/lib/python3.7/site-packages/xarray/backends/file_manager.py in close(***failed resolving arguments***) 331 def close(self, needs_lock=True): 332 del needs_lock # ignored --> 333 self._value.close() /srv/conda/envs/pangeo/lib/python3.7/site-packages/scipy/io/netcdf.py in close(self) 297 if hasattr(self, 'fp') and not self.fp.closed: 298 try: --> 299 self.flush() 300 finally: 301 self.variables = OrderedDict() /srv/conda/envs/pangeo/lib/python3.7/site-packages/scipy/io/netcdf.py in flush(self) 407 """""" 408 if hasattr(self, 'mode') and self.mode in 'wa': --> 409 self._write() 410 sync = flush 411 /srv/conda/envs/pangeo/lib/python3.7/site-packages/scipy/io/netcdf.py in _write(self) 411 412 def _write(self): --> 413 self.fp.seek(0) 414 self.fp.write(b'CDF') 415 self.fp.write(array(self.version_byte, '>b').tostring()) /srv/conda/envs/pangeo/lib/python3.7/site-packages/fsspec/spec.py in seek(self, loc, whence) 1122 loc = int(loc) 1123 if not self.mode == ""rb"": -> 1124 raise OSError(""Seek only available in read mode"") 1125 if whence == 0: 1126 nloc = loc OSError: Seek only available in read mode ```
Output of xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 | packaged by conda-forge | (default, Mar 23 2020, 23:03:20) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 4.14.138-114.102.amzn2.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: C.UTF-8 LANG: C.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.4 xarray: 0.15.1 pandas: 1.0.3 numpy: 1.18.1 scipy: 1.4.1 netCDF4: 1.5.3 pydap: installed h5netcdf: 0.8.0 h5py: 2.10.0 Nio: None zarr: 2.4.0 cftime: 1.1.1.2 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: 1.1.3 cfgrib: None iris: 2.4.0 bottleneck: None dask: 2.14.0 distributed: 2.14.0 matplotlib: 3.2.1 cartopy: 0.17.0 seaborn: None numbagg: None setuptools: 46.1.3.post20200325 pip: 20.1 conda: None pytest: 5.4.1 IPython: 7.13.0 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4122/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,reopened,13221727,issue 1157163377,I_kwDOAMm_X85E-Olx,6318,'numpy.datetime64' object has no attribute 'year' when writing to zarr or netcdf,1872600,open,0,,,3,2022-03-02T12:59:22Z,2022-03-16T04:23:25Z,,NONE,,,,"### What happened? I have a [reproducible notebook](https://nbviewer.org/gist/rsignell-usgs/029b39f0c428b07914f5a6b1129da572) where I've loaded a `referenceFileSystem` dataset into xarray and everything seems fine with time being understood correctly, but when I try to save a subset to zarr or netcdf, I get: ``` numpy.datetime64' object has no attribute 'year' ``` I don't understand this since it seems time is always a `datetime64` object in xarray, and I've never had this problem before. ### What did you expect to happen? Expected the file to be written as usual without error. ### Minimal Complete Verifiable Example ```Python https://nbviewer.org/gist/rsignell-usgs/029b39f0c428b07914f5a6b1129da572 ``` ### Relevant log output _No response_ ### Anything else we need to know? I asked the question first over at https://github.com/fsspec/kerchunk/issues/130 and @martindurant thought this looked like an xarray issue, not a kerchunk issue. ### Environment INSTALLED VERSIONS ------------------ commit: None python: 3.9.10 | packaged by conda-forge | (main, Feb 1 2022, 21:24:11) [GCC 9.4.0] python-bits: 64 OS: Linux OS-release: 4.12.14-150.17_5.0.85-cray_ari_c machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.1 libnetcdf: 4.8.1 xarray: 0.21.0 pandas: 1.4.0 numpy: 1.21.5 scipy: 1.7.3 netCDF4: 1.5.8 pydap: None h5netcdf: 0.13.1 h5py: 3.6.0 Nio: None zarr: 2.10.3 cftime: 1.5.2 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.10 cfgrib: 0.9.10.0 iris: None bottleneck: 1.3.2 dask: 2021.12.0 distributed: 2021.12.0 matplotlib: 3.5.1 cartopy: 0.20.2 seaborn: None numbagg: None fsspec: 2022.01.0 cupy: None pint: 0.18 sparse: 0.13.0 setuptools: 59.8.0 pip: 22.0.2 conda: 4.11.0 pytest: None IPython: 8.0.1 sphinx: 4.4.0 ​","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6318/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 485988536,MDU6SXNzdWU0ODU5ODg1MzY=,3269,Accessing COG overviews with read_rasterio,1872600,closed,0,,,3,2019-08-27T19:21:07Z,2021-07-30T07:09:14Z,2021-07-30T07:09:14Z,NONE,,,,"[It's considered best practice to create cloud-optimized geotiff (COG) with overviews](https://medium.com/@_VincentS_/do-you-really-want-people-using-your-data-ec94cd94dc3f), which are essentially copies of the dataset at different resolutions to allow fast representation at different scales. It would be nice if we could pick a specific overview using `xr.read_rasterio`, perhaps by just an additional parameter on the call, like `xr.read_rasterio(url, overview=512)` or something. Currently we need to use Rasterio to find out what the overlays are, for example: ```python url = 'https://esip-pangeo-uswest2.s3-us-west-2.amazonaws.com/sciencebase/Southern_California_Topobathy_DEM_1m_cog.tif' src = rasterio.open(url, 'r') [src.overviews(i) for i in src.indexes] ``` which results in: ``` [[4, 8, 16, 32, 64, 128, 256, 512, 1023]] ``` See the notebook here for the hack workaround to extract an overlay from COG: https://nbviewer.jupyter.org/gist/rsignell-usgs/dc4cf94fae97d085f6f2b9b896ec5336","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3269/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 345354038,MDU6SXNzdWUzNDUzNTQwMzg=,2323,znetcdf: h5netcdf analog for zarr? ,1872600,closed,0,,,4,2018-07-27T20:17:58Z,2020-07-28T06:25:33Z,2020-07-28T06:25:33Z,NONE,,,,"I've been making good use of the zarr backend via `xr.open_zarr`, but with the National Water Model datasets I've been working with lately, it would be really handy to be able to open them with `xr.open_mfdataset(filelist, engine='znetcdf')` so that we could more easily create xarray datasets from the assemblage of overlapping forecast data files. So just as we have [h5netcdf](https://github.com/shoyer/h5netcdf) ""Pythonic interface to netCDF4 via h5py"", we could have ""z5netcdf"", a ""Pythonic interface to netCDF4 via Zarr"". I see that @shoyer previously had this idea: https://github.com/pydata/xarray/issues/1223#issuecomment-274230041 and also @rabernat https://github.com/pydata/xarray/pull/1528#issuecomment-325226495 thinks it would be a good idea, so I'm just piling on from a user perspective! ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2323/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 497823072,MDU6SXNzdWU0OTc4MjMwNzI=,3339,Version 0.13 broke my ufunc,1872600,closed,0,,,5,2019-09-24T17:25:09Z,2019-09-24T20:32:40Z,2019-09-24T19:56:17Z,NONE,,,,"This simple xarray ufunc to calculate wind speed worked under `xarray=0.12.3`: ```python import xarray as xr url = 'http://thredds.ucar.edu/thredds/dodsC/grib/NCEP/HRRR/CONUS_2p5km/Best' ds = xr.open_dataset(url, chunks={'time1':1}) windspeed = xr.ufuncs.sqrt(ds['u-component_of_wind_height_above_ground']**2 + ds['v-component_of_wind_height_above_ground']**2) ``` but with `xarray=0.13.0`, `merge_variables` fails and dumps this traceback: ```python-traceback --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in ----> 1 windspeed = xr.ufuncs.sqrt(ds['u-component_of_wind_height_above_ground']**2 + ds['v-component_of_wind_height_above_ground']**2) /srv/conda/envs/notebook/lib/python3.7/site-packages/xarray/core/dataarray.py in func(self, other) 2495 else f(other_variable, self.variable) 2496 ) -> 2497 coords = self.coords._merge_raw(other_coords) 2498 name = self._result_name(other) 2499 /srv/conda/envs/notebook/lib/python3.7/site-packages/xarray/core/coordinates.py in _merge_raw(self, other) 128 else: 129 # don't align because we already called xarray.align --> 130 variables = expand_and_merge_variables([self.variables, other.variables]) 131 return variables 132 /srv/conda/envs/notebook/lib/python3.7/site-packages/xarray/core/merge.py in expand_and_merge_variables(objs, priority_arg) 380 expanded = expand_variable_dicts(objs) 381 priority_vars = _get_priority_vars(objs, priority_arg) --> 382 variables = merge_variables(expanded, priority_vars) 383 return variables 384 /srv/conda/envs/notebook/lib/python3.7/site-packages/xarray/core/merge.py in merge_variables(list_of_variables_dicts, priority_vars, compat) 202 else: 203 try: --> 204 merged[name] = unique_variable(name, var_list, compat) 205 except MergeError: 206 if compat != ""minimal"": /srv/conda/envs/notebook/lib/python3.7/site-packages/xarray/core/merge.py in unique_variable(name, variables, compat, equals) 116 out = out.compute() 117 for var in variables[1:]: --> 118 equals = getattr(out, compat)(var) 119 if not equals: 120 break /srv/conda/envs/notebook/lib/python3.7/site-packages/xarray/core/variable.py in broadcast_equals(self, other, equiv) 1574 except (ValueError, AttributeError): 1575 return False -> 1576 return self.equals(other, equiv=equiv) 1577 1578 def identical(self, other): /srv/conda/envs/notebook/lib/python3.7/site-packages/xarray/core/variable.py in equals(self, other, equiv) 1558 try: 1559 return self.dims == other.dims and ( -> 1560 self._data is other._data or equiv(self.data, other.data) 1561 ) 1562 except (TypeError, AttributeError): /srv/conda/envs/notebook/lib/python3.7/site-packages/xarray/core/duck_array_ops.py in array_equiv(arr1, arr2) 200 with warnings.catch_warnings(): 201 warnings.filterwarnings(""ignore"", ""In the future, 'NAT == x'"") --> 202 flag_array = (arr1 == arr2) | (isnull(arr1) & isnull(arr2)) 203 return bool(flag_array.all()) 204 /srv/conda/envs/notebook/lib/python3.7/site-packages/dask/array/core.py in __eq__(self, other) 1740 1741 def __eq__(self, other): -> 1742 return elemwise(operator.eq, self, other) 1743 1744 def __gt__(self, other): /srv/conda/envs/notebook/lib/python3.7/site-packages/dask/array/core.py in elemwise(op, *args, **kwargs) 3765 for a in args 3766 ), -> 3767 **blockwise_kwargs 3768 ) 3769 /srv/conda/envs/notebook/lib/python3.7/site-packages/dask/array/blockwise.py in blockwise(func, out_ind, name, token, dtype, adjust_chunks, new_axes, align_arrays, concatenate, meta, *args, **kwargs) 143 144 if align_arrays: --> 145 chunkss, arrays = unify_chunks(*args) 146 else: 147 arginds = [(a, i) for (a, i) in toolz.partition(2, args) if i is not None] /srv/conda/envs/notebook/lib/python3.7/site-packages/dask/array/core.py in unify_chunks(*args, **kwargs) 3034 3035 arginds = [ -> 3036 (asanyarray(a) if ind is not None else a, ind) for a, ind in partition(2, args) 3037 ] # [x, ij, y, jk] 3038 args = list(concat(arginds)) # [(x, ij), (y, jk)] /srv/conda/envs/notebook/lib/python3.7/site-packages/dask/array/core.py in (.0) 3034 3035 arginds = [ -> 3036 (asanyarray(a) if ind is not None else a, ind) for a, ind in partition(2, args) 3037 ] # [x, ij, y, jk] 3038 args = list(concat(arginds)) # [(x, ij), (y, jk)] /srv/conda/envs/notebook/lib/python3.7/site-packages/dask/array/core.py in asanyarray(a) 3609 elif hasattr(a, ""to_dask_array""): 3610 return a.to_dask_array() -> 3611 elif hasattr(a, ""data"") and type(a).__module__.startswith(""xarray.""): 3612 return asanyarray(a.data) 3613 elif isinstance(a, (list, tuple)) and any(isinstance(i, Array) for i in a): ValueError: cannot include dtype 'M' in a buffer ``` Is this a bug or a feature that I should be handling differently in my code? #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 21:52:21) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 4.14.138-114.102.amzn2.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.6.2 xarray: 0.13.0 pandas: 0.24.2 numpy: 1.17.2 scipy: 1.3.1 netCDF4: 1.5.1.2 pydap: installed h5netcdf: 0.7.4 h5py: 2.10.0 Nio: None zarr: 2.3.2 cftime: 1.0.3.4 nc_time_axis: None PseudoNetCDF: None rasterio: 1.0.28 cfgrib: None iris: 2.2.0 bottleneck: None dask: 2.2.0 distributed: 2.2.0 matplotlib: 3.1.1 cartopy: 0.17.0 seaborn: None numbagg: None setuptools: 41.2.0 pip: 19.2.3 conda: None pytest: None IPython: 7.8.0 sphinx: None ​
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3339/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 314326128,MDU6SXNzdWUzMTQzMjYxMjg=,2057,Problem reading dtype=S64 with open_zarr ,1872600,closed,0,,,1,2018-04-14T12:42:52Z,2018-04-30T17:17:27Z,2018-04-30T17:17:27Z,NONE,,,,"@jhamman suggested I raise [this SO question](https://stackoverflow.com/questions/49756981/round-tripping-zarr-data-from-xarray/49831196#49831196) as an issue here. I have a dataset that looks like: ``` Dimensions: (nv: 2, reference_time: 746, time: 746, x: 4608, y: 3840) Coordinates: * reference_time (reference_time) datetime64[ns] 2018-03-07 ... * x (x) float64 -2.304e+06 -2.303e+06 -2.302e+06 ... * y (y) float64 -1.92e+06 -1.919e+06 -1.918e+06 ... * time (time) datetime64[ns] 2018-03-07T01:00:00 ... Dimensions without coordinates: nv Data variables: time_bounds (time, nv) datetime64[ns] dask.array ProjectionCoordinateSystem (time) |S64 b'' b'' b'' b'' b'' b'' b'' b'' ... T2D (time, y, x) float64 dask.array ``` When writing this dataset using `ds.to_zarr` containing the `ProjectionCoordinateSystem` variable with `dtype=S64` , there can be an issue reading it using `ds.open_zarr` with the default `auto_chunk=True`. This example illustrates the problem: ```python import xarray as xr import s3fs f_zarr = 'rsignell/nwm/test02' fs = s3fs.S3FileSystem(anon=False) d = s3fs.S3Map(f_zarr, s3=fs) xr.open_zarr(d) ``` returning ```python-traceback --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () 2 fs = s3fs.S3FileSystem(anon=False) 3 d = s3fs.S3Map(f_zarr, s3=fs) ----> 4 xr.open_zarr(d) /opt/conda/lib/python3.6/site-packages/xarray/backends/zarr.py in open_zarr(store, group, synchronizer, auto_chunk, decode_cf, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables) 476 477 variables = OrderedDict([(k, maybe_chunk(k, v)) --> 478 for k, v in ds.variables.items()]) 479 return ds._replace_vars_and_dims(variables) 480 else: /opt/conda/lib/python3.6/site-packages/xarray/backends/zarr.py in (.0) 476 477 variables = OrderedDict([(k, maybe_chunk(k, v)) --> 478 for k, v in ds.variables.items()]) 479 return ds._replace_vars_and_dims(variables) 480 else: /opt/conda/lib/python3.6/site-packages/xarray/backends/zarr.py in maybe_chunk(name, var) 471 token2 = tokenize(name, var._data) 472 name2 = 'zarr-%s' % token2 --> 473 return var.chunk(chunks, name=name2, lock=None) 474 else: 475 return var /opt/conda/lib/python3.6/site-packages/xarray/core/variable.py in chunk(self, chunks, name, lock) 820 data = indexing.ImplicitToExplicitIndexingAdapter( 821 data, indexing.OuterIndexer) --> 822 data = da.from_array(data, chunks, name=name, lock=lock) 823 824 return type(self)(self.dims, data, self._attrs, self._encoding, /opt/conda/lib/python3.6/site-packages/dask/array/core.py in from_array(x, chunks, name, lock, asarray, fancy, getitem) 1988 >>> a = da.from_array(x, chunks=(1000, 1000), lock=True) # doctest: +SKIP 1989 """""" -> 1990 chunks = normalize_chunks(chunks, x.shape) 1991 if name in (None, True): 1992 token = tokenize(x, chunks) /opt/conda/lib/python3.6/site-packages/dask/array/core.py in normalize_chunks(chunks, shape) 1918 raise ValueError( 1919 ""Chunks and shape must be of the same length/dimension. "" -> 1920 ""Got chunks=%s, shape=%s"" % (chunks, shape)) 1921 1922 if shape is not None: ValueError: Chunks and shape must be of the same length/dimension. Got chunks=(3, 64), shape=(3,) ``` The full notebook is at https://gist.github.com/rsignell-usgs/dce09aae4f7cd174a141247a56ddea2c","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2057/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 105688738,MDU6SXNzdWUxMDU2ODg3Mzg=,567,Best way to find data variables by standard_name,1872600,closed,0,,,6,2015-09-09T21:32:02Z,2016-08-03T17:53:42Z,2016-08-03T17:53:42Z,NONE,,,,"Is there a way to return the data variables that match a specified `standard_name`? I came up with this, but maybe the functionality already exists or there is a better way. ``` def get_std_name_vars(ds,std_name): return {k: v for k, v in ds.data_vars.iteritems() if 'standard_name' in v.attrs.keys() and std_name in v.standard_name} ``` as in this example: http://nbviewer.ipython.org/gist/rsignell-usgs/5b263906e92ce47bf05e ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/567/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 95222803,MDU6SXNzdWU5NTIyMjgwMw==,476,to_netcdf failing for datasets with a single time value,1872600,closed,0,,,2,2015-07-15T15:26:02Z,2015-07-16T02:17:31Z,2015-07-16T02:17:31Z,NONE,,,,"In this notebook: http://nbviewer.ipython.org/gist/rsignell-usgs/047235496029529585cc, the `ds.to_netcdf` method in cell [12] is failing for this dataset with a single time value: ``` /home/usgs/miniconda/envs/ioos/lib/python2.7/site-packages/xray/conventions.pyc in infer_datetime_units(dates) 185 unique_timedeltas = np.unique(np.diff(dates[pd.notnull(dates)])) 186 units = _infer_time_units_from_diff(unique_timedeltas) --> 187 return '%s since %s' % (units, pd.Timestamp(dates[0])) 188 189 IndexError: too many indices for array ``` When there is a single value, I guess `dates` must not be an array. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/476/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue