html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/2368#issuecomment-1382671908,https://api.github.com/repos/pydata/xarray/issues/2368,1382671908,IC_kwDOAMm_X85SaeYk,35968931,2023-01-14T06:10:39Z,2023-01-14T06:10:39Z,MEMBER,"@ronygolderku thanks for your example. Looks like it fails for the [same reason as was mentioned](https://github.com/pydata/xarray/issues/2368#issuecomment-1006639506) for some of the other examples above.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,350899839
https://github.com/pydata/xarray/issues/2368#issuecomment-1320994455,https://api.github.com/repos/pydata/xarray/issues/2368,1320994455,IC_kwDOAMm_X85OvMaX,13301940,2022-11-19T23:53:57Z,2022-11-19T23:54:43Z,MEMBER,"@maxaragon, i'm curious. what version of xarray/netcdf4 are you using? i'm asking because this appears to be working fine on my end
```python
In [1]: import xarray as xr
In [2]: ds = xr.open_dataset(""20200825_hyytiala_icon-iglo-12-23.nc"")
In [3]: ds
Out[3]:
Dimensions: (time: 25, level: 90, flux_level: 91,
frequency: 2, soil_level: 9)
Coordinates:
* time (time) datetime64[ns] 2020-08-25 ... 2020-0...
* level (level) float32 90.0 89.0 88.0 ... 3.0 2.0 1.0
* flux_level (flux_level) float32 91.0 90.0 ... 2.0 1.0
* frequency (frequency) float32 34.96 94.0
Dimensions without coordinates: soil_level
Data variables: (12/62)
latitude float32 ...
longitude float32 ...
altitude float32 ...
horizontal_resolution float32 ...
forecast_time (time) timedelta64[ns] ...
height (time, level) float32 ...
... ...
gas_atten (frequency, time, level) float32 ...
specific_gas_atten (frequency, time, level) float32 ...
specific_saturated_gas_atten (frequency, time, level) float32 ...
specific_dry_gas_atten (frequency, time, level) float32 ...
K2 (frequency, time, level) float32 ...
specific_liquid_atten (frequency, time, level) float32 ...
Attributes: (12/13)
institution: Max Planck Institute for Meteorology/Deutscher Wette...
references: see MPIM/DWD publications
source: svn://xceh.dwd.de/for0adm/SVN_icon/tags/icon-2.6.0-n...
Conventions: CF-1.7
location: hyytiala
file_uuid: ace15f8ba477497c8d1dd0833b5ac674
... ...
year: 2020
month: 08
day: 25
history: 2021-01-25 08:24:29 - File content harmonized by the...
title: Model file from Hyytiala
pid: https://hdl.handle.net/21.12132/1.ace15f8ba477497c
```
here are the versions i'm using
```python
In [4]: xr.show_versions()
/Users/andersy005/mambaforge/envs/playground/lib/python3.10/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils.
warnings.warn(""Setuptools is replacing distutils."")
INSTALLED VERSIONS
------------------
commit: None
python: 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:41:22) [Clang 13.0.1 ]
python-bits: 64
OS: Darwin
OS-release: 22.1.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.8.1
xarray: 2022.10.0
pandas: 1.5.1
numpy: 1.23.4
scipy: 1.9.3
netCDF4: 1.6.1
pydap: installed
h5netcdf: 1.0.2
h5py: 3.7.0
Nio: None
zarr: 2.13.3
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2022.10.2
distributed: 2022.10.2
matplotlib: 3.6.1
cartopy: None
seaborn: 0.12.0
numbagg: None
fsspec: 2022.10.0
cupy: None
pint: 0.20.1
sparse: None
flox: None
numpy_groupies: None
setuptools: 65.5.0
pip: 22.3
conda: None
pytest: None
IPython: 8.6.0
sphinx: None
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,350899839
https://github.com/pydata/xarray/issues/2368#issuecomment-1006639506,https://api.github.com/repos/pydata/xarray/issues/2368,1006639506,IC_kwDOAMm_X848ABmS,4160723,2022-01-06T14:36:12Z,2022-01-06T14:36:12Z,MEMBER,"@TomNicholas yes with the explicit index refactor we should be able to relax the 1D coordinate / dimension matching name constraint in the Xarray data model.
> I'm sure there are some cases internally where we currently rely on this assumption, but it should be relatively easy to relax.
I also initially thought it would be easy to relax, but I'm not so sure anymore. I don't think it is a hard task, but it might still require some fair amount of work. I've already refactored a bunch of such internal cases in #5692, but there's a good chance that some (not sure how many) cases will still need a fix.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,350899839
https://github.com/pydata/xarray/issues/2368#issuecomment-1005916151,https://api.github.com/repos/pydata/xarray/issues/2368,1005916151,IC_kwDOAMm_X8479Q_3,35968931,2022-01-05T17:14:35Z,2022-01-05T17:14:35Z,MEMBER,"> Currently, xarray requires that variables with a name matching a dimension are 1D variables along that dimension, e.g.,
```python
for dim in dataset.dims:
if dim in dataset.variables:
assert dataset.variables[dim].dims == (dim,)
```
> I agree that this unnecessarily complicates our data model. There's no particular advantage to this invariant, besides removing the need to check the dimensions of variables used for indexing lookups. I'm sure there are some cases internally where we currently rely on this assumption, but it should be relatively easy to relax.
> It seems like this relaxation is compatible with the refactoring of indexes.
@benbovy will the explicit indexes refactor fix this case?
> This is mentioned elsewhere (can't find the issue right now) and may be out of scope for this issue but I'm going to say it anyway: opening a NetCDF file with groups was not as easy as I wanted it to be when first starting out with xarray.
@djhoese For anything to do with opening netCDF files with groups see #4118 and the linked issues from there.
If people have example of other weird cases involving groups (like groups within themselves or anything like that) then I would be interested to have those files to test with!
","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,350899839
https://github.com/pydata/xarray/issues/2368#issuecomment-785334802,https://api.github.com/repos/pydata/xarray/issues/2368,785334802,MDEyOklzc3VlQ29tbWVudDc4NTMzNDgwMg==,2448579,2021-02-24T19:58:16Z,2021-02-24T19:58:16Z,MEMBER,"Clearly we can detect this failure, so shall we rename the `date` dimension to `date_` in this example? We can raise a warning saying round-tripping will not work for such datasets","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,350899839
https://github.com/pydata/xarray/issues/2368#issuecomment-443305634,https://api.github.com/repos/pydata/xarray/issues/2368,443305634,MDEyOklzc3VlQ29tbWVudDQ0MzMwNTYzNA==,1197350,2018-11-30T19:03:07Z,2018-11-30T19:03:07Z,MEMBER,"We are working on fixing this in #2405. That PR (mine) has most of the basic functionality there, but it still needs more testing. Unfortunately, I don't have bandwidth right now to complete the required work.
If anyone here needs this fixed urgently and actually has time to work on it, I encourage you to pick up that PR and try to finish it off. We will be happy to provide help and support along the way.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,350899839
https://github.com/pydata/xarray/issues/2368#issuecomment-419212304,https://api.github.com/repos/pydata/xarray/issues/2368,419212304,MDEyOklzc3VlQ29tbWVudDQxOTIxMjMwNA==,1217238,2018-09-06T19:24:05Z,2018-09-06T19:24:05Z,MEMBER,"> Or no index at all?
This would be my inclination (for the default behavior). It would mean that you could not longer count on anyways being able to do labeled indexing along each dimension, but in the broader scheme of things I don't think that's a big deal.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,350899839
https://github.com/pydata/xarray/issues/2368#issuecomment-419207959,https://api.github.com/repos/pydata/xarray/issues/2368,419207959,MDEyOklzc3VlQ29tbWVudDQxOTIwNzk1OQ==,1197350,2018-09-06T19:08:06Z,2018-09-06T19:08:06Z,MEMBER,"It seems like this relaxation is compatible with the refactoring of indexes. Right now, we automatically create 1D indexes for all coordinate variables. The problem with 2D dimensions is that such indexes don't make sense:
```
data.sel(y=3.14)
```
But maybe we could turn multi-dimensional coordinate variables into multi-indexes? Or no index at all? In any case, we could still do
```
data.isel(y=3)
```
i.e. logical indexing on the dimension axis.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,350899839
https://github.com/pydata/xarray/issues/2368#issuecomment-419202871,https://api.github.com/repos/pydata/xarray/issues/2368,419202871,MDEyOklzc3VlQ29tbWVudDQxOTIwMjg3MQ==,1217238,2018-09-06T18:51:11Z,2018-09-06T18:51:11Z,MEMBER,"Currently, xarray requires that variables with a name matching a dimension are 1D variables along that dimension, e.g.,
```python
for dim in dataset.dims:
if dim in dataset.variables:
assert dataset.variables[dim].dims == (dim,)
```
I agree that this unnecessarily complicates our data model. There's no particular advantage to this invariant, besides removing the need to check the dimensions of variables used for indexing lookups. I'm sure there are some cases internally where we currently rely on this assumption, but it should be relatively easy to relax.","{""total_count"": 4, ""+1"": 4, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,350899839
https://github.com/pydata/xarray/issues/2368#issuecomment-419188538,https://api.github.com/repos/pydata/xarray/issues/2368,419188538,MDEyOklzc3VlQ29tbWVudDQxOTE4ODUzOA==,1197350,2018-09-06T18:05:00Z,2018-09-06T18:05:00Z,MEMBER,"Perhaps part of the confusion is simply that `y` has different meanings in different contexts. When used as a dimension (e.g. to ""define the array shape of a Variable"" in CDM terms), it is indeed 1D. When used as a variable (or ""CoordinateAxis""), it is 2D. XArray doesn't have a separate namespace for dimensions and variables.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,350899839
https://github.com/pydata/xarray/issues/2368#issuecomment-419187692,https://api.github.com/repos/pydata/xarray/issues/2368,419187692,MDEyOklzc3VlQ29tbWVudDQxOTE4NzY5Mg==,1197350,2018-09-06T18:02:19Z,2018-09-06T18:02:19Z,MEMBER,"@dopplershift - thanks for the clarifications! I agree that it's good for netCDF to be as open-ended as possible.
So I guess my quarrel is with the [CDM](https://www.unidata.ucar.edu/software/thredds/current/netcdf-java/CDM/). This is what it says about variables and dimensions:
> A Variable is a container for data. It has a DataType, a set of Dimensions that define its array shape, and optionally a set of Attributes. Any shared Dimension it uses must be in the same Group or a parent Group.
>
> A Dimension is used to define the array shape of a Variable. It may be shared among Variables, which provides a simple yet powerful way of associating Variables. When a Dimension is shared, it has a unique name within the Group. If unlimited, a Dimension's length may increase. If variableLength, then the actual length is data dependent, and can only be found by reading the data. A variableLength Dimension cannot be shared or unlimited.
then later
> A Variable can have zero or more Coordinate Systems containing one or more CoordinateAxis. A CoordinateAxis can only be part of a Variable's CoordinateSystem if the CoordinateAxis' set of Dimensions is a subset of the Variable's set of Dimensions. This ensures that every data point in the Variable has a corresponding coordinate value for each of the CoordinateAxis in the CoordinateSystem.
>
> A Coordinate System has one or more CoordinateAxis, and zero or more CoordinateTransforms.
>
> A CoordinateAxis is a subtype of Variable, and is optionally classified according to the types in AxisType.
>
> These are the rules which restrict which Variables can be used as Coordinate Axes:
>
> Shared Dimensions: All dimensions used by a Coordinate Axis must be shared with the data variable. When a variable is part of a Structure, the dimensions used by the parent Structure(s) are considered to be part of the nested Variable.
I have a very hard time understanding what all of this means. Can the same variable be a ""Dimension"" and a ""CoordinateAxis"" in CDM?
It seems much simpler to me to use the CF approach to describe the physical coordinates of the data using ""auxiliary coordinate variables"" and to keep the dimensions as purely 1D ""coordinate variables"".
> IMO, xarray is being overly pedantic here.
What would you like xarray to do with these datasets, given the fact that orthogonality of dimensions is central to its data model?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,350899839
https://github.com/pydata/xarray/issues/2368#issuecomment-419160841,https://api.github.com/repos/pydata/xarray/issues/2368,419160841,MDEyOklzc3VlQ29tbWVudDQxOTE2MDg0MQ==,1197350,2018-09-06T16:37:51Z,2018-09-06T16:37:51Z,MEMBER,"@djhoese - it would be great if you could track down a more specific example of the issue you are referring to.
Excluding this possible problem with groups, my assessment of the feedback above is that, actually, the only problem is #2233: we can't have multidimensional variables that are also their own dimensions. This is a good thing. It means we have a specific problem to fix.
Right now this is ok:
```
dimensions:
x = 4
y = 3
variables:
int x(x);
int y(y);
float data(y, x)
```
But this is not
```
dimensions:
x = 4
y = 3
variables:
int x(x);
float y(y, x);
float data(y, x)
```
Personally I find this to be an incredibly confusing, recursive use of the concept of ""dimensions"". For me, dimensions should be orthogonal. In the second example, `y` is a [non-dimension] coordinate, not a dimension! The actual dimension is implicit, some sort of logical `y_index`. I wish that CF / netCDF had never chosen to accept this as a valid schema. But I admit that perhaps my internal mental model is too wrapped up with xarray!
So the question is: what can we do about it?
I propose the following general outline:
- Create a new decoding function to effectively ""fix"" the recursively defined dimension by renaming `y(y, x)` into something like `y_coordinate(y, x)`
- Add a new option to `open_dataset` called `decode_recursive_dimension` which defaults to `False`
- Raise a more informative error when these types of datasets are encountered which suggests calling `open_dataset` with `decode_recursive_dimension=True`
Finally, we might want to raise this upstream with netCDF or CF conventions to try to understand better why this sort of schema is being encouraged.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,350899839
https://github.com/pydata/xarray/issues/2368#issuecomment-413545600,https://api.github.com/repos/pydata/xarray/issues/2368,413545600,MDEyOklzc3VlQ29tbWVudDQxMzU0NTYwMA==,10050469,2018-08-16T13:29:33Z,2018-08-16T13:29:33Z,MEMBER,The two examples by @dopplershift are the same problem as in https://github.com/pydata/xarray/issues/2233,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,350899839