id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 690624634,MDU6SXNzdWU2OTA2MjQ2MzQ=,4401,problem with time axis values in line plot,15570875,closed,0,,,4,2020-09-02T01:15:14Z,2021-10-23T16:27:32Z,2021-08-10T22:45:20Z,NONE,,,,"When I run the following code inside a jupyter notebook, the values on the x axis (`time`) in the generated plot, inserted after the code, appear to run from 0 to ~4. I expect them to run from 1 to 4, like the `time` values do. I can't tell if this is a problem with what `xarray` passes to `nc_time_axis`, or if it's a problem with `nc_time_axis` itself. Could this be looked into please? ``` import cftime import xarray as xr time_vals = [cftime.DatetimeNoLeap(1+year, 1+month, 15) for year in range(3) for month in range(12)] x_vals = [time_val.year + time_val.dayofyr / 365.0 for time_val in time_vals] x_da = xr.DataArray(x_vals, coords=[time_vals], dims=[""time""]) x_da.plot.line(""-o""); ``` ![image](https://user-images.githubusercontent.com/15570875/91920712-1ed16180-ec87-11ea-97d4-c46e38bf7698.png) **Environment**:
Output of xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.7.8 | packaged by conda-forge | (default, Jul 31 2020, 02:25:08) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 3.10.0-1127.13.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.16.0 pandas: 1.1.1 numpy: 1.19.1 scipy: 1.5.2 netCDF4: 1.5.4 pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.4.0 cftime: 1.2.1 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2.14.0 distributed: 2.14.0 matplotlib: 3.3.1 cartopy: 0.18.0 seaborn: 0.10.1 numbagg: None pint: 0.15 setuptools: 49.6.0.post20200814 pip: 20.2.2 conda: None pytest: 6.0.1 IPython: 7.17.0 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4401/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 535043825,MDU6SXNzdWU1MzUwNDM4MjU=,3606,confused by reference to dataset in docs for xarray.DataArray.copy,15570875,closed,0,,,2,2019-12-09T16:30:23Z,2020-07-24T19:20:45Z,2020-07-24T19:20:45Z,NONE,,,,"The documentation for [xarray.DataArray.copy](https://xarray.pydata.org/en/stable/generated/xarray.DataArray.copy.html) > If `deep=True`, a deep copy is made of the data array. Otherwise, a shallow copy is made, so each variable in the new array’s dataset is also a variable in this array’s dataset. I do not understand what dataset is being referred to here. In particular, there are no xarray datasets in the examples provided in this documentation. Could someone provide clarification?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3606/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 554376164,MDU6SXNzdWU1NTQzNzYxNjQ=,3718,losing shallowness of ds.copy() on ds from xr.open_dataset,15570875,open,0,,,1,2020-01-23T20:02:55Z,2020-01-23T21:39:05Z,,NONE,,,,"#### MCVE Code Sample ``` import numpy as np import xarray as xr xlen = 4 x = xr.DataArray(np.linspace(0.0, 1.0, xlen), dims=('x')) varname = 'foo' xr.Dataset({varname: xr.DataArray(np.arange(xlen, dtype='float64'), dims=('x'), coords={'x': x})}).to_netcdf('ds.nc') with xr.open_dataset('ds.nc') as ds: ds2 = ds.copy() ds2[varname][0] = 11.0 print(f'ds.equals = {ds.equals(ds2)}') with xr.open_dataset('ds.nc') as ds: ds2 = ds.copy() print(f'ds.equals = {ds.equals(ds2)}') ds2[varname][0] = 11.0 print(f'ds.equals = {ds.equals(ds2)}') ``` #### Expected Output I expect the code to write out ``` ds.equals = True ds.equals = True ds.equals = True ``` However, when I run it, the last line is ``` ds.equals = False ``` #### Problem Description The code above writes a small `xr.Dataset` to a netCDF file. There are 2 context managers opening the netCDF file as `ds`. Both context manager blocks start by setting `ds2` to a shallow copy of `ds`. In the first context manager block, a value in `ds2` is modified, and `ds2` is compared to `ds`. The Datasets are still equal, confirming that the copy is shallow. The second context manager block is the same as the first, except that `ds2` is compared to `ds` prior changing the value the value in `ds2`. When this is done, the Datasets are no longer equal, indicating that `ds2` is no longer a shallow copy of `ds`. I don't understand how evaluating `ds.equals(ds2)`, prior to changing a value in `ds2`, could decouple `ds2` from `ds`. I only observe this behavior when `ds` is set via `xr.open_dataset`. I don't see it when I create `ds` directly using `xr.Dataset`. I'm rather perplexed by this. #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 | packaged by conda-forge | (default, Jan 7 2020, 22:33:48) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-693.21.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.3 xarray: 0.14.1 pandas: 0.25.3 numpy: 1.17.5 scipy: 1.4.1 netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.1 dask: 2.9.2 distributed: 2.9.3 matplotlib: 3.1.2 cartopy: None seaborn: None numbagg: None setuptools: 45.1.0.post20200119 pip: 19.3.1 conda: None pytest: 5.3.4 IPython: 7.11.1 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3718/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 437418525,MDU6SXNzdWU0Mzc0MTg1MjU=,2921,to_netcdf with decoded time can create file with inconsistent time:units and time_bounds:units,15570875,closed,0,,,4,2019-04-25T22:08:52Z,2019-06-25T00:24:42Z,2019-06-25T00:24:42Z,NONE,,,,"#### Code Sample, a copy-pastable example if possible ```python import numpy as np import xarray as xr # create time and time_bounds DataArrays for Jan-1850 and Feb-1850 time_bounds_vals = np.array([[0.0, 31.0], [31.0, 59.0]]) time_vals = time_bounds_vals.mean(axis=1) time_var = xr.DataArray(time_vals, dims='time', coords={'time':time_vals}) time_bounds_var = xr.DataArray(time_bounds_vals, dims=('time', 'd2'), coords={'time':time_vals}) # create Dataset of time and time_bounds ds = xr.Dataset(coords={'time':time_var}, data_vars={'time_bounds':time_bounds_var}) ds.time.attrs = {'bounds':'time_bounds', 'calendar':'noleap', 'units':'days since 1850-01-01'} # write Jan-1850 values to file ds.isel(time=slice(0,1)).to_netcdf('Jan-1850.nc', unlimited_dims='time') # write Feb-1850 values to file ds.isel(time=slice(1,2)).to_netcdf('Feb-1850.nc', unlimited_dims='time') # use open_mfdataset to read in files, combining into 1 Dataset decode_times = True decode_cf = True ds = xr.open_mfdataset(['Jan-1850.nc', 'Feb-1850.nc'], decode_cf=decode_cf, decode_times=decode_times) # write combined Dataset out ds.to_netcdf('JanFeb-1850.nc', unlimited_dims='time') ``` #### Problem description The above code initially creates 2 netCDF files, for Jan-1850 and Feb-1850, that have the variables `time` and `time_bounds`, and `time:bounds='time_bounds'`. It then reads the 2 files back in as a single Dataset, using `open_mfdataset`, and this Dataset is written back out to a netCDF file. ncdump of this final file is ``` netcdf JanFeb-1850 { dimensions: time = UNLIMITED ; // (2 currently) d2 = 2 ; variables: int64 time(time) ; time:bounds = ""time_bounds"" ; time:units = ""hours since 1850-01-16 12:00:00.000000"" ; time:calendar = ""noleap"" ; double time_bounds(time, d2) ; time_bounds:_FillValue = NaN ; time_bounds:units = ""days since 1850-01-01"" ; time_bounds:calendar = ""noleap"" ; data: time = 0, 708 ; time_bounds = 0, 31, 31, 59 ; } ``` The problem is that the units attribute for `time` and `time_bounds` are different in this file, contrary to what [CF conventions](http://cfconventions.org/cf-conventions/cf-conventions.html#cell-boundaries) requires. The final call to `to_netcdf` is creating a file where `time`'s units (and type) differ from what they are in the intermediate files. These transformations are not being applied to `time_bounds`. While the change to `time`'s type is not necessarily an issue, I do find it surprising. This inconsistency goes away if either of `decode_times` or `decode_cf` is set to `False` in the python code above. In particular, the transformations to `time`'s units and type do not happen. The inconsistency also goes away if `open_mfdataset` opens a single file. In this scenario also, the transformations to `time`'s units and type do not happen. I think that the desired behavior is to either not apply the units and type transformations to `time`, or to also apply them to `time_bounds`. The first option would be consistent with the current single-file behavior.
INSTALLED VERSIONS ------------------ commit: None python: 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.12.62-60.64.8-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.12.1 pandas: 0.24.2 numpy: 1.16.2 scipy: 1.2.1 netCDF4: 1.4.2 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 1.1.5 distributed: 1.26.1 matplotlib: 3.0.3 cartopy: None seaborn: None setuptools: 40.8.0 pip: 19.0.3 conda: None pytest: 4.3.1 IPython: 7.4.0 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2921/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 433916353,MDU6SXNzdWU0MzM5MTYzNTM=,2902,DataArray sum().values depends on chunk size,15570875,closed,0,,,1,2019-04-16T18:09:33Z,2019-04-17T02:01:55Z,2019-04-17T02:01:55Z,NONE,,,,"Hi, The code below creates a Dataset with an `NxNxN` DataArray that is equal to a constant `val`. For various re-chunked copies of the Dataset, the code computes the sum of the array, and compares it to the exact value `N*N*N*val`. I find that the printed values are different, at the round-off level, for different chunk sizes. While I'm not surprised at these round-off differences, I could not find mention of such behavior in the xarray documentation. Is this feature known to xarray developers? Do xarray developers consider it a feature or a bug? Either way, I think it would be useful if the xarray documentation would mention that the results of some operations depends on chunk size. code: ```import numpy as np import xarray as xr N = 128 val = 1.9 val_array = np.full((N, N, N), val) exact_sum = N * N * N * val ds = xr.DataArray(val_array, name='val_array', dims=['x', 'y', 'z']).to_dataset() rel_diff = (ds['val_array'].sum().values - exact_sum) / exact_sum print('no chunking, rel_diff = %e' % rel_diff) for chunk_x in [N//16, N//4, N]: for chunk_y in [N//16, N//4, N]: for chunk_z in [N//16, N//4, N]: ds2 = ds.chunk({'x':chunk_x, 'y':chunk_y, 'z':chunk_z}) rel_diff = (ds2['val_array'].sum().values - exact_sum) / exact_sum print('chunk_x = %3d, chunk_y = %3d, chunk_z = %3d, rel_diff = %e' \ % (chunk_x, chunk_y, chunk_z, rel_diff)) ``` results: ```no chunking, rel_diff = -4.557758e-15 chunk_x = 8, chunk_y = 8, chunk_z = 8, rel_diff = -2.337312e-16 chunk_x = 8, chunk_y = 8, chunk_z = 32, rel_diff = -2.337312e-16 chunk_x = 8, chunk_y = 8, chunk_z = 128, rel_diff = -2.337312e-16 chunk_x = 8, chunk_y = 32, chunk_z = 8, rel_diff = -2.337312e-16 chunk_x = 8, chunk_y = 32, chunk_z = 32, rel_diff = -2.337312e-16 chunk_x = 8, chunk_y = 32, chunk_z = 128, rel_diff = -2.337312e-16 chunk_x = 8, chunk_y = 128, chunk_z = 8, rel_diff = -2.337312e-16 chunk_x = 8, chunk_y = 128, chunk_z = 32, rel_diff = -2.337312e-16 chunk_x = 8, chunk_y = 128, chunk_z = 128, rel_diff = -5.843279e-16 chunk_x = 32, chunk_y = 8, chunk_z = 8, rel_diff = -2.337312e-16 chunk_x = 32, chunk_y = 8, chunk_z = 32, rel_diff = -2.337312e-16 chunk_x = 32, chunk_y = 8, chunk_z = 128, rel_diff = -2.337312e-16 chunk_x = 32, chunk_y = 32, chunk_z = 8, rel_diff = -2.337312e-16 chunk_x = 32, chunk_y = 32, chunk_z = 32, rel_diff = -2.337312e-16 chunk_x = 32, chunk_y = 32, chunk_z = 128, rel_diff = -5.843279e-16 chunk_x = 32, chunk_y = 128, chunk_z = 8, rel_diff = -2.337312e-16 chunk_x = 32, chunk_y = 128, chunk_z = 32, rel_diff = -5.843279e-16 chunk_x = 32, chunk_y = 128, chunk_z = 128, rel_diff = 1.168656e-15 chunk_x = 128, chunk_y = 8, chunk_z = 8, rel_diff = -2.337312e-16 chunk_x = 128, chunk_y = 8, chunk_z = 32, rel_diff = -2.337312e-16 chunk_x = 128, chunk_y = 8, chunk_z = 128, rel_diff = -5.843279e-16 chunk_x = 128, chunk_y = 32, chunk_z = 8, rel_diff = -2.337312e-16 chunk_x = 128, chunk_y = 32, chunk_z = 32, rel_diff = -5.843279e-16 chunk_x = 128, chunk_y = 32, chunk_z = 128, rel_diff = 1.168656e-15 chunk_x = 128, chunk_y = 128, chunk_z = 8, rel_diff = -5.843279e-16 chunk_x = 128, chunk_y = 128, chunk_z = 32, rel_diff = 1.168656e-15 chunk_x = 128, chunk_y = 128, chunk_z = 128, rel_diff = -4.557758e-15 ``` Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: None python: 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-693.21.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.12.1 pandas: 0.24.2 numpy: 1.16.2 scipy: 1.2.1 netCDF4: 1.4.2 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 1.1.5 distributed: 1.26.1 matplotlib: 3.0.3 cartopy: None seaborn: None setuptools: 40.8.0 pip: 19.0.3 conda: None pytest: 4.3.1 IPython: 7.4.0 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2902/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 407750967,MDU6SXNzdWU0MDc3NTA5Njc=,2752,"document defaults for optional arguments to open_dataset, open_mfdataset",15570875,closed,0,,,5,2019-02-07T15:19:05Z,2019-02-07T18:28:57Z,2019-02-07T17:22:49Z,NONE,,,,"It would be useful if the docs for [`open_dataset`](https://xarray.pydata.org/en/stable/generated/xarray.open_dataset.html#xarray.open_dataset) and [`open_mfdataset`](https://xarray.pydata.org/en/stable/generated/xarray.open_mfdataset.html#xarray.open_mfdataset) listed the default values of optional arguments (where there is a default). For example, the docs for [`open_dataset`](https://xarray.pydata.org/en/stable/generated/xarray.open_dataset.html#xarray.open_dataset) do not list the defaults for `decode_times` and `decode_coords`, and the docs for [`open_mfdataset`](https://xarray.pydata.org/en/stable/generated/xarray.open_mfdataset.html#xarray.open_mfdataset) do not list the defaults for `data_vars` and `coords`.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2752/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue