home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

5 rows where state = "closed", type = "issue" and user = 15570875 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, created_at (date), updated_at (date), closed_at (date)

type 1

  • issue · 5 ✖

state 1

  • closed · 5 ✖

repo 1

  • xarray 5
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
690624634 MDU6SXNzdWU2OTA2MjQ2MzQ= 4401 problem with time axis values in line plot klindsay28 15570875 closed 0     4 2020-09-02T01:15:14Z 2021-10-23T16:27:32Z 2021-08-10T22:45:20Z NONE      

When I run the following code inside a jupyter notebook, the values on the x axis (time) in the generated plot, inserted after the code, appear to run from 0 to ~4. I expect them to run from 1 to 4, like the time values do. I can't tell if this is a problem with what xarray passes to nc_time_axis, or if it's a problem with nc_time_axis itself. Could this be looked into please?

``` import cftime import xarray as xr

time_vals = [cftime.DatetimeNoLeap(1+year, 1+month, 15) for year in range(3) for month in range(12)]

x_vals = [time_val.year + time_val.dayofyr / 365.0 for time_val in time_vals]

x_da = xr.DataArray(x_vals, coords=[time_vals], dims=["time"])

x_da.plot.line("-o"); ```

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.7.8 | packaged by conda-forge | (default, Jul 31 2020, 02:25:08) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 3.10.0-1127.13.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.16.0 pandas: 1.1.1 numpy: 1.19.1 scipy: 1.5.2 netCDF4: 1.5.4 pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.4.0 cftime: 1.2.1 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2.14.0 distributed: 2.14.0 matplotlib: 3.3.1 cartopy: 0.18.0 seaborn: 0.10.1 numbagg: None pint: 0.15 setuptools: 49.6.0.post20200814 pip: 20.2.2 conda: None pytest: 6.0.1 IPython: 7.17.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4401/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
535043825 MDU6SXNzdWU1MzUwNDM4MjU= 3606 confused by reference to dataset in docs for xarray.DataArray.copy klindsay28 15570875 closed 0     2 2019-12-09T16:30:23Z 2020-07-24T19:20:45Z 2020-07-24T19:20:45Z NONE      

The documentation for xarray.DataArray.copy

If deep=True, a deep copy is made of the data array. Otherwise, a shallow copy is made, so each variable in the new array’s dataset is also a variable in this array’s dataset.

I do not understand what dataset is being referred to here. In particular, there are no xarray datasets in the examples provided in this documentation. Could someone provide clarification?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3606/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
437418525 MDU6SXNzdWU0Mzc0MTg1MjU= 2921 to_netcdf with decoded time can create file with inconsistent time:units and time_bounds:units klindsay28 15570875 closed 0     4 2019-04-25T22:08:52Z 2019-06-25T00:24:42Z 2019-06-25T00:24:42Z NONE      

Code Sample, a copy-pastable example if possible

```python import numpy as np import xarray as xr

create time and time_bounds DataArrays for Jan-1850 and Feb-1850

time_bounds_vals = np.array([[0.0, 31.0], [31.0, 59.0]]) time_vals = time_bounds_vals.mean(axis=1)

time_var = xr.DataArray(time_vals, dims='time', coords={'time':time_vals}) time_bounds_var = xr.DataArray(time_bounds_vals, dims=('time', 'd2'), coords={'time':time_vals})

create Dataset of time and time_bounds

ds = xr.Dataset(coords={'time':time_var}, data_vars={'time_bounds':time_bounds_var}) ds.time.attrs = {'bounds':'time_bounds', 'calendar':'noleap', 'units':'days since 1850-01-01'}

write Jan-1850 values to file

ds.isel(time=slice(0,1)).to_netcdf('Jan-1850.nc', unlimited_dims='time')

write Feb-1850 values to file

ds.isel(time=slice(1,2)).to_netcdf('Feb-1850.nc', unlimited_dims='time')

use open_mfdataset to read in files, combining into 1 Dataset

decode_times = True decode_cf = True ds = xr.open_mfdataset(['Jan-1850.nc', 'Feb-1850.nc'], decode_cf=decode_cf, decode_times=decode_times)

write combined Dataset out

ds.to_netcdf('JanFeb-1850.nc', unlimited_dims='time') ```

Problem description

The above code initially creates 2 netCDF files, for Jan-1850 and Feb-1850, that have the variables time and time_bounds, and time:bounds='time_bounds'. It then reads the 2 files back in as a single Dataset, using open_mfdataset, and this Dataset is written back out to a netCDF file. ncdump of this final file is ``` netcdf JanFeb-1850 { dimensions: time = UNLIMITED ; // (2 currently) d2 = 2 ; variables: int64 time(time) ; time:bounds = "time_bounds" ; time:units = "hours since 1850-01-16 12:00:00.000000" ; time:calendar = "noleap" ; double time_bounds(time, d2) ; time_bounds:_FillValue = NaN ; time_bounds:units = "days since 1850-01-01" ; time_bounds:calendar = "noleap" ; data:

time = 0, 708 ;

time_bounds = 0, 31, 31, 59 ; } `` The problem is that the units attribute fortimeandtime_bounds` are different in this file, contrary to what CF conventions requires.

The final call to to_netcdf is creating a file where time's units (and type) differ from what they are in the intermediate files. These transformations are not being applied to time_bounds.

While the change to time's type is not necessarily an issue, I do find it surprising.

This inconsistency goes away if either of decode_times or decode_cf is set to False in the python code above. In particular, the transformations to time's units and type do not happen.

The inconsistency also goes away if open_mfdataset opens a single file. In this scenario also, the transformations to time's units and type do not happen.

I think that the desired behavior is to either not apply the units and type transformations to time, or to also apply them to time_bounds. The first option would be consistent with the current single-file behavior.

INSTALLED VERSIONS ------------------ commit: None python: 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.12.62-60.64.8-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.12.1 pandas: 0.24.2 numpy: 1.16.2 scipy: 1.2.1 netCDF4: 1.4.2 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 1.1.5 distributed: 1.26.1 matplotlib: 3.0.3 cartopy: None seaborn: None setuptools: 40.8.0 pip: 19.0.3 conda: None pytest: 4.3.1 IPython: 7.4.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2921/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
433916353 MDU6SXNzdWU0MzM5MTYzNTM= 2902 DataArray sum().values depends on chunk size klindsay28 15570875 closed 0     1 2019-04-16T18:09:33Z 2019-04-17T02:01:55Z 2019-04-17T02:01:55Z NONE      

Hi,

The code below creates a Dataset with an NxNxN DataArray that is equal to a constant val. For various re-chunked copies of the Dataset, the code computes the sum of the array, and compares it to the exact value N*N*N*val. I find that the printed values are different, at the round-off level, for different chunk sizes.

While I'm not surprised at these round-off differences, I could not find mention of such behavior in the xarray documentation.

Is this feature known to xarray developers? Do xarray developers consider it a feature or a bug?

Either way, I think it would be useful if the xarray documentation would mention that the results of some operations depends on chunk size.

code: ```import numpy as np import xarray as xr

N = 128

val = 1.9 val_array = np.full((N, N, N), val) exact_sum = N * N * N * val

ds = xr.DataArray(val_array, name='val_array', dims=['x', 'y', 'z']).to_dataset()

rel_diff = (ds['val_array'].sum().values - exact_sum) / exact_sum print('no chunking, rel_diff = %e' % rel_diff)

for chunk_x in [N//16, N//4, N]: for chunk_y in [N//16, N//4, N]: for chunk_z in [N//16, N//4, N]: ds2 = ds.chunk({'x':chunk_x, 'y':chunk_y, 'z':chunk_z}) rel_diff = (ds2['val_array'].sum().values - exact_sum) / exact_sum print('chunk_x = %3d, chunk_y = %3d, chunk_z = %3d, rel_diff = %e' \ % (chunk_x, chunk_y, chunk_z, rel_diff)) ```

results: no chunking, rel_diff = -4.557758e-15 chunk_x = 8, chunk_y = 8, chunk_z = 8, rel_diff = -2.337312e-16 chunk_x = 8, chunk_y = 8, chunk_z = 32, rel_diff = -2.337312e-16 chunk_x = 8, chunk_y = 8, chunk_z = 128, rel_diff = -2.337312e-16 chunk_x = 8, chunk_y = 32, chunk_z = 8, rel_diff = -2.337312e-16 chunk_x = 8, chunk_y = 32, chunk_z = 32, rel_diff = -2.337312e-16 chunk_x = 8, chunk_y = 32, chunk_z = 128, rel_diff = -2.337312e-16 chunk_x = 8, chunk_y = 128, chunk_z = 8, rel_diff = -2.337312e-16 chunk_x = 8, chunk_y = 128, chunk_z = 32, rel_diff = -2.337312e-16 chunk_x = 8, chunk_y = 128, chunk_z = 128, rel_diff = -5.843279e-16 chunk_x = 32, chunk_y = 8, chunk_z = 8, rel_diff = -2.337312e-16 chunk_x = 32, chunk_y = 8, chunk_z = 32, rel_diff = -2.337312e-16 chunk_x = 32, chunk_y = 8, chunk_z = 128, rel_diff = -2.337312e-16 chunk_x = 32, chunk_y = 32, chunk_z = 8, rel_diff = -2.337312e-16 chunk_x = 32, chunk_y = 32, chunk_z = 32, rel_diff = -2.337312e-16 chunk_x = 32, chunk_y = 32, chunk_z = 128, rel_diff = -5.843279e-16 chunk_x = 32, chunk_y = 128, chunk_z = 8, rel_diff = -2.337312e-16 chunk_x = 32, chunk_y = 128, chunk_z = 32, rel_diff = -5.843279e-16 chunk_x = 32, chunk_y = 128, chunk_z = 128, rel_diff = 1.168656e-15 chunk_x = 128, chunk_y = 8, chunk_z = 8, rel_diff = -2.337312e-16 chunk_x = 128, chunk_y = 8, chunk_z = 32, rel_diff = -2.337312e-16 chunk_x = 128, chunk_y = 8, chunk_z = 128, rel_diff = -5.843279e-16 chunk_x = 128, chunk_y = 32, chunk_z = 8, rel_diff = -2.337312e-16 chunk_x = 128, chunk_y = 32, chunk_z = 32, rel_diff = -5.843279e-16 chunk_x = 128, chunk_y = 32, chunk_z = 128, rel_diff = 1.168656e-15 chunk_x = 128, chunk_y = 128, chunk_z = 8, rel_diff = -5.843279e-16 chunk_x = 128, chunk_y = 128, chunk_z = 32, rel_diff = 1.168656e-15 chunk_x = 128, chunk_y = 128, chunk_z = 128, rel_diff = -4.557758e-15

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-693.21.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.12.1 pandas: 0.24.2 numpy: 1.16.2 scipy: 1.2.1 netCDF4: 1.4.2 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 1.1.5 distributed: 1.26.1 matplotlib: 3.0.3 cartopy: None seaborn: None setuptools: 40.8.0 pip: 19.0.3 conda: None pytest: 4.3.1 IPython: 7.4.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2902/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
407750967 MDU6SXNzdWU0MDc3NTA5Njc= 2752 document defaults for optional arguments to open_dataset, open_mfdataset klindsay28 15570875 closed 0     5 2019-02-07T15:19:05Z 2019-02-07T18:28:57Z 2019-02-07T17:22:49Z NONE      

It would be useful if the docs for open_dataset and open_mfdataset listed the default values of optional arguments (where there is a default).

For example, the docs for open_dataset do not list the defaults for decode_times and decode_coords, and the docs for open_mfdataset do not list the defaults for data_vars and coords.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2752/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 28.763ms · About: xarray-datasette