home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

3 rows where repo = 13221727, state = "open" and user = 601177 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, created_at (date), updated_at (date)

type 1

  • issue 3

state 1

  • open · 3 ✖

repo 1

  • xarray · 3 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
656089264 MDU6SXNzdWU2NTYwODkyNjQ= 4220 combine_first of Datasets changes dtype of variable present only in one Dataset equaeghe 601177 open 0     2 2020-07-13T19:36:41Z 2023-05-12T08:55:10Z   NONE      

What happened: I was combining two Datasets using combine_first and to my surprise the dtype of one of the DataArrays in the merged Dataset was changed (from bool to float64).

What you expected to happen: No change in dtype.

Minimal Complete Verifiable Example:

```python

import xarray as xr  ds = xr.Dataset(coords={'abc': list('abc')})  ds['x'] = ('abc', [1., 2., 3.]) ds['y'] = ('abc', [-1., -2., -3.])  ds['t'] = ('abc', [True, False, True]) ds <xarray.Dataset> Dimensions: (abc: 3) Coordinates: * abc (abc) <U1 'a' 'b' 'c' Data variables: x (abc) float64 1.0 2.0 3.0 y (abc) float64 -1.0 -2.0 -3.0 t (abc) bool True False True  xy4b = ds[['x', 'y']].sel(abc=~ds.t) * 10 xy4b.combine_first(ds) Out[14]: <xarray.Dataset> Dimensions: (abc: 3) Coordinates: * abc (abc) object 'a' 'b' 'c' Data variables: x (abc) float64 1.0 20.0 3.0 y (abc) float64 -1.0 -20.0 -3.0 t (abc) float64 1.0 0.0 1.0 ```

Anything else we need to know?: No.

Environment:

Output of <tt>xr.show_versions()</tt> commit: None python: 3.7.8 (default, Jul 5 2020, 21:51:42) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 5.4.48-gentoo machine: x86_64 processor: Intel(R) Core(TM) i7-2620M CPU @ 2.70GHz byteorder: little LC_ALL: None LANG: nl_BE.UTF-8 LOCALE: nl_BE.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.6.1 xarray: 0.12.1 pandas: 1.0.4 numpy: 1.18.5 scipy: 1.4.1 netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: 1.1.3 nc_time_axis: None PseudonetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 1.2.0 distributed: None matplotlib: 3.2.1 cartopy: None seaborn: None setuptools: 46.4.0 pip: 20.0.2 conda: None pytest: None IPython: 7.16.1 sphinx: 3.0.4
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4220/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
294380515 MDU6SXNzdWUyOTQzODA1MTU= 1888 Getting DataArrays from netCDF4 files correctly and without hassle equaeghe 601177 open 0     4 2018-02-05T12:45:42Z 2020-12-06T18:07:47Z   NONE      

Context

Consider a netCDF4 file with a group structure. For example, the following toy: ```python import netCDF4 as nc

netCDF4 file

f = nc.Dataset('simple_hierarchy.nc', 'w')

coordinates in root

f.createDimension('x', 3) f.createVariable('x', 'f4', ('x',), fill_value=False) f['x'][:] = [1.1, 2.2, 3.3] f.createDimension('y', 2) f.createVariable('y', 'f4', ('y',), fill_value=False) f['y'][:] = [-0.9, -1.8]

variables in root

f.createVariable('u', 'i1', (), fill_value=False) f.createVariable('v', 'u1', ('x','y'), fill_value=False)

group

f.createGroup('g') g = f['g']

new/modified coordinates in g

g.createDimension('y', 3) g.createVariable('y', 'f4', ('y',), fill_value=False) g['y'][:] = [-0.9, -1.8, -2.7]

variable in g

g.createVariable('w', 'u1', ('x', 'y'), fill_value=False) f.close() ```

Current behavior

  1. It is currently a hassle to get a DataArray from variable in a group with multiple non-coordinate variables: ```python >>> xr.open_dataarray('simple_hierarchy.nc') … ValueError: Given file dataset contains more than one data variable. Please read with xarray.open_dataset and then select the variable you want. >>> xr.open_dataarray('simple_hierarchy.nc', group='v') xr.open_dataarray('simple_hierarchy.nc', group='v') … OSError: [Errno group not found: v] 'v' >>> xr.open_dataarray('simple_hierarchy.nc', drop_variables='u') <xarray.DataArray 'v' (x: 3, y: 2)> array([[120, 219], [178, 172], [ 9, 127]], dtype=uint8) Coordinates:

    • x (x) float32 1.1 2.2 3.3
    • y (y) float32 -0.9 -1.8 ```
  2. Also, coordinates defined at a group level closer tot the root are not taken into account: ```python >>> xr.open_dataarray('simple_hierarchy.nc', group='g') <xarray.DataArray 'w' (x: 3, y: 3)> array([[216, 219, 178], [172, 9, 127], [ 0, 0, 64]], dtype=uint8) Coordinates:

    • y (y) float32 -0.9 -1.8 -2.7 Dimensions without coordinates: x ``` So the DataArray is not loaded correctly, as part of its defining coordinates are missing.

Suggested behavior

  1. Add a variable kwarg in the open_dataarray method: ```python >>> xr.open_dataarray('simple_hierarchy.nc', variable='v') <xarray.DataArray 'v' (x: 3, y: 2)> array([[120, 219], [178, 172], [ 9, 127]], dtype=uint8) Coordinates:

    • x (x) float32 1.1 2.2 3.3
    • y (y) float32 -0.9 -1.8 ```
  2. Have the function that loads variables go up the group hierarchy to see if some coordinate arrays can be found for dimensions lacking them within this group: ```python >>> xr.open_dataarray('simple_hierarchy.nc', group='g') <xarray.DataArray 'w' (x: 3, y: 3)> array([[216, 219, 178], [172, 9, 127], [ 0, 0, 64]], dtype=uint8) Coordinates:

    • x (x) float32 1.1 2.2 3.3
    • y (y) float32 -0.9 -1.8 -2.7 ``` I guess care needs to be taken as well upon writing to netCDF, to make sure no spurious dimension/coordinate definitions are added.

Version

xarray 0.9.6

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1888/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
498297720 MDU6SXNzdWU0OTgyOTc3MjA= 3343 Interpolation using non-dimension coordinates equaeghe 601177 open 0     4 2019-09-25T13:45:38Z 2019-09-30T12:53:35Z   NONE      

MCVE Code Sample

```python

da = xr.DataArray([1., 2.], coords=[('dim', [.5, 1.])]) da.coords['nondim'] = ('dim', [0., 1.]) da <xarray.DataArray (dim: 2)> array([1., 2.]) Coordinates: * dim (dim) float64 0.5 1.0 nondim (dim) float64 0.0 1.0 da.interp(dim=.75) <xarray.DataArray ()> array(1.5) Coordinates: nondim float64 0.5 dim float64 0.75 da.interp(nondim=.5) Traceback (most recent call last):

File "<ipython-input-192-e3df34cff90f>", line 1, in <module> da.interp(nondim=.5)

File "/usr/lib64/python3.6/site-packages/xarray/core/dataarray.py", line 951, in interp **coords_kwargs)

File "/usr/lib64/python3.6/site-packages/xarray/core/dataset.py", line 1860, in interp indexers = OrderedDict(self._validate_indexers(coords))

File "/usr/lib64/python3.6/site-packages/xarray/core/dataset.py", line 1316, in _validate_indexers raise ValueError("dimensions %r do not exist" % invalid)

ValueError: dimensions ['nondim'] do not exist ```

Expected Output

The same interpolation as for 'dim'.

Problem Description

Apparently, xarray currently cannot interpolate using non-dimension coordinates.

Output of xr.show_versions()

commit: None python: 3.6.5.final.0 python-bits: 64 OS: Linux OS-release: 4.19.72-gentoo machine: x86_64 processor: Intel(R) Core(TM) i7-2620M CPU @ 2.70GHz byteorder: little LC_ALL: None LANG: nl_BE.UTF-8 LOCALE: nl_BE.UTF-8 xarray: 0.10.8 pandas: 0.24.2 numpy: 1.14.5 scipy: 1.1.0 netCDF4: 1.5.1.2 h5netcdf: None h5py: 2.9.0 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: 1.2.0 distributed: None matplotlib: 2.2.2 cartopy: None seaborn: None setuptools: 40.6.3 pip: 19.1 conda: None pytest: 3.10.1 IPython: 5.4.1 sphinx: 1.7.5
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3343/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 79.271ms · About: xarray-datasette