home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

3 rows where state = "closed", type = "issue" and user = 10819524 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue · 3 ✖

state 1

  • closed · 3 ✖

repo 1

  • xarray 3
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1367029446 I_kwDOAMm_X85RezbG 7012 Time-based resampling drops lat/lon coordinate metadata Zeitsperre 10819524 closed 0     5 2022-09-08T21:55:30Z 2022-09-13T15:30:32Z 2022-09-13T15:30:32Z CONTRIBUTOR      

What happened?

When performing a DataArray resampling on a time dimension, the metadata attributes of non-affected coordinate variables are dropped. This behaviour breaks compatibility with cf_xarray as the coordinate metadata is needed to identify the X, Y, Z coordinates.

What did you expect to happen?

Metadata fields of unaffected coordinates (lat, lon, height) to be preserved.

Minimal Complete Verifiable Example

```Python import xarray as xr import cf_xarray

ds = xr.open_dataset("my_dataset_that_has_lat_and_lon_coordinates.nc") tas = ds.tas.resample(time="MS").mean(dim="time")

tas.cf["latitude"] ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [ ] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [ ] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

```Python KeyError Traceback (most recent call last) File ~/mambaforge/envs/xclim310/lib/python3.10/site-packages/xarray/core/dataarray.py:760, in DataArray._getitem_coord(self, key) 759 try: --> 760 var = self._coords[key] 761 except KeyError:

KeyError: 'latitude'

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call last) File ~/mambaforge/envs/xclim310/lib/python3.10/site-packages/cf_xarray/accessor.py:706, in _getitem(accessor, key, skip) 705 for name in allnames: --> 706 extravars = accessor.get_associated_variable_names( 707 name, skip_bounds=scalar_key, error=False 708 ) 709 coords.extend(itertools.chain(*extravars.values()))

File ~/mambaforge/envs/xclim310/lib/python3.10/site-packages/cf_xarray/accessor.py:1597, in CFAccessor.get_associated_variable_names(self, name, skip_bounds, error) 1596 coords: dict[str, list[str]] = {k: [] for k in keys} -> 1597 attrs_or_encoding = ChainMap(self._obj[name].attrs, self._obj[name].encoding) 1599 if "coordinates" in attrs_or_encoding:

File ~/mambaforge/envs/xclim310/lib/python3.10/site-packages/xarray/core/dataarray.py:769, in DataArray.getitem(self, key) 768 if isinstance(key, str): --> 769 return self._getitem_coord(key) 770 else: 771 # xarray-style array indexing

File ~/mambaforge/envs/xclim310/lib/python3.10/site-packages/xarray/core/dataarray.py:763, in DataArray.getitem_coord(self, key) 762 dim_sizes = dict(zip(self.dims, self.shape)) --> 763 , key, var = _get_virtual_variable(self._coords, key, dim_sizes) 765 return self._replace_maybe_drop_dims(var, name=key)

File ~/mambaforge/envs/xclim310/lib/python3.10/site-packages/xarray/core/dataset.py:175, in _get_virtual_variable(variables, key, dim_sizes) 174 if len(split_key) != 2: --> 175 raise KeyError(key) 177 ref_name, var_name = split_key

KeyError: 'latitude'

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call last) Input In [7], in <cell line: 1>() ----> 1 tas.cf["latitude"]

File ~/mambaforge/envs/xclim310/lib/python3.10/site-packages/cf_xarray/accessor.py:2526, in CFDataArrayAccessor.getitem(self, key) 2521 if not isinstance(key, str): 2522 raise KeyError( 2523 f"Cannot use a list of keys with DataArrays. Expected a single string. Received {key!r} instead." 2524 ) -> 2526 return _getitem(self, key)

File ~/mambaforge/envs/xclim310/lib/python3.10/site-packages/cf_xarray/accessor.py:749, in _getitem(accessor, key, skip) 746 return ds.set_coords(coords) 748 except KeyError: --> 749 raise KeyError( 750 f"{kind}.cf does not understand the key {k!r}. " 751 f"Use 'repr({kind}.cf)' (or '{kind}.cf' in a Jupyter environment) to see a list of key names that can be interpreted." 752 )

KeyError: "DataArray.cf does not understand the key 'latitude'. Use 'repr(DataArray.cf)' (or 'DataArray.cf' in a Jupyter environment) to see a list of key names that can be interpreted." ```

Anything else we need to know?

Before

netcdf tas_Amon_CanESM2_rcp85_r1i1p1_200701-200712 { dimensions: time = UNLIMITED ; // (12 currently) bnds = 2 ; lat = 64 ; lon = 128 ; variables: double time(time) ; time:_FillValue = NaN ; time:bounds = "time_bnds" ; time:axis = "T" ; time:long_name = "time" ; time:standard_name = "time" ; time:units = "days since 1850-01-01" ; time:calendar = "365_day" ; double time_bnds(time, bnds) ; time_bnds:_FillValue = NaN ; time_bnds:coordinates = "height" ; double lat(lat) ; lat:_FillValue = NaN ; lat:bounds = "lat_bnds" ; lat:units = "degrees_north" ; lat:axis = "Y" ; lat:long_name = "latitude" ; lat:standard_name = "latitude" ; double lat_bnds(lat, bnds) ; lat_bnds:_FillValue = NaN ; lat_bnds:coordinates = "height" ; double lon(lon) ; lon:_FillValue = NaN ; lon:bounds = "lon_bnds" ; lon:units = "degrees_east" ; lon:axis = "X" ; lon:long_name = "longitude" ; lon:standard_name = "longitude" ; double lon_bnds(lon, bnds) ; lon_bnds:_FillValue = NaN ; lon_bnds:coordinates = "height" ; double height ; height:_FillValue = NaN ; height:units = "m" ; height:axis = "Z" ; height:positive = "up" ; height:long_name = "height" ; height:standard_name = "height" ; float tas(time, lat, lon) ; tas:_FillValue = 1.e+20f ; tas:standard_name = "air_temperature" ; tas:long_name = "Near-Surface Air Temperature" ; tas:units = "K" ; tas:original_name = "ST" ; tas:cell_methods = "time: mean (interval: 15 minutes)" ; tas:cell_measures = "area: areacella" ; tas:history = "2011-03-10T05:13:26Z altered by CMOR: Treated scalar dimension: \'height\'. 2011-03-10T05:13:26Z altered by CMOR: replaced missing value flag (1e+38) with standard missing value (1e+20)." ; tas:associated_files = "baseURL: http://cmip-pcmdi.llnl.gov/CMIP5/dataLocation gridspecFile: gridspec_atmos_fx_CanESM2_rcp85_r0i0p0.nc areacella: areacella_fx_CanESM2_rcp85_r0i0p0.nc" ; tas:coordinates = "height" ; tas:missing_value = 1.e+20f ;

After

netcdf test_cf_lat_new { dimensions: lat = 64 ; lon = 128 ; time = 11 ; variables: double lat(lat) ; lat:_FillValue = NaN ; double lon(lon) ; lon:_FillValue = NaN ; double height ; height:_FillValue = NaN ; height:units = "m" ; height:axis = "Z" ; height:positive = "up" ; height:long_name = "height" ; height:standard_name = "height" ; int64 time(time) ; time:units = "days since 2007-01-01 00:00:00.000000" ; time:calendar = "noleap" ; float tas(time, lat, lon) ; tas:_FillValue = NaNf ; tas:coordinates = "height" ;

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.10.5 | packaged by conda-forge | (main, Jun 14 2022, 07:06:46) [GCC 10.3.0] python-bits: 64 OS: Linux OS-release: 5.19.6-200.fc36.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_CA.UTF-8 LOCALE: ('en_CA', 'UTF-8') libhdf5: 1.12.1 libnetcdf: 4.8.1 xarray: 2022.6.0 pandas: 1.3.5 numpy: 1.22.4 scipy: 1.8.1 netCDF4: 1.6.0 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.6.2 nc_time_axis: 1.4.1 PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.5 dask: 2022.6.1 distributed: 2022.6.1 matplotlib: 3.5.2 cartopy: None seaborn: None numbagg: None fsspec: 2022.7.1 cupy: None pint: 0.19.2 sparse: 0.13.0 flox: 0.5.10.dev8+gfbc2af8 numpy_groupies: 0.9.19 setuptools: 59.8.0 pip: 22.2.1 conda: None pytest: 7.1.2 IPython: 8.4.0 sphinx: 5.1.1
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7012/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
361016974 MDU6SXNzdWUzNjEwMTY5NzQ= 2417 Limiting threads/cores used by xarray(/dask?) Zeitsperre 10819524 closed 0     9 2018-09-17T19:50:07Z 2019-02-11T18:07:41Z 2019-02-11T18:07:40Z CONTRIBUTOR      

I'm fairly new to xarray and I'm currently trying to leverage it to subset some NetCDFs. I'm running this on a shared server and would like to know how best to limit the processing power used by xarray so that it plays nicely with others. I've read through the dask and xarray documentation a bit but it doesn't seem clear to me how to set a cap on cpus/threads. Here's an example of a spatial subset: ``` import glob import os import xarray as xr

from multiprocessing.pool import ThreadPool import dask

wd = os.getcwd()

test_data = os.path.join(wd, 'test_data') lat_bnds = (43, 50) lon_bnds = (-67, -80) output = 'test_data_subset'

def subset_nc(ncfile, lat_bnds, lon_bnds, output): if not glob.os.path.exists(output): glob.os.makedirs(output) outfile = os.path.join(output, os.path.basename(ncfile).replace('.nc', '_subset.nc'))

with dask.config.set(scheduler='threads', pool=ThreadPool(5)):
    ds = xr.open_dataset(ncfile, decode_times=False)

    ds_sub = ds.where(
        (ds.lon >= min(lon_bnds)) & (ds.lon <= max(lon_bnds)) & (ds.lat >= min(lat_bnds)) & (ds.lat <= max(lat_bnds)),
        drop=True)
    comp = dict(zlib=True, complevel=5)
    encoding = {var: comp for var in ds.data_vars}
    ds_sub.to_netcdf(outfile, format='NETCDF4', encoding=encoding)

list_files = glob.glob(os.path.join(test_data, '*')) print(list_files)

for i in list_files: subset_nc(i, lat_bnds, lon_bnds, output) ```

I've tried a few variations on this by moving the ThreadPool configuration around but I still see way too much activity in the server's top (>3000% cpu activity). I'm not sure where the issue lies.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2417/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
397950349 MDU6SXNzdWUzOTc5NTAzNDk= 2664 Xarray fails to build with bottleneck on Travis CI Zeitsperre 10819524 closed 0     3 2019-01-10T18:06:24Z 2019-01-10T18:30:48Z 2019-01-10T18:15:44Z CONTRIBUTOR      

I'm currently having trouble building a project built on xarray (https://github.com/Ouranosinc/xclim/pull/139). Problems seem to have arose during the holidays as even our previously stable master branch is now failing.

We believe the issue is due to incompatibilities with bottleneck and/or numpy which seem to pop up every now and then. The build seems to only fail when run via Travis CI. Running tests via tox locally does not raise errors. Fixes from other issues similar to this haven't been successful (such as #1294). I've attempted the following: - manually upgrading pip before install - adjusting the base versions of xarray, bottleneck and numpy prior to installation - reinstalling bottleneck after xarray is installed

```python tests/test_checks.py:4: in <module> import xarray as xr .tox/py36/lib/python3.6/site-packages/xarray/init.py:14: in <module> from .core.extensions import (register_dataarray_accessor, .tox/py36/lib/python3.6/site-packages/xarray/core/extensions.py:6: in <module> from .dataarray import DataArray .tox/py36/lib/python3.6/site-packages/xarray/core/dataarray.py:9: in <module> from . import computation, groupby, indexing, ops, resample, rolling, utils .tox/py36/lib/python3.6/site-packages/xarray/core/rolling.py:457: in <module> inject_bottleneck_rolling_methods(DataArrayRolling) .tox/py36/lib/python3.6/site-packages/xarray/core/ops.py:357: in inject_bottleneck_rolling_methods f = getattr(bn, bn_name) E AttributeError: module 'bottleneck' has no attribute 'move_sum' ------------------------------- Captured stderr -------------------------------- ModuleNotFoundError: No module named 'numpy.core._multiarray_umath' ModuleNotFoundError: No module named 'numpy.core._multiarray_umath' ModuleNotFoundError: No module named 'numpy.core._multiarray_umath' ModuleNotFoundError: No module named 'numpy.core._multiarray_umath'

```

I believe this could be a bug in either xarray or in Travis CI. Help appreciated. :)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2664/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 561.281ms · About: xarray-datasette