home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

5 rows where type = "issue" and user = 13837821 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, created_at (date), updated_at (date), closed_at (date)

type 1

  • issue · 5 ✖

state 1

  • closed 5

repo 1

  • xarray 5
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
689384366 MDU6SXNzdWU2ODkzODQzNjY= 4393 Dimension attrs lost when creating new variable with that dimension dnowacki-usgs 13837821 closed 0     2 2020-08-31T17:52:01Z 2020-09-10T17:37:07Z 2020-09-10T17:37:07Z CONTRIBUTOR      

What happened: When creating a new variable based on an existing dimension, the attrs of the dimension are lost.

What you expected to happen: The attrs should be preserved.

Minimal Complete Verifiable Example:

```python import xarray as xr

ds = xr.Dataset() ds['x'] = xr.DataArray(range(10), dims='x') ds['y'] = xr.DataArray(range(len(ds['x'])), dims='x') ds['x'].attrs['foo'] = 'bar' print(ds['x']) # attrs of ds['x'] are preserved

print('\n****\n')

ds = xr.Dataset() ds['x'] = xr.DataArray(range(10), dims='x') ds['x'].attrs['foo'] = 'bar' ds['y'] = xr.DataArray(range(len(ds['x'])), dims='x') print(ds['x']) # attrs of ds['x'] are lost ```

Output of above code: ``` <xarray.DataArray 'x' (x: 10)> array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Coordinates: * x (x) int64 0 1 2 3 4 5 6 7 8 9 Attributes: foo: bar


<xarray.DataArray 'x' (x: 10)> array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Coordinates: * x (x) int64 0 1 2 3 4 5 6 7 8 9 ```

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None libhdf5: 1.10.5 libnetcdf: 4.7.4 xarray: 0.16.0 pandas: 1.0.3 numpy: 1.19.1 scipy: 1.3.1 netCDF4: 1.5.3 pydap: None h5netcdf: 0.8.0 h5py: 2.10.0 Nio: None zarr: None cftime: 1.2.1 nc_time_axis: None PseudoNetCDF: None rasterio: 1.1.3 cfgrib: None iris: None bottleneck: 1.3.2 dask: 2.18.1 distributed: 2.25.0 matplotlib: 3.2.1 cartopy: 0.18.0 seaborn: 0.10.0 numbagg: None pint: 0.15 setuptools: 49.6.0.post20200814 pip: 19.2.2 conda: 4.8.4 pytest: 5.4.1 IPython: 7.14.0 sphinx: None None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4393/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
355698213 MDU6SXNzdWUzNTU2OTgyMTM= 2392 Improving interpolate_na()'s limit argument dnowacki-usgs 13837821 closed 0     0 2018-08-30T18:16:07Z 2019-11-15T14:53:17Z 2019-11-15T14:53:17Z CONTRIBUTOR      

I've been working with some time-series data with occasional nans peppered throughout. I want to interpolate small gaps of nans (say, when there is a single isolated nan or perhaps a block of two) but leave larger blocks as nans. That is, it's not appropriate to fill large gaps, but it acceptable to do so for small gaps.

I was hoping interpolate_na() with the limit argument would do exactly this, but it turns out that if you specify, say, limit=2, it will fill the first two nans of nan-blocks of any length, no matter how long. There are definitely solutions for dealing with this, but it seems like a common issue, and has cropped up over on Pandas as well.

I'm not able to attempt tackling this right now, but I guess I wanted to put in a feature request for an additional argument to interpolate_na() that would do this.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2392/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
507524966 MDU6SXNzdWU1MDc1MjQ5NjY= 3404 groupby_bins raises ufunc 'isnan' error on 0.14.0 dnowacki-usgs 13837821 closed 0     1 2019-10-15T23:02:34Z 2019-10-17T21:13:45Z 2019-10-17T21:13:45Z CONTRIBUTOR      

I recently upgraded to xarray 0.14.0. When running code that used to work in 0.13, I get a TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'' in 0.14 when running code similar to the MCVE below. The code should return the GroupBy bins instead of the error.

MCVE Code Sample

```python import xarray as xr import pandas as pd import numpy as np

ts = pd.date_range(start='2010-08-01', end='2010-08-15', freq='24.8H')

ds = xr.Dataset() ds['time'] = xr.DataArray(pd.date_range('2010-08-01', '2010-08-15', freq='15min'), dims='time') ds['val'] = xr.DataArray(np.random.rand(*ds['time'].shape), dims='time')

ds.groupby_bins('time', ts) #error thrown here ```

Full error details below.

--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-35-43742bae2c94> in <module> 9 ds['val'] = xr.DataArray(np.random.rand(*ds['time'].shape), dims='time') 10 ---> 11 ds.groupby_bins('time', ts) ~/miniconda3/lib/python3.7/site-packages/xarray/core/common.py in groupby_bins(self, group, bins, right, labels, precision, include_lowest, squeeze, restore_coord_dims) 727 "labels": labels, 728 "precision": precision, --> 729 "include_lowest": include_lowest, 730 }, 731 ) ~/miniconda3/lib/python3.7/site-packages/xarray/core/groupby.py in __init__(self, obj, group, squeeze, grouper, bins, restore_coord_dims, cut_kwargs) 322 323 if bins is not None: --> 324 if np.isnan(bins).all(): 325 raise ValueError("All bin edges are NaN.") 326 binned = pd.cut(group.values, bins, **cut_kwargs) TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.14.0 pandas: 0.25.1 numpy: 1.17.2 scipy: 1.3.0 netCDF4: 1.5.1.2 pydap: installed h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.2.0 distributed: 2.5.2 matplotlib: 3.1.1 cartopy: 0.17.0 seaborn: 0.9.0 numbagg: None setuptools: 41.4.0 pip: 19.3 conda: 4.7.12 pytest: 5.1.1 IPython: 7.8.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3404/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
323357664 MDU6SXNzdWUzMjMzNTc2NjQ= 2134 unlimited_dims generates 0-length dimensions named as letters of unlimited dimension dnowacki-usgs 13837821 closed 0     5 2018-05-15T19:47:10Z 2018-05-18T14:48:11Z 2018-05-18T14:48:11Z CONTRIBUTOR      

I'm not sure I understand how the unlimited_dims option to to_netcdf() is supposed to work. Consider the following: python ds = xr.Dataset() ds['time'] = xr.DataArray(pd.date_range('2000-01-01', '2000-01-10'), dims='time') ds.to_netcdf('timedim.cdf', unlimited_dims='time') This results in a file that looks like this: ``` $ ncdump timedim.cdf netcdf timedim { dimensions: t = UNLIMITED ; // (0 currently) i = UNLIMITED ; // (0 currently) m = UNLIMITED ; // (0 currently) e = UNLIMITED ; // (0 currently) time = UNLIMITED ; // (10 currently) variables: int64 time(time) ; time:units = "days since 2000-01-01 00:00:00" ; time:calendar = "proleptic_gregorian" ; data:

time = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ; } `` Note the dimensions namedt,i,m,eall with zero length. Thetimedimension (which is the only one that should exist) is properly set toUNLIMITEDbut we shouldn't have the four extra dimensions. What's going on here? The same behavior occurs when setting viads.encoding['unlimited_dims'] = 'time'. Everything is as expected without theunlimited_dimsoption (but thetimedimension is notUNLIMITED`, of course).

I thought it could be related to the variable and dimension having the same name, but this also happens when they are different.

Expected Output

There shouldn't be extra 0-length dimensions

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.3.final.0 python-bits: 64 OS: Darwin OS-release: 16.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: None LOCALE: None.None xarray: 0.10.3 pandas: 0.22.0 numpy: 1.14.3 scipy: 1.0.0 netCDF4: 1.3.1 h5netcdf: 0.5.0 h5py: 2.7.1 Nio: None zarr: 2.2.0 bottleneck: 1.2.1 cyordereddict: None dask: 0.16.1 distributed: 1.20.2 matplotlib: 2.2.2 cartopy: 0.16.0 seaborn: None setuptools: 36.5.0.post20170921 pip: 9.0.1 conda: 4.5.3 pytest: None IPython: 6.3.1 sphinx: 1.7.1
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2134/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
312633077 MDU6SXNzdWUzMTI2MzMwNzc= 2044 Feature request: writing xarray list-type attributes to netCDF dnowacki-usgs 13837821 closed 0     2 2018-04-09T18:14:33Z 2018-04-17T15:39:34Z 2018-04-17T15:39:34Z CONTRIBUTOR      

Migrated from Stack Overflow.

NetCDF supports the NC_STRING type, which can stores arrays of strings in attributes. Xarray already supports reading arrays of strings from attributes in netCDF files, and It would be great if it also supported writing the same.

Reading already works

```python import xarray as xr import netCDF4 as nc

rg = nc.Dataset('test_string.nc', 'w', format='NETCDF4') rg.setncattr_string('testing', ['a', 'b']) rg.close() ds = xr.open_dataset('test_string.nc') print(ds) gives <xarray.Dataset> Dimensions: () Data variables: empty Attributes: testing: ['a', 'b'] ```

This works because I used the setncattr_string method. Setting the attributes like rg.testing = ['a', 'b'] does not work and results in a concatenated list (just like the xarray example below).

Writing doesn't work

python import xarray as xr ds = xr.Dataset() ds.attrs['testing'] = ['a', 'b'] ds.to_netcdf('asdf.nc') ds = xr.open_dataset('asdf.nc', autoclose=True) print(ds) gives <xarray.Dataset> Dimensions: () Data variables: *empty* Attributes: testing: ab

Note the list elements have been concatenated. So this is a request for xarray to implement something like netCDF4's setncattr_string. I would be happy to help do this if someone pointed me in the right direction; I looked through the code but got lost pretty quickly. Thanks.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2044/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 23.163ms · About: xarray-datasette