home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

2 rows where state = "open", type = "issue" and user = 10563614 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

type 1

  • issue · 2 ✖

state 1

  • open · 2 ✖

repo 1

  • xarray 2
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1388326248 I_kwDOAMm_X85SwC1o 7093 xarray allows several types for netcdf attributes. Is it expected ? ghislainp 10563614 open 0     3 2022-09-27T20:20:46Z 2022-10-04T20:46:32Z   CONTRIBUTOR      

What is your issue?

Xarray is permissive regarding the type of the attributes. If using a wrong type, the error reveals the valid types: For serialization to netCDF files, its value must be of one of the following types: str, Number, ndarray, number, list, tuple

Using a non iterable type used to raise an Exception when reading the saved netcdf, but this is now solved with #7085

The pending question is whether it is valid to save netcdf attributes with type other than a string or not. The following lines are working (in a notebook):

```python xr.DataArray([1, 2, 3], attrs={'units': 1}, name='x').to_netcdf("tmp.nc") !ncdump tmp.nc

xr.DataArray([1, 2, 3], attrs={'units': np.nan}, name='x').to_netcdf("tmp.nc") !ncdump tmp.nc

xr.DataArray([1, 2, 3], attrs={'units': ['xarray', 'is', 'very', 'permissive', ]}, name='x').to_netcdf("tmp.nc") !ncdump tmp.nc ``` On the other hand, the following line raises an error:

```python xr.DataArray([1, 2, 3], attrs={'units': None}, name='x').to_netcdf("tmp.nc")

```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7093/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
709503596 MDU6SXNzdWU3MDk1MDM1OTY= 4465 combine_by_coords could use allclose instead of equal to compare coordinates ghislainp 10563614 open 0     4 2020-09-26T09:26:05Z 2020-09-26T21:30:35Z   CONTRIBUTOR      

Is your feature request related to a problem? Please describe.

When a coordinate in different dataset / netcdf files has slightly different values, combine_by_coords considers the coordinate are different and attempts a concatenation of the coordinates.

Concretely, I produce netcdf with (lat, lon, time) coordinates, annually. Apparently the lat is not the same in all the files (difference is 1e-14), which I suspect is due to different pyproj version used to produce the lon,lat grid. Reprocessing all the annual netcdf is not an option. When using open_mfdataset on these netcdf, the lat coordinate is concatenated which leads to a MemoryError in my case.

Describe the solution you'd like Two options: - add a coord_tolerance argument to xr.combine_by_coords and use np.allclose to compare the coordinates. In line 69 combine.py the comparison uses strict equality "if not all(index.equals(indexes[0]) for index in indexes[1:]):". This does not break the compatibility because coord_tolerance=0 should be the default.

  • add an argument to explicity list the coordinates to NOT concatenate. I tried to play with the coords argument to solve my problem, but was not succesfull.

Describe alternatives you've considered

  • I certainly could find a workaround for this specific case, but I often had issue with the magic in combine_by_coords, and imho adding more control by the user would be useful in general.

Additional context

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4465/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 25.969ms · About: xarray-datasette