home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

4 rows where state = "closed" and user = 10563614 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 2

  • issue 2
  • pull 2

state 1

  • closed · 4 ✖

repo 1

  • xarray 4
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1415430795 I_kwDOAMm_X85UXcKL 7188 efficiently set values in a xarray using dask ghislainp 10563614 closed 0     1 2022-10-19T18:44:44Z 2023-11-06T06:07:08Z 2023-11-06T06:07:08Z CONTRIBUTOR      

What is your issue?

I have a quite dataset (data) with three coords band=21, y = 5000, x=5000, and I want to set the value for a few bands in some points (x, y) given by a boolean dataset. The chunk size is band=1, y=16, x = 5000. My memory is 4Gb per worker and I've 4 workers, 1 thread per worker. The most compact form I found is this one:

band = dict(band=[17, 18, 19, 20]) data['somevar'].loc[band] = data['somevar'].loc[band].where(~points, some_complex_calculation)

points and some_complex_calculation are DataArray's with the same shape as data (in fact points is only a DataArray of x,y), they typically have a HighLevelGraph with 106 layers and 142610 keys from all layers. These datasets depend on data. data also has a HighLevelGraph with hundred layers. I can not use "compute()", this blow up the memory, I want directly to use data.to_zarr to exploit the chunks. Unfortunately, this calculation blocks the workers, which end up to be killed.

I tried many forms, and I found this one:

for b in [17, 18, 19, 20]: data['somevar'] = data['somevar'].where(~((snow.band == b) & ipoints), some_complex_calculation)

it works! but its is very inefficient and I found it difficult to read.

It seems that my objective is quite simple, set a few values in a large dataset at a given dimension, and this dimension is outer and has chunksize=1. It seems very easy from a C / Fortran perspective.

Do you have any suggestion how to peform such operations ?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7188/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  not_planned xarray 13221727 issue
1386596170 PR_kwDOAMm_X84_olQw 7085 solve a bug when the units attribute is not a string ghislainp 10563614 closed 0     2 2022-09-26T19:27:08Z 2022-09-28T19:13:11Z 2022-09-28T19:13:11Z CONTRIBUTOR   0 pydata/xarray/pulls/7085
  • [ ] Closes #xxxx
  • [x] Tests added
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [ ] New functions/methods are listed in api.rst

We faced a sort of bug with a colleague of mine. It seems to be legal to set a numeric value to the units attributes in an xarray or a netcdf file. xarray accepts to save such an array to netcdf: xr.DataArray([1, 2, 3], attrs={'units': 1}, name='x').to_csv('tmp.nc'). Reading this netcdf file with xarray.open_dataset raises an error.

It is unlikely to have a scalar for the units, but at least it happened to us (the value was NaN) and this raised an exception very difficult to understand.

This raises an exception because "since" in attrs["units"] was called twice in xarray codebase (in coding_times.py and in conventions.py) without checking for the type of the attribute.

This PR solves this improbable bug

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7085/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
705182835 MDExOlB1bGxSZXF1ZXN0NDg5OTU2NDQ2 4442 Fix DataArray.to_dataframe when the array has MultiIndex ghislainp 10563614 closed 0     4 2020-09-20T20:45:12Z 2021-02-20T00:08:42Z 2021-02-20T00:08:42Z CONTRIBUTOR   0 pydata/xarray/pulls/4442
  • [X] Closes #3008
  • [x] Tests added
  • [x] Passes isort . && black . && mypy . && flake8
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4442/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
657466413 MDU6SXNzdWU2NTc0NjY0MTM= 4228 to_dataframe: no valid index for a 0-dimensional object ghislainp 10563614 closed 0     5 2020-07-15T15:58:43Z 2020-10-26T08:42:35Z 2020-10-26T08:42:35Z CONTRIBUTOR      

What happened: xr.DataArray([1], coords=[('onecoord', [2])]).sel(onecoord=2).to_dataframe(name='name') raise an exception ValueError: no valid index for a 0-dimensional object

What you expected to happen:

the same behavior as: xr.DataArray([1], coords=[('onecoord', [2])]).to_dataframe(name='name')

Anything else we need to know?:

I see that the array after the selection has no "dims" anymore, and this is what cause the error. but it still has one "coords", this is confusing. Is there any documentation about this difference ?

Environment:

INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 | packaged by conda-forge | (default, Jun 1 2020, 18:57:50) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 4.19.0-9-amd64 machine: x86_64 processor: byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.4 xarray: 0.15.1 pandas: 1.0.4 numpy: 1.18.5 scipy: 1.4.1 netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: 2.4.0 cftime: 1.1.3 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2.18.1 distributed: 2.18.0 matplotlib: 3.2.1 cartopy: None seaborn: 0.10.1 numbagg: None setuptools: 47.3.1.post20200616 pip: 20.1.1 conda: 4.8.3 pytest: 5.4.3 IPython: 7.15.0 sphinx: 3.1.1
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4228/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 488.337ms · About: xarray-datasette