home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 1415430795

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1415430795 I_kwDOAMm_X85UXcKL 7188 efficiently set values in a xarray using dask 10563614 closed 0     1 2022-10-19T18:44:44Z 2023-11-06T06:07:08Z 2023-11-06T06:07:08Z CONTRIBUTOR      

What is your issue?

I have a quite dataset (data) with three coords band=21, y = 5000, x=5000, and I want to set the value for a few bands in some points (x, y) given by a boolean dataset. The chunk size is band=1, y=16, x = 5000. My memory is 4Gb per worker and I've 4 workers, 1 thread per worker. The most compact form I found is this one:

band = dict(band=[17, 18, 19, 20]) data['somevar'].loc[band] = data['somevar'].loc[band].where(~points, some_complex_calculation)

points and some_complex_calculation are DataArray's with the same shape as data (in fact points is only a DataArray of x,y), they typically have a HighLevelGraph with 106 layers and 142610 keys from all layers. These datasets depend on data. data also has a HighLevelGraph with hundred layers. I can not use "compute()", this blow up the memory, I want directly to use data.to_zarr to exploit the chunks. Unfortunately, this calculation blocks the workers, which end up to be killed.

I tried many forms, and I found this one:

for b in [17, 18, 19, 20]: data['somevar'] = data['somevar'].where(~((snow.band == b) & ipoints), some_complex_calculation)

it works! but its is very inefficient and I found it difficult to read.

It seems that my objective is quite simple, set a few values in a large dataset at a given dimension, and this dimension is outer and has chunksize=1. It seems very easy from a C / Fortran perspective.

Do you have any suggestion how to peform such operations ?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7188/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  not_planned 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 0 rows from issue in issue_comments
Powered by Datasette · Queries took 160.915ms · About: xarray-datasette