home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 1063949669

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/6329#issuecomment-1063949669 https://api.github.com/repos/pydata/xarray/issues/6329 1063949669 IC_kwDOAMm_X84_apVl 9576982 2022-03-10T11:21:18Z 2022-03-10T11:21:18Z NONE

Ok, changing to 'r+' leads to the error suggesting to use 'a' ValueError: dataset contains non-pre-existing variables ['air'], which is not allowed in ``xarray.Dataset.to_zarr()`` with mode='r+'. To allow writing new variables, set mode='a'.

I have found something that gives me satisfactory results. The reason why I have issues in the cloud, I still don't know, I am still investigating. Maybe it is unrelated. The following script kinds of keep the important stuff but still it is not very clean as some of the parameters are not included in the final file. I ended up doing the same kind of convoluted approach as I was making before. But hopefully that's helpful to someone looking for some sort of real-case example. Definitely clarified stuff in my head.

``` python import xarray as xr from rasterio.enums import Resampling import numpy as np import dask.array as da

def init_coord(ds, X,Y): ''' To have the geometry right''' arr_r=some_processing(ds.isel(time=slice(0,1)), X,Y) return arr_r.x.values, arr_r.y.values

def some_processing(arr, X,Y): ''' A reprojection routine'''
arr = arr.rio.write_crs('EPSG:4326') arr_r = arr.rio.reproject('EPSG:3857', shape=(Y,X), resampling=Resampling.bilinear, nodata=np.nan) return arr_r

filename='processed_dataset.zarr' ds = xr.tutorial.open_dataset('air_temperature') ds.air.encoding['dtype']=np.dtype('float32') X,Y=250, 250 #size of each final timestep x,y=init_coord(ds, X,Y) dummy=da.zeros((len(ds.time.values), Y, X)) ds_to_write=xr.Dataset({'air':(('time','y','x'), dummy)}, coords={'time':('time',ds.time.values),'x':('x', x),'y':('y',y)}) ds_to_write.to_zarr(filename, compute=False, encoding={"time": {"chunks": [1]}}) for i in range(len(ds.time)): # some kind of heavy processing arr_r=some_processing(ds.isel(time=slice(i,i+1)),X,Y) buff= arr_r.drop(['spatial_ref','x','y']).chunk({'time':1,'x':X,'y':Y}) del buff.air.attrs["_FillValue"] buff.to_zarr(filename, mode='r+', region={'time':slice(i,i+1)}) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  1159923690
Powered by Datasette · Queries took 0.796ms · About: xarray-datasette