home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

8 rows where repo = 13221727 and user = 8453445 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, updated_at, closed_at, created_at (date), updated_at (date), closed_at (date)

type 2

  • issue 7
  • pull 1

state 2

  • closed 6
  • open 2

repo 1

  • xarray · 8 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1290704150 I_kwDOAMm_X85M7pUW 6743 to_zarr(mode='w') does not overwrite correctly/successfully when changing chunk size chiaral 8453445 closed 0     1 2022-06-30T22:36:36Z 2023-11-24T22:14:40Z 2023-11-24T22:14:40Z CONTRIBUTOR      

What happened?

Reproducible example: ``` import xarray as xr import numpy as np

Create dataset

data_vars = {'temperature':(['lat','lon','time'], np.random.rand(400,800,1000), {'units': 'Celsius'})}

define coordinates

coords = {'time': (['time'], np.arange(1,1001)), 'lat': (['lat'], np.arange(1,401)), 'lon': (['lon'], np.arange(1,801)),}

create dataset

ds = xr.Dataset(data_vars=data_vars, coords=coords, )

ds = ds.chunk({"lat": 20, "lon": 80, "time":100}) ds.to_zarr('temperature',mode='w', consolidated=True) `` This works. Note that theds.temperature.encodingis now empty and equal to{}`

If I load the data, ds = xr.open_zarr('temperature') the chunk size is correct

Then if I re-generate the dataset as above, change the chunk size, and overwrite the file, it works: ds = xr.Dataset(data_vars=data_vars, coords=coords, ) ds = ds.chunk({"lat": 100, "lon": 80, "time":100}) ds.to_zarr('temperature',mode='w', consolidated=True) When i re-load the zarr files now the chunk size are (100,80,100).

However if I do: ds = xr.Dataset(data_vars=data_vars, coords=coords, ) ds = ds.chunk({"lat": 20, "lon": 80, "time":100}). # 20 for lat ds.to_zarr('temperature',mode='w', consolidated=True) ds = xr.open_zarr('temperature') ds = ds.chunk({"lat": 100, "lon": 80, "time":100}) ds.to_zarr('temperature',mode='w', consolidated=True)

when i then re-open the file, the chunk size is still (20,80,100).

Ok then maybe it's the encoding, in fact, even if I change the chunk size using .chunk the encoding remains unchanged: ds = xr.Dataset(data_vars=data_vars, coords=coords, ) ds = ds.chunk({"lat": 20, "lon": 80, "time":100}). # 20 for lat ds.to_zarr('temperature',mode='w', consolidated=True) ds = xr.open_zarr('temperature') ds = ds.chunk({"lat": 100, "lon": 80, "time":100}) ds.temperature.encoding gives {'chunks': (20, 80, 100), 'preferred_chunks': {'lat': 20, 'lon': 80, 'time': 100}, 'compressor': Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0), 'filters': None, '_FillValue': nan, 'dtype': dtype('float64')} but if I print to screen ds it looks right - with chunks (100,80,100)

I then tried to: 1) set enconding to empty: ds.temperature.encoding={} 2) overwriting the encoding with the new chunk values: ds.temperature.encoding['chunks'] = (100,80, 100) ds.temperature.encoding['preferred_chunks']= {'lat': 100, 'lon': 80, 'time': 100}

when I try either one of the two fixing of encoding, and then I try to overwrite the zarr file, I get the error below, ``` ds = xr.Dataset(data_vars=data_vars, coords=coords, ) ds = ds.chunk({"lat": 20, "lon": 80, "time":100}) # 20 for lat ds.to_zarr('temperature',mode='w', consolidated=True) ds = xr.open_zarr('temperature') ds.temperature.encoding = {} ds = ds.chunk({"lat": 100, "lon": 80, "time":100}) ds.to_zarr('temperature',mode='w', consolidated=True)

ValueError: destination buffer too small; expected at least 6400000, got 1280000 ``` I searched for the error above in the open issues and didn't find anything.

What did you expect to happen?

I expected to be able to overwrite the file with whatever new combination of chunk size I want, especially after fixing the encoding.

Minimal Complete Verifiable Example

No response

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [x] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

```Python

ValueError Traceback (most recent call last) Input In [46], in <cell line: 7>() 5 ds.temperature.encoding = {} 6 ds = ds.chunk({"lat": 100, "lon": 80, "time":100}) ----> 7 ds.to_zarr('temperature',mode='w', consolidated=True)

File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/xarray/core/dataset.py:2036, in Dataset.to_zarr(self, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options) 2033 if encoding is None: 2034 encoding = {} -> 2036 return to_zarr( 2037 self, 2038 store=store, 2039 chunk_store=chunk_store, 2040 storage_options=storage_options, 2041 mode=mode, 2042 synchronizer=synchronizer, 2043 group=group, 2044 encoding=encoding, 2045 compute=compute, 2046 consolidated=consolidated, 2047 append_dim=append_dim, 2048 region=region, 2049 safe_chunks=safe_chunks, 2050 )

File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/xarray/backends/api.py:1432, in to_zarr(dataset, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options) 1430 # TODO: figure out how to properly handle unlimited_dims 1431 dump_to_store(dataset, zstore, writer, encoding=encoding) -> 1432 writes = writer.sync(compute=compute) 1434 if compute: 1435 _finalize_store(writes, zstore)

File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/xarray/backends/common.py:166, in ArrayWriter.sync(self, compute) 160 import dask.array as da 162 # TODO: consider wrapping targets with dask.delayed, if this makes 163 # for any discernible difference in perforance, e.g., 164 # targets = [dask.delayed(t) for t in self.targets] --> 166 delayed_store = da.store( 167 self.sources, 168 self.targets, 169 lock=self.lock, 170 compute=compute, 171 flush=True, 172 regions=self.regions, 173 ) 174 self.sources = [] 175 self.targets = []

File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/dask/array/core.py:1223, in store(failed resolving arguments) 1221 elif compute: 1222 store_dsk = HighLevelGraph(layers, dependencies) -> 1223 compute_as_if_collection(Array, store_dsk, map_keys, **kwargs) 1224 return None 1226 else:

File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/dask/base.py:344, in compute_as_if_collection(cls, dsk, keys, scheduler, get, kwargs) 341 # see https://github.com/dask/dask/issues/8991. 342 # This merge should be removed once the underlying issue is fixed. 343 dsk2 = HighLevelGraph.merge(dsk2) --> 344 return schedule(dsk2, keys, kwargs)

File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/dask/threaded.py:81, in get(dsk, result, cache, num_workers, pool, kwargs) 78 elif isinstance(pool, multiprocessing.pool.Pool): 79 pool = MultiprocessingPoolExecutor(pool) ---> 81 results = get_async( 82 pool.submit, 83 pool._max_workers, 84 dsk, 85 result, 86 cache=cache, 87 get_id=_thread_get_id, 88 pack_exception=pack_exception, 89 kwargs, 90 ) 92 # Cleanup pools associated to dead threads 93 with pools_lock:

File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/dask/local.py:508, in get_async(submit, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, chunksize, **kwargs) 506 _execute_task(task, data) # Re-execute locally 507 else: --> 508 raise_exception(exc, tb) 509 res, worker_id = loads(res_info) 510 state["cache"][key] = res

File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/dask/local.py:316, in reraise(exc, tb) 314 if exc.traceback is not tb: 315 raise exc.with_traceback(tb) --> 316 raise exc

File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/dask/local.py:221, in execute_task(key, task_info, dumps, loads, get_id, pack_exception) 219 try: 220 task, data = loads(task_info) --> 221 result = _execute_task(task, data) 222 id = get_id() 223 result = dumps((result, id))

File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/dask/core.py:119, in _execute_task(arg, cache, dsk) 115 func, args = arg[0], arg[1:] 116 # Note: Don't assign the subtask results to a variable. numpy detects 117 # temporaries by their reference count and can execute certain 118 # operations in-place. --> 119 return func(*(_execute_task(a, cache) for a in args)) 120 elif not ishashable(arg): 121 return arg

File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/dask/array/core.py:122, in getter(a, b, asarray, lock) 117 # Below we special-case np.matrix to force a conversion to 118 # np.ndarray and preserve original Dask behavior for getter, 119 # as for all purposes np.matrix is array-like and thus 120 # is_arraylike evaluates to True in that case. 121 if asarray and (not is_arraylike(c) or isinstance(c, np.matrix)): --> 122 c = np.asarray(c) 123 finally: 124 if lock:

File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/xarray/core/indexing.py:358, in ImplicitToExplicitIndexingAdapter.array(self, dtype) 357 def array(self, dtype=None): --> 358 return np.asarray(self.array, dtype=dtype)

File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/xarray/core/indexing.py:522, in CopyOnWriteArray.array(self, dtype) 521 def array(self, dtype=None): --> 522 return np.asarray(self.array, dtype=dtype)

File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/xarray/core/indexing.py:423, in LazilyIndexedArray.array(self, dtype) 421 def array(self, dtype=None): 422 array = as_indexable(self.array) --> 423 return np.asarray(array[self.key], dtype=None)

File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/xarray/backends/zarr.py:73, in ZarrArrayWrapper.getitem(self, key) 71 array = self.get_array() 72 if isinstance(key, indexing.BasicIndexer): ---> 73 return array[key.tuple] 74 elif isinstance(key, indexing.VectorizedIndexer): 75 return array.vindex[ 76 indexing._arrayize_vectorized_indexer(key, self.shape).tuple 77 ]

File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/zarr/core.py:788, in Array.getitem(self, selection) 786 result = self.vindex[selection] 787 else: --> 788 result = self.get_basic_selection(pure_selection, fields=fields) 789 return result

File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/zarr/core.py:914, in Array.get_basic_selection(self, selection, out, fields) 911 return self._get_basic_selection_zd(selection=selection, out=out, 912 fields=fields) 913 else: --> 914 return self._get_basic_selection_nd(selection=selection, out=out, 915 fields=fields)

File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/zarr/core.py:957, in Array._get_basic_selection_nd(self, selection, out, fields) 951 def _get_basic_selection_nd(self, selection, out=None, fields=None): 952 # implementation of basic selection for array with at least one dimension 953 954 # setup indexer 955 indexer = BasicIndexer(selection, self) --> 957 return self._get_selection(indexer=indexer, out=out, fields=fields)

File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/zarr/core.py:1247, in Array._get_selection(self, indexer, out, fields) 1241 if not hasattr(self.chunk_store, "getitems") or \ 1242 any(map(lambda x: x == 0, self.shape)): 1243 # sequentially get one key at a time from storage 1244 for chunk_coords, chunk_selection, out_selection in indexer: 1245 1246 # load chunk selection into output array -> 1247 self._chunk_getitem(chunk_coords, chunk_selection, out, out_selection, 1248 drop_axes=indexer.drop_axes, fields=fields) 1249 else: 1250 # allow storage to get multiple items at once 1251 lchunk_coords, lchunk_selection, lout_selection = zip(*indexer)

File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/zarr/core.py:1951, in Array._chunk_getitem(self, chunk_coords, chunk_selection, out, out_selection, drop_axes, fields) 1948 out[out_selection] = fill_value 1950 else: -> 1951 self._process_chunk(out, cdata, chunk_selection, drop_axes, 1952 out_is_ndarray, fields, out_selection)

File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/zarr/core.py:1859, in Array._process_chunk(self, out, cdata, chunk_selection, drop_axes, out_is_ndarray, fields, out_selection, partial_read_decode) 1857 if isinstance(cdata, PartialReadBuffer): 1858 cdata = cdata.read_full() -> 1859 self._compressor.decode(cdata, dest) 1860 else: 1861 chunk = ensure_ndarray(cdata).view(self._dtype)

File numcodecs/blosc.pyx:562, in numcodecs.blosc.Blosc.decode()

File numcodecs/blosc.pyx:371, in numcodecs.blosc.decompress()

ValueError: destination buffer too small; expected at least 6400000, got 1280000 ```

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.10.5 | packaged by conda-forge | (main, Jun 14 2022, 07:07:06) [Clang 13.0.1 ] python-bits: 64 OS: Darwin OS-release: 21.5.0 machine: arm64 processor: arm byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.1 libnetcdf: 4.8.1 xarray: 2022.3.0 pandas: 1.4.3 numpy: 1.22.4 scipy: 1.8.1 netCDF4: 1.5.8 pydap: None h5netcdf: 1.0.0 h5py: 3.6.0 Nio: None zarr: 2.12.0 cftime: 1.6.0 nc_time_axis: 1.4.1 PseudoNetCDF: None rasterio: 1.2.10 cfgrib: 0.9.10.1 iris: None bottleneck: 1.3.4 dask: 2022.6.0 distributed: 2022.6.0 matplotlib: 3.5.2 cartopy: 0.20.2 seaborn: 0.11.2 numbagg: None fsspec: 2022.5.0 cupy: None pint: 0.19.2 sparse: 0.13.0 setuptools: 62.6.0 pip: 22.1.2 conda: None pytest: None IPython: 8.4.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6743/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1030768250 I_kwDOAMm_X849cEZ6 5877 Rolling() gives values different from pd.rolling() chiaral 8453445 open 0     4 2021-10-19T21:41:42Z 2022-04-09T01:29:07Z   CONTRIBUTOR      

I am not sure this is a bug - but it clearly doesn't give the results the user would expect.

The rolling sum of zeros gives me values that are not zeros

```python var = np.array([0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.31 , 0.91999996, 8.3 , 1.42 , 0.03 , 1.22 , 0.09999999, 0.14 , 0.13 , 0. , 0.12 , 0.03 , 2.53 , 0. , 0.19999999, 0.19999999, 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ], dtype='float32')

timet = np.array([ 43200000000000, 129600000000000, 216000000000000, 302400000000000, 388800000000000, 475200000000000, 561600000000000, 648000000000000, 734400000000000, 820800000000000, 907200000000000, 993600000000000, 1080000000000000, 1166400000000000, 1252800000000000, 1339200000000000, 1425600000000000, 1512000000000000, 1598400000000000, 1684800000000000, 1771200000000000, 1857600000000000, 1944000000000000, 2030400000000000, 2116800000000000, 2203200000000000, 2289600000000000, 2376000000000000, 2462400000000000, 2548800000000000, 2635200000000000, 2721600000000000, 2808000000000000, 2894400000000000, 2980800000000000], dtype='timedelta64[ns]')

ds_ex = xr.Dataset(data_vars=dict( pr=(["time"], var), ), coords=dict( time=("time", timet) ), )

ds_ex.rolling(time=3).sum().pr.values

``` it gives me this result:

array([ nan, nan, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 3.1000000e-01, 1.2300000e+00, 9.5300007e+00, 1.0640000e+01, 9.7500000e+00, 2.6700001e+00, 1.3500001e+00, 1.4600002e+00, 3.7000012e-01, 2.7000013e-01, 2.5000012e-01, 1.5000013e-01, 2.6800001e+00, 2.5600002e+00, 2.7300003e+00, 4.0000033e-01, 4.0000033e-01, 2.0000035e-01, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07], dtype=float32)

Note the non zero values - the non zero value changes depending on whether i use float64 or float32 as precision of my data. So this seems to be a precision related issue (although the first values are correctly set to zero), in fact other sums of values are not exactly what they should be.

The small difference at the 8th/9th decimal position can be expected due to precision, but the fact that the 0s become non zeros is problematic imho, especially if not documented. Oftentimes zero in geoscience data can mean a very specific thing (i.e. zero rainfall will be characterized differently than non-zero).

in pandas this instead works:

python df_ex = ds_ex.to_dataframe() df_ex.rolling(window=3).sum().values.T gives me

array([[ nan, nan, 0. , 0. , 0. , 0. , 0. , 0.31 , 1.22999996, 9.53000015, 10.6400001 , 9.75000015, 2.66999999, 1.35000001, 1.46000002, 0.36999998, 0.27 , 0.24999999, 0.15 , 2.67999997, 2.55999997, 2.72999996, 0.39999998, 0.39999998, 0.19999999, 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ]])

What you expected to happen:

the sum of zeros should be zero. If this cannot be achieved/expected because of precision issues, it should be documented.

Anything else we need to know?:

I discovered this behavior in my old environments, but I created a new ad hoc environment with the latest versions, and it does the same thing.

Environment:

INSTALLED VERSIONS

commit: None python: 3.9.7 (default, Sep 16 2021, 08:50:36) [Clang 10.0.0 ] python-bits: 64 OS: Darwin OS-release: 17.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None

xarray: 0.19.0 pandas: 1.3.3 numpy: 1.21.2 scipy: None netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: None distributed: None matplotlib: None cartopy: None seaborn: None numbagg: None pint: None setuptools: 58.0.4 pip: 21.2.4 conda: None pytest: None IPython: 7.28.0 sphinx: None

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5877/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
183713222 MDU6SXNzdWUxODM3MTMyMjI= 1050 xarray.Dataset.var - xarray.DataArray.var - does it have ddof=1 parameter? chiaral 8453445 closed 0     4 2016-10-18T15:03:53Z 2022-03-12T08:17:48Z 2022-03-12T08:17:48Z CONTRIBUTOR      

It is not clear from the description whether ddof = 1 is available and/or if it is set to 0. (https://docs.scipy.org/doc/numpy-1.6.0/reference/generated/numpy.var.html)

for large samples, 1 or 0 don't make a lot of difference, but it would be good to know whether it uses N-1 or N.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1050/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
875696070 MDU6SXNzdWU4NzU2OTYwNzA= 5257 Inconsistencies between 0.17.0 and 0.17.1.dev102+gf455e00f chiaral 8453445 closed 0     3 2021-05-04T17:52:08Z 2021-05-04T20:02:00Z 2021-05-04T20:02:00Z CONTRIBUTOR      

Download file: !wget https://noaa-gefs-retrospective.s3.amazonaws.com/GEFSv12/reforecast/2000/2000012900/c00/Days%3A1-10/acpcp_sfc_2000012900_c00.grib2

I work on https://staging.us-central1-b.gcp.pangeo.io/

xarray: 0.17.0 cfgrib: 0.9.9.0

and when I try:

ds = xr.open_dataset("acpcp_sfc_2000012900_c00.grib2", engine="cfgrib", backend_kwargs={"extra_coords": {"stepRange": "step"}})

it works. The keyword argument 'extra_coords' it's new to cfgrib 0.9.9.0

After installing the latest xarray version from master (when I do !pip install git+https://github.com/pangeo-forge/pangeo-forge.git) the version becomes xarray '0.17.1.dev102+gf455e00f' cfgrib remains the same

when I attempt the same loading: ds = xr.open_dataset("acpcp_sfc_2000012900_c00.grib2", engine="cfgrib", backend_kwargs={"extra_coords": {"stepRange": "step"}})

I get ```python


TypeError Traceback (most recent call last) <ipython-input-5-6673e5f2812b> in <module> ----> 1 ds = xr.open_dataset("acpcp_sfc_2000012900_c00.grib2", engine="cfgrib", backend_kwargs={"extra_coords": {"stepRange": "step"}})

/srv/conda/envs/notebook/lib/python3.8/site-packages/xarray/backends/api.py in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, backend_kwargs, args, *kwargs) 499 500 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None) --> 501 backend_ds = backend.open_dataset( 502 filename_or_obj, 503 drop_variables=drop_variables,

TypeError: open_dataset() got an unexpected keyword argument 'extra_coords' ```

What you expected to happen:

I expect '0.17.1.dev102+gf455e00f' to work as '0.17.0'

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5257/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
783630055 MDU6SXNzdWU3ODM2MzAwNTU= 4793 More advanced tutorial on how to manipulate facetgrid chiaral 8453445 open 0     3 2021-01-11T19:17:12Z 2021-01-11T22:37:16Z   CONTRIBUTOR      

Is your feature request related to a problem? Please describe. I have explored a bit the object returned by faceting a plot (usually identified like p in the tutorial). It clearly stores tons of stuff that can be manipulated and make the plots more flexible.

I have an example here which I was planning to add somewhere to the tutorial for plotting.

Would this be of interest? or not since it makes use of i.e. matplotlib methods?

This issue is also intended to call for people that might have been playing with obscure attributes/method/whatever stored in p and have come out with some interesting manipulation. xarray faceting is very powerful, imho, and it is a great starting point for more complicated figures.

For example, in my notebook linked above, I add some axes to the side of the facetgrid to add a meridional average, and it used to take me a second to match the location of the added axes to the location of the axes in the faceted plot. But I figured that: for oa in p.axes.flat: print(oa.get_position().bounds) gets me the position.

I am sure tons of people have come up with similar stuff - so it would be amazing to put it all together in one spot!

Describe the solution you'd like If there is interest, I will open a PR with an example on how to manipulate faceted plots.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4793/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
183715595 MDU6SXNzdWUxODM3MTU1OTU= 1051 to_netcdf Documentation chiaral 8453445 closed 0     3 2016-10-18T15:12:07Z 2019-02-24T23:25:40Z 2019-02-24T23:25:40Z CONTRIBUTOR      

I found this SO thread reply very helpful when I had to create some netcdf files with many attributes.

http://stackoverflow.com/questions/22933855/convert-csv-to-netcdf/28914767#28914767

I thought to bring it to your attention.

The documentation on http://xarray.pydata.org/en/stable/io.html#netcdf could use a more detailed example and this is very clear.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1051/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
314239017 MDU6SXNzdWUzMTQyMzkwMTc= 2055 Documentation on assign a value and vectorized indexing chiaral 8453445 closed 0     10 2018-04-13T20:22:18Z 2018-05-16T02:13:26Z 2018-05-16T02:13:26Z CONTRIBUTOR      

I was trying to assign a value to a dataset and kept getting no error but also not getting what I wanted.

I then was directed to the Warning at the end of the Assigning values with indexing and I realized I wasted a lot of time on something that is not possible.

So I am suggesting a few improvements (some might be feasible, some might not):

A) I am not sure if it is possible, but maybe add a proper error to such thing - try to assign values when using any of the indexing methods - would be great. B) if A is not possible, maybe in the DataArray page you should repeat the Warning. In this second page, in fact, you state: "Select or assign values by integer location (like numpy) : x[:10] or by label (like pandas): x.loc['2014-01-01'] or x.sel(time='2014-01-01')." Which I think it's in contradiction, or at least it's not crystal clear. C) you should add to the text of the Warning to use vectorized indexing, so people know how to fix the issue.
D) Also for the vectorized indexing page, an example using .sel could help. For example, something like the following, which I think it should work (using your same dataArray within the vectorized indexing help) ``` ind_x = da.x==0

ind_y = da.y=='c'

da[ind_x, ind_y] =30

``` E) If I am completely off in the solution that i used in the code above, then add an example that takes care of this. In your examples you use 0s and 1s, which is not what you want to do if you have multiple lat and lon and time coordinates that you want to use correctly.

I hope I made some sense..

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2055/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
323345746 MDExOlB1bGxSZXF1ZXN0MTg4MjI2NTMy 2133 DOC: Added text to Assign values with Indexing chiaral 8453445 closed 0     1 2018-05-15T19:09:19Z 2018-05-16T02:13:26Z 2018-05-16T02:13:26Z CONTRIBUTOR   0 pydata/xarray/pulls/2133
  • [x] Closes #2055
  • [ ] Tests added (for all bug fixes or enhancements)
  • [ ] Tests passed (for all non-documentation changes)
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later)

Added examples to select and assign values to a DataArray using .loc() and xr.where() in the Assign values with indexing

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2133/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 78.609ms · About: xarray-datasette