github: issues: 5 rows where state = "closed", type = "issue" and user = 8453445 sorted by updated

5 rows where state = "closed", type = "issue" and user = 8453445 sorted by updated_at descending

Search:

descending

id	node_id	number	title	user	state	comments	created_at	updated_at ▲	closed_at	author_association	body	reactions	state_reason	repo	type
1290704150	I_kwDOAMm_X85M7pUW	6743	to_zarr(mode='w') does not overwrite correctly/successfully when changing chunk size	chiaral 8453445	closed	1	2022-06-30T22:36:36Z	2023-11-24T22:14:40Z	2023-11-24T22:14:40Z	CONTRIBUTOR	What happened? Reproducible example: ``` import xarray as xr import numpy as np Create dataset data_vars = {'temperature':(['lat','lon','time'], np.random.rand(400,800,1000), {'units': 'Celsius'})} define coordinates coords = {'time': (['time'], np.arange(1,1001)), 'lat': (['lat'], np.arange(1,401)), 'lon': (['lon'], np.arange(1,801)),} create dataset ds = xr.Dataset(data_vars=data_vars, coords=coords, ) ds = ds.chunk({"lat": 20, "lon": 80, "time":100}) ds.to_zarr('temperature',mode='w', consolidated=True) `` This works. Note that theds.temperature.encoding`is now empty and equal to`{}` If I load the data, `ds = xr.open_zarr('temperature')` the chunk size is correct Then if I re-generate the dataset as above, change the chunk size, and overwrite the file, it works: `ds = xr.Dataset(data_vars=data_vars, coords=coords, ) ds = ds.chunk({"lat": 100, "lon": 80, "time":100}) ds.to_zarr('temperature',mode='w', consolidated=True)` When i re-load the zarr files now the chunk size are `(100,80,100)`. However if I do: `ds = xr.Dataset(data_vars=data_vars, coords=coords, ) ds = ds.chunk({"lat": 20, "lon": 80, "time":100}). # 20 for lat ds.to_zarr('temperature',mode='w', consolidated=True) ds = xr.open_zarr('temperature') ds = ds.chunk({"lat": 100, "lon": 80, "time":100}) ds.to_zarr('temperature',mode='w', consolidated=True)` when i then re-open the file, the chunk size is still (20,80,100). Ok then maybe it's the encoding, in fact, even if I change the chunk size using `.chunk` the encoding remains unchanged: `ds = xr.Dataset(data_vars=data_vars, coords=coords, ) ds = ds.chunk({"lat": 20, "lon": 80, "time":100}). # 20 for lat ds.to_zarr('temperature',mode='w', consolidated=True) ds = xr.open_zarr('temperature') ds = ds.chunk({"lat": 100, "lon": 80, "time":100}) ds.temperature.encoding` gives `{'chunks': (20, 80, 100), 'preferred_chunks': {'lat': 20, 'lon': 80, 'time': 100}, 'compressor': Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0), 'filters': None, '_FillValue': nan, 'dtype': dtype('float64')}` but if I print to screen `ds` it looks right - with chunks (100,80,100) I then tried to: 1) set enconding to empty: `ds.temperature.encoding={}` 2) overwriting the encoding with the new chunk values: `ds.temperature.encoding['chunks'] = (100,80, 100) ds.temperature.encoding['preferred_chunks']= {'lat': 100, 'lon': 80, 'time': 100}` when I try either one of the two fixing of `encoding`, and then I try to overwrite the zarr file, I get the error below, ``` ds = xr.Dataset(data_vars=data_vars, coords=coords, ) ds = ds.chunk({"lat": 20, "lon": 80, "time":100}) # 20 for lat ds.to_zarr('temperature',mode='w', consolidated=True) ds = xr.open_zarr('temperature') ds.temperature.encoding = {} ds = ds.chunk({"lat": 100, "lon": 80, "time":100}) ds.to_zarr('temperature',mode='w', consolidated=True) ValueError: destination buffer too small; expected at least 6400000, got 1280000 ``` I searched for the error above in the open issues and didn't find anything. What did you expect to happen? I expected to be able to overwrite the file with whatever new combination of chunk size I want, especially after fixing the encoding. Minimal Complete Verifiable Example No response MVCE confirmation [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [X] Complete example — the example is self-contained, including all data and the text of any traceback. [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [x] New issue — a search of GitHub Issues suggests this is not a duplicate. Relevant log output ```Python ValueError Traceback (most recent call last) Input In [46], in <cell line: 7>() 5 ds.temperature.encoding = {} 6 ds = ds.chunk({"lat": 100, "lon": 80, "time":100}) ----> 7 ds.to_zarr('temperature',mode='w', consolidated=True) File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/xarray/core/dataset.py:2036, in Dataset.to_zarr(self, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options) 2033 if encoding is None: 2034 encoding = {} -> 2036 return to_zarr( 2037 self, 2038 store=store, 2039 chunk_store=chunk_store, 2040 storage_options=storage_options, 2041 mode=mode, 2042 synchronizer=synchronizer, 2043 group=group, 2044 encoding=encoding, 2045 compute=compute, 2046 consolidated=consolidated, 2047 append_dim=append_dim, 2048 region=region, 2049 safe_chunks=safe_chunks, 2050 ) File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/xarray/backends/api.py:1432, in to_zarr(dataset, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options) 1430 # TODO: figure out how to properly handle unlimited_dims 1431 dump_to_store(dataset, zstore, writer, encoding=encoding) -> 1432 writes = writer.sync(compute=compute) 1434 if compute: 1435 _finalize_store(writes, zstore) File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/xarray/backends/common.py:166, in ArrayWriter.sync(self, compute) 160 import dask.array as da 162 # TODO: consider wrapping targets with dask.delayed, if this makes 163 # for any discernible difference in perforance, e.g., 164 # targets = [dask.delayed(t) for t in self.targets] --> 166 delayed_store = da.store( 167 self.sources, 168 self.targets, 169 lock=self.lock, 170 compute=compute, 171 flush=True, 172 regions=self.regions, 173 ) 174 self.sources = [] 175 self.targets = [] File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/dask/array/core.py:1223, in store(*failed resolving arguments) 1221 elif compute: 1222 store_dsk = HighLevelGraph(layers, dependencies) -> 1223 compute_as_if_collection(Array, store_dsk, map_keys, kwargs) 1224 return None 1226 else: File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/dask/base.py:344, in compute_as_if_collection(cls, dsk, keys, scheduler, get, kwargs) 341 # see https://github.com/dask/dask/issues/8991. 342 # This merge should be removed once the underlying issue is fixed. 343 dsk2 = HighLevelGraph.merge(dsk2) --> 344 return schedule(dsk2, keys, kwargs) File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/dask/threaded.py:81, in get(dsk, result, cache, num_workers, pool, kwargs) 78 elif isinstance(pool, multiprocessing.pool.Pool): 79 pool = MultiprocessingPoolExecutor(pool) ---> 81 results = get_async( 82 pool.submit, 83 pool._max_workers, 84 dsk, 85 result, 86 cache=cache, 87 get_id=_thread_get_id, 88 pack_exception=pack_exception, 89 kwargs, 90 ) 92 # Cleanup pools associated to dead threads 93 with pools_lock: File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/dask/local.py:508, in get_async(submit, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, chunksize, kwargs) 506 _execute_task(task, data) # Re-execute locally 507 else: --> 508 raise_exception(exc, tb) 509 res, worker_id = loads(res_info) 510 state["cache"][key] = res File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/dask/local.py:316, in reraise(exc, tb) 314 if exc.traceback is not tb: 315 raise exc.with_traceback(tb) --> 316 raise exc File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/dask/local.py:221, in execute_task(key, task_info, dumps, loads, get_id, pack_exception) 219 try: 220 task, data = loads(task_info) --> 221 result = _execute_task(task, data) 222 id = get_id() 223 result = dumps((result, id)) File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/dask/core.py:119, in _execute_task(arg, cache, dsk) 115 func, args = arg[0], arg[1:] 116 # Note: Don't assign the subtask results to a variable. numpy detects 117 # temporaries by their reference count and can execute certain 118 # operations in-place. --> 119 return func((_execute_task(a, cache) for a in args)) 120 elif not ishashable(arg): 121 return arg File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/dask/array/core.py:122, in getter(a, b, asarray, lock) 117 # Below we special-case `np.matrix` to force a conversion to 118 # `np.ndarray` and preserve original Dask behavior for `getter`, 119 # as for all purposes `np.matrix` is array-like and thus 120 # `is_arraylike` evaluates to `True` in that case. 121 if asarray and (not is_arraylike(c) or isinstance(c, np.matrix)): --> 122 c = np.asarray(c) 123 finally: 124 if lock: File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/xarray/core/indexing.py:358, in ImplicitToExplicitIndexingAdapter.array(self, dtype) 357 def array(self, dtype=None): --> 358 return np.asarray(self.array, dtype=dtype) File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/xarray/core/indexing.py:522, in CopyOnWriteArray.array(self, dtype) 521 def array(self, dtype=None): --> 522 return np.asarray(self.array, dtype=dtype) File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/xarray/core/indexing.py:423, in LazilyIndexedArray.array(self, dtype) 421 def array(self, dtype=None): 422 array = as_indexable(self.array) --> 423 return np.asarray(array[self.key], dtype=None) File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/xarray/backends/zarr.py:73, in ZarrArrayWrapper.getitem(self, key) 71 array = self.get_array() 72 if isinstance(key, indexing.BasicIndexer): ---> 73 return array[key.tuple] 74 elif isinstance(key, indexing.VectorizedIndexer): 75 return array.vindex[ 76 indexing._arrayize_vectorized_indexer(key, self.shape).tuple 77 ] File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/zarr/core.py:788, in Array.getitem(self, selection) 786 result = self.vindex[selection] 787 else: --> 788 result = self.get_basic_selection(pure_selection, fields=fields) 789 return result File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/zarr/core.py:914, in Array.get_basic_selection(self, selection, out, fields) 911 return self._get_basic_selection_zd(selection=selection, out=out, 912 fields=fields) 913 else: --> 914 return self._get_basic_selection_nd(selection=selection, out=out, 915 fields=fields) File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/zarr/core.py:957, in Array._get_basic_selection_nd(self, selection, out, fields) 951 def _get_basic_selection_nd(self, selection, out=None, fields=None): 952 # implementation of basic selection for array with at least one dimension 953 954 # setup indexer 955 indexer = BasicIndexer(selection, self) --> 957 return self._get_selection(indexer=indexer, out=out, fields=fields) File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/zarr/core.py:1247, in Array._get_selection(self, indexer, out, fields) 1241 if not hasattr(self.chunk_store, "getitems") or \ 1242 any(map(lambda x: x == 0, self.shape)): 1243 # sequentially get one key at a time from storage 1244 for chunk_coords, chunk_selection, out_selection in indexer: 1245 1246 # load chunk selection into output array -> 1247 self._chunk_getitem(chunk_coords, chunk_selection, out, out_selection, 1248 drop_axes=indexer.drop_axes, fields=fields) 1249 else: 1250 # allow storage to get multiple items at once 1251 lchunk_coords, lchunk_selection, lout_selection = zip(indexer) File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/zarr/core.py:1951, in Array._chunk_getitem(self, chunk_coords, chunk_selection, out, out_selection, drop_axes, fields) 1948 out[out_selection] = fill_value 1950 else: -> 1951 self._process_chunk(out, cdata, chunk_selection, drop_axes, 1952 out_is_ndarray, fields, out_selection) File /opt/miniconda3/envs/june2022/lib/python3.10/site-packages/zarr/core.py:1859, in Array._process_chunk(self, out, cdata, chunk_selection, drop_axes, out_is_ndarray, fields, out_selection, partial_read_decode) 1857 if isinstance(cdata, PartialReadBuffer): 1858 cdata = cdata.read_full() -> 1859 self._compressor.decode(cdata, dest) 1860 else: 1861 chunk = ensure_ndarray(cdata).view(self._dtype) File numcodecs/blosc.pyx:562, in numcodecs.blosc.Blosc.decode() File numcodecs/blosc.pyx:371, in numcodecs.blosc.decompress() ValueError: destination buffer too small; expected at least 6400000, got 1280000 ``` Anything else we need to know? No response* Environment INSTALLED VERSIONS ------------------ commit: None python: 3.10.5 \| packaged by conda-forge \| (main, Jun 14 2022, 07:07:06) [Clang 13.0.1 ] python-bits: 64 OS: Darwin OS-release: 21.5.0 machine: arm64 processor: arm byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.1 libnetcdf: 4.8.1 xarray: 2022.3.0 pandas: 1.4.3 numpy: 1.22.4 scipy: 1.8.1 netCDF4: 1.5.8 pydap: None h5netcdf: 1.0.0 h5py: 3.6.0 Nio: None zarr: 2.12.0 cftime: 1.6.0 nc_time_axis: 1.4.1 PseudoNetCDF: None rasterio: 1.2.10 cfgrib: 0.9.10.1 iris: None bottleneck: 1.3.4 dask: 2022.6.0 distributed: 2022.6.0 matplotlib: 3.5.2 cartopy: 0.20.2 seaborn: 0.11.2 numbagg: None fsspec: 2022.5.0 cupy: None pint: 0.19.2 sparse: 0.13.0 setuptools: 62.6.0 pip: 22.1.2 conda: None pytest: None IPython: 8.4.0 sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6743/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
183713222	MDU6SXNzdWUxODM3MTMyMjI=	1050	xarray.Dataset.var - xarray.DataArray.var - does it have ddof=1 parameter?	chiaral 8453445	closed	4	2016-10-18T15:03:53Z	2022-03-12T08:17:48Z	2022-03-12T08:17:48Z	CONTRIBUTOR	It is not clear from the description whether ddof = 1 is available and/or if it is set to 0. (https://docs.scipy.org/doc/numpy-1.6.0/reference/generated/numpy.var.html) for large samples, 1 or 0 don't make a lot of difference, but it would be good to know whether it uses N-1 or N.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1050/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
875696070	MDU6SXNzdWU4NzU2OTYwNzA=	5257	Inconsistencies between 0.17.0 and 0.17.1.dev102+gf455e00f	chiaral 8453445	closed	3	2021-05-04T17:52:08Z	2021-05-04T20:02:00Z	2021-05-04T20:02:00Z	CONTRIBUTOR	Download file: `!wget https://noaa-gefs-retrospective.s3.amazonaws.com/GEFSv12/reforecast/2000/2000012900/c00/Days%3A1-10/acpcp_sfc_2000012900_c00.grib2` I work on https://staging.us-central1-b.gcp.pangeo.io/ xarray: 0.17.0 cfgrib: 0.9.9.0 and when I try: `ds = xr.open_dataset("acpcp_sfc_2000012900_c00.grib2", engine="cfgrib", backend_kwargs={"extra_coords": {"stepRange": "step"}})` it works. The keyword argument `'extra_coords'` it's new to cfgrib 0.9.9.0 After installing the latest `xarray` version from master (when I do `!pip install git+https://github.com/pangeo-forge/pangeo-forge.git`) the version becomes xarray '0.17.1.dev102+gf455e00f' cfgrib remains the same when I attempt the same loading: `ds = xr.open_dataset("acpcp_sfc_2000012900_c00.grib2", engine="cfgrib", backend_kwargs={"extra_coords": {"stepRange": "step"}})` I get ```python TypeError Traceback (most recent call last) <ipython-input-5-6673e5f2812b> in <module> ----> 1 ds = xr.open_dataset("acpcp_sfc_2000012900_c00.grib2", engine="cfgrib", backend_kwargs={"extra_coords": {"stepRange": "step"}}) /srv/conda/envs/notebook/lib/python3.8/site-packages/xarray/backends/api.py in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, backend_kwargs, args, kwargs) 499 500 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None) --> 501 backend_ds = backend.open_dataset( 502 filename_or_obj, 503 drop_variables=drop_variables, TypeError: open_dataset() got an unexpected keyword argument 'extra_coords' ``` What you expected to happen*: I expect '0.17.1.dev102+gf455e00f' to work as '0.17.0'	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5257/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
183715595	MDU6SXNzdWUxODM3MTU1OTU=	1051	to_netcdf Documentation	chiaral 8453445	closed	3	2016-10-18T15:12:07Z	2019-02-24T23:25:40Z	2019-02-24T23:25:40Z	CONTRIBUTOR	I found this SO thread reply very helpful when I had to create some netcdf files with many attributes. http://stackoverflow.com/questions/22933855/convert-csv-to-netcdf/28914767#28914767 I thought to bring it to your attention. The documentation on http://xarray.pydata.org/en/stable/io.html#netcdf could use a more detailed example and this is very clear.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1051/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
314239017	MDU6SXNzdWUzMTQyMzkwMTc=	2055	Documentation on assign a value and vectorized indexing	chiaral 8453445	closed	10	2018-04-13T20:22:18Z	2018-05-16T02:13:26Z	2018-05-16T02:13:26Z	CONTRIBUTOR	I was trying to assign a value to a dataset and kept getting no error but also not getting what I wanted. I then was directed to the Warning at the end of the Assigning values with indexing and I realized I wasted a lot of time on something that is not possible. So I am suggesting a few improvements (some might be feasible, some might not): A) I am not sure if it is possible, but maybe add a proper error to such thing - try to assign values when using any of the indexing methods - would be great. B) if A is not possible, maybe in the DataArray page you should repeat the Warning. In this second page, in fact, you state: "Select or assign values by integer location (like numpy) : x[:10] or by label (like pandas): x.loc['2014-01-01'] or x.sel(time='2014-01-01')." Which I think it's in contradiction, or at least it's not crystal clear. C) you should add to the text of the Warning to use vectorized indexing, so people know how to fix the issue. D) Also for the vectorized indexing page, an example using .sel could help. For example, something like the following, which I think it should work (using your same dataArray within the vectorized indexing help) ``` ind_x = da.x==0 ind_y = da.y=='c' da[ind_x, ind_y] =30 ``` E) If I am completely off in the solution that i used in the code above, then add an example that takes care of this. In your examples you use 0s and 1s, which is not what you want to do if you have multiple lat and lon and time coordinates that you want to use correctly. I hope I made some sense..	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2055/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);

issues

5 rows where state = "closed", type = "issue" and user = 8453445 sorted by updated_at descending

What happened?

Create dataset

define coordinates

create dataset

What did you expect to happen?

Minimal Complete Verifiable Example

MVCE confirmation

Relevant log output

```Python

Anything else we need to know?

Environment

Advanced export