github: issues: 5 rows where type = "issue" and user = 4753005 sorted by updated

5 rows where type = "issue" and user = 4753005 sorted by updated_at descending

Search:

descending

id	node_id	number	title	user	state	locked	comments	created_at	updated_at ▲	closed_at	author_association	body	reactions	state_reason	repo	type
1694671281	I_kwDOAMm_X85lAqGx	7812	Appending to existing zarr store writes mostly NaN from dask arrays, but not numpy arrays	grahamfindlay 4753005	open	0	1	2023-05-03T19:30:13Z	2023-11-15T18:56:09Z		NONE	What is your issue? I am using `xarray` to consolidate ~24 pre-existing, moderately large netCDF files into a single zarr store. Each file contains a `DataArray` with dimensions `(channel, time)`, and no values are `nan`. Each file's timeseries picks up right where the previous one's left off, making this a perfect use case for out-of-memory file concatenation. `for i, f in enumerate(tqdm(files)): da = xr.open_dataarray(f) # Open the netCDF file da = da.chunk({'channel': da.channel.size, 'time': 'auto'}) # Chunk along the time dimension if i == 0: da.to_zarr(zarr_file, mode="w") else: da.to_zarr(zarr_file, append_dim='time') da.close()` This always writes the first file correctly, and every other file appends without warning or error, but when I read the resulting zarr store, ~25% of all timepoints (probably, time chunks) derived from files `i > 0` are `nan`. Admittedly, the above code seems dangerous, since there is no guarantee that `da.chunk({'time': 'auto'})` will always return chunks of the same size, even though the files are nearly identical in size, and I don't know what the expected behavior is if the dask chunksizes don't match the chunksizes of the pre-existing zarr store. I checked the docs but didn't find the answer. Even if the chunksizes always do match, I am not sure what will happen when appending to an existing store. If the last chunk in the store before appending is not a full chunk, will it be "filled in" when new data are appended to the store? Presumably, but this seems like it could cause problems with parallel writing, since the source chunks from a dask array almost certainly won't line up with the new chunks in the zarr store, unless you've been careful to make it so. In any case, the following change seems to solve the issue, and the zarr store no longer contains `nan`. `for i, f in enumerate(tqdm(files)): da = xr.open_dataarray(f) # Open the netCDF file if i == 0: da = da.chunk({'channel': da.channel.size, 'time': 'auto'}) # Chunk along the time dimension da.to_zarr(zarr_file, mode="w") else: da.to_zarr(zarr_file, append_dim='time') da.close()` I didn't file this as a bug, because I was doing something that was a bad idea, but it does seem like `to_zarr` should have stopped me from doing it in the first place.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7812/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1717787692	I_kwDOAMm_X85mY1ws	7853	Surprising behavior of DataArray.chunk when using automatic chunksize determination	grahamfindlay 4753005	closed	1	2	2023-05-19T20:31:25Z	2023-08-01T16:27:19Z	2023-08-01T16:27:19Z	NONE	What is your issue? I have a DataArray `da` with dims `(x, y)`, and additional coordinates such as `x_coord` on dim `x`. If I try to chunk this array using `da.chunk(chunks={'x': 'auto'})`, I end up with a situation where: 1. The data themselves are chunked along `x` with chunksize `a`. 2. The `x` coordinate itself is not chunked. 3. The `x_coord` coordinate on dim `x` is chunked, with chunksize `b != a`. As far as I can tell, what is going on is that `da.chunk(chunks={'x': 'auto'})` is autodetermining the chunksize differently for each "thing" (data, variable, coordinate, etc) on the `x` dimension. What I expected was for it to determine one chunksize based on the data in the array, then use that chunksize (or no chunking) to each coordinate as well. Maybe there could be an option to yield unified chunks by default. I discovered this because after chunking, `da.chunksizes` raises a ValueError because of the mismatch between the data and `x_coord`, and the proposed solution -- calling `da.unify_chunks()` -- then results in irregular chunksizes on both the data and `x_coord`. To get the behavior that I expected I have to call `da.chunk(da.encoding['preferred_chunks'])`, which also, incidentally, seems like what I would have expected from `da.unify_chunks()`.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7853/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1421180629	I_kwDOAMm_X85UtX7V	7207	Difficulties with selecting from numpy.datetime64[ns] dimensions	grahamfindlay 4753005	closed	0	3	2022-10-24T17:35:01Z	2022-10-24T22:45:36Z	2022-10-24T22:45:36Z	NONE	What is your issue? I have a DataArray ("`spgs`") containing time-frequency data, with a `time` dimension of dtype `numpy.datetime64[ns]`. I used to be able to select using: ``` Select using datetime strings spgs.sel(time=slice("2022-10-13T09:00:00", "2022-10-13T21:00:00") Select using Timestamp objects rng = tuple(pd.to_datetime(x) for x in ["2022-10-13T09:00:00", "2022-10-13T21:00:00"]) spgs.sel(time=slice(rng)) # Select using numpy.datetime64[ns] objects, such that rng[0].dtype == spgs.time.values.dtype rng = tuple(pd.to_datetime(["2022-10-13T09:00:00", "2022-10-13T21:00:00"]).values) spg.sel(time=slice(rng)) `None of these work after upgrading to v2022.10.0. The first method yields:` Traceback (most recent call last): File "<string>", line 1, in <module> File "/home/gfindlay/miniconda3/envs/seahorse/lib/python3.10/site-packages/xarray/core/dataarray.py", line 1523, in sel ds = self._to_temp_dataset().sel( File "/home/gfindlay/miniconda3/envs/seahorse/lib/python3.10/site-packages/xarray/core/dataset.py", line 2550, in sel query_results = map_index_queries( File "/home/gfindlay/miniconda3/envs/seahorse/lib/python3.10/site-packages/xarray/core/indexing.py", line 183, in map_index_queries results.append(index.sel(labels, **options)) # type: ignore[call-arg] File "/home/gfindlay/miniconda3/envs/seahorse/lib/python3.10/site-packages/xarray/core/indexes.py", line 434, in sel indexer = _query_slice(self.index, label, coord_name, method, tolerance) File "/home/gfindlay/miniconda3/envs/seahorse/lib/python3.10/site-packages/xarray/core/indexes.py", line 210, in _query_slice raise KeyError( KeyError: "cannot represent labeled-based slice indexer for coordinate 'time' with a slice over integer positions; the index is unsorted or non-unique" `The second two methods yield:` Traceback (most recent call last): File "pandas/_libs/index.pyx", line 545, in pandas._libs.index.DatetimeEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 2131, in pandas._libs.hashtable.Int64HashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 2140, in pandas._libs.hashtable.Int64HashTable.get_item KeyError: 1665651600000000000 ... KeyError: Timestamp('2022-10-13 09:00:00') `Interestingly, this works:` start = spgs.time.values.min() stop = spgs.time.values.max() spgs.sel(time=slice(start, stop)) `This does not:` start = spgs.time.values.min() stop = start + pd.to_timedelta('10s') spgs.sel(time=slice(start, stop)) ``` I filed this as an issue and not a bug, because from reading other issues here and over at pandas, it seems like this may be an unintended consequence of changes to Datetime/Timestamp handling, especially within pandas, rather than a bug with xarray per se. This is supported by the fact that downgrading xarray to 2022.9.0, without touching other dependencies (e.g. pandas), does not restore the old behavior.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7207/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1352920776	I_kwDOAMm_X85Qo-7I	6960	Unable to import xarray after installing "io" extras in Python 3.10.*	grahamfindlay 4753005	closed	0	3	2022-08-27T02:50:48Z	2022-09-01T10:15:30Z	2022-09-01T10:15:30Z	NONE	What happened? When installed into a Python 3.10 environment with a basic `pip install xarray`, there are no issues importing xarray. But when installing with `pip install xarray[io]`, the following error results upon import: ``` Python 3.10.6 \| packaged by conda-forge \| (main, Aug 22 2022, 20:36:39) [GCC 10.4.0] on linux Type "help", "copyright", "credits" or "license" for more information. import xarray as xr Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/gfindlay/miniconda3/envs/foo/lib/python3.10/site-packages/xarray/init.py", line 1, in <module> from . import testing, tutorial File "/home/gfindlay/miniconda3/envs/foo/lib/python3.10/site-packages/xarray/tutorial.py", line 13, in <module> from .backends.api import open_dataset as open_dataset File "/home/gfindlay/miniconda3/envs/foo/lib/python3.10/site-packages/xarray/backends/__init__.py", line 14, in <module> from .pydap import PydapDataStore File "/home/gfindlay/miniconda3/envs/foo/lib/python3.10/site-packages/xarray/backends/pydap_.py", line 20, in <module> import pydap.client File "/home/gfindlay/miniconda3/envs/foo/lib/python3.10/site-packages/pydap/client.py", line 50, in <module> from .model import DapType File "/home/gfindlay/miniconda3/envs/foo/lib/python3.10/site-packages/pydap/model.py", line 175, in <module> from collections import OrderedDict, Mapping ImportError: cannot import name 'Mapping' from 'collections' (/home/gfindlay/miniconda3/envs/foo/lib/python3.10/collections/init.py) `` It appears that having the extras installed causes an alternate series of imports within xarray that have not been updated for Python 3.10 (from collections import Mapping`should be`from collections.abc import Mapping`). What did you expect to happen? No response Minimal Complete Verifiable Example ```Python mamba create -n foo python=3 mamba activate foo pip install xarray[io] python import xarray as xr ``` MVCE confirmation [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [X] Complete example — the example is self-contained, including all data and the text of any traceback. [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [X] New issue — a search of GitHub Issues suggests this is not a duplicate. Relevant log output No response Anything else we need to know? No response Environment N/A	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6960/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1317502063	I_kwDOAMm_X85Oh3xv	6826	Success of DataArray.plot() depends on object's history.	grahamfindlay 4753005	closed	0	1	2022-07-25T23:40:07Z	2022-07-26T22:48:39Z	2022-07-26T22:48:39Z	NONE	What happened? I have the following 2D DataArray ``` ldda ```` I can select a portion of it like so `da1 = ldda.sel(component=0) da1` I can get what seems like an equivalent array (equal values, matching dtypes, etc.) in the following way: `da2 = ldda.to_dataset(dim="component")[0] da2` And yet, while I can successfully plot `da1`... `da1.plot()` Trying to do the same with `da2` results in the following error... `da2.plot()` AttributeError: 'int' object has no attribute 'startswith' See below for full traceback and minimal working example. What did you expect to happen? I expected `da1` and `da2` to be functionally equivalent. Minimal Complete Verifiable Example ```Python import xarray as xr import numpy as np da = xr.DataArray( data=np.asarray([[1, 2], [3, 4], [5, 6]]), dims=["x", "y"], ) da.sel(x=0).plot() # Succeeds da.to_dataset(dim='x')[0].plot() # Fails ``` MVCE confirmation [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [X] Complete example — the example is self-contained, including all data and the text of any traceback. [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [X] New issue — a search of GitHub Issues suggests this is not a duplicate. Relevant log output ```Python AttributeError Traceback (most recent call last) /Volumes/scratch/neuropixels/t2_shared_projects/discoflow_v2/discoflow/analysis/ANPIX30/discoflow-day2/get_senzai_ic_loadings.ipynb Cell 18 in <cell line: 1>() ----> 1 da2.plot() File /Volumes/scratch/neuropixels/t2_shared_envs/discoflow_v2/lib/python3.8/site-packages/xarray/plot/plot.py:866, in _PlotMethods.call(self, kwargs) 865 def call(self, kwargs): --> 866 return plot(self._da, kwargs) File /Volumes/scratch/neuropixels/t2_shared_envs/discoflow_v2/lib/python3.8/site-packages/xarray/plot/plot.py:332, in plot(darray, row, col, col_wrap, ax, hue, rtol, subplot_kws, kwargs) 328 plotfunc = hist 330 kwargs["ax"] = ax --> 332 return plotfunc(darray,** kwargs) File /Volumes/scratch/neuropixels/t2_shared_envs/discoflow_v2/lib/python3.8/site-packages/xarray/plot/plot.py:436, in line(darray, row, col, figsize, aspect, size, ax, hue, x, y, xincrease, yincrease, xscale, yscale, xticks, yticks, xlim, ylim, add_legend, _labels, args, kwargs) 432 xplt_val, yplt_val, x_suffix, y_suffix, kwargs = _resolve_intervals_1dplot( 433 xplt.to_numpy(), yplt.to_numpy(), kwargs 434 ) 435 xlabel = label_from_attrs(xplt, extra=x_suffix) --> 436 ylabel = label_from_attrs(yplt, extra=y_suffix) 438 _ensure_plottable(xplt_val, yplt_val) 440 primitive = ax.plot(xplt_val, yplt_val, args, **kwargs) File /Volumes/scratch/neuropixels/t2_shared_envs/discoflow_v2/lib/python3.8/site-packages/xarray/plot/utils.py:491, in label_from_attrs(da, extra) 488 units = _get_units_from_attrs(da) ... 493 textwrap.wrap(name + extra + units, 60, break_long_words=False) 494 ) 495 else: AttributeError: 'int' object has no attribute 'startswith' ``` Anything else we need to know? Thank you for one of my favorite packages! Environment INSTALLED VERSIONS ------------------ commit: None python: 3.8.13 \| packaged by conda-forge \| (default, Mar 25 2022, 06:04:18) [GCC 10.3.0] python-bits: 64 OS: Linux OS-release: 5.4.0-122-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: None xarray: 2022.3.0 pandas: 1.4.3 numpy: 1.21.0 scipy: 1.8.1 netCDF4: None pydap: None h5netcdf: None h5py: 3.7.0 Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2022.7.0 distributed: None matplotlib: 3.5.1 cartopy: None seaborn: 0.11.2 numbagg: None fsspec: 2022.5.0 cupy: None pint: None sparse: None setuptools: 63.2.0 pip: 22.2 conda: None pytest: 7.1.2 IPython: 8.4.0 sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6826/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);

issues

5 rows where type = "issue" and user = 4753005 sorted by updated_at descending

What is your issue?

What is your issue?

What is your issue?

Select using datetime strings

Select using Timestamp objects

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

MVCE confirmation

Relevant log output

Anything else we need to know?

Environment

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

MVCE confirmation

Relevant log output

```Python

Anything else we need to know?

Environment

Advanced export