id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 427410885,MDU6SXNzdWU0Mjc0MTA4ODU=,2857,Quadratic slowdown when saving multiple datasets to the same h5 file (h5netcdf),2418513,closed,0,,,24,2019-03-31T15:47:40Z,2022-01-12T07:19:06Z,2022-01-12T07:19:06Z,NONE,,,,"I can't quite understand what's wrong with my side of the code, wondering if this kind of slowdown is expected or not? Basically, what I'm doing is something like this: ```python with h5py.File('file.h5', 'w') as f: f.flush() # reset the file for i, ds in enumerate(datasets): ds.to_netcdf('file.h5', group=str(i), engine='h5netcdf', mode='a') ``` And here's the log for saving 20 datasets, the listed times are for each dataset independently. Instead of the expected 10 sec (which is already kind of slow, but whatever), I get 2 minutes. The time to save each dataset seems to increase linearly, which leads to a quadratic overall slowdown: ``` saving dataset... 00:00:00.559135 saving dataset... 00:00:00.924617 saving dataset... 00:00:01.351670 saving dataset... 00:00:01.818111 saving dataset... 00:00:02.356307 saving dataset... 00:00:02.971077 saving dataset... 00:00:03.685565 saving dataset... 00:00:04.375104 saving dataset... 00:00:04.575837 saving dataset... 00:00:05.179975 saving dataset... 00:00:05.793876 saving dataset... 00:00:06.517916 saving dataset... 00:00:07.190257 saving dataset... 00:00:07.993795 saving dataset... 00:00:08.786421 saving dataset... 00:00:09.414821 saving dataset... 00:00:10.729006 saving dataset... 00:00:11.584044 saving dataset... 00:00:14.160655 saving dataset... 00:00:14.460564 CPU times: user 1min 49s, sys: 12.8 s, total: 2min 2s Wall time: 2min 4s ```","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2857/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 423749397,MDU6SXNzdWU0MjM3NDkzOTc=,2836,xarray.concat() with compat='identical' fails for DataArray attrs,2418513,open,0,,,9,2019-03-21T14:11:29Z,2021-07-08T17:42:52Z,,NONE,,,,"Not sure if it was ever supposed to work with numpy arrays, but it actually does :thinking:: ```python >>> attr = np.array([[3, 4]]) >>> d1 = xr.Dataset({'z': 1}, attrs={'y': attr}) >>> d2 = xr.Dataset({'z': 2}, attrs={'y': attr.copy()}) >>> xr.concat([d1, d2], dim='z', compat='identical') ``` However, it fails if you use DataArray attrs: ```python >>> attr = xr.DataArray([3, 4], {'x': [1, 2]}, 'x') >>> d1 = xr.Dataset({'z': 1}, attrs={'y': attr}) >>> d2 = xr.Dataset({'z': 2}, attrs={'y': attr.copy()}) >>> xr.concat([d1, d2], dim='z', compat='identical') ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() ``` Given that the check is simply `(a is b) or (a == b)`, should it try to do something smarter for array-like attrs?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2836/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 423016453,MDU6SXNzdWU0MjMwMTY0NTM=,2824,Dataset.from_records()?,2418513,open,0,,,4,2019-03-20T00:46:19Z,2021-05-13T20:20:52Z,,NONE,,,,"Currently, to easily create a `Dataset` from an existing numpy recarray (not a `DataArray`, which is currently bugged anyway with recarrays due to #1434), I couldn't find an easier way than ```python df = xr.Dataset.from_dataframe(pd.DataFrame(my_recarray).set_index('foo')) ``` (which is kind of dumb since it allocates the memory twice) It would definitely be nice to be able to do just this (perhaps with extra arguments to set index on the fly etc): ```python df = xr.Dataset.from_records(my_recarray, ...) ``` (Apologies if I'm missing something obvious.)","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2824/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 549712566,MDU6SXNzdWU1NDk3MTI1NjY=,3695,mypy --strict fails on scripts/packages depending on xarray; __all__ required,2418513,closed,0,6213168,,3,2020-01-14T17:27:44Z,2020-01-17T20:42:25Z,2020-01-17T20:42:25Z,NONE,,,,"Checked this with both 0.14.1 and master branch. Create `foo.py`: ```python from xarray import DataArray ``` and run: ```sh $ mypy --strict foo.py ``` which results in ``` foo.py:1: error: Module 'xarray' has no attribute 'DataArray' Found 1 error in 1 file (checked 1 source file) ``` I did a bit of digging trying to make it work, it looks like what makes the above script work with mypy is adding ```python __all__ = ('DataArray',) ``` to `xarray/__init__.py`, otherwise mypy treats those imports as ""private"" (and is correct in doing so). Should `__all__` be added to the root `__init__.py`? To any `__init__.py` in subpackages as well?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3695/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 423774214,MDU6SXNzdWU0MjM3NzQyMTQ=,2837,DataArray plotting: pyplot compat and passing the style,2418513,open,0,,,6,2019-03-21T14:57:12Z,2019-04-11T16:25:49Z,,NONE,,,,"These are two unrelated issues in one really that I've noticed while trying to plot things directly from DataArray objects. ---- The following works as expected, by converting DataArray to pandas first) ```python >>> arr.to_series().plot(style='.-') >>> arr.to_series().plot.line(style='.-') ``` Passing Series to pyplot.plot() directly also works and retains index: ```python >>> plt.plot(arr.to_series(), '.-') ``` Trying to set style directly when plotting from DataArray doesn't work: ```python >>> arr.plot(style='.-') AttributeError: Unknown property style >>> arr.plot.line(style='.-') AttributeError: Unknown property style ``` Passing DataArray to pyplot.plot() loses index: ```python >>> plt.plot(arr, '.-') # works but loses coords; same as plot.plot(arr.values, '.-') ```","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2837/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 423023519,MDU6SXNzdWU0MjMwMjM1MTk=,2825,KeyError on selecting empty time slice from a datetime-indexed Dataset,2418513,open,0,,,4,2019-03-20T01:21:56Z,2019-03-20T17:58:24Z,,NONE,,,,"(xarray version: 0.11.3) Just wanted to confirm this is expected behaviour: `sel()` with a date that would yield an empty selection throws an exception (I would naturally expect it to return a zero-length dataarray/dataset instead): ```python >>> foo = xr.DataArray( np.array([1, 2, 3]), {'t': pd.to_datetime(['2018-01-01', '2018-02-02T01:01', '2018-02-02T02:02'])}, dims=['t'] ) >>> foo.sel(t='2018-01-01').size 1 >>> foo.sel(t='2018-02-02').size 2 >>> foo.sel(t='2018-03-03').size # expected 0? pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item() TypeError: an integer is required During handling of the above exception, another exception occurred: pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc() pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc() pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine._date_check_type() KeyError: '2018-03-03' During handling of the above exception, another exception occurred: pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item() TypeError: an integer is required During handling of the above exception, another exception occurred: pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc() pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc() pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine._date_check_type() KeyError: '2018-03-03' During handling of the above exception, another exception occurred: pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item() KeyError: 1520035200000000000 During handling of the above exception, another exception occurred: pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc() pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc() KeyError: Timestamp('2018-03-03 00:00:00') During handling of the above exception, another exception occurred: pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item() KeyError: 1520035200000000000 During handling of the above exception, another exception occurred: pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc() pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc() KeyError: Timestamp('2018-03-03 00:00:00') ... ```","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2825/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue