issues
7,034 rows where state = "closed" sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: locked, assignee, milestone, author_association, draft, state_reason, created_at (date), updated_at (date), closed_at (date)
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at ▲ | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
503583044 | MDU6SXNzdWU1MDM1ODMwNDQ= | 3379 | `ds.to_zarr(mode="a", append_dim="time")` not capturing any time steps under Hours | jminsk-cc 48155582 | closed | 0 | 3 | 2019-10-07T17:17:06Z | 2024-05-03T18:34:50Z | 2024-05-03T18:34:50Z | NONE | MCVE Code Sample```python import datetime import xarray as xr date = datetime.datetime(2019, 1, 1, 1, 10) Reading in 2 min time stepped MRMS datads = xr.open_rasterio(dir_path) ds.name = "mrms" ds["time"] = date ds = ds.expand_dims("time") ds = ds.to_dataset() ds.to_zarr("fin_zarr", compute=False, mode="w-") date = datetime.datetime(2019, 1, 1, 1, 12) Reading in 2 min time stepped MRMS dataThis can be the same file since we are adding time manuallyds = xr.open_rasterio(dir_path) ds.name = "mrms" ds["time"] = date ds = ds.expand_dims("time") ds = ds.to_dataset() ds.to_zarr("fin_zarr", compute=False, mode="a", append_dim="time") ``` Expected Output
Problem DescriptionThe outout looks like this:
Where the minutes are repeated for the whole hour until a new hour is appended. It seems to not be handling minutes correctly. Output of
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/3379/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
1050082137 | I_kwDOAMm_X84-lvtZ | 5969 | `to_zarr(append_dim="time")` appends incorrect datetimes | JackKelly 460756 | closed | 0 | 3 | 2021-11-10T17:00:53Z | 2024-05-03T17:09:31Z | 2024-05-03T17:09:30Z | NONE | DescriptionIf you create a Zarr with a single timestep and then append to the Minimal Complete Verifiable ExampleCreate a really simple
Write just the first timestep to a new Zarr store:
So far, so good! Now things get weird... let's append the remainder of
This throws a warning, which is probably relevant:
What happenedLet's load the Zarr and print the contents on the
(I've removed the seconds and milliseconds to make it a bit easier to read) The first and fifth time coords (2000-01-01T00:35 and 2000-01-02T00:35) are correct. None of the others are correct! The encoding is not appropriate (see #3942)... notice that the
What you expected to happenThe correct Anything else we need to know?There are three workarounds that I'm aware of: 1) When first creating the Zarr, write two or more timesteps into the Zarr. Then you can append any number of timesteps to the Zarr and everything works fine.
2) Convert the
Related issuesIt's possible that the root cause of this issue is #3942. And I think #3379 is another symptom of this issue. EnvironmentOutput of <tt>xr.show_versions()</tt>INSTALLED VERSIONS ------------------ commit: None python: 3.9.7 | packaged by conda-forge | (default, Sep 29 2021, 19:20:46) [GCC 9.4.0] python-bits: 64 OS: Linux OS-release: 5.13.0-21-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: ('en_GB', 'UTF-8') libhdf5: 1.12.1 libnetcdf: 4.8.1 xarray: 0.20.1 pandas: 1.3.4 numpy: 1.21.4 scipy: 1.7.2 netCDF4: 1.5.8 pydap: None h5netcdf: 0.11.0 h5py: 3.4.0 Nio: None zarr: 2.10.1 cftime: 1.5.1.1 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.8 cfgrib: 0.9.9.1 iris: None bottleneck: 1.3.2 dask: 2021.10.0 distributed: None matplotlib: 3.4.3 cartopy: None seaborn: None numbagg: None fsspec: 2021.11.0 cupy: None pint: None sparse: None setuptools: 58.5.3 pip: 21.3.1 conda: None pytest: 6.2.5 IPython: 7.29.0 sphinx: None |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5969/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2275404926 | PR_kwDOAMm_X85uWjVP | 8993 | call `np.cross` with 3D vectors only | keewis 14808389 | closed | 0 | 1 | 2024-05-02T12:21:30Z | 2024-05-03T15:56:49Z | 2024-05-03T15:22:26Z | MEMBER | 0 | pydata/xarray/pulls/8993 |
In the tests, we've been calling For a later PR: add tests to check if |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8993/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2203689075 | PR_kwDOAMm_X85qjXJq | 8870 | Enable explicit use of key tuples (instead of *Indexer objects) in indexing adapters and explicitly indexed arrays | andersy005 13301940 | closed | 0 | 1 | 2024-03-23T04:34:18Z | 2024-05-03T15:27:38Z | 2024-05-03T15:27:22Z | MEMBER | 0 | pydata/xarray/pulls/8870 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8870/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2276732187 | PR_kwDOAMm_X85ubH0P | 8996 | Mark `test_use_cftime_false_standard_calendar_in_range` as an expected failure | spencerkclark 6628425 | closed | 0 | 0 | 2024-05-03T01:05:21Z | 2024-05-03T15:21:48Z | 2024-05-03T15:21:48Z | MEMBER | 0 | pydata/xarray/pulls/8996 | Per https://github.com/pydata/xarray/issues/8844#issuecomment-2089427222, for the time being this marks |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8996/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2266442492 | PR_kwDOAMm_X85t4NhR | 8976 | Migration of datatree/ops.py -> datatree_ops.py | flamingbear 479480 | closed | 0 | 4 | 2024-04-26T20:14:11Z | 2024-05-02T19:49:39Z | 2024-05-02T19:49:39Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/8976 | I considered wedging this into core/ops.py, but it didn't look like it fit there. This is a basic lift and shift from datatree_/ops.py to core/datatree_ops.py I did fix the document addendum injection and added a couple of tests.
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8976/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2241526039 | PR_kwDOAMm_X85skMs0 | 8939 | avoid a couple of warnings in `polyfit` | keewis 14808389 | closed | 0 | 14 | 2024-04-13T11:49:13Z | 2024-05-01T16:42:06Z | 2024-05-01T15:34:20Z | MEMBER | 0 | pydata/xarray/pulls/8939 | - [x] towards #8844
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8939/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2270984193 | PR_kwDOAMm_X85uHk70 | 8986 | clean up the upstream-dev setup script | keewis 14808389 | closed | 0 | 1 | 2024-04-30T09:34:04Z | 2024-04-30T23:26:13Z | 2024-04-30T20:59:56Z | MEMBER | 0 | pydata/xarray/pulls/8986 | In trying to install packages that are compatible with As it seems
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8986/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2272299822 | PR_kwDOAMm_X85uL82a | 8989 | Skip flaky `test_open_mfdataset_manyfiles` test | max-sixty 5635139 | closed | 0 | 0 | 2024-04-30T19:24:41Z | 2024-04-30T20:27:04Z | 2024-04-30T19:46:34Z | MEMBER | 0 | pydata/xarray/pulls/8989 | Don't just xfail, and not only on windows, since it can crash the worker |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8989/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2271670475 | PR_kwDOAMm_X85uJ5Er | 8988 | Remove `.drop` warning allow | max-sixty 5635139 | closed | 0 | 0 | 2024-04-30T14:39:35Z | 2024-04-30T19:26:17Z | 2024-04-30T19:26:16Z | MEMBER | 0 | pydata/xarray/pulls/8988 | { "url": "https://api.github.com/repos/pydata/xarray/issues/8988/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
2271652603 | PR_kwDOAMm_X85uJ122 | 8987 | Add notes on when to add ignores to warnings | max-sixty 5635139 | closed | 0 | 0 | 2024-04-30T14:34:52Z | 2024-04-30T14:56:47Z | 2024-04-30T14:56:46Z | MEMBER | 0 | pydata/xarray/pulls/8987 | { "url": "https://api.github.com/repos/pydata/xarray/issues/8987/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
2262468762 | PR_kwDOAMm_X85tqnJm | 8973 | Docstring and documentation improvement for the Dataset class | noahbenson 2005723 | closed | 0 | 7 | 2024-04-25T01:39:02Z | 2024-04-30T14:40:32Z | 2024-04-30T14:40:14Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/8973 | The example in the doc-string of the Additionally, this PR contains updates to the documentation, specifically the See issue #8970 for more information.
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8973/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2261858401 | I_kwDOAMm_X86G0Thh | 8970 | Example code in the documentation for `Dataset` is not clear | noahbenson 2005723 | closed | 0 | 13 | 2024-04-24T17:50:46Z | 2024-04-30T14:40:15Z | 2024-04-30T14:40:15Z | CONTRIBUTOR | What is your issue?The example code in the documentation for the ```python np.random.seed(0) temperature = 15 + 8 * np.random.randn(2, 2, 3) precipitation = 10 * np.random.rand(2, 2, 3) lon = [[-99.83, -99.32], [-99.79, -99.23]] lat = [[42.25, 42.21], [42.63, 42.59]] time = pd.date_range("2014-09-06", periods=3) reference_time = pd.Timestamp("2014-09-05") ds = xr.Dataset( data_vars=dict( temperature=(["x", "y", "time"], temperature), precipitation=(["x", "y", "time"], precipitation), ), coords=dict( lon=(["x", "y"], lon), lat=(["x", "y"], lat), time=time, reference_time=reference_time, ), attrs=dict(description="Weather related data."), ) ``` To be clear, I understand each individual line of code, but I don't understand why there is both a latitude/longitude and an x/y in this example or how they are supposed to be related to each other (and there do not appear to be any additional details about this dataset's intended structure). Probably due to this lack of clarity I'm having a hard time wrapping my head around what the x/y coordinates and the lat/lon coordinates are supposed to demonstrate about xarray here, or how the x/y and lat/lon values are represented in the data structure. Are the x and y coordinates in a map projection of some kind? I have worked successfully with I suspect that all that is needed is a clear description of what these data are supposed to represent, how they are intended to be used, and how x/y and lat/lon are related. If someone can explain this to me, I'd be happy to submit a PR for the docs. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8970/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2212435865 | PR_kwDOAMm_X85rAwYu | 8885 | add `.oindex` and `.vindex` to `BackendArray` | andersy005 13301940 | closed | 0 | 8 | 2024-03-28T06:14:43Z | 2024-04-30T12:12:50Z | 2024-04-17T01:53:23Z | MEMBER | 0 | pydata/xarray/pulls/8885 | this PR builds towards the primary objective is to partially address
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8885/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1250939008 | I_kwDOAMm_X85Kj9CA | 6646 | `dim` vs `dims` | max-sixty 5635139 | closed | 0 | 4 | 2022-05-27T16:15:02Z | 2024-04-29T18:24:56Z | 2024-04-29T18:24:56Z | MEMBER | What is your issue?I've recently been hit with this when experimenting with Should we standardize on one of these? |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6646/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2268058661 | PR_kwDOAMm_X85t9f5f | 8982 | Switch all methods to `dim` | max-sixty 5635139 | closed | 0 | 0 | 2024-04-29T03:42:34Z | 2024-04-29T18:24:56Z | 2024-04-29T18:24:55Z | MEMBER | 0 | pydata/xarray/pulls/8982 | I think this is the final set of methods
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8982/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2267810980 | PR_kwDOAMm_X85t8q4s | 8981 | Enable ffill for datetimes | max-sixty 5635139 | closed | 0 | 5 | 2024-04-28T20:53:18Z | 2024-04-29T18:09:48Z | 2024-04-28T23:02:11Z | MEMBER | 0 | pydata/xarray/pulls/8981 | Notes inline. Would fix #4587 |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8981/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2019566184 | I_kwDOAMm_X854YCJo | 8494 | Filter expected warnings in the test suite | TomNicholas 35968931 | closed | 0 | 1 | 2023-11-30T21:50:15Z | 2024-04-29T16:57:07Z | 2024-04-29T16:56:16Z | MEMBER | FWIW one thing I'd be keen for to do generally — though maybe this isn't the place to start it — is handle warnings in the test suite when we add a new warning — i.e. filter them out where we expect them. In this case, that would be the loading the netCDF files that have duplicate dims. Otherwise warnings become a huge block of text without much salience. I mostly see the 350 lines of them and think "meh mostly units & cftime", but then something breaks on a new upstream release that was buried in there, or we have a supported code path that is raising warnings internally. (I'm not sure whether it's possible to generally enforce that — maybe we could raise on any warnings coming from within xarray? Would be a non-trivial project to get us there though...) Originally posted by @max-sixty in https://github.com/pydata/xarray/issues/8491#issuecomment-1834615826 |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8494/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2261855627 | PR_kwDOAMm_X85togwQ | 8969 | CI: python 3.12 by default. | dcherian 2448579 | closed | 0 | 2 | 2024-04-24T17:49:25Z | 2024-04-29T16:21:20Z | 2024-04-29T16:21:08Z | MEMBER | 0 | pydata/xarray/pulls/8969 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8969/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2118308210 | I_kwDOAMm_X85-QtFy | 8707 | Weird interaction between aggregation and multiprocessing on DaskArrays | saschahofmann 24508496 | closed | 0 | 10 | 2024-02-05T11:35:28Z | 2024-04-29T16:20:45Z | 2024-04-29T16:20:44Z | CONTRIBUTOR | What happened?When I try to run a modified version of the example from the dropna documentation (see below), it creates a never terminating process. To reproduce it I added a rolling operation before dropping nans and then run 4 processes using the standard library multiprocessing What did you expect to happen?There is nothing obvious to me why this wouldn't just work unless there is a weird interaction between the Dask threads and the different processes. Using Xarray+Dask+Multiprocessing seems to work for me on other functions, it seems to be this particular combination that is problematic. Minimal Complete Verifiable Example```Python import xarray as xr import numpy as np from multiprocessing import Pool datasets = [xr.Dataset( { "temperature": ( ["time", "location"], [[23.4, 24.1], [np.nan if i>1 else 23.4, 22.1 if i<2 else np.nan], [21.8 if i<3 else np.nan, 24.2], [20.5, 25.3]], ) }, coords={"time": [1, 2, 3, 4], "location": ["A", "B"]}, ).chunk(time=2) for i in range(4)] def process(dataset): return dataset.rolling(dim={'time':2}).sum().dropna(dim="time", how="all").compute() This works as expecteddropped = [] for dataset in datasets: dropped.append(process(dataset)) This seems to never finishwith Pool(4) as p: dropped = p.map(process, datasets) ``` MVCE confirmation
Relevant log outputNo response Anything else we need to know?I am still running on 2023.08.0 see below for more details about the environment Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.11.6 (main, Jan 25 2024, 20:42:03) [GCC 7.5.0]
python-bits: 64
OS: Linux
OS-release: 5.4.0-124-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.9.3-development
xarray: 2023.8.0
pandas: 2.1.4
numpy: 1.26.3
scipy: 1.12.0
netCDF4: 1.6.5
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: 2.16.1
cftime: 1.6.3
nc_time_axis: 1.4.1
PseudoNetCDF: None
iris: None
bottleneck: 1.3.7
dask: 2024.1.1
distributed: 2024.1.1
matplotlib: 3.8.2
cartopy: 0.22.0
seaborn: None
numbagg: None
fsspec: 2023.12.2
cupy: None
pint: 0.23
sparse: None
flox: 0.9.0
numpy_groupies: 0.10.2
setuptools: 69.0.3
pip: 23.2.1
conda: None
pytest: 8.0.0
mypy: None
IPython: 8.18.1
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8707/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2267711587 | PR_kwDOAMm_X85t8VWy | 8978 | more engine environment tricks in preparation for `numpy>=2` | keewis 14808389 | closed | 0 | 7 | 2024-04-28T17:54:38Z | 2024-04-29T14:56:22Z | 2024-04-29T14:56:21Z | MEMBER | 0 | pydata/xarray/pulls/8978 | Turns out And finally, the
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8978/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2262478932 | PR_kwDOAMm_X85tqpUi | 8974 | Raise errors on new warnings from within xarray | max-sixty 5635139 | closed | 0 | 2 | 2024-04-25T01:50:48Z | 2024-04-29T12:18:42Z | 2024-04-29T02:50:21Z | MEMBER | 0 | pydata/xarray/pulls/8974 | Notes are inline.
Done with some help from an LLM — quite good for doing tedious tasks that we otherwise wouldn't want to do — can paste in all the warnings output and get a decent start on rules for exclusions |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8974/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1997537503 | PR_kwDOAMm_X85fqp3A | 8459 | Check for aligned chunks when writing to existing variables | max-sixty 5635139 | closed | 0 | 5 | 2023-11-16T18:56:06Z | 2024-04-29T03:05:36Z | 2024-03-29T14:35:50Z | MEMBER | 0 | pydata/xarray/pulls/8459 | While I don't feel super confident that this is designed to protect against any bugs, it does solve the immediate problem in #8371, by hoisting the encoding check above the code that runs for only new variables. The encoding check is somewhat implicit, so this was an easy thing to miss prior.
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8459/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1574694462 | I_kwDOAMm_X85d2-4- | 7513 | intermittent failures with h5netcdf, h5py on macos | dcherian 2448579 | closed | 0 | 5 | 2023-02-07T16:58:43Z | 2024-04-28T23:35:21Z | 2024-04-28T23:35:21Z | MEMBER | What is your issue?cc @hmaarrfk @kmuehlbauer Passed: https://github.com/pydata/xarray/actions/runs/4115923717/jobs/7105298426 Failed: https://github.com/pydata/xarray/actions/runs/4115946392/jobs/7105345290 Versions:
``` =================================== FAILURES =================================== ___ test_open_mfdataset_manyfiles[h5netcdf-20-True-5-5] ______ [gw1] darwin -- Python 3.10.9 /Users/runner/micromamba-root/envs/xarray-tests/bin/python readengine = 'h5netcdf', nfiles = 20, parallel = True, chunks = 5 file_cache_maxsize = 5
/Users/runner/work/xarray/xarray/xarray/tests/test_backends.py:3267: /Users/runner/work/xarray/xarray/xarray/backends/api.py:991: in open_mfdataset datasets, closers = dask.compute(datasets, closers) /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/base.py:599: in compute results = schedule(dsk, keys, kwargs) /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/threaded.py:89: in get results = get_async( /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/local.py:511: in get_async raise_exception(exc, tb) /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/local.py:319: in reraise raise exc /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/local.py:224: in execute_task result = _execute_task(task, data) /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/core.py:119: in _execute_task return func((_execute_task(a, cache) for a in args)) /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/utils.py:72: in apply return func(args, kwargs) /Users/runner/work/xarray/xarray/xarray/backends/api.py:526: in open_dataset backend_ds = backend.open_dataset( /Users/runner/work/xarray/xarray/xarray/backends/h5netcdf_.py:417: in open_dataset ds = store_entrypoint.open_dataset( /Users/runner/work/xarray/xarray/xarray/backends/store.py:32: in open_dataset vars, attrs = store.load() /Users/runner/work/xarray/xarray/xarray/backends/common.py:129: in load (decode_variable_name(k), v) for k, v in self.get_variables().items() /Users/runner/work/xarray/xarray/xarray/backends/h5netcdf.py:220: in get_variables return FrozenDict( /Users/runner/work/xarray/xarray/xarray/core/utils.py:471: in FrozenDict return Frozen(dict(args, *kwargs)) /Users/runner/work/xarray/xarray/xarray/backends/h5netcdf_.py:221: in <genexpr> (k, self.open_store_variable(k, v)) for k, v in self.ds.variables.items() /Users/runner/work/xarray/xarray/xarray/backends/h5netcdf_.py:200: in open_store_variable elif var.compression is not None: /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/h5netcdf/core.py:394: in compression return self._h5ds.compression self = <[AttributeError("'NoneType' object has no attribute '_root'") raised in repr()] Variable object at 0x151378970>
``` |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7513/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
1579956621 | I_kwDOAMm_X85eLDmN | 7519 | Selecting variables from Dataset with view on dict keys is of type DataArray | derhintze 25172489 | closed | 0 | 7 | 2023-02-10T16:02:19Z | 2024-04-28T21:01:28Z | 2024-04-28T21:01:27Z | NONE | What happened?When selecting variables from a Dataset using a view on dict keys, the type returned is a DataArray, whereas the same using a list is a Dataset. What did you expect to happen?The type returned should be a Dataset. Minimal Complete Verifiable Example```Python import xarray as xr d = {"a": ("dim", range(1, 4)), "b": ("dim", range(2, 5))} data = xr.Dataset(d) select_dict = data[d.keys()] select_list = data[list(d)] reveal_type(select_dict) reveal_type(select_list) ``` MVCE confirmation
Relevant log output
Anything else we need to know?No response Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.9.10 (main, Mar 15 2022, 15:56:56)
[GCC 7.5.0]
python-bits: 64
OS: Linux
OS-release: 3.10.0-1160.49.1.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.9.0
xarray: 2022.12.0
pandas: 1.5.2
numpy: 1.23.5
scipy: 1.10.0
netCDF4: 1.6.2
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.6.3
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 58.1.0
pip: 23.0
conda: None
pytest: 7.2.1
mypy: 0.991
IPython: 8.8.0
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7519/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
1024011835 | I_kwDOAMm_X849CS47 | 5857 | Incorrect results when using xarray.ufuncs.angle(..., deg=True) | cvr 1119116 | closed | 0 | 4 | 2021-10-12T16:24:11Z | 2024-04-28T20:58:55Z | 2024-04-28T20:58:54Z | NONE | What happened: The What you expected to happen: To have the result of Minimal Complete Verifiable Example: ```python Put your MCVE code hereimport numpy as np import xarray as xr ds = xr.Dataset(coords={'wd': ('wd', np.arange(0, 360, 30, dtype=float))}) Z = xr.ufuncs.exp(1j * xr.ufuncs.radians(ds.wd)) D = xr.ufuncs.angle(Z, deg=True) # YIELDS INCORRECT RESULTS if not np.allclose(ds.wd, (D % 360)): print(f"Issue with angle operation: {D.values%360} instead of {ds.wd.values}" \ + f"\n\tERROR xr.ufuncs.angle(Z, deg=True) gives incorrect results !!!") D = xr.ufuncs.degrees(xr.ufuncs.angle(Z)) # Works OK if not np.allclose(ds.wd, (D % 360)): print(f"Issue with angle operation: {D%360} instead of {ds.wd}" \ + f"\n\tERROR xr.ufuncs.degrees(xr.ufuncs.angle(Z)) gives incorrect results!!!") D = xr.apply_ufunc(np.angle, Z, kwargs={'deg': True}) # Works OK if not np.allclose(ds.wd, (D % 360)): print(f"Issue with angle operation: {D%360} instead of {ds.wd}" \ + f"\n\tERROR xr.apply_ufunc(np.angle, Z, kwargs={{'deg': True}}) gives incorrect results!!!") ``` Anything else we need to know?: Though ```python import numpy as np import xarray as xr ds = xr.Dataset(coords={'wd': ('wd', np.arange(0, 360, 30, dtype=float))}) Z = np.exp(1j * np.radians(ds.wd)) print(Z) print(f"Is Z an XArray? {isinstance(Z, xr.DataArray)}") D = np.angle(ds.wd, deg=True)
print(D)
print(f"Is D an XArray? {isinstance(D, xr.DataArray)}")
Environment: No issues with xarray versions 0.16.2 and 0.17.0. This error happens from 0.18.0 onwards, up to 0.19.0 (recentmost). Output of <tt>xr.show_versions()</tt>INSTALLED VERSIONS ------------------ commit: None python: 3.7.10 (default, Feb 26 2021, 18:47:35) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 4.19.0-18-amd64 machine: x86_64 processor: byteorder: little LC_ALL: en_US.utf8 LANG: C.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 0.19.0 pandas: 1.2.3 numpy: 1.20.2 scipy: 1.5.3 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: None cartopy: None seaborn: None numbagg: None pint: None setuptools: 58.2.0 pip: 21.3 conda: 4.10.3 pytest: None IPython: None sphinx: None |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5857/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
1518812301 | I_kwDOAMm_X85ahzyN | 7414 | Error using xarray.interp - function signature does not match with scipy.interpn | Florian1209 20089326 | closed | 0 | 2 | 2023-01-04T11:30:48Z | 2024-04-28T20:55:33Z | 2024-04-28T20:55:33Z | NONE | What happened?I am experiencing an error when using the array.interp function. The error message indicates that the function signature does not match with scipy interpn. It 's linked to scipy update 1.10.0 (2023/01/03). What did you expect to happen?I would interpolate 2D data of numpy float64 : two data lattitudes and longitudes following <xarray.DataArray (row: 32, col: 32)>.
da is a xarray dataset :
Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output```Python interpolated_da = da.interp( venv/lib/python3.8/site-packages/xarray/core/dataset.py:3378: in interp variables[name] = missing.interp(var, var_indexers, method, kwargs) venv/lib/python3.8/site-packages/xarray/core/missing.py:639: in interp interped = interp_func( venv/lib/python3.8/site-packages/xarray/core/missing.py:764: in interp_func return _interpnd(var, x, new_x, func, kwargs) venv/lib/python3.8/site-packages/xarray/core/missing.py:788: in _interpnd rslt = func(x, var, xi, kwargs) venv/lib/python3.8/site-packages/scipy/interpolate/_rgi.py:654: in interpn return interp(xi) venv/lib/python3.8/site-packages/scipy/interpolate/_rgi.py:336: in call result = evaluate_linear_2d(self.values,
_rgi_cython.pyx:19: TypeError ``` Anything else we need to know?No response EnvironmentINSTALLED VERSIONScommit: None python: 3.8.10 (default, Nov 14 2022, 12:59:47) [GCC 9.4.0] python-bits: 64 OS: Linux OS-release: 5.4.0-135-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: 4.9.0 xarray: 2022.12.0 pandas: 1.5.2 numpy: 1.22.4 scipy: 1.10.0 netCDF4: 1.6.2 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.6.2 nc_time_axis: None PseudoNetCDF: None rasterio: 1.3.4 cfgrib: None iris: None bottleneck: None dask: 2022.12.1 distributed: 2022.12.1 matplotlib: 3.6.2 cartopy: None seaborn: None numbagg: None fsspec: 2022.11.0 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 65.6.3 pip: 22.3.1 conda: None pytest: 7.2.0 mypy: None IPython: 8.7.0 sphinx: 5.3.0 None |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7414/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
1039113959 | I_kwDOAMm_X849757n | 5913 | Invalid characters in OpenDAP URL | pmartineauGit 57886986 | closed | 0 | 5 | 2021-10-29T02:54:14Z | 2024-04-28T20:55:17Z | 2024-04-28T20:55:17Z | NONE | Hello, I have successfully opened an OpenDAP URL with ds = xarray.open_dataset(url) However, after selecting a subset with ds = ds.isel(time=0) and attempting to load the data with ds.load(), I get the following error: HTTP Status 400 – Bad Request: Invalid character found in the request
I suspect the reason is that square brackets are passed in the URL when attempting to load: ...zg_6hrPlevPt_MIROC6_historical_r1i1p1f1_gn_185001010600-185101010000.nc.dods?zg.zg[0][0:6][0:127][0:255]] because of the index selection with .isel() In fact, some servers do forbid square brackets: https://www.unidata.ucar.edu/mailing_lists/archives/thredds/2020/msg00056.html Would it be possible to provide an option to encode URLs? ( [ becomes %5B, and ] becomes %5D ) Or, instead of loading directly with ds.load(), is there a way for me to retrieve the URL with offending brackets that is generated automatically by xarray, encode it myself, and then use ds2 = xarray.load_dataset(encoded_url) to load? Thank you for your help! |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5913/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
not_planned | xarray 13221727 | issue | ||||||
1244977848 | I_kwDOAMm_X85KNNq4 | 6629 | `plot.imshow` with datetime coordinate fails | shaharkadmiel 6872529 | closed | 0 | 5 | 2022-05-23T10:56:46Z | 2024-04-28T20:16:44Z | 2024-04-28T20:16:44Z | NONE | What happened?When trying to plot a 2d DataArray that has one of the 2 coordinates as datetime with
I know that I can use Here is a minimal working example: ```python import numpy as np from xarray import DataArray from pandas import date_range time = date_range('2020-01-01', periods=7, freq='D') y = np.linspace(0, 10, 11) da = DataArray( np.random.rand(time.size, y.size), coords=dict(time=time, y=y), dims=('time', 'y') ) da.plot.imshow(x='time', y='y') ``` What did you expect to happen?I suggest the following solution which can be added after https://github.com/pydata/xarray/blob/4da7fdbd85bb82e338ad65a532dd7a9707e18ce0/xarray/plot/plot.py#L1366
and then adding:
Minimal Complete Verifiable Example```Python import numpy as np from xarray import DataArray from pandas import date_range creating the datatime = date_range('2020-01-01', periods=7, freq='D') y = np.linspace(0, 10, 11) da = DataArray( np.random.rand(time.size, y.size), coords=dict(time=time, y=y), dims=('time', 'y') ) import matplotlib.pyplot as plt from matplotlib.dates import date2num, AutoDateFormatter from https://github.com/pydata/xarray/blob/4da7fdbd85bb82e338ad65a532dd7a9707e18ce0/xarray/plot/plot.py#L1348def _center_pixels(x): """Center the pixels on the coordinates.""" if np.issubdtype(x.dtype, str): # When using strings as inputs imshow converts it to # integers. Choose extent values which puts the indices in # in the center of the pixels: return 0 - 0.5, len(x) - 0.5
Center the pixels:left, right = _center_pixels(da.time) top, bottom = _center_pixels(da.y) the magical stepvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvleft, right = map(date2num, (left, right)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^plottingfig, ax = plt.subplots() ax.imshow( da.T, extent=(left, right, top, bottom), origin='lower', aspect='auto' ) ax.xaxis_date() plt.setp(ax.get_xticklabels(), rotation=30, ha='right') ``` MVCE confirmation
Relevant log output```PythonTypeError Traceback (most recent call last) /var/folders/bj/czjbfh496258q1lc3p01lyz00000gn/T/ipykernel_59425/1460104966.py in <module> ----> 1 da.plot.imshow(x='time', y='y') ~/miniconda3/lib/python3.8/site-packages/xarray/plot/plot.py in plotmethod(_PlotMethods_obj, x, y, figsize, size, aspect, ax, row, col, col_wrap, xincrease, yincrease, add_colorbar, add_labels, vmin, vmax, cmap, colors, center, robust, extend, levels, infer_intervals, subplot_kws, cbar_ax, cbar_kwargs, xscale, yscale, xticks, yticks, xlim, ylim, norm, kwargs) 1306 for arg in ["_PlotMethods_obj", "newplotfunc", "kwargs"]: 1307 del allargs[arg] -> 1308 return newplotfunc(allargs) 1309 1310 # Add to class _PlotMethods ~/miniconda3/lib/python3.8/site-packages/xarray/plot/plot.py in newplotfunc(darray, x, y, figsize, size, aspect, ax, row, col, col_wrap, xincrease, yincrease, add_colorbar, add_labels, vmin, vmax, cmap, center, robust, extend, levels, infer_intervals, colors, subplot_kws, cbar_ax, cbar_kwargs, xscale, yscale, xticks, yticks, xlim, ylim, norm, kwargs) 1208 ax = get_axis(figsize, size, aspect, ax, subplot_kws) 1209 -> 1210 primitive = plotfunc( 1211 xplt, 1212 yplt, ~/miniconda3/lib/python3.8/site-packages/xarray/plot/plot.py in imshow(x, y, z, ax, kwargs) 1394 z[np.any(z.mask, axis=-1), -1] = 0 1395 -> 1396 primitive = ax.imshow(z, defaults) 1397 1398 # If x or y are strings the ticklabels have been replaced with ~/miniconda3/lib/python3.8/site-packages/matplotlib/_api/deprecation.py in wrapper(args, kwargs) 454 "parameter will become keyword-only %(removal)s.", 455 name=name, obj_type=f"parameter of {func.name}()") --> 456 return func(args, kwargs) 457 458 # Don't modify func's signature, as boilerplate.py needs it. ~/miniconda3/lib/python3.8/site-packages/matplotlib/init.py in inner(ax, data, args, kwargs) 1410 def inner(ax, args, data=None, kwargs): 1411 if data is None: -> 1412 return func(ax, *map(sanitize_sequence, args), kwargs) 1413 1414 bound = new_sig.bind(ax, args, *kwargs) ~/miniconda3/lib/python3.8/site-packages/matplotlib/axes/_axes.py in imshow(self, X, cmap, norm, aspect, interpolation, alpha, vmin, vmax, origin, extent, interpolation_stage, filternorm, filterrad, resample, url, **kwargs) 5450 # update ax.dataLim, and, if autoscaling, set viewLim 5451 # to tightly fit the image, regardless of dataLim. -> 5452 im.set_extent(im.get_extent()) 5453 5454 self.add_image(im) ~/miniconda3/lib/python3.8/site-packages/matplotlib/image.py in set_extent(self, extent) 980 self._extent = xmin, xmax, ymin, ymax = extent 981 corners = (xmin, ymin), (xmax, ymax) --> 982 self.axes.update_datalim(corners) 983 self.sticky_edges.x[:] = [xmin, xmax] 984 self.sticky_edges.y[:] = [ymin, ymax] ~/miniconda3/lib/python3.8/site-packages/matplotlib/axes/_base.py in update_datalim(self, xys, updatex, updatey) 2474 """ 2475 xys = np.asarray(xys) -> 2476 if not np.any(np.isfinite(xys)): 2477 return 2478 self.dataLim.update_from_data_xy(xys, self.ignore_existing_data_limits, TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'' ``` Anything else we need to know?No response Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.8.12 | packaged by conda-forge | (default, Oct 12 2021, 21:21:17)
[Clang 11.1.0 ]
python-bits: 64
OS: Darwin
OS-release: 21.4.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: None
LANG: None
LOCALE: (None, 'UTF-8')
libhdf5: 1.12.1
libnetcdf: 4.8.1
xarray: 2022.3.0
pandas: 1.3.4
numpy: 1.21.4
scipy: 1.7.3
netCDF4: 1.5.8
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: 2.10.3
cftime: 1.6.0
nc_time_axis: 1.4.0
PseudoNetCDF: None
rasterio: None
cfgrib: 0.9.8.5
iris: None
bottleneck: 1.3.2
dask: 2022.04.0
distributed: 2022.4.0
matplotlib: 3.5.0
cartopy: 0.20.2
seaborn: 0.11.2
numbagg: None
fsspec: 2022.5.0
cupy: None
pint: 0.18
sparse: None
setuptools: 62.3.2
pip: 22.1.1
conda: 4.12.0
pytest: None
IPython: 7.30.1
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6629/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
803075280 | MDU6SXNzdWU4MDMwNzUyODA= | 4880 | Datetime as coordinaets does not convert back to datetime (returns int) | feefladder 33122845 | closed | 0 | 6 | 2021-02-07T22:20:11Z | 2024-04-28T20:13:33Z | 2024-04-28T20:13:32Z | CONTRIBUTOR | What happened:
datetime was in ```python Put your MCVE code hereimport xarray as xr
import numpy as np
import datetime
date_frame = xr.DataArray(dims='time',coords={'time':pd.date_range('2000-01-01',periods=365)},data=np.zeros(365))
print('pandas date range (datetime): ',pd.date_range('2000-01-01',periods=365)[0])
print('dataframe datetime converted to datetime (int): ',date_frame.coords['time'].data[0].astype(datetime.datetime))
print("normal numpy datetime64 converted to datetime (datetime): ",np.datetime64(datetime.datetime(2000,1,1)).astype(datetime.datetime))
if converted to int, it also gives different lengths of int : date_frame: 946684800000000000 946684800000000 normal datetime64^ Anything else we need to know?: it is also mentioned in this SO thread appears to be a problem in the datetime64.... numpy version 1.20.0 pandas version 1.2.1 Environment: Output of <tt>xr.show_versions()</tt>INSTALLED VERSIONS ------------------ commit: None python: 3.7.9 | packaged by conda-forge | (default, Dec 9 2020, 21:08:20) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 5.4.0-59-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: None libnetcdf: None xarray: 0.16.2 pandas: 1.2.1 numpy: 1.20.0 scipy: None netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.6.1 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2021.01.1 distributed: 2021.01.1 matplotlib: None cartopy: None seaborn: None numbagg: None pint: None setuptools: 49.6.0.post20210108 pip: 21.0.1 conda: None pytest: None IPython: 7.20.0 sphinx: None |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/4880/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
1402002645 | I_kwDOAMm_X85TkNzV | 7146 | Segfault writing large netcdf files to s3fs | d1mach 11075246 | closed | 0 | 17 | 2022-10-08T16:56:31Z | 2024-04-28T20:11:59Z | 2024-04-28T20:11:59Z | NONE | What happened?It seems netcdf4 does not work well currently with Here is an example
The output with ``` There are 1 HDF5 objects open! Report: open objects on 72057594037927936 Segmentation fault (core dumped) ``` I have tried the other engine that handles NETCDF4 in xarray with A quick workaround seems to be to use the local filesystem to write the NetCDF file and then move the complete file to S3.
What did you expect to happen?With NTIMES=24 I am getting a file Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output```Python There are 1 HDF5 objects open! Report: open objects on 72057594037927936 Segmentation fault (core dumped) ``` Anything else we need to know?No response Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.8.3 | packaged by conda-forge | (default, Jun 1 2020, 17:43:00)
[GCC 7.5.0]
python-bits: 64
OS: Linux
OS-release: 5.4.0-26-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: None
LOCALE: en_US.UTF-8
libhdf5: 1.10.6
libnetcdf: 4.7.4
xarray: 0.16.1
pandas: 1.1.3
numpy: 1.19.1
scipy: 1.5.2
netCDF4: 1.5.4
pydap: None
h5netcdf: 1.0.2
h5py: 3.1.0
Nio: None
zarr: None
cftime: 1.2.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.30.0
distributed: None
matplotlib: 3.3.1
cartopy: None
seaborn: None
numbagg: None
pint: None
setuptools: 50.3.0.post20201006
pip: 20.2.3
conda: 22.9.0
pytest: 6.1.1
IPython: 7.18.1
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7146/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2132500634 | I_kwDOAMm_X85_G2Ca | 8742 | Y-axis is reversed when using to_zarr() | alistaireverett 7837535 | closed | 0 | 3 | 2024-02-13T14:48:30Z | 2024-04-28T20:08:13Z | 2024-04-28T20:08:13Z | NONE | What happened?When I export a dataset to NetCDF and Zarr, the y axis appears to have been reversed with gdalinfo. I also cannot build a vrt file with the Zarr file since it complains about positive NS axis, but this works fine with the NetCDF file. Example NetCDF file as input: in.nc.zip gdalinfo on output NetCDF file:
gdalinfo on output Zarr file:
The main issue is that the origin and y-axis direction is reversed, as you can see from the origin and pixel size. I have tried taking the CRS from the netcdf and adding it to the Zarr file as a What did you expect to happen?Origin, pixel size and corner coords should match those in the netcdf file.
Minimal Complete Verifiable Example```Python import xarray as xr from pyproj import CRS ds = xr.open_dataset("in.nc") Optionally take copy CRS to Zarr (produces and error, but does work)crs_wkt = CRS.from_cf(ds["projection_lambert"].attrs).to_wkt() ds["air_temperature_2m"] = ds["air_temperature_2m"].assign_attrs(_CRS={"wkt": crs_wkt}) ds.to_zarr("out.zarr") ds.to_netcdf("out.nc") ``` MVCE confirmation
Relevant log outputNo response Anything else we need to know?No response Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.11.7 (main, Dec 15 2023, 18:12:31) [GCC 11.2.0]
python-bits: 64
OS: Linux
OS-release: 6.5.0-15-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.9.3-development
xarray: 2024.1.1
pandas: 2.2.0
numpy: 1.26.4
scipy: None
netCDF4: 1.6.5
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: 2.16.1
cftime: 1.6.3
nc_time_axis: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 68.2.2
pip: 23.3.1
conda: None
pytest: None
mypy: None
IPython: None
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8742/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2021386895 | PR_kwDOAMm_X85g7QZD | 8500 | Deprecate ds.dims returning dict | TomNicholas 35968931 | closed | 0 | 1 | 2023-12-01T18:29:28Z | 2024-04-28T20:04:00Z | 2023-12-06T17:52:24Z | MEMBER | 0 | pydata/xarray/pulls/8500 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8500/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2115555965 | I_kwDOAMm_X85-GNJ9 | 8695 | Return a 3D object alongside 1D object in apply_ufunc | ahuang11 15331990 | closed | 0 | 7 | 2024-02-02T18:47:14Z | 2024-04-28T19:59:31Z | 2024-04-28T19:59:31Z | CONTRIBUTOR | Is your feature request related to a problem?Currently, I have something similar to this, where the Since Any ideas on how I can modify this to make it more efficient? ```python import xarray as xr import numpy as np air = xr.tutorial.open_dataset("air_temperature")["air"] input_lat = np.arange(20, 45) def interp1d_np(data, base_lat, input_lat): new_lat = input_lat + 0.25 return np.interp(new_lat, base_lat, data), new_lat ds, new_lat = xr.apply_ufunc( interp1d_np, # first the function air, air.lat, # as above input_lat, # as above input_core_dims=[["lat"], ["lat"], ["lat"]], # list with one entry per arg output_core_dims=[["lat"], ["lat"]], # returned data has one dimension exclude_dims=set(("lat",)), # dimensions allowed to change size. Must be a set! vectorize=True, # loop over non-core dims ) new_lat = new_lat.isel(lon=0, time=0).values ds["lat"] = new_lat ``` Describe the solution you'd likeEither be able to automatically assign the new_lat to the returned xarray object, or allow a 1D dataset to be returned Describe alternatives you've consideredNo response Additional contextNo response |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8695/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
576337745 | MDU6SXNzdWU1NzYzMzc3NDU= | 3831 | Errors using to_zarr for an s3 store | JarrodBWong 15351025 | closed | 0 | 15 | 2020-03-05T15:30:40Z | 2024-04-28T19:59:02Z | 2024-04-28T19:59:02Z | NONE | Hello,
I have been trying to write zarr files from xarray directly into an s3 store but keep getting errors for missing arrays. It looks like the structure of the zarr archive is created in my s3 bucket, I can see MCVE Code Sample```python s3 = s3fs.S3FileSystem(anon=False) store= s3fs.S3Map(root=f's3://my-bucket/data.zarr', s3=s3, check=False) ds.to_zarr(store=store, consolidated=True, mode='w') ``` OutputThe variable name of the array changes by the run, it's not always the same one that it says is missing. logs-------------------------------------------------------------------------- NoSuchKey Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/s3fs/core.py in _fetch_range(client, bucket, key, version_id, start, end, max_attempts, req_kw) 1196 Range='bytes=%i-%i' % (start, end - 1), -> 1197 **kwargs) 1198 return resp['Body'].read() ~/.local/lib/python3.7/site-packages/botocore/client.py in _api_call(self, *args, **kwargs) 315 # The "self" in this scope is referring to the BaseClient. --> 316 return self._make_api_call(operation_name, kwargs) 317 ~/.local/lib/python3.7/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params) 625 error_class = self.exceptions.from_code(error_code) --> 626 raise error_class(parsed_response, operation_name) 627 else: NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist. During handling of the above exception, another exception occurred: FileNotFoundError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/fsspec/mapping.py in __getitem__(self, key, default) 75 try: ---> 76 result = self.fs.cat(key) 77 except: # noqa: E722 /opt/conda/lib/python3.7/site-packages/fsspec/spec.py in cat(self, path) 545 """ Get the content of a file """ --> 546 return self.open(path, "rb").read() 547 /opt/conda/lib/python3.7/site-packages/fsspec/spec.py in read(self, length) 1129 return b"" -> 1130 out = self.cache._fetch(self.loc, self.loc + length) 1131 self.loc += len(out) /opt/conda/lib/python3.7/site-packages/fsspec/caching.py in _fetch(self, start, end) 338 # First read, or extending both before and after --> 339 self.cache = self.fetcher(start, bend) 340 self.start = start /opt/conda/lib/python3.7/site-packages/s3fs/core.py in _fetch_range(self, start, end) 1059 def _fetch_range(self, start, end): -> 1060 return _fetch_range(self.fs.s3, self.bucket, self.key, self.version_id, start, end, req_kw=self.req_kw) 1061 /opt/conda/lib/python3.7/site-packages/s3fs/core.py in _fetch_range(client, bucket, key, version_id, start, end, max_attempts, req_kw) 1212 return b'' -> 1213 raise translate_boto_error(e) 1214 except Exception as e: FileNotFoundError: The specified key does not exist. During handling of the above exception, another exception occurred: KeyError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/zarr/core.py in _load_metadata_nosync(self) 149 mkey = self._key_prefix + array_meta_key --> 150 meta_bytes = self._store[mkey] 151 except KeyError: /opt/conda/lib/python3.7/site-packages/fsspec/mapping.py in __getitem__(self, key, default) 79 return default ---> 80 raise KeyError(key) 81 return result KeyError: 'my-bucket/data.zarr/lv_HTGL7_l1/.zarray' During handling of the above exception, another exception occurred: ValueError Traceback (most recent call last) <ipython-input-7-c21938cc83d3> in <module> 7 ds.to_zarr(store=s3_store_dest, 8 consolidated=True, ----> 9 mode='w') /opt/conda/lib/python3.7/site-packages/xarray/core/dataset.py in to_zarr(self, store, mode, synchronizer, group, encoding, compute, consolidated, append_dim) 1623 compute=compute, 1624 consolidated=consolidated, -> 1625 append_dim=append_dim, 1626 ) 1627 /opt/conda/lib/python3.7/site-packages/xarray/backends/api.py in to_zarr(dataset, store, mode, synchronizer, group, encoding, compute, consolidated, append_dim) 1341 writer = ArrayWriter() 1342 # TODO: figure out how to properly handle unlimited_dims -> 1343 dump_to_store(dataset, zstore, writer, encoding=encoding) 1344 writes = writer.sync(compute=compute) 1345 /opt/conda/lib/python3.7/site-packages/xarray/backends/api.py in dump_to_store(dataset, store, writer, encoder, encoding, unlimited_dims) 1133 variables, attrs = encoder(variables, attrs) 1134 -> 1135 store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims) 1136 1137 /opt/conda/lib/python3.7/site-packages/xarray/backends/zarr.py in store(self, variables, attributes, check_encoding_set, writer, unlimited_dims) 385 self.set_dimensions(variables_encoded, unlimited_dims=unlimited_dims) 386 self.set_variables( --> 387 variables_encoded, check_encoding_set, writer, unlimited_dims=unlimited_dims 388 ) 389 /opt/conda/lib/python3.7/site-packages/xarray/backends/zarr.py in set_variables(self, variables, check_encoding_set, writer, unlimited_dims) 444 dtype = str 445 zarr_array = self.ds.create( --> 446 name, shape=shape, dtype=dtype, fill_value=fill_value, **encoding 447 ) 448 zarr_array.attrs.put(encoded_attrs) /opt/conda/lib/python3.7/site-packages/zarr/hierarchy.py in create(self, name, **kwargs) 877 """Create an array. Keyword arguments as per 878 :func:`zarr.creation.create`.""" --> 879 return self._write_op(self._create_nosync, name, **kwargs) 880 881 def _create_nosync(self, name, **kwargs): /opt/conda/lib/python3.7/site-packages/zarr/hierarchy.py in _write_op(self, f, *args, **kwargs) 656 657 with lock: --> 658 return f(*args, **kwargs) 659 660 def create_group(self, name, overwrite=False): /opt/conda/lib/python3.7/site-packages/zarr/hierarchy.py in _create_nosync(self, name, **kwargs) 884 kwargs.setdefault('cache_attrs', self.attrs.cache) 885 return create(store=self._store, path=path, chunk_store=self._chunk_store, --> 886 **kwargs) 887 888 def empty(self, name, **kwargs): /opt/conda/lib/python3.7/site-packages/zarr/creation.py in create(shape, chunks, dtype, compressor, fill_value, order, store, synchronizer, overwrite, path, chunk_store, filters, cache_metadata, cache_attrs, read_only, object_codec, **kwargs) 123 # instantiate array 124 z = Array(store, path=path, chunk_store=chunk_store, synchronizer=synchronizer, --> 125 cache_metadata=cache_metadata, cache_attrs=cache_attrs, read_only=read_only) 126 127 return z /opt/conda/lib/python3.7/site-packages/zarr/core.py in __init__(self, store, path, read_only, chunk_store, synchronizer, cache_metadata, cache_attrs) 122 123 # initialize metadata --> 124 self._load_metadata() 125 126 # initialize attributes /opt/conda/lib/python3.7/site-packages/zarr/core.py in _load_metadata(self) 139 """(Re)load metadata from store.""" 140 if self._synchronizer is None: --> 141 self._load_metadata_nosync() 142 else: 143 mkey = self._key_prefix + array_meta_key /opt/conda/lib/python3.7/site-packages/zarr/core.py in _load_metadata_nosync(self) 150 meta_bytes = self._store[mkey] 151 except KeyError: --> 152 err_array_not_found(self._path) 153 else: 154 /opt/conda/lib/python3.7/site-packages/zarr/errors.py in err_array_not_found(path) 19 20 def err_array_not_found(path): ---> 21 raise ValueError('array not found at path %r' % path) 22 23 ValueError: array not found at path 'lv_HTGL7_l1'Output of
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/3831/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2224036575 | I_kwDOAMm_X86EkBrf | 8905 | Variable doesn't have an .expand_dims method | TomNicholas 35968931 | closed | 0 | 4 | 2024-04-03T22:19:10Z | 2024-04-28T19:54:08Z | 2024-04-28T19:54:08Z | MEMBER | Is your feature request related to a problem?
Describe the solution you'd likeVariable should also have this method, the only difference being that it wouldn't create any coordinates or indexes. Describe alternatives you've consideredNo response Additional contextNo response |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8905/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2254350395 | PR_kwDOAMm_X85tPTua | 8960 | Option to not auto-create index during expand_dims | TomNicholas 35968931 | closed | 0 | 2 | 2024-04-20T03:27:23Z | 2024-04-27T16:48:30Z | 2024-04-27T16:48:24Z | MEMBER | 0 | pydata/xarray/pulls/8960 |
TODO:
- [x] Add new kwarg to |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8960/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2261844699 | PR_kwDOAMm_X85toeXT | 8968 | Bump dependencies incl `pandas>=2` | dcherian 2448579 | closed | 0 | 0 | 2024-04-24T17:42:19Z | 2024-04-27T14:17:16Z | 2024-04-27T14:17:16Z | MEMBER | 0 | pydata/xarray/pulls/8968 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8968/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2260280862 | PR_kwDOAMm_X85tjH8m | 8967 | Migrate datatreee assertions/extensions/formatting | owenlittlejohns 7788154 | closed | 0 | 0 | 2024-04-24T04:23:03Z | 2024-04-26T17:38:59Z | 2024-04-26T17:29:18Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/8967 | This PR continues the overall work of migrating DataTree into xarray.
I had also meant to get to
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8967/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
590630281 | MDU6SXNzdWU1OTA2MzAyODE= | 3921 | issues discovered by the all-but-dask CI | keewis 14808389 | closed | 0 | 4 | 2020-03-30T22:08:46Z | 2024-04-25T14:48:15Z | 2024-02-10T02:57:34Z | MEMBER | After adding the |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/3921/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2261917442 | PR_kwDOAMm_X85touYl | 8971 | Delete pynio backend. | dcherian 2448579 | closed | 0 | 2 | 2024-04-24T18:25:26Z | 2024-04-25T14:38:23Z | 2024-04-25T14:23:59Z | MEMBER | 0 | pydata/xarray/pulls/8971 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8971/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2243685081 | I_kwDOAMm_X86Fu-rZ | 8945 | netCDF4 indexing: `reindex_like` is very slow if dataset not loaded into memory | brendan-m-murphy 11130776 | closed | 0 | 4 | 2024-04-15T13:26:08Z | 2024-04-23T21:49:28Z | 2024-04-23T15:33:36Z | NONE | What is your issue?Reindexing a dataset without loading it into memory seems to be very slow (about 1000x slower than reindexing after loading into memory). Here is a minimum working example: ``` times = 100 nlat = 200 nlon = 300 fp = xr.Dataset({"fp": (["time", "lat", "lon"], np.arange(times * nlat * nlon).reshape(times, nlat, nlon))}, coords={"time": pd.date_range(start="2019-01-01T02:00:00", periods=times, freq="1H"), "lat": np.arange(nlat), "lon": np.arange(nlon)}) flux = xr.Dataset({"flux": (["time", "lat", "lon"], np.arange(nlat * nlon).reshape(1, nlat, nlon))}, coords={"time": [pd.to_datetime("2019-01-01")], "lat": np.arange(nlat) + np.random.normal(0.0, 0.01, nlat), "lon": np.arange(nlon) + np.random.normal(0.0, 0.01, nlon)}) fp.to_netcdf("combine_datasets_tests/fp.nc") flux.to_netcdf("combine_datasets_tests/flux.nc") fp1 = xr.open_dataset("combine_datasets_tests/fp.nc") flux1 = xr.open_dataset("combine_datasets_tests/flux.nc") ``` Then
Profiling the "reindex without load" cell: ``` 804936 function calls (804622 primitive calls) in 93.285 seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 1 92.211 92.211 93.191 93.191 {built-in method _operator.getitem} 1 0.289 0.289 0.980 0.980 utils.py:81(_StartCountStride) 6 0.239 0.040 0.613 0.102 shape_base.py:267(apply_along_axis) 72656 0.109 0.000 0.109 0.000 utils.py:429(<lambda>) 72656 0.085 0.000 0.136 0.000 utils.py:430(<lambda>) 72661 0.051 0.000 0.051 0.000 {built-in method numpy.arange} 145318 0.048 0.000 0.115 0.000 shape_base.py:370(<genexpr>) 2 0.045 0.023 0.046 0.023 indexing.py:1334(getitem) 6 0.044 0.007 0.044 0.007 numeric.py:136(ones) 145318 0.044 0.000 0.067 0.000 index_tricks.py:690(next) 14 0.033 0.002 0.033 0.002 {built-in method numpy.empty} 145333/145325 0.023 0.000 0.023 0.000 {built-in method builtins.next} 1 0.020 0.020 93.275 93.275 duck_array_ops.py:317(where) 21 0.018 0.001 0.018 0.001 {method 'astype' of 'numpy.ndarray' objects} 145330 0.013 0.000 0.013 0.000 {built-in method numpy.asanyarray} 1 0.002 0.002 0.002 0.002 {built-in method _functools.reduce} 1 0.002 0.002 93.279 93.279 variable.py:821(_getitem_with_mask) 18 0.001 0.000 0.001 0.000 {built-in method numpy.zeros} 1 0.000 0.000 0.000 0.000 file_manager.py:226(close) ``` The In my venv, netCDF4 was installed from a wheel with the following versions:
This is with xarray version 2023.12.0, numpy 1.26, and pandas 1.5.3. I will try to investigate more and hopefully simplify the example. (Can't quite justify spending more time on it at work because this is just to tag a version that was used in some experiments before we switch to zarr as a backend, so hopefully it won't be relevant at that point.) |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8945/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2141447815 | I_kwDOAMm_X85_o-aH | 8768 | `xarray/datatree_` missing in 2024.2.0 sdist | mgorny 110765 | closed | 0 | 15 | 2024-02-19T03:57:31Z | 2024-04-23T18:11:58Z | 2024-04-23T15:35:21Z | CONTRIBUTOR | What happened?Apparently What did you expect to happen?No response Minimal Complete Verifiable Example
MVCE confirmation
Relevant log outputNo response Anything else we need to know?No response Environmentn/a |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8768/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2248692681 | PR_kwDOAMm_X85s8dDt | 8953 | stop pruning datatree_ directory from distribution | flamingbear 479480 | closed | 0 | 0 | 2024-04-17T16:14:13Z | 2024-04-23T15:39:06Z | 2024-04-23T15:35:20Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/8953 | This PR removes the directive that strips out the datatree_ directory from the xarray distribution. It also cleans a few typing errors and removes exceptions for the datatree_ directory for mypy. It does NOT remove the exception for pre-commit config.
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8953/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2255271332 | PR_kwDOAMm_X85tSKJs | 8961 | use `nan` instead of `NaN` | keewis 14808389 | closed | 0 | 0 | 2024-04-21T21:26:18Z | 2024-04-21T22:01:04Z | 2024-04-21T22:01:03Z | MEMBER | 0 | pydata/xarray/pulls/8961 | FYI @aulemahal,
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8961/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2100707586 | PR_kwDOAMm_X85lFQn3 | 8669 | Fix automatic broadcasting when wrapping array api class | TomNicholas 35968931 | closed | 0 | 0 | 2024-01-25T16:05:19Z | 2024-04-20T05:58:05Z | 2024-01-26T16:41:30Z | MEMBER | 0 | pydata/xarray/pulls/8669 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8669/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2238099300 | PR_kwDOAMm_X85sYXC0 | 8930 | Migrate formatting_html.py into xarray core | eni-awowale 51421921 | closed | 0 | 7 | 2024-04-11T16:15:28Z | 2024-04-18T21:59:47Z | 2024-04-18T21:59:44Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/8930 | This PR migrates the One thing of note is that importing and setting the
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8930/reactions", "total_count": 3, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 2, "rocket": 1, "eyes": 0 } |
xarray 13221727 | pull | |||||
2125478394 | PR_kwDOAMm_X85mZIzr | 8723 | (feat): Support for `pandas` `ExtensionArray` | ilan-gold 43999641 | closed | 0 | 23 | 2024-02-08T15:38:18Z | 2024-04-18T12:52:06Z | 2024-04-18T12:52:03Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/8723 | Some outstanding points/decisions brought up by this PR:
- [ ] Confirm type promotion rules and write them out. As it stands now, if everything is of the same extension array type, it is passed onwards and otherwise is converted to numpy. (related: https://github.com/pydata/xarray/pull/8714)
~- [ ] Acceptance of Possible missing something else! Let me know! Checklist:
- [x] Closes #8463 and Closes #5287
- [x] Tests added
- [x] User visible changes (including notable bug fixes) are documented in |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8723/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
884649380 | MDU6SXNzdWU4ODQ2NDkzODA= | 5287 | Support for pandas Extension Arrays | Hoeze 1200058 | closed | 0 | 8 | 2021-05-10T17:00:17Z | 2024-04-18T12:52:04Z | 2024-04-18T12:52:04Z | NONE | Is your feature request related to a problem? Please describe.
I started writing an ExtensionArray which is basically a This is working great in Pandas, I can read and write Parquet as well as csv with it.
However, as soon as I'm using any Describe the solution you'd like Would it be possible to support Pandas Extension Types on coordinates? It's not necessary to compute anything on them, I'd just like to use them for dimensions. Describe alternatives you've considered I was thinking over implementing a NumPy duck array, but I have never tried this and it looks quite complicated compared to the Pandas Extension types. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5287/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
1999657332 | I_kwDOAMm_X853MFl0 | 8463 | Categorical Array | ilan-gold 43999641 | closed | 0 | 19 | 2023-11-17T17:57:12Z | 2024-04-18T12:52:04Z | 2024-04-18T12:52:04Z | CONTRIBUTOR | Is your feature request related to a problem?We are looking to improve compatibility between Describe the solution you'd likeThe goal would be a standard-use categorical data type We have something functional here that inherits from Some issues:
1. I have no idea what a standard "return type" for an It seems you may want, in addition to the array container, some sort of i/o functionality for this feature (so maybe some on-disk specification?). Describe alternatives you've consideredI think there is some route via Additional contextSo just for reference, the current behavior of ```python import pandas as pd df = pd.DataFrame({'cat': ['a', 'b', 'a', 'b', 'c']}) df['cat'] = df['cat'].astype('category') df.to_xarray()['cat'] <xarray.DataArray 'cat' (index: 5)>array(['a', 'b', 'a', 'b', 'c'], dtype=object)Coordinates:* index (index) int64 0 1 2 3 4``` And as stated in the Apologies if I'm missing something here! Feedback welcome! Sorry if this is a bit chaotic, just trying to cover my bases. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8463/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2246986030 | PR_kwDOAMm_X85s2plY | 8948 | Migrate datatree mapping.py | owenlittlejohns 7788154 | closed | 0 | 1 | 2024-04-16T22:36:48Z | 2024-04-17T20:44:29Z | 2024-04-17T19:59:34Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/8948 | This PR continues the overall work of migrating DataTree into xarray.
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8948/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2244681150 | PR_kwDOAMm_X85suxIl | 8947 | Add mypy to dev dependencies | max-sixty 5635139 | closed | 0 | 0 | 2024-04-15T21:39:19Z | 2024-04-17T16:39:23Z | 2024-04-17T16:39:22Z | MEMBER | 0 | pydata/xarray/pulls/8947 | { "url": "https://api.github.com/repos/pydata/xarray/issues/8947/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
2075019328 | PR_kwDOAMm_X85juCQ- | 8603 | Convert 360_day calendars by choosing random dates to drop or add | aulemahal 20629530 | closed | 0 | 3 | 2024-01-10T19:13:31Z | 2024-04-16T14:53:42Z | 2024-04-16T14:53:42Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/8603 |
Small PR to add a new "method" to convert to and from 360_day calendars. The current two methods (chosen with the This new option will randomly chose the days, one for each fifth of the year (72-days period). It emulates the method of the LOCA datasets (see web page and article ). February 29th is always removed/added when the source/target is a leap year. I copied the implementation from xclim (which I wrote), see code here . |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8603/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2243268327 | I_kwDOAMm_X86FtY7n | 8944 | When opening a zipped Dataset stored under Zarr on a s3 bucket, `botocore.exceptions.NoCredentialsError: Unable to locate credentials` | eschalkargans 119882363 | closed | 0 | 2 | 2024-04-15T10:13:58Z | 2024-04-15T19:51:44Z | 2024-04-15T19:51:43Z | NONE | What happened?A zipped Zarr store is available on s3 bucket that requires authentication. When using NoCredentialsError: Unable to locate credentials What did you expect to happen?I expected the dataset to be openable. Minimal Complete Verifiable ExampleIt is difficult for me to describe a MCVE as it requires a remote file on an s3 bucket requiring authentication. To reproduce fully, one must have access to a zipped zarr on an s3 bucket requiring authentication. ```Python import xarray as xr credentials_key = "key" credentials_secret = "secret" credentials_endpoint_url = "endpoint_url" credentials_region_name = "region" storage_options = dict( key=credentials_key, secret=credentials_secret, client_kwargs=dict( endpoint_url=credentials_endpoint_url, region_name=credentials_region_name, ), ) zip_s3_zarr_path = "zip::s3://path/to/my/dataset.zarr.zip" xds = xr.open_dataset( zip_s3_zarr_path, backend_kwargs={"storage_options": storage_options}, engine="zarr", group="/", consolidated=True, ) ``` MVCE confirmation
Relevant log output
```Python
---------------------------------------------------------------------------
NoCredentialsError Traceback (most recent call last)
Cell In[4], line 1
----> 1 xds = xr.open_dataset(
2 zip_s3_zarr_path,
3 backend_kwargs={"storage_options": storage_options},
4 engine="zarr",
5 group="/",
6 consolidated=True,
7 )
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/xarray/backends/api.py:573, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, inline_array, chunked_array_type, from_array_kwargs, backend_kwargs, **kwargs)
561 decoders = _resolve_decoders_kwargs(
562 decode_cf,
563 open_backend_dataset_parameters=backend.open_dataset_parameters,
(...)
569 decode_coords=decode_coords,
570 )
572 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None)
--> 573 backend_ds = backend.open_dataset(
574 filename_or_obj,
575 drop_variables=drop_variables,
576 **decoders,
577 **kwargs,
578 )
579 ds = _dataset_from_backend_dataset(
580 backend_ds,
581 filename_or_obj,
(...)
591 **kwargs,
592 )
593 return ds
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/xarray/backends/zarr.py:967, in ZarrBackendEntrypoint.open_dataset(self, filename_or_obj, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta, group, mode, synchronizer, consolidated, chunk_store, storage_options, stacklevel, zarr_version)
946 def open_dataset( # type: ignore[override] # allow LSP violation, not supporting **kwargs
947 self,
948 filename_or_obj: str | os.PathLike[Any] | BufferedIOBase | AbstractDataStore,
(...)
964 zarr_version=None,
965 ) -> Dataset:
966 filename_or_obj = _normalize_path(filename_or_obj)
--> 967 store = ZarrStore.open_group(
968 filename_or_obj,
969 group=group,
970 mode=mode,
971 synchronizer=synchronizer,
972 consolidated=consolidated,
973 consolidate_on_close=False,
974 chunk_store=chunk_store,
975 storage_options=storage_options,
976 stacklevel=stacklevel + 1,
977 zarr_version=zarr_version,
978 )
980 store_entrypoint = StoreBackendEntrypoint()
981 with close_on_error(store):
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/xarray/backends/zarr.py:454, in ZarrStore.open_group(cls, store, mode, synchronizer, group, consolidated, consolidate_on_close, chunk_store, storage_options, append_dim, write_region, safe_chunks, stacklevel, zarr_version, write_empty)
451 raise FileNotFoundError(f"No such file or directory: '{store}'")
452 elif consolidated:
453 # TODO: an option to pass the metadata_key keyword
--> 454 zarr_group = zarr.open_consolidated(store, **open_kwargs)
455 else:
456 zarr_group = zarr.open_group(store, **open_kwargs)
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/zarr/convenience.py:1334, in open_consolidated(store, metadata_key, mode, **kwargs)
1332 # normalize parameters
1333 zarr_version = kwargs.get("zarr_version")
-> 1334 store = normalize_store_arg(
1335 store, storage_options=kwargs.get("storage_options"), mode=mode, zarr_version=zarr_version
1336 )
1337 if mode not in {"r", "r+"}:
1338 raise ValueError("invalid mode, expected either 'r' or 'r+'; found {!r}".format(mode))
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/zarr/storage.py:197, in normalize_store_arg(store, storage_options, mode, zarr_version)
195 else:
196 raise ValueError("zarr_version must be either 2 or 3")
--> 197 return normalize_store(store, storage_options, mode)
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/zarr/storage.py:167, in _normalize_store_arg_v2(store, storage_options, mode)
165 if isinstance(store, str):
166 if "://" in store or "::" in store:
--> 167 return FSStore(store, mode=mode, **(storage_options or {}))
168 elif storage_options:
169 raise ValueError("storage_options passed with non-fsspec path")
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/zarr/storage.py:1377, in FSStore.__init__(self, url, normalize_keys, key_separator, mode, exceptions, dimension_separator, fs, check, create, missing_exceptions, **storage_options)
1375 if protocol in (None, "file") and not storage_options.get("auto_mkdir"):
1376 storage_options["auto_mkdir"] = True
-> 1377 self.map = fsspec.get_mapper(url, **{**mapper_options, **storage_options})
1378 self.fs = self.map.fs # for direct operations
1379 self.path = self.fs._strip_protocol(url)
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/mapping.py:245, in get_mapper(url, check, create, missing_exceptions, alternate_root, **kwargs)
214 """Create key-value interface for given URL and options
215
216 The URL will be of the form "protocol://location" and point to the root
(...)
242 ``FSMap`` instance, the dict-like key-value store.
243 """
244 # Removing protocol here - could defer to each open() on the backend
--> 245 fs, urlpath = url_to_fs(url, **kwargs)
246 root = alternate_root if alternate_root is not None else urlpath
247 return FSMap(root, fs, check, create, missing_exceptions=missing_exceptions)
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/core.py:388, in url_to_fs(url, **kwargs)
386 inkwargs["fo"] = urls
387 urlpath, protocol, _ = chain[0]
--> 388 fs = filesystem(protocol, **inkwargs)
389 return fs, urlpath
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/registry.py:290, in filesystem(protocol, **storage_options)
283 warnings.warn(
284 "The 'arrow_hdfs' protocol has been deprecated and will be "
285 "removed in the future. Specify it as 'hdfs'.",
286 DeprecationWarning,
287 )
289 cls = get_filesystem_class(protocol)
--> 290 return cls(**storage_options)
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/spec.py:79, in _Cached.__call__(cls, *args, **kwargs)
77 return cls._cache[token]
78 else:
---> 79 obj = super().__call__(*args, **kwargs)
80 # Setting _fs_token here causes some static linters to complain.
81 obj._fs_token_ = token
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/implementations/zip.py:56, in ZipFileSystem.__init__(self, fo, mode, target_protocol, target_options, compression, allowZip64, compresslevel, **kwargs)
52 fo = fsspec.open(
53 fo, mode=mode + "b", protocol=target_protocol, **(target_options or {}), # **kwargs
54 )
55 self.of = fo
---> 56 self.fo = fo.__enter__() # the whole instance is a context
57 self.zip = zipfile.ZipFile(
58 self.fo,
59 mode=mode,
(...)
62 compresslevel=compresslevel,
63 )
64 self.dir_cache = None
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/core.py:100, in OpenFile.__enter__(self)
97 def __enter__(self):
98 mode = self.mode.replace("t", "").replace("b", "") + "b"
--> 100 f = self.fs.open(self.path, mode=mode)
102 self.fobjects = [f]
104 if self.compression is not None:
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/spec.py:1307, in AbstractFileSystem.open(self, path, mode, block_size, cache_options, compression, **kwargs)
1305 else:
1306 ac = kwargs.pop("autocommit", not self._intrans)
-> 1307 f = self._open(
1308 path,
1309 mode=mode,
1310 block_size=block_size,
1311 autocommit=ac,
1312 cache_options=cache_options,
1313 **kwargs,
1314 )
1315 if compression is not None:
1316 from fsspec.compression import compr
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/s3fs/core.py:671, in S3FileSystem._open(self, path, mode, block_size, acl, version_id, fill_cache, cache_type, autocommit, size, requester_pays, cache_options, **kwargs)
668 if cache_type is None:
669 cache_type = self.default_cache_type
--> 671 return S3File(
672 self,
673 path,
674 mode,
675 block_size=block_size,
676 acl=acl,
677 version_id=version_id,
678 fill_cache=fill_cache,
679 s3_additional_kwargs=kw,
680 cache_type=cache_type,
681 autocommit=autocommit,
682 requester_pays=requester_pays,
683 cache_options=cache_options,
684 size=size,
685 )
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/s3fs/core.py:2099, in S3File.__init__(self, s3, path, mode, block_size, acl, version_id, fill_cache, s3_additional_kwargs, autocommit, cache_type, requester_pays, cache_options, size)
2097 self.details = s3.info(path)
2098 self.version_id = self.details.get("VersionId")
-> 2099 super().__init__(
2100 s3,
2101 path,
2102 mode,
2103 block_size,
2104 autocommit=autocommit,
2105 cache_type=cache_type,
2106 cache_options=cache_options,
2107 size=size,
2108 )
2109 self.s3 = self.fs # compatibility
2111 # when not using autocommit we want to have transactional state to manage
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/spec.py:1663, in AbstractBufferedFile.__init__(self, fs, path, mode, block_size, autocommit, cache_type, cache_options, size, **kwargs)
1661 self.size = size
1662 else:
-> 1663 self.size = self.details["size"]
1664 self.cache = caches[cache_type](
1665 self.blocksize, self._fetch_range, self.size, **cache_options
1666 )
1667 else:
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/spec.py:1676, in AbstractBufferedFile.details(self)
1673 @property
1674 def details(self):
1675 if self._details is None:
-> 1676 self._details = self.fs.info(self.path)
1677 return self._details
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/asyn.py:118, in sync_wrapper.<locals>.wrapper(*args, **kwargs)
115 @functools.wraps(func)
116 def wrapper(*args, **kwargs):
117 self = obj or args[0]
--> 118 return sync(self.loop, func, *args, **kwargs)
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/asyn.py:103, in sync(loop, func, timeout, *args, **kwargs)
101 raise FSTimeoutError from return_result
102 elif isinstance(return_result, BaseException):
--> 103 raise return_result
104 else:
105 return return_result
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/fsspec/asyn.py:56, in _runner(event, coro, result, timeout)
54 coro = asyncio.wait_for(coro, timeout=timeout)
55 try:
---> 56 result[0] = await coro
57 except Exception as ex:
58 result[0] = ex
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/s3fs/core.py:1302, in S3FileSystem._info(self, path, bucket, key, refresh, version_id)
1300 if key:
1301 try:
-> 1302 out = await self._call_s3(
1303 "head_object",
1304 self.kwargs,
1305 Bucket=bucket,
1306 Key=key,
1307 **version_id_kw(version_id),
1308 **self.req_kw,
1309 )
1310 return {
1311 "ETag": out.get("ETag", ""),
1312 "LastModified": out["LastModified"],
(...)
1318 "ContentType": out.get("ContentType"),
1319 }
1320 except FileNotFoundError:
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/s3fs/core.py:348, in S3FileSystem._call_s3(self, method, *akwarglist, **kwargs)
346 logger.debug("CALL: %s - %s - %s", method.__name__, akwarglist, kw2)
347 additional_kwargs = self._get_s3_method_kwargs(method, *akwarglist, **kwargs)
--> 348 return await _error_wrapper(
349 method, kwargs=additional_kwargs, retries=self.retries
350 )
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/s3fs/core.py:140, in _error_wrapper(func, args, kwargs, retries)
138 err = e
139 err = translate_boto_error(err)
--> 140 raise err
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/s3fs/core.py:113, in _error_wrapper(func, args, kwargs, retries)
111 for i in range(retries):
112 try:
--> 113 return await func(*args, **kwargs)
114 except S3_RETRYABLE_ERRORS as e:
115 err = e
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/aiobotocore/client.py:366, in AioBaseClient._make_api_call(self, operation_name, api_params)
362 maybe_compress_request(
363 self.meta.config, request_dict, operation_model
364 )
365 apply_request_checksum(request_dict)
--> 366 http, parsed_response = await self._make_request(
367 operation_model, request_dict, request_context
368 )
370 await self.meta.events.emit(
371 'after-call.{service_id}.{operation_name}'.format(
372 service_id=service_id, operation_name=operation_name
(...)
377 context=request_context,
378 )
380 if http.status_code >= 300:
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/aiobotocore/client.py:391, in AioBaseClient._make_request(self, operation_model, request_dict, request_context)
387 async def _make_request(
388 self, operation_model, request_dict, request_context
389 ):
390 try:
--> 391 return await self._endpoint.make_request(
392 operation_model, request_dict
393 )
394 except Exception as e:
395 await self.meta.events.emit(
396 'after-call-error.{service_id}.{operation_name}'.format(
397 service_id=self._service_model.service_id.hyphenize(),
(...)
401 context=request_context,
402 )
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/aiobotocore/endpoint.py:96, in AioEndpoint._send_request(self, request_dict, operation_model)
94 context = request_dict['context']
95 self._update_retries_context(context, attempts)
---> 96 request = await self.create_request(request_dict, operation_model)
97 success_response, exception = await self._get_response(
98 request, operation_model, context
99 )
100 while await self._needs_retry(
101 attempts,
102 operation_model,
(...)
105 exception,
106 ):
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/aiobotocore/endpoint.py:84, in AioEndpoint.create_request(self, params, operation_model)
80 service_id = operation_model.service_model.service_id.hyphenize()
81 event_name = 'request-created.{service_id}.{op_name}'.format(
82 service_id=service_id, op_name=operation_model.name
83 )
---> 84 await self._event_emitter.emit(
85 event_name,
86 request=request,
87 operation_name=operation_model.name,
88 )
89 prepared_request = self.prepare_request(request)
90 return prepared_request
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/aiobotocore/hooks.py:66, in AioHierarchicalEmitter._emit(self, event_name, kwargs, stop_on_response)
63 logger.debug('Event %s: calling handler %s', event_name, handler)
65 # Await the handler if its a coroutine.
---> 66 response = await resolve_awaitable(handler(**kwargs))
67 responses.append((handler, response))
68 if stop_on_response and response is not None:
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/aiobotocore/_helpers.py:15, in resolve_awaitable(obj)
13 async def resolve_awaitable(obj):
14 if inspect.isawaitable(obj):
---> 15 return await obj
17 return obj
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/aiobotocore/signers.py:24, in AioRequestSigner.handler(self, operation_name, request, **kwargs)
19 async def handler(self, operation_name=None, request=None, **kwargs):
20 # This is typically hooked up to the "request-created" event
21 # from a client's event emitter. When a new request is created
22 # this method is invoked to sign the request.
23 # Don't call this method directly.
---> 24 return await self.sign(operation_name, request)
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/aiobotocore/signers.py:82, in AioRequestSigner.sign(self, operation_name, request, region_name, signing_type, expires_in, signing_name)
79 else:
80 raise e
---> 82 auth.add_auth(request)
File ~/.pyenv/versions/3.11.6/envs/work-env/lib/python3.11/site-packages/botocore/auth.py:418, in SigV4Auth.add_auth(self, request)
416 def add_auth(self, request):
417 if self.credentials is None:
--> 418 raise NoCredentialsError()
419 datetime_now = datetime.datetime.utcnow()
420 request.context['timestamp'] = datetime_now.strftime(SIGV4_TIMESTAMP)
NoCredentialsError: Unable to locate credentials
```
### Anything else we need to know?
#### Summary
When debugging, I found a bugfix, to be made in the `fsspec` library.
I still wanted to create the issue in the xarray repo as the bug happened to me while using xarray, and another xarray users might have similar issues, so creating the issue here serves as a potential bridge for future users
#### Details
Bug in `fsspec: 2023.10.0`: it forgets to pass the `kwargs` to the `open` method in `ZipFileSystem.__init__`.
Current:
```python
fo = fsspec.open(
fo, mode=mode + "b", protocol=target_protocol, **(target_options or {})
)
```
Bugfix: (passing the kwargs)
```python
fo = fsspec.open(
fo, mode=mode + "b", protocol=target_protocol, **(target_options or {}), **kwargs
)
```
Note: the missing kwargs passing is still present in the latest main branch at the time of writing this issue: https://github.com/fsspec/filesystem_spec/blob/37c1bc63b9c5a5b2b9a0d5161e89b4233f888b29/fsspec/implementations/zip.py#L56 Tested on my local environment by editing fsspec itself. The Zip Zarr store on the s3 bucket can then be opened successfully. Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.11.6 (main, Jan 10 2024, 20:45:04) [GCC 9.4.0]
python-bits: 64
OS: Linux
OS-release: 5.15.0-102-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.9.3-development
xarray: 2023.10.1
pandas: 2.1.4
numpy: 1.26.2
scipy: 1.11.3
netCDF4: 1.6.5
pydap: None
h5netcdf: None
h5py: 3.10.0
Nio: None
zarr: 2.16.1
cftime: 1.6.3
nc_time_axis: None
PseudoNetCDF: None
iris: None
bottleneck: None
dask: 2023.11.0
distributed: 2023.11.0
matplotlib: 3.7.1
cartopy: 0.22.0
seaborn: None
numbagg: None
fsspec: 2023.10.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 68.2.2
pip: 23.2.1
conda: None
pytest: 7.4.3
mypy: 1.7.0
IPython: 8.20.0
sphinx: 6.2.1
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8944/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2239835092 | PR_kwDOAMm_X85seXWW | 8932 | FIX: use str dtype without size information | kmuehlbauer 5821660 | closed | 0 | 11 | 2024-04-12T10:59:45Z | 2024-04-15T19:43:22Z | 2024-04-13T12:25:48Z | MEMBER | 0 | pydata/xarray/pulls/8932 | Aims to resolve parts of #8844.
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8932/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2242781767 | PR_kwDOAMm_X85soOln | 8943 | Bump codecov/codecov-action from 4.2.0 to 4.3.0 in the actions group | dependabot[bot] 49699333 | closed | 0 | 0 | 2024-04-15T06:04:28Z | 2024-04-15T19:16:38Z | 2024-04-15T19:16:38Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/8943 | Bumps the actions group with 1 update: codecov/codecov-action. Updates Release notesSourced from codecov/codecov-action's releases.
Commits
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting Dependabot commands and optionsYou can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore <dependency name> major version` will close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself) - `@dependabot ignore <dependency name> minor version` will close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself) - `@dependabot ignore <dependency name>` will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself) - `@dependabot unignore <dependency name>` will remove all of the ignore conditions of the specified dependency - `@dependabot unignore <dependency name> <ignore condition>` will remove the ignore condition of the specified dependency and ignore conditions |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8943/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2208982027 | PR_kwDOAMm_X85q1Lns | 8879 | Migrate iterators.py for datatree. | owenlittlejohns 7788154 | closed | 0 | 2 | 2024-03-26T18:14:53Z | 2024-04-15T16:23:56Z | 2024-04-11T15:28:25Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/8879 | This PR continues the overall work of migrating DataTree into xarray.
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8879/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2240895281 | PR_kwDOAMm_X85siDno | 8934 | Correct save_mfdataset docstring | TomNicholas 35968931 | closed | 0 | 0 | 2024-04-12T20:51:35Z | 2024-04-14T19:58:46Z | 2024-04-14T11:14:42Z | MEMBER | 0 | pydata/xarray/pulls/8934 | Noticed the
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8934/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2236408438 | PR_kwDOAMm_X85sSjdN | 8926 | no untyped tests | Illviljan 14371165 | closed | 0 | 2 | 2024-04-10T20:52:29Z | 2024-04-14T16:15:45Z | 2024-04-14T16:15:45Z | MEMBER | 1 | pydata/xarray/pulls/8926 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8926/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2241528898 | PR_kwDOAMm_X85skNON | 8940 | adapt more tests to the copy-on-write behavior of pandas | keewis 14808389 | closed | 0 | 1 | 2024-04-13T11:57:10Z | 2024-04-13T19:36:30Z | 2024-04-13T14:44:50Z | MEMBER | 0 | pydata/xarray/pulls/8940 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8940/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2241499231 | PR_kwDOAMm_X85skHW9 | 8938 | use `pd.to_timedelta` instead of `TimedeltaIndex` | keewis 14808389 | closed | 0 | 0 | 2024-04-13T10:38:12Z | 2024-04-13T12:32:14Z | 2024-04-13T12:32:13Z | MEMBER | 0 | pydata/xarray/pulls/8938 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8938/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2241228882 | PR_kwDOAMm_X85sjOh9 | 8936 | MAINT: use sphinxext-rediraffe conda install | raybellwaves 17162724 | closed | 0 | 1 | 2024-04-13T02:11:07Z | 2024-04-13T02:53:53Z | 2024-04-13T02:53:48Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/8936 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8936/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2066807588 | I_kwDOAMm_X857MPsk | 8590 | ValueError: conflicting sizes for dimension in xr.open_dataset("reference://"...) VS. no error in xr.open_dataset(direct_file_path) for h5 | ksharonin 90292403 | closed | 0 | 3 | 2024-01-05T06:34:34Z | 2024-04-11T06:54:44Z | 2024-04-11T06:54:44Z | NONE | What is your issue?Hi all, on a project I am attempting a dataset read using the xarray JSON reference system. Metadata for this file (an ATL03 h5 file) can be found here: https://nsidc.org/sites/default/files/icesat2_atl03_data_dict_v005.pdf
File ~/opt/anaconda3/envs/kerchunkc/lib/python3.8/site-packages/xarray/backends/api.py:539, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, inline_array, backend_kwargs, kwargs) 527 decoders = _resolve_decoders_kwargs( 528 decode_cf, 529 open_backend_dataset_parameters=backend.open_dataset_parameters, (...) 535 decode_coords=decode_coords, 536 ) 538 overwrite_encoded_chunks = kwargs.pop(\"overwrite_encoded_chunks\", None) --> 539 backend_ds = backend.open_dataset( 540 filename_or_obj, 541 drop_variables=drop_variables, 542 decoders, 543 kwargs, 544 ) 545 ds = _dataset_from_backend_dataset( 546 backend_ds, 547 filename_or_obj, (...) 555 kwargs, 556 ) 557 return ds File ~/opt/anaconda3/envs/kerchunkc/lib/python3.8/site-packages/xarray/backends/zarr.py:862, in ZarrBackendEntrypoint.open_dataset(self, filename_or_obj, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta, group, mode, synchronizer, consolidated, chunk_store, storage_options, stacklevel) 860 store_entrypoint = StoreBackendEntrypoint() 861 with close_on_error(store): --> 862 ds = store_entrypoint.open_dataset( 863 store, 864 mask_and_scale=mask_and_scale, 865 decode_times=decode_times, 866 concat_characters=concat_characters, 867 decode_coords=decode_coords, 868 drop_variables=drop_variables, 869 use_cftime=use_cftime, 870 decode_timedelta=decode_timedelta, 871 ) 872 return ds File ~/opt/anaconda3/envs/kerchunkc/lib/python3.8/site-packages/xarray/backends/store.py:43, in StoreBackendEntrypoint.open_dataset(self, store, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta) 29 encoding = store.get_encoding() 31 vars, attrs, coord_names = conventions.decode_cf_variables( 32 vars, 33 attrs, (...) 40 decode_timedelta=decode_timedelta, 41 ) ---> 43 ds = Dataset(vars, attrs=attrs) 44 ds = ds.set_coords(coord_names.intersection(vars)) 45 ds.set_close(store.close) File ~/opt/anaconda3/envs/kerchunkc/lib/python3.8/site-packages/xarray/core/dataset.py:604, in Dataset.init(self, data_vars, coords, attrs) 601 if isinstance(coords, Dataset): 602 coords = coords.variables --> 604 variables, coord_names, dims, indexes, _ = merge_data_and_coords( 605 data_vars, coords, compat=\"broadcast_equals\" 606 ) 608 self._attrs = dict(attrs) if attrs is not None else None 609 self._close = None File ~/opt/anaconda3/envs/kerchunkc/lib/python3.8/site-packages/xarray/core/merge.py:575, in merge_data_and_coords(data_vars, coords, compat, join) 573 objects = [data_vars, coords] 574 explicit_coords = coords.keys() --> 575 return merge_core( 576 objects, 577 compat, 578 join, 579 explicit_coords=explicit_coords, 580 indexes=Indexes(indexes, coords), 581 ) File ~/opt/anaconda3/envs/kerchunkc/lib/python3.8/site-packages/xarray/core/merge.py:761, in merge_core(objects, compat, join, combine_attrs, priority_arg, explicit_coords, indexes, fill_value) 756 prioritized = _get_priority_vars_and_indexes(aligned, priority_arg, compat=compat) 757 variables, out_indexes = merge_collected( 758 collected, prioritized, compat=compat, combine_attrs=combine_attrs 759 ) --> 761 dims = calculate_dimensions(variables) 763 coord_names, noncoord_names = determine_coords(coerced) 764 if explicit_coords is not None: File ~/opt/anaconda3/envs/kerchunkc/lib/python3.8/site-packages/xarray/core/variable.py:3208, in calculate_dimensions(variables) 3206 last_used[dim] = k 3207 elif dims[dim] != size: -> 3208 raise ValueError( 3209 f\"conflicting sizes for dimension {dim!r}: \" 3210 f\"length {size} on {k!r} and length {dims[dim]} on {last_used!r}\" 3211 ) 3212 return dims ValueError: conflicting sizes for dimension 'phony_dim_1': length 498 on 'width' and length 160 on {'phony_dim_0': 'dead_time', 'phony_dim_1': 'rad_corr', 'phony_dim_2': 'rad_corr'}" } ```
The JSON reference file has been attached for reference ATL03_REF_NONUTM.json |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8590/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2117245042 | I_kwDOAMm_X85-Mphy | 8703 | calling to_zarr inside map_blocks function results in missing values | martinspetlik 23472459 | closed | 0 | 8 | 2024-02-04T18:21:40Z | 2024-04-11T06:53:45Z | 2024-04-11T06:53:45Z | NONE | What happened?I want to work with a huge dataset stored in hdf5 loaded in chunks. Each chunk contains part of my data that should be saved to a specific region of zarr files. I need to follow the original order of chunks.
I found it a convenient way to use a I used a simplified scenario for code documenting this behavior. The initial zarr file of zeros is filled with ones. There are always some parts where there are still zeros. What did you expect to happen?No response Minimal Complete Verifiable Example```Python import os import shutil import xarray as xr import numpy as np import dask.array as da xr.show_versions() zarr_file = "file.zarr" if os.path.exists(zarr_file): shutil.rmtree(zarr_file) chunk_size = 5 shape = (50, 32, 1000) ones_dataset = xr.Dataset({"data": xr.ones_like(xr.DataArray(np.empty(shape)))}) ones_dataset = ones_dataset.chunk({'dim_0': chunk_size}) chunk_indices = np.arange(len(ones_dataset.chunks['dim_0'])) chunk_ids = np.repeat(np.arange(ones_dataset.sizes["dim_0"] // chunk_size), chunk_size) chunk_ids_dask_array = da.from_array(chunk_ids, chunks=(chunk_size,)) Append the chunk IDs Dask array as a new variable to the existing datasetones_dataset['chunk_id'] = (('dim_0',), chunk_ids_dask_array) Create a new dataset filled with zeroszeros_dataset = xr.Dataset({"data": xr.zeros_like(xr.DataArray(np.empty(shape)))}) zeros_dataset.to_zarr(zarr_file, compute=False) def process_chunk(chunk_dataset): chunk_id = int(chunk_dataset["chunk_id"][0]) chunk_dataset_to_store = chunk_dataset.drop_vars("chunk_id")
ones_dataset.map_blocks(process_chunk, template=ones_dataset).compute() Load data stored in zarrzarr_data = xr.open_zarr(zarr_file, chunks={'dim_0': chunk_size}) Find differencesfor var_name in zarr_data.variables: try: xr.testing.assert_equal(zarr_data[var_name], ones_dataset[var_name]) except AssertionError: print(f"Differences in {var_name}:") print(zarr_data[var_name].values) print(ones_dataset[var_name].values) ``` MVCE confirmation
Relevant log outputNo response Anything else we need to know?No response Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
python-bits: 64
OS: Linux
OS-release: 6.5.0-15-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.9.3-development
xarray: 2024.1.1
pandas: 2.1.4
numpy: 1.26.3
scipy: 1.11.4
netCDF4: 1.6.5
pydap: None
h5netcdf: 1.3.0
h5py: 3.10.0
Nio: None
zarr: 2.16.1
cftime: 1.6.3
nc_time_axis: 1.4.1
iris: None
bottleneck: 1.3.7
dask: 2024.1.1
distributed: 2024.1.0
matplotlib: 3.8.2
cartopy: 0.22.0
seaborn: 0.13.1
numbagg: 0.6.8
fsspec: 2023.12.2
cupy: None
pint: None
sparse: None
flox: 0.8.9
numpy_groupies: 0.10.2
setuptools: 69.0.2
pip: 23.3.1
conda: None
pytest: 7.4.4
mypy: None
IPython: None
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8703/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2189919172 | I_kwDOAMm_X86Ch4PE | 8848 | numpy where argument crashes the program | ShaiAvr 35605235 | closed | 0 | 3 | 2024-03-16T11:17:22Z | 2024-04-11T06:53:28Z | 2024-04-11T06:53:28Z | NONE | What happened?I was trying to divide two What did you expect to happen?The division would succeed and I'd get a Minimal Complete Verifiable Example```Python import numpy as np import xarray as xr a = xr.DataArray([-1, 0, 1, 2, 3], dims="x") b = xr.DataArray([0, 0, 0, 2, 2], dims="x") print(np.divide(a, b, out=np.zeros_like(a, dtype=float), where=b != 0)) ``` MVCE confirmation
Relevant log output
Anything else we need to know?No response Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.11.5 (tags/v3.11.5:cce6ba9, Aug 24 2023, 14:38:34) [MSC v.1936 64 bit (AMD64)]
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 158 Stepping 10, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('English_United States', '1252')
libhdf5: 1.14.0
libnetcdf: 4.9.2
xarray: 2024.2.0
pandas: 2.2.1
numpy: 1.26.4
scipy: 1.12.0
netCDF4: 1.6.5
pydap: None
h5netcdf: 1.3.0
h5py: 3.10.0
Nio: None
zarr: 2.17.1
cftime: 1.6.3
nc_time_axis: 1.4.1
iris: None
bottleneck: 1.3.8
dask: 2024.3.1
distributed: 2024.3.1
matplotlib: 3.8.2
cartopy: None
seaborn: 0.13.2
numbagg: 0.8.1
fsspec: 2024.2.0
cupy: None
pint: 0.23
sparse: None
flox: 0.9.3
numpy_groupies: 0.10.2
setuptools: 69.1.1
pip: 24.0
conda: None
pytest: 8.0.2
mypy: 1.8.0
IPython: 8.22.1
sphinx: 7.2.6
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8848/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2231409978 | PR_kwDOAMm_X85sBYyR | 8920 | Enhance the ugly error in constructor when no data passed | aimtsou 2598924 | closed | 0 | 6 | 2024-04-08T14:42:57Z | 2024-04-10T22:46:57Z | 2024-04-10T22:46:53Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/8920 | This fix enhances the issue 8860. I did not add any test since I believe it is not needed for this case since we did not add any functionality.
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8920/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2198196326 | I_kwDOAMm_X86DBdBm | 8860 | Ugly error in constructor when no data passed | TomNicholas 35968931 | closed | 0 | 2 | 2024-03-20T17:55:52Z | 2024-04-10T22:46:55Z | 2024-04-10T22:46:54Z | MEMBER | What happened?Passing no data to the What did you expect to happen?An error more like "tuple must be of form (dims, data[, attrs])" Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output```PythonIndexError Traceback (most recent call last) Cell In[2], line 1 ----> 1 xr.Dataset({"t": ()}) File ~/Documents/Work/Code/xarray/xarray/core/dataset.py:693, in Dataset.init(self, data_vars, coords, attrs) 690 if isinstance(coords, Dataset): 691 coords = coords._variables --> 693 variables, coord_names, dims, indexes, _ = merge_data_and_coords( 694 data_vars, coords 695 ) 697 self._attrs = dict(attrs) if attrs else None 698 self._close = None File ~/Documents/Work/Code/xarray/xarray/core/dataset.py:422, in merge_data_and_coords(data_vars, coords) 418 coords = create_coords_with_default_indexes(coords, data_vars) 420 # exclude coords from alignment (all variables in a Coordinates object should 421 # already be aligned together) and use coordinates' indexes to align data_vars --> 422 return merge_core( 423 [data_vars, coords], 424 compat="broadcast_equals", 425 join="outer", 426 explicit_coords=tuple(coords), 427 indexes=coords.xindexes, 428 priority_arg=1, 429 skip_align_args=[1], 430 ) File ~/Documents/Work/Code/xarray/xarray/core/merge.py:718, in merge_core(objects, compat, join, combine_attrs, priority_arg, explicit_coords, indexes, fill_value, skip_align_args) 715 for pos, obj in skip_align_objs: 716 aligned.insert(pos, obj) --> 718 collected = collect_variables_and_indexes(aligned, indexes=indexes) 719 prioritized = _get_priority_vars_and_indexes(aligned, priority_arg, compat=compat) 720 variables, out_indexes = merge_collected( 721 collected, prioritized, compat=compat, combine_attrs=combine_attrs 722 ) File ~/Documents/Work/Code/xarray/xarray/core/merge.py:358, in collect_variables_and_indexes(list_of_mappings, indexes) 355 indexes_.pop(name, None) 356 append_all(coords_, indexes_) --> 358 variable = as_variable(variable, name=name, auto_convert=False) 359 if name in indexes: 360 append(name, variable, indexes[name]) File ~/Documents/Work/Code/xarray/xarray/core/variable.py:126, in as_variable(obj, name, auto_convert) 124 obj = obj.copy(deep=False) 125 elif isinstance(obj, tuple): --> 126 if isinstance(obj[1], DataArray): 127 raise TypeError( 128 f"Variable {name!r}: Using a DataArray object to construct a variable is" 129 " ambiguous, please extract the data using the .data property." 130 ) 131 try: IndexError: tuple index out of range ``` Anything else we need to know?No response EnvironmentXarray |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8860/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2232134629 | PR_kwDOAMm_X85sD5gg | 8922 | Add typing to some functions in indexing.py | Illviljan 14371165 | closed | 0 | 0 | 2024-04-08T21:45:30Z | 2024-04-10T18:05:52Z | 2024-04-10T18:05:52Z | MEMBER | 0 | pydata/xarray/pulls/8922 | A drive-by PR as I was trying to figure out how these functions works. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8922/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2230680765 | I_kwDOAMm_X86E9Xy9 | 8919 | Using the xarray.Dataset.where() function takes up a lot of memory | isLiYang 69391863 | closed | 0 | 4 | 2024-04-08T09:15:49Z | 2024-04-09T02:45:09Z | 2024-04-09T02:45:08Z | NONE | What is your issue?My python script was killed because it took up too much memory. After checking, I found that the problem is the ds.where() function. The original netcdf file opened from the hard disk takes up about 10 Mb of storage, but when I mask the data that doesn't match according to the latitude and longitude location, the variable ds takes up a dozen GB of memory. When I deleted this variable using del ds, the memory occupied by the script immediately returned to normal. ``` Open this netcdf file.ds = xr.open_dataset(track) If longitude range is [-180, 180], then convert to [0, 360].if np.any(ds[var_lon] < 0): ds[var_lon] = ds[var_lon] % 360 Extract data by longitude and latitude.ds = ds.where((ds[var_lon] >= region[0]) & (ds[var_lon] <= region[1]) & (ds[var_lat] >= region[2]) & (ds[var_lat] <= region[3])) Select data by range and value of some variables.for key, value in range_select.items(): ds = ds.where((ds[key] >= value[0]) & (ds[key] <= value[1])) for key, value in value_select.items(): ds = ds.where(ds[key].isin(value)) ``` |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8919/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2230357492 | PR_kwDOAMm_X85r9wiV | 8918 | Bump codecov/codecov-action from 4.1.1 to 4.2.0 in the actions group | dependabot[bot] 49699333 | closed | 0 | 0 | 2024-04-08T06:21:47Z | 2024-04-08T16:31:12Z | 2024-04-08T16:31:11Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/8918 | Bumps the actions group with 1 update: codecov/codecov-action. Updates Release notesSourced from codecov/codecov-action's releases.
CommitsDependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting Dependabot commands and optionsYou can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore <dependency name> major version` will close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself) - `@dependabot ignore <dependency name> minor version` will close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself) - `@dependabot ignore <dependency name>` will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself) - `@dependabot unignore <dependency name>` will remove all of the ignore conditions of the specified dependency - `@dependabot unignore <dependency name> <ignore condition>` will remove the ignore condition of the specified dependency and ignore conditions |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8918/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2228373305 | I_kwDOAMm_X86E0kc5 | 8915 | Weird behavior of DataSet.where(... , drop=True) | johannespletzer 22961670 | closed | 0 | 4 | 2024-04-05T16:03:05Z | 2024-04-08T09:32:48Z | 2024-04-08T09:32:48Z | NONE | What happened?I work with an aircraft emission dataset that is freely available online: emission dataset During my calculations I eventually convert the Example 1: Along some dimensions data points vanished if Example 2: For other dimensions (these?) data points appeared elsewhere if What did you expect to happen?I expect for my calculations to return the same results, regardless of whether drop=True is active or not. Minimal Complete Verifiable Example```Python !wget "https://zenodo.org/records/10818082/files/Emission_Inventory_H2O_Optimized_v0.1_MR3_Fleet_BRU-MYA_2075.nc" import matplotlib.pyplot as plt import xarray as xr nc_file = xr.open_dataset('Emission_Inventory_H2O_Optimized_v0.1_MR3_Fleet_BRU-MYA_2075.nc') fig, axs = plt.subplots(1,2,figsize=(10,4)) nc_file.H2O.where(nc_file.H2O!=0, drop=True).sum(('lon','time')).plot.contour(x='lat',ax=axs[0]) axs[0].set_xlim(-50,90) axs[0].set_title('With drop=True') nc_file.H2O.where(nc_file.H2O!=0, drop=False).sum(('lon','time')).plot.contour(x='lat',ax=axs[1]) axs[1].set_xlim(-50,90) axs[1].set_title('With drop=False') plt.tight_layout() plt.show() fig, axs = plt.subplots(1,2,figsize=(10,4)) nc_file.H2O.where(nc_file.H2O!=0, drop=True).sum(('lat','time')).plot.contour(x='lon',ax=axs[0]) axs[0].set_title('With drop=True') nc_file.H2O.where(nc_file.H2O!=0, drop=False).sum(('lat','time')).plot.contour(x='lon',ax=axs[1]) axs[1].set_title('With drop=False') plt.tight_layout() plt.show() ``` MVCE confirmation
Relevant log outputNo response Anything else we need to know?No response Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.10.9 | packaged by Anaconda, Inc. | (main, Mar 1 2023, 18:18:15) [MSC v.1916 64 bit (AMD64)]
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 165 Stepping 2, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: ('en_US', 'ISO8859-1')
libhdf5: 1.14.0
libnetcdf: 4.9.2
xarray: 2022.11.0
pandas: 1.5.3
numpy: 1.23.5
scipy: 1.13.0
netCDF4: 1.6.5
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.6.3
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.5
dask: None
distributed: None
matplotlib: 3.7.0
cartopy: 0.21.1
seaborn: 0.12.2
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 65.6.3
pip: 22.3.1
conda: None
pytest: None
IPython: 8.10.0
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8915/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2227630565 | I_kwDOAMm_X86ExvHl | 8912 | xarray and pyinstaller | happymvd 135848029 | closed | 0 | 3 | 2024-04-05T10:14:44Z | 2024-04-08T07:10:46Z | 2024-04-05T20:26:48Z | NONE | What is your issue?I am working on a Windows 11 computer with Python 3.11.9 installed as the only version of Python I am working in Virtual Studio Code I have created and activated a venv virtual environment for my project (not a conda one) I used the pip command to install xarray into the virtual environment I have a test script called Testing.py and the only thing it does is import xarray as xr and then it prints a message to the terminal window This works 100% in python in the VSC terminal window I then issue the following command pyinstaller --onefile Testing.py This creates a file called Testing.exe for me When I run Testing.exe I get the following error message **Traceback (most recent call last): File "importlib\metadata__init__.py", line 563, in from_name StopIteration During handling of the above exception, another exception occurred: Traceback (most recent call last): File "Testing.py", line 20, in <module> File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module File "xarray__init__.py", line 3, in <module> File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module File "xarray\testing__init__.py", line 1, in <module> File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module File "xarray\testing\assertions.py", line 11, in <module> File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module File "xarray\core\duck_array_ops.py", line 36, in <module> File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module File "xarray\core\dask_array_ops.py", line 3, in <module> File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module File "xarray\core\nputils.py", line 14, in <module> File "xarray\namedarray\utils.py", line 60, in module_available File "importlib\metadata__init__.py", line 1009, in version File "importlib\metadata__init__.py", line 982, in distribution File "importlib\metadata__init__.py", line 565, in from_name importlib.metadata.PackageNotFoundError: No package metadata was found for numpy [15596] Failed to execute script 'Testing' due to unhandled exception!** Please, is there anyone who can suggest what I need to do to get around this problem .... I need to distribute my project when it is complete |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8912/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2229440952 | I_kwDOAMm_X86E4pG4 | 8917 | from_array() got an unexpected keyword argument 'inline_array' | Shiviiii23 29259305 | closed | 0 | 6 | 2024-04-06T22:09:47Z | 2024-04-08T00:15:38Z | 2024-04-08T00:15:37Z | NONE | What happened?I got the error: from_array() got an unexpected keyword argument 'inline_array' for this line in xarray: data = da.from_array( data, chunks, name=name, lock=lock, inline_array=inline_array, **kwargs ) lines 1230 and 1231 in xarray/core/variable.py What did you expect to happen?No response Minimal Complete Verifiable ExampleNo response MVCE confirmation
Relevant log outputNo response Anything else we need to know?No response Environment |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8917/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
not_planned | xarray 13221727 | issue | ||||||
2228266052 | PR_kwDOAMm_X85r24hE | 8913 | Update hypothesis action to always save the cache | dcherian 2448579 | closed | 0 | 0 | 2024-04-05T15:09:35Z | 2024-04-05T16:51:05Z | 2024-04-05T16:51:03Z | MEMBER | 0 | pydata/xarray/pulls/8913 | Update the cache always. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8913/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2215113392 | PR_kwDOAMm_X85rJ3wR | 8889 | Add typing to test_plot.py | Illviljan 14371165 | closed | 0 | 0 | 2024-03-29T10:49:39Z | 2024-04-05T16:42:27Z | 2024-04-05T16:42:27Z | MEMBER | 0 | pydata/xarray/pulls/8889 | Enforce typing on all tests in |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8889/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2057651682 | PR_kwDOAMm_X85i2Byx | 8573 | ddof vs correction kwargs in std/var | TomNicholas 35968931 | closed | 0 | 0 | 2023-12-27T18:10:52Z | 2024-04-04T16:46:55Z | 2024-04-04T16:46:55Z | MEMBER | 0 | pydata/xarray/pulls/8573 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8573/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2136709010 | I_kwDOAMm_X85_W5eS | 8753 | Lazy Loading with `DataArray` vs. `Variable` | dcherian 2448579 | closed | 0 | 0 | 2024-02-15T14:42:24Z | 2024-04-04T16:46:54Z | 2024-04-04T16:46:54Z | MEMBER | Discussed in https://github.com/pydata/xarray/discussions/8751
<sup>Originally posted by **ilan-gold** February 15, 2024</sup>
My goal is to get a dataset from [custom io-zarr backend lazy-loaded](https://docs.xarray.dev/en/stable/internals/how-to-add-new-backend.html#how-to-support-lazy-loading). But when I declare a `DataArray` based on the `Variable` which uses `LazilyIndexedArray`, everything is read in. Is this expected? I specifically don't want to have to use dask if possible. I have seen https://github.com/aurghs/xarray-backend-tutorial/blob/main/2.Backend_with_Lazy_Loading.ipynb but it's a little bit different.
While I have a custom backend array inheriting from `ZarrArrayWrapper`, this example using `ZarrArrayWrapper` directly still highlights the same unexpected behavior of everything being read in.
```python
import zarr
import xarray as xr
from tempfile import mkdtemp
import numpy as np
from pathlib import Path
from collections import defaultdict
class AccessTrackingStore(zarr.DirectoryStore):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self._access_count = {}
self._accessed = defaultdict(set)
def __getitem__(self, key):
for tracked in self._access_count:
if tracked in key:
self._access_count[tracked] += 1
self._accessed[tracked].add(key)
return super().__getitem__(key)
def get_access_count(self, key):
return self._access_count[key]
def set_key_trackers(self, keys_to_track):
if isinstance(keys_to_track, str):
keys_to_track = [keys_to_track]
for k in keys_to_track:
self._access_count[k] = 0
def get_subkeys_accessed(self, key):
return self._accessed[key]
orig_path = Path(mkdtemp())
z = zarr.group(orig_path / "foo.zarr")
z['array'] = np.random.randn(1000, 1000)
store = AccessTrackingStore(orig_path / "foo.zarr")
store.set_key_trackers(['array'])
z = zarr.group(store)
arr = xr.backends.zarr.ZarrArrayWrapper(z['array'])
lazy_arr = xr.core.indexing.LazilyIndexedArray(arr)
# just `.zarray`
var = xr.Variable(('x', 'y'), lazy_arr)
print('Variable read in ', store.get_subkeys_accessed('array'))
# now everything is read in
da = xr.DataArray(var)
print('DataArray read in ', store.get_subkeys_accessed('array'))
``` |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8753/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2136724736 | PR_kwDOAMm_X85m_MtN | 8754 | Don't access data when creating DataArray from Variable. | dcherian 2448579 | closed | 0 | 2 | 2024-02-15T14:48:32Z | 2024-04-04T16:46:54Z | 2024-04-04T16:46:53Z | MEMBER | 0 | pydata/xarray/pulls/8754 |
This seems to have been around since 2016-ish, so presumably our backend code path is passing arrays around, not Variables. cc @ilan-gold |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8754/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2220470499 | I_kwDOAMm_X86EWbDj | 8902 | opening a zarr dataset taking so much time | DarshanSP19 93967637 | closed | 0 | 10 | 2024-04-02T13:01:52Z | 2024-04-04T07:49:51Z | 2024-04-04T07:49:51Z | NONE | What is your issue?I have an era5 dataset stored in GCS bucket as zarr. It contains 273 weather related variables and 4 dimensions. It's an hourly stored data from 1940 to 2023.
When I try to open with |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8902/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2224300175 | PR_kwDOAMm_X85rpG4S | 8907 | Trigger hypothesis stateful tests nightly | dcherian 2448579 | closed | 0 | 0 | 2024-04-04T02:16:59Z | 2024-04-04T02:17:49Z | 2024-04-04T02:17:47Z | MEMBER | 0 | pydata/xarray/pulls/8907 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8907/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2098659175 | PR_kwDOAMm_X85k-T6b | 8658 | Stateful tests with Dataset | dcherian 2448579 | closed | 0 | 8 | 2024-01-24T16:34:59Z | 2024-04-03T21:29:38Z | 2024-04-03T21:29:36Z | MEMBER | 0 | pydata/xarray/pulls/8658 | I was curious to see if the hypothesis stateful testing would catch an inconsistent sequence of index manipulation operations like #8646. Turns out PS: this blog post is amazing.
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8658/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2000205407 | PR_kwDOAMm_X85fzupc | 8467 | [skip-ci] dev whats-new | dcherian 2448579 | closed | 0 | 0 | 2023-11-18T03:59:29Z | 2024-04-03T21:08:45Z | 2023-11-18T15:20:37Z | MEMBER | 0 | pydata/xarray/pulls/8467 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8467/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1989233637 | PR_kwDOAMm_X85fOdAk | 8446 | Remove PseudoNetCDF | dcherian 2448579 | closed | 0 | 0 | 2023-11-12T04:29:50Z | 2024-04-03T21:08:44Z | 2023-11-13T21:53:56Z | MEMBER | 0 | pydata/xarray/pulls/8446 | joining the party
- [x] User visible changes (including notable bug fixes) are documented in |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8446/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 1, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2064698904 | PR_kwDOAMm_X85jLHsQ | 8584 | Silence a bunch of CachingFileManager warnings | dcherian 2448579 | closed | 0 | 1 | 2024-01-03T21:57:07Z | 2024-04-03T21:08:27Z | 2024-01-03T22:52:58Z | MEMBER | 0 | pydata/xarray/pulls/8584 | { "url": "https://api.github.com/repos/pydata/xarray/issues/8584/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
2102850331 | PR_kwDOAMm_X85lMW8k | 8674 | Fix negative slicing of Zarr arrays | dcherian 2448579 | closed | 0 | 0 | 2024-01-26T20:22:21Z | 2024-04-03T21:08:26Z | 2024-02-10T02:57:32Z | MEMBER | 0 | pydata/xarray/pulls/8674 | Closes #8252 Closes #3921
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8674/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2148245262 | PR_kwDOAMm_X85nmmqX | 8777 | Return a dataclass from Grouper.factorize | dcherian 2448579 | closed | 0 | 0 | 2024-02-22T05:41:29Z | 2024-04-03T21:08:25Z | 2024-03-15T04:47:30Z | MEMBER | 0 | pydata/xarray/pulls/8777 | Toward #8510, builds on #8776 |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8777/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2148164557 | PR_kwDOAMm_X85nmU5w | 8775 | [skip-ci] NamedArray: Add lazy indexing array refactoring plan | dcherian 2448579 | closed | 0 | 0 | 2024-02-22T04:25:49Z | 2024-04-03T21:08:21Z | 2024-02-23T22:20:09Z | MEMBER | 0 | pydata/xarray/pulls/8775 | This adds a proposal for decoupling the lazy indexing array machinery, indexing adapter machinery, and Variable's setitem and getitem methods, so that the latter can be migrated to NamedArray. cc @andersy005 |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8775/reactions", "total_count": 2, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 2, "eyes": 0 } |
xarray 13221727 | pull | |||||
2198991054 | PR_kwDOAMm_X85qTNFP | 8861 | upstream-dev CI: Fix interp and cumtrapz | dcherian 2448579 | closed | 0 | 0 | 2024-03-21T02:49:40Z | 2024-04-03T21:08:17Z | 2024-03-21T04:16:45Z | MEMBER | 0 | pydata/xarray/pulls/8861 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8861/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2206243581 | I_kwDOAMm_X86DgJr9 | 8876 | Possible race condition when appending to an existing zarr | rsemlal-murmuration 157591329 | closed | 0 | 4 | 2024-03-25T16:59:52Z | 2024-04-03T15:23:14Z | 2024-03-29T14:35:52Z | NONE | What happened?When appending to an existing zarr along a dimension ( What did you expect to happen?We would expected that zarr append to have the same behaviour as if we concatenate dataset in memory (using Minimal Complete Verifiable Example```Python from distributed import Client, LocalCluster import xarray as xr import tempfile ds1 = xr.Dataset({"a": ("x", [1., 1.])}, coords={'x': [1, 2]}).chunk({"x": 3}) ds2 = xr.Dataset({"a": ("x", [1., 1., 1., 1.])}, coords={'x': [3, 4, 5, 6]}).chunk({"x": 3}) with Client(LocalCluster(processes=False, n_workers=1, threads_per_worker=2)): # The issue happens only when: threads_per_worker > 1 for i in range(0, 100): with tempfile.TemporaryDirectory() as store: print(store) ds1.to_zarr(store, mode="w") # write first dataset ds2.to_zarr(store, mode="a", append_dim="x") # append first dataset
``` MVCE confirmation
Relevant log output
Anything else we need to know?The example code snippet provided here, reproduces the issue. Since the issue occurs randomly, we loop in the example for a few times and stop when the issue occurs. In the example, when Side note: This behaviour in itself is not problematic in this case, but the fact that the chunking is silently changed made this issue harder to spot. However, when we try to append the second dataset Zarr chunks:
+ chunk1 : Dask chunks for Both dask chunks A and B, are supposed to write on zarr chunk3
And depending on who writes first, we can end up with NaN on The issue obviously happens only when dask tasks are run in parallel.
Using We couldn't figure out from the documentation how to detect this kind of issues, and how to prevent them from happening (maybe using a synchronizer?) Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.11.0rc1 (main, Aug 12 2022, 10:02:14) [GCC 11.2.0]
python-bits: 64
OS: Linux
OS-release: 5.15.133.1-microsoft-standard-WSL2
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.9.3-development
xarray: 2024.2.0
pandas: 2.2.1
numpy: 1.26.4
scipy: 1.12.0
netCDF4: 1.6.5
pydap: None
h5netcdf: 1.3.0
h5py: 3.10.0
Nio: None
zarr: 2.17.1
cftime: 1.6.3
nc_time_axis: 1.4.1
iris: None
bottleneck: 1.3.8
dask: 2024.3.1
distributed: 2024.3.1
matplotlib: 3.8.3
cartopy: None
seaborn: 0.13.2
numbagg: 0.8.1
fsspec: 2024.3.1
cupy: None
pint: None
sparse: None
flox: 0.9.5
numpy_groupies: 0.10.2
setuptools: 69.2.0
pip: 24.0
conda: None
pytest: 8.1.1
mypy: None
IPython: 8.22.2
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8876/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2171912634 | PR_kwDOAMm_X85o3Ify | 8809 | Pass variable name to `encode_zarr_variable` | slevang 39069044 | closed | 0 | 6 | 2024-03-06T16:21:53Z | 2024-04-03T14:26:49Z | 2024-04-03T14:26:48Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/8809 |
The change from https://github.com/pydata/xarray/pull/8672 mostly fixed the issue of serializing a reset multiindex in the backends, but there was an additional niche issue that turned up in xeofs that was causing serialization to still fail on the zarr backend. The issue is that zarr is the only backend that uses a custom version of As a minimal fix, this PR just passes The exact workflow this turned up in involves DataTree and looks like this: ```python import numpy as np import xarray as xr from datatree import DataTree ND DataArray that gets stacked along a multiindexda = xr.DataArray(np.ones((3, 3)), coords={"dim1": [1, 2, 3], "dim2": [4, 5, 6]}) da = da.stack(feature=["dim1", "dim2"]) Extract just the stacked coordinates for saving in a datasetds = xr.Dataset(data_vars={"feature": da.feature}) Reset the multiindex, which should make things serializableds = ds.reset_index("feature") dt1 = DataTree() dt2 = DataTree(name="feature", data=ds) dt1["foo"] = dt2 Somehow in this step, dt1.foo.feature.dim1.variable becomes an IndexVariable againprint(type(dt1.foo.feature.dim1.variable)) Worksdt1.to_netcdf("test.nc", mode="w") Failsdt1.to_zarr("test.zarr", mode="w") ``` But we can reproduce in xarray with the test added here. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8809/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2220487961 | PR_kwDOAMm_X85rb6ea | 8903 | Update docstring for compute and persist | saschahofmann 24508496 | closed | 0 | 2 | 2024-04-02T13:10:02Z | 2024-04-03T07:45:10Z | 2024-04-02T23:52:32Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/8903 |
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8903/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2220228856 | I_kwDOAMm_X86EVgD4 | 8901 | Is .persist in place or like .compute? | saschahofmann 24508496 | closed | 0 | 3 | 2024-04-02T11:09:59Z | 2024-04-02T23:52:33Z | 2024-04-02T23:52:33Z | CONTRIBUTOR | What is your issue?I am playing around with using In either case, I would make a PR to clarify in the docs whether persists leaves the original data untouched or not. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8901/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2218195713 | PR_kwDOAMm_X85rUCVG | 8898 | Update reference to 'Weighted quantile estimators' | AndreyAkinshin 2259237 | closed | 0 | 3 | 2024-04-01T12:49:36Z | 2024-04-02T12:51:28Z | 2024-04-01T15:42:19Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/8898 | { "url": "https://api.github.com/repos/pydata/xarray/issues/8898/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
2215890029 | I_kwDOAMm_X86EE8xt | 8894 | Rolling reduction with a custom function generates an excesive use of memory that kills the workers | josephnowak 25071375 | closed | 0 | 8 | 2024-03-29T19:15:28Z | 2024-04-01T20:57:59Z | 2024-03-30T01:49:17Z | CONTRIBUTOR | What happened?Hi, I have been trying to use a custom function on the rolling reduction method, the original function tries to filter the nan values (any numpy function that I have used that handles nans generates the same problem) to later apply some simple aggregate functions, but it is killing all my workers even when the data is very small (I have 7 workers and all of them have 3 Gb of RAM). What did you expect to happen?I would expect less use of memory taking into account the size of the rolling window, the simplicity of the function and the amount of data used on the example. Minimal Complete Verifiable Example```Python import numpy as np import dask.array as da import xarray as xr import dask def f(x, axis): # If I replace np.nansum by np.sum everything works perfectly and the amount of memory used is very small return np.nansum(x, axis=axis) arr = xr.DataArray( dask.array.zeros( shape=(300, 30000), dtype=float, chunks=(30, 6000) ), dims=["a", "b"], coords={"a": list(range(300)), "b": list(range(30000))} ) arr.rolling(a=252).reduce(f).chunk({"a": 252}).to_zarr("/data/test/test_write", mode="w") ``` MVCE confirmation
Relevant log output
Anything else we need to know?No response Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.11.7 | packaged by conda-forge | (main, Dec 23 2023, 14:43:09) [GCC 12.3.0]
python-bits: 64
OS: Linux
OS-release: 4.14.275-207.503.amzn2.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.3
libnetcdf: None
xarray: 2024.1.0
pandas: 2.2.1
numpy: 1.26.3
scipy: 1.11.4
netCDF4: None
pydap: None
h5netcdf: None
h5py: 3.10.0
Nio: None
zarr: 2.16.1
cftime: None
nc_time_axis: None
iris: None
bottleneck: 1.3.7
dask: 2024.1.0
distributed: 2024.1.0
matplotlib: 3.8.2
cartopy: None
seaborn: 0.13.1
numbagg: 0.7.0
fsspec: 2023.12.2
cupy: None
pint: None
sparse: 0.15.1
flox: 0.8.9
numpy_groupies: 0.10.2
setuptools: 69.0.3
pip: 23.3.2
conda: 23.11.0
pytest: 7.4.4
mypy: None
IPython: 8.20.0
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8894/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2218712684 | PR_kwDOAMm_X85rV1i2 | 8900 | [pre-commit.ci] pre-commit autoupdate | pre-commit-ci[bot] 66853113 | closed | 0 | 0 | 2024-04-01T17:27:26Z | 2024-04-01T18:57:43Z | 2024-04-01T18:57:42Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/8900 | { "url": "https://api.github.com/repos/pydata/xarray/issues/8900/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
2217657228 | PR_kwDOAMm_X85rSLKy | 8896 | Bump the actions group with 1 update | dependabot[bot] 49699333 | closed | 0 | 0 | 2024-04-01T06:50:24Z | 2024-04-01T18:02:56Z | 2024-04-01T18:02:56Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/8896 | Bumps the actions group with 1 update: codecov/codecov-action. Updates Release notesSourced from codecov/codecov-action's releases.
ChangelogSourced from codecov/codecov-action's changelog.
... (truncated) Commits
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting Dependabot commands and optionsYou can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore <dependency name> major version` will close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself) - `@dependabot ignore <dependency name> minor version` will close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself) - `@dependabot ignore <dependency name>` will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself) - `@dependabot unignore <dependency name>` will remove all of the ignore conditions of the specified dependency - `@dependabot unignore <dependency name> <ignore condition>` will remove the ignore condition of the specified dependency and ignore conditions |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8896/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2218574880 | PR_kwDOAMm_X85rVXJC | 8899 | New empty whatsnew entry | TomNicholas 35968931 | closed | 0 | 0 | 2024-04-01T16:04:27Z | 2024-04-01T17:49:09Z | 2024-04-01T17:49:06Z | MEMBER | 0 | pydata/xarray/pulls/8899 | Should have been done as part of the last release https://github.com/pydata/xarray/releases/tag/v2024.03.0 |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8899/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2215539648 | PR_kwDOAMm_X85rLW_p | 8891 | 2024.03.0: Add whats-new | dcherian 2448579 | closed | 0 | 0 | 2024-03-29T15:01:35Z | 2024-03-29T17:07:19Z | 2024-03-29T17:07:17Z | MEMBER | 0 | pydata/xarray/pulls/8891 | { "url": "https://api.github.com/repos/pydata/xarray/issues/8891/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | ||||||
2215324218 | PR_kwDOAMm_X85rKmW7 | 8890 | Add typing to test_groupby.py | Illviljan 14371165 | closed | 0 | 1 | 2024-03-29T13:13:59Z | 2024-03-29T16:38:17Z | 2024-03-29T16:38:16Z | MEMBER | 0 | pydata/xarray/pulls/8890 | Enforce typing on all tests in |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8890/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2203250238 | PR_kwDOAMm_X85qh2s8 | 8867 | Avoid in-place multiplication of a large value to an array with small integer dtype | Illviljan 14371165 | closed | 0 | 3 | 2024-03-22T20:22:22Z | 2024-03-29T15:26:38Z | 2024-03-29T15:26:38Z | MEMBER | 0 | pydata/xarray/pulls/8867 | Upstream numpy has become a bit more particular with which types you can use for inplace operations. This PR fixes ``` __ TestImshow.test_imshow_rgb_values_in_valid_range __ self = <xarray.tests.test_plot.TestImshow object at 0x7f88320c2780>
/home/runner/work/xarray/xarray/xarray/tests/test_plot.py:2034: /home/runner/work/xarray/xarray/xarray/plot/accessor.py:421: in imshow return dataarray_plot.imshow(self._da, args, kwargs) /home/runner/work/xarray/xarray/xarray/plot/dataarray_plot.py:1601: in newplotfunc primitive = plotfunc( /home/runner/work/xarray/xarray/xarray/plot/dataarray_plot.py:1853: in imshow alpha = 255 self = masked_array( data=[[[1], [1], [1], [1], [1]],
mask=False, fill_value=np.int64(999999), dtype=uint8) other = 255
/home/runner/micromamba/envs/xarray-tests/lib/python3.12/site-packages/numpy/ma/core.py:4415: UFuncTypeError ``` Some curious behaviors seen while debugging: ```python alpha = np.array([1], dtype=np.int8) alpha *= 255 repr(alpha) # 'array([-1], dtype=int8)' alpha = np.array([1], dtype=np.int16) alpha *= 255 repr(alpha) # 'array([255], dtype=int16)' ``` xref: #8844 |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8867/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issues] ( [id] INTEGER PRIMARY KEY, [node_id] TEXT, [number] INTEGER, [title] TEXT, [user] INTEGER REFERENCES [users]([id]), [state] TEXT, [locked] INTEGER, [assignee] INTEGER REFERENCES [users]([id]), [milestone] INTEGER REFERENCES [milestones]([id]), [comments] INTEGER, [created_at] TEXT, [updated_at] TEXT, [closed_at] TEXT, [author_association] TEXT, [active_lock_reason] TEXT, [draft] INTEGER, [pull_request] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [state_reason] TEXT, [repo] INTEGER REFERENCES [repos]([id]), [type] TEXT ); CREATE INDEX [idx_issues_repo] ON [issues] ([repo]); CREATE INDEX [idx_issues_milestone] ON [issues] ([milestone]); CREATE INDEX [idx_issues_assignee] ON [issues] ([assignee]); CREATE INDEX [idx_issues_user] ON [issues] ([user]);