home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

4 rows where repo = 13221727, type = "issue" and user = 34257249 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, created_at (date), updated_at (date), closed_at (date)

state 2

  • open 3
  • closed 1

type 1

  • issue · 4 ✖

repo 1

  • xarray · 4 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1656130602 I_kwDOAMm_X85itowq 7726 open_zarr: PermissionError with multiple processes despite use of ProcessSynchronizer templiert 34257249 open 0     0 2023-04-05T18:55:12Z 2023-04-06T01:37:32Z   CONTRIBUTOR      

What happened?

Several processes read and write to a xarray stored in .zarr format, on a network. The write operations write to existing regions. These regions are not aligned to chunks, therefore I use a ProcessSynchronizer. The ProcessSynchronizer points to a local folder on SSD, separate from the actual stored array.

After several hundreds of read/write I get permission errors like below. So far I have failed to reproduce the error with a MCVE.

The file 0 that gave a permission error is the chunk of coordinates of a certain dimension in the dimension folder dim_yyy: dim_yyy |-- .zarray |-- .zattrs `-- 0

What did you expect to happen?

No permission error.

Minimal Complete Verifiable Example

```Python I have failed so far to reproduce the error with an MVCE. Here my attempt.

from pathlib import Path

import dask.array as da import pandas as pd import xarray as xr from dask.distributed import Client from zarr.sync import ProcessSynchronizer

if name == "main": path_store = Path(aaa) path_synchronizer = Path(bbb) # must exist, and not same location as store

# create and save a datset to zarr
s0, s1, s2 = 10, 10, 10
temperature = da.random.random((s0, s1, s2), chunks=[s0, s1, s2])
precipitation = da.random.random((s0, s1, s2), chunks=[s0, s1, s2])
lon = da.random.random((s0, s1))
lat = da.random.random((s0, s1))
time = pd.date_range("2014-09-06", periods=s2)
reference_time = pd.Timestamp("2014-09-05")
ds = xr.Dataset(
    data_vars=dict(
        temperature=(["x", "y", "time"], temperature),
        precipitation=(["x", "y", "time"], precipitation),
    ),
    coords=dict(
        lon=(["x", "y"], lon),
        lat=(["x", "y"], lat),
        time=time,
        reference_time=reference_time,
    ),
    attrs=dict(description="Weather related data."),
)
print(f"{ds=}")
ds.to_zarr(path_store, mode="w")

def read_write(path_store: Path):
    """lazily opens the dataset, then writes into a region. Comment/uncomment to use synchronizer"""
    synchronizer = ProcessSynchronizer(path_synchronizer)
    for b in range(100):
        # open the saved dataset
        # xr.open_zarr(path_store, synchronizer=synchronizer)
        ds = xr.open_zarr(path_store)

        # process a region
        dst = (
            ds.temperature.isel(x=slice(0, 5), y=slice(0, 5), time=slice(0, 5))
            .to_dataset()
            .load()
        )
        dst["temperature"] = -dst["temperature"]
        dst = dst.drop_vars(["time", "reference_time"])

        # save the region to the zarr store
        dst.to_zarr(
            path_store,
            region={
                "x": slice(0, 5),
                "y": slice(0, 5),
                "time": slice(0, 5),
            },
            # synchronizer=synchronizer,
        )

# independent processes that perform read and write operations
with Client(processes=True) as client:
    futures = [client.submit(read_write, path_store) for a in range(1000)]
    client.gather(futures)

```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [ ] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [ ] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

Python return xr.open_zarr(path, synchronizer=synchronizer) File "C:\anaconda3\lib\site-packages\xarray\backends\zarr.py", line 787, in open_zarr ds = open_dataset( File "C:\anaconda3\lib\site-packages\xarray\backends\api.py", line 539, in open_dataset backend_ds = backend.open_dataset( File "C:\anaconda3\lib\site-packages\xarray\backends\zarr.py", line 862, in open_dataset ds = store_entrypoint.open_dataset( File "C:\anaconda3\lib\site-packages\xarray\backends\store.py", line 43, in open_dataset ds = Dataset(vars, attrs=attrs) File "C:\anaconda3\lib\site-packages\xarray\core\dataset.py", line 604, in __init__ variables, coord_names, dims, indexes, _ = merge_data_and_coords( File "C:\anaconda3\lib\site-packages\xarray\core\merge.py", line 575, in merge_data_and_coords return merge_core( File "C:\anaconda3\lib\site-packages\xarray\core\merge.py", line 755, in merge_core collected = collect_variables_and_indexes(aligned, indexes=indexes) File "C:\anaconda3\lib\site-packages\xarray\core\merge.py", line 365, in collect_variables_and_indexes variable = as_variable(variable, name=name) File "C:\anaconda3\lib\site-packages\xarray\core\variable.py", line 168, in as_variable obj = obj.to_index_variable() File "C:\anaconda3\lib\site-packages\xarray\core\variable.py", line 624, in to_index_variable return IndexVariable( File "C:\anaconda3\lib\site-packages\xarray\core\variable.py", line 2844, in __init__ self._data = PandasIndexingAdapter(self._data) File "C:\anaconda3\lib\site-packages\xarray\core\indexing.py", line 1420, in __init__ self.array = safe_cast_to_index(array) File "C:\anaconda3\lib\site-packages\xarray\core\indexes.py", line 177, in safe_cast_to_index index = pd.Index(np.asarray(array), **kwargs) File "C:\anaconda3\lib\site-packages\xarray\core\indexing.py", line 524, in __array__ return np.asarray(array[self.key], dtype=None) File "C:\anaconda3\lib\site-packages\xarray\backends\zarr.py", line 68, in __getitem__ return array[key.tuple] File "C:\anaconda3\lib\site-packages\zarr\core.py", line 821, in __getitem__ result = self.get_basic_selection(pure_selection, fields=fields) File "C:\anaconda3\lib\site-packages\zarr\core.py", line 947, in get_basic_selection return self._get_basic_selection_nd(selection=selection, out=out, File "C:\anaconda3\lib\site-packages\zarr\core.py", line 990, in _get_basic_selection_nd return self._get_selection(indexer=indexer, out=out, fields=fields) File "C:\anaconda3\lib\site-packages\zarr\core.py", line 1285, in _get_selection self._chunk_getitem(chunk_coords, chunk_selection, out, out_selection, File "C:\anaconda3\lib\site-packages\zarr\core.py", line 1994, in _chunk_getitem cdata = self.chunk_store[ckey] File "C:\anaconda3\lib\site-packages\zarr\storage.py", line 1085, in __getitem__ return self._fromfile(filepath) File "C:\anaconda3\lib\site-packages\zarr\storage.py", line 1059, in _fromfile with open(fn, 'rb') as f: PermissionError: [Errno 13] Permission denied: 'xxx.zarr\\dim_yyy/0'

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.10.9 | packaged by Anaconda, Inc. | (main, Mar 1 2023, 18:18:15) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 85 Stepping 7, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: ('English_United States', '1252') libhdf5: 1.10.6 libnetcdf: None xarray: 2022.11.0 pandas: 1.5.3 numpy: 1.23.5 scipy: 1.10.0 netCDF4: None pydap: None h5netcdf: None h5py: 3.7.0 Nio: None zarr: 2.14.2 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.5 dask: 2022.7.0 distributed: None matplotlib: 3.7.0 cartopy: None seaborn: 0.12.2 numbagg: None fsspec: 2022.11.0 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 65.6.3 pip: 23.0.1 conda: 23.1.0 pytest: 7.1.2 IPython: 8.10.0 sphinx: 5.0.2
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7726/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1584580877 I_kwDOAMm_X85ecskN 7527 DataArray.idxmax converts coordinates into float64 by default templiert 34257249 open 0     0 2023-02-14T17:45:07Z 2023-02-14T17:51:33Z   CONTRIBUTOR      

What happened?

Same example as in DataArray.idxmax but instead we look at the "y" dimension. The starting "y" coordinates are of type int: [-1,0,1]

The return values of argmax are of type int64: good. The return values of idxmax are of type float64: bad.

What did you expect to happen?

If no fillna operation must occur, then the return values of idxmax should be the same type as from the input.

Else, the return type might change to a new type depending on the type of the filled value.

Minimal Complete Verifiable Example

Python array = xr.DataArray( [ [2.0, 1.0, 2.0, 0.0, -2.0], [-4.0, np.NaN, 2.0, np.NaN, -2.0], [np.NaN, np.NaN, 1.0, np.NaN, np.NaN], ], dims=["y", "x"], coords={"y": [-1, 0, 1], "x": np.arange(5.0) ** 2}, ) print(array.argmax(dim="y").dtype) print(array.idxmax(dim="y").dtype)

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

```Python In [41]: print(array.argmax(dim="y").dtype) int64

In [42]: print(array.idxmax(dim="y").dtype) float64 ```

Anything else we need to know?

Suggestions: - change these two lines: if skipna or (skipna is None and array.dtype.kind in na_dtypes): # Put the NaN values back in after removing them into if (skipna or (skipna is None and array.dtype.kind in na_dtypes)) and allna.any(): # Put the NaN values back in after removing them, if any

  • or maybe instead, it is a bug from DataArray.where: this res = res.where(~allna, fill_value) should not change the array type if not allna.any()? Actually, it is a known limitation of where: #3570

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.9.13 (main, Aug 25 2022, 23:51:50) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 158 Stepping 13, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: ('English_United States', '1252') libhdf5: 1.10.6 libnetcdf: None xarray: 0.20.1 pandas: 1.4.4 numpy: 1.24.2 scipy: 1.9.1 netCDF4: None pydap: None h5netcdf: None h5py: 3.7.0 Nio: None zarr: 2.13.3 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.5 dask: 2022.7.0 distributed: 2022.7.0 matplotlib: 3.5.2 cartopy: None seaborn: 0.11.2 numbagg: None fsspec: 2022.7.1 cupy: None pint: None sparse: None setuptools: 63.4.1 pip: 23.0 conda: 22.9.0 pytest: 7.1.2 IPython: 7.31.1 sphinx: 5.0.2
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7527/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1452123685 I_kwDOAMm_X85WjaYl 7294 DataArray.transpose with transpose_coords=True does not change coords order templiert 34257249 open 0     6 2022-11-16T19:02:27Z 2022-11-24T20:40:32Z   CONTRIBUTOR      

What happened?

I used DataArray.transpose with transpose_coords=True to change the coords order from startings_dims = "dim_0", "dim_1", "dim_2" to reordered_dims = "dim_2", "dim_1", "dim_0".

The order of dims was correctly transposed but the order of coords remained unchanged.

What did you expect to happen?

I expected the transposed coords to be in the new order:

reordered_dims = "dim_2", "dim_1", "dim_0"

Minimal Complete Verifiable Example

```Python import numpy as np import pandas as pd import xarray as xr

np.random.seed(0) temperature = np.random.randn(4, 4, 3) dim_0_values = [1, 2, 3, 4] dim_1_values = [5, 6, 7, 8] dim_2_values = pd.date_range("2014-09-06", periods=3) starting_dims = "dim_0", "dim_1", "dim_2"

da = xr.DataArray( data=temperature, dims=starting_dims, coords=dict( dim_0=dim_0_values, dim_1=dim_1_values, dim_2=dim_2_values, ), attrs=dict( description="Ambient temperature.", units="degC", ), )

print(f"{da.dims=}") print(f"{da.coords.keys()=}")

reordered_dims = "dim_2", "dim_1", "dim_0" print(f"{da.transpose(reordered_dims).dims=}") print(f"{da.transpose(reordered_dims).coords.keys()=}") print(f"{da.transpose(*reordered_dims, transpose_coords=True).coords.keys()=}") ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

Python da.dims=('dim_0', 'dim_1', 'dim_2') da.coords.keys()=KeysView(Coordinates: * dim_0 (dim_0) int32 1 2 3 4 * dim_1 (dim_1) int32 5 6 7 8 * dim_2 (dim_2) datetime64[ns] 2014-09-06 2014-09-07 2014-09-08) da.transpose(*reordered_dims).dims=('dim_2', 'dim_1', 'dim_0') da.transpose(*reordered_dims).coords.keys()=KeysView(Coordinates: * dim_0 (dim_0) int32 1 2 3 4 * dim_1 (dim_1) int32 5 6 7 8 * dim_2 (dim_2) datetime64[ns] 2014-09-06 2014-09-07 2014-09-08) da.transpose(*reordered_dims, transpose_coords=True).coords.keys()=KeysView(Coordinates: * dim_0 (dim_0) int32 1 2 3 4 * dim_1 (dim_1) int32 5 6 7 8 * dim_2 (dim_2) datetime64[ns] 2014-09-06 2014-09-07 2014-09-08)

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.9.12 (main, Apr 4 2022, 05:22:27) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 85 Stepping 7, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: ('English_United States', '1252') libhdf5: 1.10.6 libnetcdf: None xarray: 2022.6.0 pandas: 1.4.2 numpy: 1.21.5 scipy: 1.9.3 netCDF4: None pydap: None h5netcdf: None h5py: 3.6.0 Nio: None zarr: 2.13.2 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.4 dask: 2022.02.1 distributed: 2022.2.1 matplotlib: 3.5.1 cartopy: None seaborn: 0.11.2 numbagg: None fsspec: 2022.02.0 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 61.2.0 pip: 22.3.1 conda: 4.12.0 pytest: 7.1.1 IPython: 8.2.0 sphinx: 4.4.0
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7294/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1449069429 I_kwDOAMm_X85WXwt1 7289 typo in xarray/doc/user-guide/reshaping.rst templiert 34257249 closed 0     2 2022-11-15T02:47:56Z 2022-11-15T15:20:44Z 2022-11-15T15:20:44Z CONTRIBUTOR      

What is your issue?

line 23 An ellipsis (...) can be use -> An ellipsis (...) can be used

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7289/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 5200.376ms · About: xarray-datasette