issues
1 row where state = "closed" and user = 3171991 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date), closed_at (date)
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at ▲ | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1781667275 | I_kwDOAMm_X85qMhXL | 7952 | Tasks hang when operating on writing Zarr-backed Dataset | ljstrnadiii 3171991 | closed | 0 | 5 | 2023-06-30T00:31:54Z | 2023-11-03T04:51:41Z | 2023-11-03T04:51:40Z | NONE | What happened?When writing a dataset to zarr, we sometimes see that the last few tasks hang indefinitely with no cpu activity or data transfer activity (as seen from the daskui). Inspecting the daskui always shows we are waiting on a task with a name like ('store-map-34659153bd4dc964b4e5f380dacebdbe', 0, 1). What did you expect to happen?For all tasks to finish and all taking approximately the same amount of time to complete. I also would expect worker saturation to have some effect and kick in a queue with xarray and map_blocks to zarr, but I don't see this behavior. I do see this behavior with the dask counterpart (example in code the extra section below). Minimal Complete Verifiable Example```Python import os import xarray as xr import dask.array as da from distributed import Client def main(): client = Client("...")
if name == "main": main() ``` MVCE confirmation
Relevant log outputNo response Anything else we need to know?My dask cluster is deployed with helm with (32) workers each with 1cpu and 4gb. We often mitigate this issue (poorly) using environment variables to trigger worker restarts so long hanging tasks get rescheduled so the program can complete, but at the cost of the expected time to restart, which can slow down performance.
In the mre above, we write Possibly relevant links: - https://github.com/dask/distributed/issues/391 - https://github.com/fsspec/gcsfs/issues/379 - https://github.com/pydata/xarray/issues/4406 note: I have tried to minimize the example even more with
Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.10.11 | packaged by conda-forge | (main, May 10 2023, 18:58:44) [GCC 11.3.0]
python-bits: 64
OS: Linux
OS-release: 5.15.0-1036-gcp
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: None
xarray: 2023.5.0
pandas: 1.5.3
numpy: 1.23.5
scipy: 1.10.1
netCDF4: None
pydap: None
h5netcdf: None
h5py: 3.8.0
Nio: None
zarr: 2.15.0
cftime: None
nc_time_axis: None
PseudoNetCDF: None
iris: None
bottleneck: None
dask: 2023.6.0
distributed: 2023.6.0
matplotlib: 3.7.1
cartopy: None
seaborn: 0.12.2
numbagg: None
fsspec: 2023.6.0 # and gcsfs.__version__ == '2023.6.0'
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 67.7.2
pip: 22.3.1
conda: None
pytest: 7.3.2
mypy: 1.3.0
IPython: 8.14.0
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7952/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issues] ( [id] INTEGER PRIMARY KEY, [node_id] TEXT, [number] INTEGER, [title] TEXT, [user] INTEGER REFERENCES [users]([id]), [state] TEXT, [locked] INTEGER, [assignee] INTEGER REFERENCES [users]([id]), [milestone] INTEGER REFERENCES [milestones]([id]), [comments] INTEGER, [created_at] TEXT, [updated_at] TEXT, [closed_at] TEXT, [author_association] TEXT, [active_lock_reason] TEXT, [draft] INTEGER, [pull_request] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [state_reason] TEXT, [repo] INTEGER REFERENCES [repos]([id]), [type] TEXT ); CREATE INDEX [idx_issues_repo] ON [issues] ([repo]); CREATE INDEX [idx_issues_milestone] ON [issues] ([milestone]); CREATE INDEX [idx_issues_assignee] ON [issues] ([assignee]); CREATE INDEX [idx_issues_user] ON [issues] ([user]);