home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 1697705761

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1697705761 I_kwDOAMm_X85lMO8h 7818 Warning on distributed lock on dask cluster 43635101 closed 0     2 2023-05-05T14:19:22Z 2024-02-26T06:08:14Z 2024-02-26T06:08:14Z NONE      

What is your issue?

I'm using xarray to store datasets that are computed on a distributed dask cluster. I'm making use of the to_netcdf to chunk the storage.

I get some warnings about coroutine not awaited with distributed. Just wanted to know if that is something to worry about, like locks not released ?

log INFO:distributed.scheduler:State start INFO:distributed.scheduler: Scheduler at: tcp://127.0.0.1:8786 INFO:distributed.scheduler: dashboard at: http://127.0.0.1:8787/status INFO:distributed.worker: Start worker at: tcp://127.0.0.1:44425 INFO:distributed.worker: Listening to: tcp://127.0.0.1:44425 INFO:distributed.worker: Worker name: 0 INFO:distributed.worker: dashboard at: 127.0.0.1:42587 INFO:distributed.worker:Waiting to connect to: tcp://127.0.0.1:8786 INFO:distributed.worker:------------------------------------------------- INFO:distributed.worker: Threads: 8 INFO:distributed.worker: Memory: 15.38 GiB INFO:distributed.worker: Local Directory: /tmp/dask-worker-space/worker-l2i3et3y INFO:distributed.worker:------------------------------------------------- INFO:distributed.scheduler:Register worker <WorkerState 'tcp://127.0.0.1:44425', name: 0, status: init, memory: 0, processing: 0> INFO:distributed.scheduler:Starting worker compute stream, tcp://127.0.0.1:44425 INFO:distributed.core:Starting established connection to tcp://127.0.0.1:57776 INFO:distributed.worker: Registered to: tcp://127.0.0.1:8786 INFO:distributed.worker:------------------------------------------------- INFO:distributed.core:Starting established connection to tcp://127.0.0.1:8786 INFO:distributed.scheduler:Receive client connection: Client-15a7ea83-eb4e-11ed-ab82-5b728e3afd73 INFO:distributed.core:Starting established connection to tcp://127.0.0.1:57790 /home/jules/Code/xarraystoreexample/.venv/lib/python3.9/site-packages/distributed/lock.py:174: RuntimeWarning: coroutine 'PooledRPCCall.__getattr__.<locals>.send_recv_from_rpc' was never awaited self.acquire() /home/jules/Code/xarraystoreexample/.venv/lib/python3.9/site-packages/distributed/lock.py:178: RuntimeWarning: coroutine 'PooledRPCCall.__getattr__.<locals>.send_recv_from_rpc' was never awaited self.release() INFO:distributed.scheduler:Remove client Client-15a7ea83-eb4e-11ed-ab82-5b728e3afd73 INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:57790; closing. INFO:distributed.scheduler:Remove client Client-15a7ea83-eb4e-11ed-ab82-5b728e3afd73 INFO:distributed.scheduler:Close client connection: Client-15a7ea83-eb4e-11ed-ab82-5b728e3afd73 INFO:distributed.worker:Stopping worker at tcp://127.0.0.1:44425. Reason: worker-close INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:57776; closing. INFO:distributed.scheduler:Remove worker <WorkerState 'tcp://127.0.0.1:44425', name: 0, status: closing, memory: 0, processing: 0> INFO:distributed.core:Removing comms to tcp://127.0.0.1:44425 INFO:distributed.scheduler:Lost all workers INFO:distributed.core:Connection to tcp://127.0.0.1:8786 has been closed. INFO:distributed.scheduler:Scheduler closing... INFO:distributed.scheduler:Scheduler closing all comms

Here is a minimal example :

```python import logging import xarray as xr import numpy as np

import dask.distributed as dd import dask.array as da import pandas as pd

async def main(): cluster = dd.LocalCluster(scheduler_port=8786, processes=False, asynchronous=True)

await cluster
time = pd.date_range("1990-01-01", "1995-01-01", freq="h")

ds = xr.Dataset(
    coords={"time": time, "x": np.linspace(0, 100), "y": np.linspace(0, 100)}
)

ds["data"] = (("time", "y", "x"), da.ones((len(ds.time), 50, 50)))

async with dd.Client(address="tcp://127.0.0.1:8786", asynchronous=True) as client:
    await client.compute(
        ds.data.to_netcdf(
            "data.nc",
            encoding={"data": {"zlib": True, "complevel": 6}},
            engine="h5netcdf",
            compute=False,
        )
    )
await cluster.close()
return True

import asyncio import os

if name == "main": logging.basicConfig() os.environ["HDF5_USE_FILE_LOCKING"] = "False" asyncio.run(main()) ```

```python import xarray as xr

xr.show_versions()

INSTALLED VERSIONS

commit: None python: 3.9.16 (main, Dec 7 2022, 01:12:08) [GCC 11.3.0] python-bits: 64 OS: Linux OS-release: 5.19.0-41-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: 4.9.1

xarray: 2022.12.0 pandas: 1.5.3 numpy: 1.24.3 scipy: 1.10.1 netCDF4: 1.6.3 pydap: None h5netcdf: 1.1.0 h5py: 3.8.0 Nio: None zarr: None cftime: 1.6.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2023.4.0 distributed: 2023.4.0 matplotlib: 3.7.1 cartopy: None seaborn: 0.11.2 numbagg: None fsspec: 2023.4.0 cupy: None pint: 0.20.1 sparse: None flox: None numpy_groupies: None setuptools: 67.7.2 pip: 23.1.2 conda: None pytest: 6.2.5 mypy: 0.931 IPython: 8.12.1 sphinx: None ```

Documentation on this feature :

https://docs.xarray.dev/en/stable/user-guide/dask.html#reading-and-writing-data

feel free to close this issue, if that's unrelated with xarray.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7818/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  not_planned 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 1 row from issue in issue_comments
Powered by Datasette · Queries took 75.382ms · About: xarray-datasette