home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 757660307

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
757660307 MDU6SXNzdWU3NTc2NjAzMDc= 4650 Ability to Pass Dask Arrays as `data` in DataArray Creation 29051639 closed 0     4 2020-12-05T11:33:03Z 2020-12-05T18:23:11Z 2020-12-05T13:13:10Z CONTRIBUTOR      

Is your feature request related to a problem? Please describe.

I'm trying to convert a dask dataframe into a dask xarray without having to load the data fully into memory.

I was hoping I'd be able to pass df.values which is a Dask array to the data parameter in xr.DataArray

```python idx_dim = 'datetime' col_dim = 'fueltypes'

xr.DataArray(df.values, [df.index, df.columns], [idx_dim, col_dim]) ```

However this raises the error: ValueError: conflicting sizes for dimension 'datetime': length nan on the data but length 90386 on coordinate 'datetime'


Describe the solution you'd like

An ability to create DataArrays from dask dataframes, similar to the existing reverse method for converting Datasets to dask dataframes: Dataset.to_dask_dataframe


Describe alternatives you've considered

I tried using xr.Dataset.from_dataframe(df) but it required the dataframe to be fully loaded into memory

Additionally, unlike the standard Pandas dataframe the Dask dataframe does not have a .to_xarray method.


Additional context

This is in part made necessary by the decision of the Zarr developers to not support saving of dask dataframes to zarr, instead suggesting that you convert to an xarray and then save that to zarr.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4650/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 4 rows from issue in issue_comments
Powered by Datasette · Queries took 0.689ms · About: xarray-datasette