issue_comments
7 rows where issue = 593029940 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
issue 1
- Feature request xarray.Dataset.from_dask_dataframe · 7 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
843725376 | https://github.com/pydata/xarray/issues/3929#issuecomment-843725376 | https://api.github.com/repos/pydata/xarray/issues/3929 | MDEyOklzc3VlQ29tbWVudDg0MzcyNTM3Ng== | N4321D 35295509 | 2021-05-19T03:52:00Z | 2021-05-19T03:52:00Z | NONE | I create this function which works pretty good, idk if it is of any help: ``` import xarray as xr import dask.dataframe as dd def dask_2_xarray(ddf, indexname='index'): ds = xr.Dataset() ds[indexname] = ddf.index for key in ddf.columns: ds[key] = (indexname, ddf[key].to_dask_array().compute_chunk_sizes()) return ds use:ds = dask_2_xarray(ddf) ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature request xarray.Dataset.from_dask_dataframe 593029940 | |
739991914 | https://github.com/pydata/xarray/issues/3929#issuecomment-739991914 | https://api.github.com/repos/pydata/xarray/issues/3929 | MDEyOklzc3VlQ29tbWVudDczOTk5MTkxNA== | AyrtonB 29051639 | 2020-12-07T15:32:01Z | 2020-12-07T15:32:01Z | CONTRIBUTOR | I've added a PR for the new feature but it's currently failing tests as the test-suite doesn't seem to have Dask installed. Any advice on how to get this PR prepared for merging would be appreciated. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature request xarray.Dataset.from_dask_dataframe 593029940 | |
739904265 | https://github.com/pydata/xarray/issues/3929#issuecomment-739904265 | https://api.github.com/repos/pydata/xarray/issues/3929 | MDEyOklzc3VlQ29tbWVudDczOTkwNDI2NQ== | AyrtonB 29051639 | 2020-12-07T13:01:57Z | 2020-12-07T13:02:20Z | CONTRIBUTOR | One of the things I was hoping to include in my approach is the preservation of the column dimension names, however if I was to use Thanks for the advice @shoyer, I reached a similar opinion and so have been working on the dim compute route. The issue is that a Dask array's shape uses np.nan for uncomputed dimensions, rather than leaving a delayed object like the Dask dataframe's shape. I looked into returning the dask dataframe rather than dask array but this didn't feel like it fit with the rest of the code and produced another issue as dask dataframes don't have a dtype attribute. I'll continue to look into alternatives. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature request xarray.Dataset.from_dask_dataframe 593029940 | |
739576721 | https://github.com/pydata/xarray/issues/3929#issuecomment-739576721 | https://api.github.com/repos/pydata/xarray/issues/3929 | MDEyOklzc3VlQ29tbWVudDczOTU3NjcyMQ== | shoyer 1217238 | 2020-12-06T22:36:32Z | 2020-12-06T22:36:32Z | MEMBER | It sounds like making this work well would require xarray to support "unknown" dimension sizes throughout the codebase. This would be a nice feature to have, but indeed would likely require pervasive changes. The other option would be to explicitly compute the shape when converting from dask dataframes, by calling |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature request xarray.Dataset.from_dask_dataframe 593029940 | |
739395190 | https://github.com/pydata/xarray/issues/3929#issuecomment-739395190 | https://api.github.com/repos/pydata/xarray/issues/3929 | MDEyOklzc3VlQ29tbWVudDczOTM5NTE5MA== | keewis 14808389 | 2020-12-05T20:13:58Z | 2020-12-05T20:44:42Z | MEMBER | Thanks for investigating and working on this, @AyrtonB. I indeed think this is the correct place to discuss this: your use case can probably be implemented by converting to a |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature request xarray.Dataset.from_dask_dataframe 593029940 | |
739334281 | https://github.com/pydata/xarray/issues/3929#issuecomment-739334281 | https://api.github.com/repos/pydata/xarray/issues/3929 | MDEyOklzc3VlQ29tbWVudDczOTMzNDI4MQ== | AyrtonB 29051639 | 2020-12-05T18:52:49Z | 2020-12-05T18:52:49Z | CONTRIBUTOR | For context this is the function I'm using to convert the Dask DataFrame to a DataArray. ```python def from_dask_dataframe(df, index_name=None, columns_name=None): def extract_dim_name(df, dim='index'): if getattr(df, dim).name is None: getattr(df, dim).name = dim
df.index.name = 'datetime' df.columns.name = 'fueltypes' da = from_dask_dataframe(df) ``` I'm also conscious that my question is different to @raybellwaves' as they were asking about Dataset creation and I'm interested in creating a DataArray which requires different functionality. I'm assuming this is the correct place to post though as @keewis closed my issue and linked to this one. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature request xarray.Dataset.from_dask_dataframe 593029940 | |
739330558 | https://github.com/pydata/xarray/issues/3929#issuecomment-739330558 | https://api.github.com/repos/pydata/xarray/issues/3929 | MDEyOklzc3VlQ29tbWVudDczOTMzMDU1OA== | AyrtonB 29051639 | 2020-12-05T18:20:33Z | 2020-12-05T18:20:33Z | CONTRIBUTOR | I've been trying to implement this and have managed to create a The modifications I've made so far are adding the following above line 400 in dataarray.py: ```python shape = tuple([ dim_size.compute() if hasattr(dim_size, 'compute') else dim_size for dim_size in data.shape ]) coords = tuple([ coord.compute() if hasattr(coord, 'compute') else coord for coord in coords ]) ``` and on line 403 by replacing The issue I have is that when I then want to use the DataArray and do something like ValueError Traceback (most recent call last) <ipython-input-23-5d739a721388> in <module> ----> 1 da.sel(datetime='2020') ~\anaconda3\envs\DataHub\lib\site-packages\xarray\core\dataarray.py in sel(self, indexers, method, tolerance, drop, **indexers_kwargs) 1219 1220 """ -> 1221 ds = self._to_temp_dataset().sel( 1222 indexers=indexers, 1223 drop=drop, ~\anaconda3\envs\DataHub\lib\site-packages\xarray\core\dataarray.py in _to_temp_dataset(self) 499 500 def _to_temp_dataset(self) -> Dataset: --> 501 return self._to_dataset_whole(name=_THIS_ARRAY, shallow_copy=False) 502 503 def _from_temp_dataset( ~\anaconda3\envs\DataHub\lib\site-packages\xarray\core\dataarray.py in _to_dataset_whole(self, name, shallow_copy) 551 552 coord_names = set(self._coords) --> 553 dataset = Dataset._construct_direct(variables, coord_names, indexes=indexes) 554 return dataset 555 ~\anaconda3\envs\DataHub\lib\site-packages\xarray\core\dataset.py in _construct_direct(cls, variables, coord_names, dims, attrs, indexes, encoding, file_obj) 959 """ 960 if dims is None: --> 961 dims = calculate_dimensions(variables) 962 obj = object.new(cls) 963 obj._variables = variables ~\anaconda3\envs\DataHub\lib\site-packages\xarray\core\dataset.py in calculate_dimensions(variables) 207 "conflicting sizes for dimension %r: " 208 "length %s on %r and length %s on %r" --> 209 % (dim, size, k, dims[dim], last_used[dim]) 210 ) 211 return dims ValueError: conflicting sizes for dimension 'datetime': length nan on <this-array> and length 90386 on 'datetime' ``` This occurs due to the construction of I'm assuming there's an alternative way to construct |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature request xarray.Dataset.from_dask_dataframe 593029940 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 4