home / github / pull_requests

Menu
  • GraphQL API
  • Search all tables

pull_requests: 1496182200

This data as json

id node_id number state locked title user body created_at updated_at closed_at merged_at merge_commit_sha assignee milestone draft head base author_association auto_merge repo url merged_by
1496182200 PR_kwDOAMm_X85ZLe24 8124 open 0 More flexible index variables 4160723 <!-- Feel free to remove check-list items aren't relevant to your change --> - [ ] Closes #xxxx - [ ] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed in `api.rst` The goal of this PR is to provide a more general solution to indexed coordinate variables, i.e., support arbitrary dimensions and/or duck arrays for those variables while at the same time prevent them from being updated in a way that would invalidate their index. This would solve problems like the one mentioned here: https://github.com/pydata/xarray/issues/1650#issuecomment-1697237429 @shoyer I've tried to implement what you have suggested in https://github.com/pydata/xarray/pull/4979#discussion_r589798510. It would be nice indeed if eventually we could get rid of `IndexVariable`. It won't be easy to deprecate it until we finish the index refactor (i.e., all methods listed in #6293), though. Also, I didn't find an easy way to refactor that class as it has been designed too closely around a 1-d variable backed by a `pandas.Index`. So the approach implemented in this PR is to keep using `IndexVariable` for PandasIndex until we can deprecate / remove it later, and for the other cases use `Variable` with data wrapped in a custom `IndexedCoordinateArray` object. The latter solution (wrapper) doesn't always work nicely, though. For example, several methods of `Variable` expect that `self._data` directly returns a duck array (e.g., a dask array or a chunked duck array). A wrapped duck array will result in unexpected behavior there. We could probably add some checks / indirection or extend the wrapper API... But I wonder if there wouldn't be a more elegant approach? More generally, which operations should we allow / forbid / skip for an indexed coordinate variable? - Set array items in-place? Do not allow. - Replace data? Do not allow. - (Re)Chunk? - Load lazy data? - ... ? (Note: we could add `Index.chunk()` and `Index.load()` methods in order to allow an Xarray index implement custom logic for the two latter cases like, e.g., convert a DaskIndex to a PandasIndex during load, see #8128). cc @andersy005 (some changes made here may conflict with what you are refactoring in #8075). 2023-08-30T21:45:12Z 2023-08-31T16:02:20Z     8b84dc392e5443f9ada245cb6a6f31d8f19327df     1 09f3ed0acd119fcefa07652bbc40dff96db2f66c 0f9f790c7e887bbfd13f4026fd1d37e4cd599ff1 MEMBER   13221727 https://github.com/pydata/xarray/pull/8124  

Links from other tables

  • 2 rows from pull_requests_id in labels_pull_requests
Powered by Datasette · Queries took 0.703ms