issues: 2060883540
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2060883540 | PR_kwDOAMm_X85i-ZWI | 8577 | Interpolate na: Fix #7665 and introduce arguments similar to pandas | 42680748 | open | 0 | 0 | 2023-12-30T23:28:47Z | 2023-12-30T23:28:47Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/8577 |
This is an attempt to close #7665 and combine the current possibilities from xarray (max_gap) and pandas (limit_direction, limit_area) regarding interpolation of nan values. Please see also my comments in #7665 for the motivation. This PR already involves a full implementation, documentation and corresponding tests, but before any final polishing, I want to hear your thoughts. Specifically, I think the API and default options need to be discussed. (See the proposed documentation of DataArray.interpolate_na() / Dataset.interpolate_na() for the current state) Implementation: Basically, I use ffill and bfill to calculate the coordinate of the left/right edge for every gap in the data. Based on edge coordinates, all masks (limit, limit_area, max_gap) are created. On the long term, it might be interesting to provide those arguments to other na-filling methods as well (ffill, bfill, fillna). Things to considerlimit_direction=forwardPros: - Backward compatible: If limit is not None, this is the current behaviour (see #7665) - Pandas compatible: Forward is the pandas default. Cons:
- limit_use_coordinates=FalsePros: - Backward compatible - Pandas compatible -> Both xarray and pandas have no support for coordinate based limits so far. Cons:
- Inconsistent with the current default of Generally, one might discuss if this separate argument is necessary or only one argument use_coordinates=TrueSo far, if there is no coordinate for PerformanceOn my machine, the new limit implementation based on ffill/bfill seems to be a little less performant (10%) than the old one (based on rolling). There might be potential for improvements. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8577/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
13221727 | pull |