home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 141853078

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/525#issuecomment-141853078 https://api.github.com/repos/pydata/xarray/issues/525 141853078 MDEyOklzc3VlQ29tbWVudDE0MTg1MzA3OA== 1217238 2015-09-21T01:28:18Z 2016-02-09T16:16:38Z MEMBER

@mhvk It would certainly be possible to extend dask.array to handle units, in either of the ways you suggest.

Although you could allow Quantity objects inside dask.arrays, I don't like that approach, because static checks like units really should be done only once when arrays are constructed (akin to dtype checks) rather than at evaluation time, and for every chunk. This suggests that tagging on the outside is the better approach.

So far, so good -- but with the current state of duck array typing in NumPy, it's really hard to be happy with this. Until __numpy_ufunc__ lands, we can't override operations like np.sqrt in a way that is remotely feasible for dask.arrays (we can't afford to load big arrays into memory). Likewise, we need overrides for standard numpy array utility functions like concatenate. But the worst part is that the lack of standard interfaces means that we lose the possibility of composing different arrays backends with your Quantity type -- it will only be able to wrap dask or numpy arrays, not sparse matrices or bolt arrays or some other type yet to be invented.

Once we have all that duck-array stuff, then yes, you certainly could write all a duck-array Quantity type that can wrap generic duck-arrays. But something like Quantity really only needs to override compute operations so that they can propagate dtypes -- there shouldn't be a need to override methods like concatenate. If you had an actual (parametric) dtype for units (e.g., Quantity[float64, 'meters']), then you would get all those dtype agnostic methods for free, which would make your life as an implementer much easier. Hence why I think custom dtypes would really be the ideal solution.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  100295585
Powered by Datasette · Queries took 0.852ms · About: xarray-datasette