home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 340069538

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
340069538 MDExOlB1bGxSZXF1ZXN0MjAwNTcxMjI1 2277 ENH: Scatter plots of one variable vs another 6164157 closed 0     45 2018-07-11T02:31:01Z 2019-08-08T18:05:00Z 2019-08-08T15:57:17Z CONTRIBUTOR   0 pydata/xarray/pulls/2277
  • [x] Closes #470
  • [x] Tests added (for all bug fixes or enhancements)
  • [x] Tests passed (for all non-documentation changes)
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API
  • [x] Add support for size?
  • [x] Revert hue=datetime support bits

Say you have two variables in a Dataset and you want to make a scatter plot of one vs the other, possibly using different hues and/or faceting. This is useful if you want to inspect the data to see whether two variables have some underlying relationships between them that you might have missed. It's something that I found myself manually writing the code for quite a few times, so I thought it would be better to have it as a feature. I'm not sure if this is actually useful for other people, but I have the feeling that it probably is.

First, set up dataset with two variables:

```python import xarray as xr import numpy as np import matplotlib from matplotlib import pyplot as plt

A = xr.DataArray(np.zeros([3, 11, 4, 4]), dims=[ 'x', 'y', 'z', 'w'], coords=[np.arange(3), np.linspace(0,1,11), np.arange(4), 0.1np.random.randn(4)]) B = 0.1A.x2+A.y2.5+0.1A.zA.w A = -0.1*A.x+A.y/(5+A.z)+A.w ds = xr.Dataset({'A':A, 'B':B}) ds['w'] = ['one', 'two', 'three', 'five'] Now, we can plot all values of `A` vs all values of `B`:python plt.plot(A.values.flat,B.values.flat,'.') ```

What a mess. Wouldn't it be nice if you could color each point according to the value of some coordinate, say w? python ds.scatter(x='A',y='B', hue='w') Huh! There seems to be some underlying structure there. Can we also facet over a different coordinate? python ds.scatter(x='A',y='B',col='x', hue='w') or two coordinates? python ds.scatter(x='A',y='B',col='x', row='z', hue='w')

The logic is that dimensions that are not faceted/hue are just stacked using xr.stack and plotted. Only variables that have exactly the same dimensions are allowed.

Regarding implementation -- I am certainly not sure about the API and I probably haven't thought about edge cases with missing data or nans or whatnot, so any input would be welcome. Also, there might be a simpler implementation by first using to_array and then using existing line plot functions, but I couldn't find it.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2277/reactions",
    "total_count": 3,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 3,
    "rocket": 0,
    "eyes": 0
}
    13221727 pull

Links from other tables

  • 1 row from issues_id in issues_labels
  • 45 rows from issue in issue_comments
Powered by Datasette · Queries took 155.695ms · About: xarray-datasette