home / github / pull_requests

Menu
  • GraphQL API
  • Search all tables

pull_requests: 200571225

This data as json

id node_id number state locked title user body created_at updated_at closed_at merged_at merge_commit_sha assignee milestone draft head base author_association auto_merge repo url merged_by
200571225 MDExOlB1bGxSZXF1ZXN0MjAwNTcxMjI1 2277 closed 0 ENH: Scatter plots of one variable vs another 6164157 - [x] Closes #470 - [x] Tests added (for all bug fixes or enhancements) - [x] Tests passed (for all non-documentation changes) - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API - [x] Add support for `size`? - [x] Revert hue=datetime support bits Say you have two variables in a `Dataset` and you want to make a scatter plot of one vs the other, possibly using different hues and/or faceting. This is useful if you want to inspect the data to see whether two variables have some underlying relationships between them that you might have missed. It's something that I found myself manually writing the code for quite a few times, so I thought it would be better to have it as a feature. I'm not sure if this is actually useful for other people, but I have the feeling that it probably is. First, set up dataset with two variables: ```python import xarray as xr import numpy as np import matplotlib from matplotlib import pyplot as plt A = xr.DataArray(np.zeros([3, 11, 4, 4]), dims=[ 'x', 'y', 'z', 'w'], coords=[np.arange(3), np.linspace(0,1,11), np.arange(4), 0.1*np.random.randn(4)]) B = 0.1*A.x**2+A.y**2.5+0.1*A.z*A.w A = -0.1*A.x+A.y/(5+A.z)+A.w ds = xr.Dataset({'A':A, 'B':B}) ds['w'] = ['one', 'two', 'three', 'five'] ``` Now, we can plot all values of `A` vs all values of `B`: ```python plt.plot(A.values.flat,B.values.flat,'.') ``` ![a](https://user-images.githubusercontent.com/6164157/42546655-4cce84d8-848c-11e8-8ae0-9249e04137bd.png) What a mess. Wouldn't it be nice if you could color each point according to the value of some coordinate, say `w`? ```python ds.scatter(x='A',y='B', hue='w') ``` ![a](https://user-images.githubusercontent.com/6164157/42546725-9810e4c2-848c-11e8-827d-ae98c57f0ad0.png) Huh! There seems to be some underlying structure there. Can we also facet over a different coordinate? ```python ds.scatter(x='A',y='B',col='x', hue='w') ``` ![a](https://user-images.githubusercontent.com/6164157/42546773-d590caf6-848c-11e8-8095-06216f138a95.png) or two coordinates? ```python ds.scatter(x='A',y='B',col='x', row='z', hue='w') ``` ![a](https://user-images.githubusercontent.com/6164157/42546775-e07a8b96-848c-11e8-98f2-8a05d543b54a.png) The logic is that dimensions that are not faceted/hue are just stacked using `xr.stack` and plotted. Only variables that have exactly the same dimensions are allowed. Regarding implementation -- I am certainly not sure about the API and I probably haven't thought about edge cases with missing data or nans or whatnot, so any input would be welcome. Also, there might be a simpler implementation by first using `to_array` and then using existing line plot functions, but I couldn't find it. 2018-07-11T02:31:01Z 2019-08-08T18:05:00Z 2019-08-08T15:57:17Z 2019-08-08T15:57:17Z f172c6738ae4bc9802e08d355ea05ea6c47527ab     0 d56f7d13c9b82afbbe63734448e3594bfd06c940 8a9c4710b2ee389a41e08a665108aca05ef02544 CONTRIBUTOR   13221727 https://github.com/pydata/xarray/pull/2277  

Links from other tables

  • 1 row from pull_requests_id in labels_pull_requests
Powered by Datasette · Queries took 79.478ms