home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 453576041

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
453576041 MDU6SXNzdWU0NTM1NzYwNDE= 3004 assign values from `xr.groupby_bins` to new `variable` 21049064 closed 0     8 2019-06-07T15:38:01Z 2019-07-07T12:17:46Z 2019-07-07T12:17:45Z NONE      

Code Sample, a copy-pastable example if possible

A "Minimal, Complete and Verifiable Example" will make it much easier for maintainers to help you: http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports

```python

Your code here

import pandas as pd import numpy as np import xarray as xr

time = pd.date_range('2010-01-01','2011-12-31',freq='M') lat = np.linspace(-5.175003, -4.7250023, 10) lon = np.linspace(33.524994, 33.97499, 10) precip = np.random.normal(0, 1, size=(len(time), len(lat), len(lon)))

ds = xr.Dataset( {'precip': (['time', 'lat', 'lon'], precip)}, coords={ 'lon': lon, 'lat': lat, 'time': time, } )

variable = 'precip'

calculate a cumsum over some window size

rolling_window = 3 ds_window = ( ds.rolling(time=rolling_window, center=True) .sum() .dropna(dim='time', how='all') )

construct a cumulative frequency distribution ranking the precip values

per month

rank_norm_list = [] for mth in range(1, 13): ds_mth = ( ds_window .where(ds_window['time.month'] == mth) .dropna(dim='time', how='all') ) rank_norm_mth = ( (ds_mth.rank(dim='time') - 1) / (ds_mth.time.size - 1.0) * 100.0 ) rank_norm_mth = rank_norm_mth.rename({variable: 'rank_norm'}) rank_norm_list.append(rank_norm_mth)

rank_norm = xr.merge(rank_norm_list).sortby('time')

assign bins to variable xarray

bins = [20., 40., 60., 80., np.Inf] decile_index_gpby = rank_norm.groupby_bins('rank_norm', bins=bins) out = decile_index_gpby.assign() # assign_coords()

```

Problem description

[this should explain why the current behavior is a problem and why the expected output is a better solution.]

I want to calculate the Decile Index - see the ex1-Calculate Decile Index (DI) with Python.ipynb.

The pandas implementation is simple enough but I need help with applying the bin labels to a new variable / coordinate.

Expected Output

``` <xarray.Dataset> Dimensions: (lat: 10, lon: 10, time: 24) Coordinates: * time (time) datetime64[ns] 2010-01-31 2010-02-28 ... 2011-12-31 * lat (lat) float32 -5.175003 -5.125 -5.075001 ... -4.7750015 -4.7250023 * lon (lon) float32 33.524994 33.574997 33.625 ... 33.925003 33.97499 Data variables: precip (time, lat, lon) float32 4.6461554 4.790813 ... 7.3063064 7.535994 rank_bin (lat, lon, time) int64 1 3 3 0 1 4 2 3 0 1 ... 0 4 0 1 3 1 2 2 3 1

```

Output of xr.show_versions()

# Paste the output here xr.show_versions() here INSTALLED VERSIONS ------------------ commit: None python: 3.7.0 | packaged by conda-forge | (default, Nov 12 2018, 12:34:36) [Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 18.2.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.12.1 pandas: 0.24.2 numpy: 1.16.4 scipy: 1.3.0 netCDF4: 1.5.1.2 pydap: None h5netcdf: None h5py: 2.9.0 Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: 1.0.17 cfgrib: 0.9.7 iris: None bottleneck: 1.2.1 dask: 1.2.2 distributed: 1.28.1 matplotlib: 3.1.0 cartopy: 0.17.0 seaborn: 0.9.0 setuptools: 41.0.1 pip: 19.1 conda: None pytest: 4.5.0 IPython: 7.1.1 sphinx: 2.0.1
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3004/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 8 rows from issue in issue_comments
Powered by Datasette · Queries took 0.676ms · About: xarray-datasette