home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 152061016

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
152061016 MDU6SXNzdWUxNTIwNjEwMTY= 839 Feature request: Assign coords for new axis in xr.concat 9061708 closed 0     3 2016-05-01T00:34:51Z 2019-02-25T20:28:23Z 2019-02-25T20:28:23Z NONE      

It would be awesome to add coords while concatenating. Basically, combining this into one line:

DA_data = xr.concat(list(D_patient_DA.values()), dim="Patients"); DA_data.coords["Patients"] = list(D_patient_DA.keys())

For this dataset I made up, imagine 100 patients, 12 months, and 10000 attributes which would be a typical 3D dataset. Basically, I end up with a bunch of 2D DataArrays (row=months, col=attributes) this DataArray is the value in my dictionary and the patient it came from is the key (i.e. (patient_x : DataArray_X) )

I'm trying to do DA_data = xr.concat(list(D_patient_DA.values()), coords = list(D_patient_DA.keys()), dim="Patients") but it's not working and I need to split it up like DA_data = xr.concat(list(D_patient_DA.values()), dim="Patients"); DA_data.coords["Patients"] = list(D_patient_DA.keys())

Am I not writing the one-liner in the right format? The docs say coords : {‘minimal’, ‘different’, ‘all’ o list of str} so it seems like it should work

Here is my code for generating fake data for this problem:

``` import xarray as xr import numpy as np from collections import *

np.random.seed(1618033)

Set dimensions

a,b,c = 100,12,10000 #100 patients, 12 months, 10000 attributes

Create labels

patients = ["patient_%d" % i for i in range(a)] months = [j for j in range(b)] attributes = ["attr_%d" % k for k in range(c)]

Dict of DataFrames

D_patient_DA = OrderedDict()

for i, patient in enumerate(patients): A_placeholder = np.zeros((b,c)) for j, month in enumerate(months): #Genes x Replicates V_attrExp = np.random.random(c) #Fill array with row A_placeholder[j,:] = V_attrExp #Assign dataframe for every patient D_patient_DA[patient] = xr.DataArray(A_placeholder, coords = [months, attributes], dims = ["Months","Attributes"])

I'd like to do this:

DA_data = xr.concat(list(D_patient_DA.values()), coords = list(D_patient_DA.keys()), dim="Patients")

Traceback (most recent call last):

File "Untitled.py", line 29, in <module>

DA_data = xr.concat(list(D_patient_DA.values()), coords = list(D_patient_DA.keys()), dim="Patients")

File "/Users/Mu/Dropbox/anaconda/lib/python3.5/site-packages/xarray/core/combine.py", line 114, in concat

return f(objs, dim, data_vars, coords, compat, positions)

File "/Users/Mu/Dropbox/anaconda/lib/python3.5/site-packages/xarray/core/combine.py", line 301, in _dataarray_concat

positions)

File "/Users/Mu/Dropbox/anaconda/lib/python3.5/site-packages/xarray/core/combine.py", line 207, in _dataset_concat

concat_over = _calc_concat_over(datasets, dim, data_vars, coords)

File "/Users/Mu/Dropbox/anaconda/lib/python3.5/site-packages/xarray/core/combine.py", line 186, in _calc_concat_over

concat_over.update(process_subset_opt(coords, 'coords'))

File "/Users/Mu/Dropbox/anaconda/lib/python3.5/site-packages/xarray/core/combine.py", line 177, in process_subset_opt

% (subset, subset_long_name, invalid_vars))

ValueError: some variables in coords are not coordinates on the first dataset: ['patient_0', 'patient_1', 'patient_2', 'patient_3', 'patient_4', 'patient_5', 'patient_6', 'patient_7', 'patient_8', 'patient_9', 'patient_10', 'patient_11', 'patient_12', 'patient_13', 'patient_14', 'patient_15', 'patient_16', 'patient_17', 'patient_18', 'patient_19', 'patient_20', 'patient_21', 'patient_22', 'patient_23', 'patient_24', 'patient_25', 'patient_26', 'patient_27', 'patient_28', 'patient_29', 'patient_30', 'patient_31', 'patient_32', 'patient_33', 'patient_34', 'patient_35', 'patient_36', 'patient_37', 'patient_38', 'patient_39', 'patient_40', 'patient_41', 'patient_42', 'patient_43', 'patient_44', 'patient_45', 'patient_46', 'patient_47', 'patient_48', 'patient_49', 'patient_50', 'patient_51', 'patient_52', 'patient_53', 'patient_54', 'patient_55', 'patient_56', 'patient_57', 'patient_58', 'patient_59', 'patient_60', 'patient_61', 'patient_62', 'patient_63', 'patient_64', 'patient_65', 'patient_66', 'patient_67', 'patient_68', 'patient_69', 'patient_70', 'patient_71', 'patient_72', 'patient_73', 'patient_74', 'patient_75', 'patient_76', 'patient_77', 'patient_78', 'patient_79', 'patient_80', 'patient_81', 'patient_82', 'patient_83', 'patient_84', 'patient_85', 'patient_86', 'patient_87', 'patient_88', 'patient_89', 'patient_90', 'patient_91', 'patient_92', 'patient_93', 'patient_94', 'patient_95', 'patient_96', 'patient_97', 'patient_98', 'patient_99']

But I have to do this instead

DA_data = xr.concat(list(D_patient_DA.values()), dim="Patients") DA_data.coords["Patients"] = list(D_patient_DA.keys()) ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/839/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 3 rows from issue in issue_comments
Powered by Datasette · Queries took 158.623ms · About: xarray-datasette