home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where author_association = "CONTRIBUTOR" and issue = 621078539 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 2

  • alimanfoo 3
  • hmaarrfk 1

issue 1

  • Unnamed dimensions · 4 ✖

author_association 1

  • CONTRIBUTOR · 4 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1434780029 https://github.com/pydata/xarray/issues/4079#issuecomment-1434780029 https://api.github.com/repos/pydata/xarray/issues/4079 IC_kwDOAMm_X85VhQF9 hmaarrfk 90008 2023-02-17T15:08:50Z 2023-02-17T15:08:50Z CONTRIBUTOR

I know it is "stale" but aligning to these "surprise dimensions" creates "late stage" bugs that are hard to pinpoint.

I'm not sure if it is possible to mark these dimensions as "unnamed" and as such, they should be "merged" into new "unnamed" dimensions that the user isn't tracking at this point in time.

Our workaround have included calling these dimensions something related to the datarray d1_i, or simply making small small "arrays" a countable number of scalar variables (d1_min, d1_max) instead of a single array containing two values d1_limits.

```python import xarray as xr d1 = xr.DataArray(data=[1, 2]) assert 'dim_0' in d1.dims d2 = xr.DataArray(data=[1, 2, 3]) assert 'dim_0' in d2.dims

xr.Dataset({'d1': d1, 'd2': d2}) ```

Stack trace ``` --------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[2], line 7 4 d2 = xr.DataArray(data=[1, 2, 3]) 5 assert 'dim_0' in d2.dims ----> 7 xr.Dataset({'d1': d1, 'd2': d2}) File ~/mambaforge/envs/dev/lib/python3.9/site-packages/xarray/core/dataset.py:612, in Dataset.__init__(self, data_vars, coords, attrs) 609 if isinstance(coords, Dataset): 610 coords = coords.variables --> 612 variables, coord_names, dims, indexes, _ = merge_data_and_coords( 613 data_vars, coords, compat="broadcast_equals" 614 ) 616 self._attrs = dict(attrs) if attrs is not None else None 617 self._close = None File ~/mambaforge/envs/dev/lib/python3.9/site-packages/xarray/core/merge.py:564, in merge_data_and_coords(data_vars, coords, compat, join) 562 objects = [data_vars, coords] 563 explicit_coords = coords.keys() --> 564 return merge_core( 565 objects, 566 compat, 567 join, 568 explicit_coords=explicit_coords, 569 indexes=Indexes(indexes, coords), 570 ) File ~/mambaforge/envs/dev/lib/python3.9/site-packages/xarray/core/merge.py:741, in merge_core(objects, compat, join, combine_attrs, priority_arg, explicit_coords, indexes, fill_value) 738 _assert_compat_valid(compat) 740 coerced = coerce_pandas_values(objects) --> 741 aligned = deep_align( 742 coerced, join=join, copy=False, indexes=indexes, fill_value=fill_value 743 ) 744 collected = collect_variables_and_indexes(aligned, indexes=indexes) 745 prioritized = _get_priority_vars_and_indexes(aligned, priority_arg, compat=compat) File ~/mambaforge/envs/dev/lib/python3.9/site-packages/xarray/core/alignment.py:848, in deep_align(objects, join, copy, indexes, exclude, raise_on_invalid, fill_value) 845 else: 846 out.append(variables) --> 848 aligned = align( 849 *targets, 850 join=join, 851 copy=copy, 852 indexes=indexes, 853 exclude=exclude, 854 fill_value=fill_value, 855 ) 857 for position, key, aligned_obj in zip(positions, keys, aligned): 858 if key is no_key: File ~/mambaforge/envs/dev/lib/python3.9/site-packages/xarray/core/alignment.py:785, in align(join, copy, indexes, exclude, fill_value, *objects) 589 """ 590 Given any number of Dataset and/or DataArray objects, returns new 591 objects with aligned indexes and dimension sizes. (...) 775 776 """ 777 aligner = Aligner( 778 objects, 779 join=join, (...) 783 fill_value=fill_value, 784 ) --> 785 aligner.align() 786 return aligner.results File ~/mambaforge/envs/dev/lib/python3.9/site-packages/xarray/core/alignment.py:573, in Aligner.align(self) 571 self.assert_no_index_conflict() 572 self.align_indexes() --> 573 self.assert_unindexed_dim_sizes_equal() 575 if self.join == "override": 576 self.override_indexes() File ~/mambaforge/envs/dev/lib/python3.9/site-packages/xarray/core/alignment.py:472, in Aligner.assert_unindexed_dim_sizes_equal(self) 470 add_err_msg = "" 471 if len(sizes) > 1: --> 472 raise ValueError( 473 f"cannot reindex or align along dimension {dim!r} " 474 f"because of conflicting dimension sizes: {sizes!r}" + add_err_msg 475 ) ValueError: cannot reindex or align along dimension 'dim_0' because of conflicting dimension sizes: {2, 3} ```

cc: @claydugo

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Unnamed dimensions 621078539
631075010 https://github.com/pydata/xarray/issues/4079#issuecomment-631075010 https://api.github.com/repos/pydata/xarray/issues/4079 MDEyOklzc3VlQ29tbWVudDYzMTA3NTAxMA== alimanfoo 703554 2020-05-19T20:50:26Z 2020-05-19T20:50:51Z CONTRIBUTOR

In the specific example from your notebook, where do the dimensions lengths __variants/BaseCounts_dim1, __variants/MLEAC_dim1 and __variants/MLEAF_dim1 come from?

BaseCounts_dim1 is length 4, so maybe that corresponds to DNA bases ATGC?

In this specific example, I do actually know where these dimension lengths come from. In fact I should've used the shared dimension alt_alleles instead of __variants/MLEAC_dim1 and __variants/MLEAF_dim1. And yes BaseCounts_dim1 does correspond to DNA bases.

But two points.

First, I don't care about these dimensions. The only dimensions I care about and will use are variants, samples and ploidy.

Second, more important, this kind of data can come from a number of different sources, each of which includes a different set of arrays with different names and semantics. While there are some common arrays and naming conventions where I can guess what the dimensions mean, in general I can't know all of those up front and bake them in as special cases.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Unnamed dimensions 621078539
630924754 https://github.com/pydata/xarray/issues/4079#issuecomment-630924754 https://api.github.com/repos/pydata/xarray/issues/4079 MDEyOklzc3VlQ29tbWVudDYzMDkyNDc1NA== alimanfoo 703554 2020-05-19T16:14:27Z 2020-05-19T16:14:27Z CONTRIBUTOR

Thanks @shoyer.

For reference, I'm exploring putting some genome variation data into xarray, here's an initial experiment and discussion here.

In general I will have some arrays where I won't know what some of the dimensions mean, and so cannot give them a meaningful name.

No worries if this is hard, was just wondering if it was supported already.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Unnamed dimensions 621078539
630913851 https://github.com/pydata/xarray/issues/4079#issuecomment-630913851 https://api.github.com/repos/pydata/xarray/issues/4079 MDEyOklzc3VlQ29tbWVudDYzMDkxMzg1MQ== alimanfoo 703554 2020-05-19T15:55:54Z 2020-05-19T15:55:54Z CONTRIBUTOR

Thanks so much @rabernat for quick response.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Unnamed dimensions 621078539

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.54ms · About: xarray-datasette