home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where issue = 350899839 and user = 1197350 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 1

  • rabernat · 5 ✖

issue 1

  • Let's list all the netCDF files that xarray can't open · 5 ✖

author_association 1

  • MEMBER 5
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
443305634 https://github.com/pydata/xarray/issues/2368#issuecomment-443305634 https://api.github.com/repos/pydata/xarray/issues/2368 MDEyOklzc3VlQ29tbWVudDQ0MzMwNTYzNA== rabernat 1197350 2018-11-30T19:03:07Z 2018-11-30T19:03:07Z MEMBER

We are working on fixing this in #2405. That PR (mine) has most of the basic functionality there, but it still needs more testing. Unfortunately, I don't have bandwidth right now to complete the required work.

If anyone here needs this fixed urgently and actually has time to work on it, I encourage you to pick up that PR and try to finish it off. We will be happy to provide help and support along the way.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Let's list all the netCDF files that xarray can't open 350899839
419207959 https://github.com/pydata/xarray/issues/2368#issuecomment-419207959 https://api.github.com/repos/pydata/xarray/issues/2368 MDEyOklzc3VlQ29tbWVudDQxOTIwNzk1OQ== rabernat 1197350 2018-09-06T19:08:06Z 2018-09-06T19:08:06Z MEMBER

It seems like this relaxation is compatible with the refactoring of indexes. Right now, we automatically create 1D indexes for all coordinate variables. The problem with 2D dimensions is that such indexes don't make sense: data.sel(y=3.14)

But maybe we could turn multi-dimensional coordinate variables into multi-indexes? Or no index at all? In any case, we could still do data.isel(y=3) i.e. logical indexing on the dimension axis.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Let's list all the netCDF files that xarray can't open 350899839
419188538 https://github.com/pydata/xarray/issues/2368#issuecomment-419188538 https://api.github.com/repos/pydata/xarray/issues/2368 MDEyOklzc3VlQ29tbWVudDQxOTE4ODUzOA== rabernat 1197350 2018-09-06T18:05:00Z 2018-09-06T18:05:00Z MEMBER

Perhaps part of the confusion is simply that y has different meanings in different contexts. When used as a dimension (e.g. to "define the array shape of a Variable" in CDM terms), it is indeed 1D. When used as a variable (or "CoordinateAxis"), it is 2D. XArray doesn't have a separate namespace for dimensions and variables.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Let's list all the netCDF files that xarray can't open 350899839
419187692 https://github.com/pydata/xarray/issues/2368#issuecomment-419187692 https://api.github.com/repos/pydata/xarray/issues/2368 MDEyOklzc3VlQ29tbWVudDQxOTE4NzY5Mg== rabernat 1197350 2018-09-06T18:02:19Z 2018-09-06T18:02:19Z MEMBER

@dopplershift - thanks for the clarifications! I agree that it's good for netCDF to be as open-ended as possible.

So I guess my quarrel is with the CDM. This is what it says about variables and dimensions:

A Variable is a container for data. It has a DataType, a set of Dimensions that define its array shape, and optionally a set of Attributes. Any shared Dimension it uses must be in the same Group or a parent Group.

A Dimension is used to define the array shape of a Variable. It may be shared among Variables, which provides a simple yet powerful way of associating Variables. When a Dimension is shared, it has a unique name within the Group. If unlimited, a Dimension's length may increase. If variableLength, then the actual length is data dependent, and can only be found by reading the data. A variableLength Dimension cannot be shared or unlimited.

then later

A Variable can have zero or more Coordinate Systems containing one or more CoordinateAxis. A CoordinateAxis can only be part of a Variable's CoordinateSystem if the CoordinateAxis' set of Dimensions is a subset of the Variable's set of Dimensions. This ensures that every data point in the Variable has a corresponding coordinate value for each of the CoordinateAxis in the CoordinateSystem.

A Coordinate System has one or more CoordinateAxis, and zero or more CoordinateTransforms.

A CoordinateAxis is a subtype of Variable, and is optionally classified according to the types in AxisType.

These are the rules which restrict which Variables can be used as Coordinate Axes:

Shared Dimensions: All dimensions used by a Coordinate Axis must be shared with the data variable. When a variable is part of a Structure, the dimensions used by the parent Structure(s) are considered to be part of the nested Variable.

I have a very hard time understanding what all of this means. Can the same variable be a "Dimension" and a "CoordinateAxis" in CDM?

It seems much simpler to me to use the CF approach to describe the physical coordinates of the data using "auxiliary coordinate variables" and to keep the dimensions as purely 1D "coordinate variables".

IMO, xarray is being overly pedantic here.

What would you like xarray to do with these datasets, given the fact that orthogonality of dimensions is central to its data model?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Let's list all the netCDF files that xarray can't open 350899839
419160841 https://github.com/pydata/xarray/issues/2368#issuecomment-419160841 https://api.github.com/repos/pydata/xarray/issues/2368 MDEyOklzc3VlQ29tbWVudDQxOTE2MDg0MQ== rabernat 1197350 2018-09-06T16:37:51Z 2018-09-06T16:37:51Z MEMBER

@djhoese - it would be great if you could track down a more specific example of the issue you are referring to.

Excluding this possible problem with groups, my assessment of the feedback above is that, actually, the only problem is #2233: we can't have multidimensional variables that are also their own dimensions. This is a good thing. It means we have a specific problem to fix.

Right now this is ok: dimensions: x = 4 y = 3 variables: int x(x); int y(y); float data(y, x) But this is not dimensions: x = 4 y = 3 variables: int x(x); float y(y, x); float data(y, x)

Personally I find this to be an incredibly confusing, recursive use of the concept of "dimensions". For me, dimensions should be orthogonal. In the second example, y is a [non-dimension] coordinate, not a dimension! The actual dimension is implicit, some sort of logical y_index. I wish that CF / netCDF had never chosen to accept this as a valid schema. But I admit that perhaps my internal mental model is too wrapped up with xarray!

So the question is: what can we do about it?

I propose the following general outline: - Create a new decoding function to effectively "fix" the recursively defined dimension by renaming y(y, x) into something like y_coordinate(y, x) - Add a new option to open_dataset called decode_recursive_dimension which defaults to False - Raise a more informative error when these types of datasets are encountered which suggests calling open_dataset with decode_recursive_dimension=True

Finally, we might want to raise this upstream with netCDF or CF conventions to try to understand better why this sort of schema is being encouraged.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Let's list all the netCDF files that xarray can't open 350899839

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 3356.569ms · About: xarray-datasette