home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

9 rows where issue = 520110180 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 3

  • juseg 4
  • TomNicholas 4
  • dcherian 1

author_association 2

  • MEMBER 5
  • CONTRIBUTOR 4

issue 1

  • Add option to choose mfdataset attributes source. · 9 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
573325829 https://github.com/pydata/xarray/pull/3498#issuecomment-573325829 https://api.github.com/repos/pydata/xarray/issues/3498 MDEyOklzc3VlQ29tbWVudDU3MzMyNTgyOQ== TomNicholas 35968931 2020-01-11T15:21:51Z 2020-01-11T15:21:51Z MEMBER

I think there's nothing left to do here, thanks @juseg!

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add option to choose mfdataset attributes source. 520110180
565725336 https://github.com/pydata/xarray/pull/3498#issuecomment-565725336 https://api.github.com/repos/pydata/xarray/issues/3498 MDEyOklzc3VlQ29tbWVudDU2NTcyNTMzNg== TomNicholas 35968931 2019-12-14T15:04:09Z 2019-12-14T15:04:09Z MEMBER

Thanks @juseg .

in Python 3.6- the order is not guaranteed preserved.

I think for python 3.6 and above the order is preserved isn't it?

the current default 0 is not well defined in case of nested lists.

Yes, this is what I was thinking of.

the ids are essentially ND indexes that could perhaps be used...

We could do this, and that's how we would solve it in general, but I don't really think it's worth the effort/complexity.

Or should we just stick to file paths as you suggest? And leave the default as is

I think so - if we do this then users can still easily pick the attributes from the file of their choosing (solving the original issue), and if someone wants to be able to choose the attrs_file in another way later then we can worry about that then.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add option to choose mfdataset attributes source. 520110180
565714909 https://github.com/pydata/xarray/pull/3498#issuecomment-565714909 https://api.github.com/repos/pydata/xarray/issues/3498 MDEyOklzc3VlQ29tbWVudDU2NTcxNDkwOQ== juseg 1186928 2019-12-14T13:00:28Z 2019-12-14T13:00:28Z CONTRIBUTOR

@TomNicholas I've had a closer look at the code. Nested lists of file paths are processed by:

https://github.com/pydata/xarray/blob/f2b2f9f62ea0f1020262a7ff563bfe74258ffaa1/xarray/backends/api.py#L881-L882

Using the method defined in:

https://github.com/pydata/xarray/blob/f2b2f9f62ea0f1020262a7ff563bfe74258ffaa1/xarray/core/combine.py#L15-L46

In Python 3.7+ the list of paths is essentially flattened list e.g.:

```

import xarray as xr paths = [list('abc'), list('def')] list(xr.core.combine._infer_concat_order_from_positions(paths).values()) ['a', 'b', 'c', 'd', 'e', 'f'] `` Unfortunately the current code uses a dictionary which means that in Python 3.6- the order is not guaranteed preserved. This also implies that the current default0` is not well defined in case of nested lists.

https://github.com/pydata/xarray/blob/f2b2f9f62ea0f1020262a7ff563bfe74258ffaa1/xarray/backends/api.py#L900 https://github.com/pydata/xarray/blob/f2b2f9f62ea0f1020262a7ff563bfe74258ffaa1/xarray/backends/api.py#L964

On the other hands the ids are essentially ND indexes that could perhaps be used...

```

list(xr.core.combine._infer_concat_order_from_positions(paths).keys()) [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2)] ```

Or should we just stick to file paths as you suggest? And leave the default as is (e.g. ambiguous for Python 3.6-)?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add option to choose mfdataset attributes source. 520110180
565622102 https://github.com/pydata/xarray/pull/3498#issuecomment-565622102 https://api.github.com/repos/pydata/xarray/issues/3498 MDEyOklzc3VlQ29tbWVudDU2NTYyMjEwMg== TomNicholas 35968931 2019-12-13T21:53:08Z 2019-12-13T21:53:08Z MEMBER

I'm not sure we should merge changes if we're unsure how they will behave in certain circumstances.

On the other hand I am eager to keep the file number option because (1) attrs_file=-1 is the behaviour that I need

If we kept just the string specifier, you could still solve the problem of preserving the history: ```python files_to_open = ['filepath1', 'filepath2']

ds = open_mfdataset(files_to_open, attrs_file=files_to_open[-1]) `` But then the option would always have clear and well-defined behaviour, even in more complex cases likecombine_by_coordsorcombine='nested'` with a >1D input file list.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add option to choose mfdataset attributes source. 520110180
565413630 https://github.com/pydata/xarray/pull/3498#issuecomment-565413630 https://api.github.com/repos/pydata/xarray/issues/3498 MDEyOklzc3VlQ29tbWVudDU2NTQxMzYzMA== TomNicholas 35968931 2019-12-13T11:48:58Z 2019-12-13T21:48:24Z MEMBER

Thanks for this @juseg. The only problem I see is that a scalar number to specify the file only makes sense if it's a 1D list, but open_mfdataset can also accept nested list-of-lists (with combine='nested'), or ignore the order of the input entirely (with combine='by_coords'). What happens if you pass a list-of-lists of datasets?

On the other hand specifying the particular filepath or object makes sense in all cases, so perhaps the easiest way to avoid ambiguity would be to restrict to that option? (The default would just be left as-is.)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add option to choose mfdataset attributes source. 520110180
565555318 https://github.com/pydata/xarray/pull/3498#issuecomment-565555318 https://api.github.com/repos/pydata/xarray/issues/3498 MDEyOklzc3VlQ29tbWVudDU2NTU1NTMxOA== juseg 1186928 2019-12-13T18:31:09Z 2019-12-13T18:31:09Z CONTRIBUTOR

@TomNicholas Thanks for bringing the discussion live again! I'm not sure what happens in those cases, but I'm confident the default behaviour is unchanged, i.e. the attributes file is 0, whatever that 0 means (see my first commit).

If this is an issue I would suggest to discuss in a separate thread, as I think it is independent from my changes. On the other hand I am eager to keep the file number option because (1) attrs_file=-1 is the behaviour that I need (to ensure that history is always preserved) and (2) attrs_file=0 is the current behaviour (again, whatever that means for nested lists).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add option to choose mfdataset attributes source. 520110180
554001471 https://github.com/pydata/xarray/pull/3498#issuecomment-554001471 https://api.github.com/repos/pydata/xarray/issues/3498 MDEyOklzc3VlQ29tbWVudDU1NDAwMTQ3MQ== juseg 1186928 2019-11-14T17:49:29Z 2019-11-14T17:49:29Z CONTRIBUTOR

This will add a new kwarg in open_mfdataset. The current name is attrs_file, defaulting to 0 (current behaviour). Or? Suggestions welcome.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add option to choose mfdataset attributes source. 520110180
552480743 https://github.com/pydata/xarray/pull/3498#issuecomment-552480743 https://api.github.com/repos/pydata/xarray/issues/3498 MDEyOklzc3VlQ29tbWVudDU1MjQ4MDc0Mw== dcherian 2448579 2019-11-11T15:04:01Z 2019-11-11T15:04:01Z MEMBER

Thanks @juseg . I've left a few comments.

I see that this is your first PR. Welcome to xarray! and thanks for contributing :clap:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add option to choose mfdataset attributes source. 520110180
552398274 https://github.com/pydata/xarray/pull/3498#issuecomment-552398274 https://api.github.com/repos/pydata/xarray/issues/3498 MDEyOklzc3VlQ29tbWVudDU1MjM5ODI3NA== juseg 1186928 2019-11-11T11:05:52Z 2019-11-11T11:05:52Z CONTRIBUTOR

I think I'm done. Can someone look at it? The master_file keyword argument is borrowed from NetCDF (see #2382 and Unidata/netcdf4-python#835), although xarray's mechanism is independant.

The default is 0, which is consistent with current xarray behaviour. When concatenating model output in time it would be more logical to use -1, i.e. preserve history of the last file.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add option to choose mfdataset attributes source. 520110180

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 32.856ms · About: xarray-datasette
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows