home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

7 rows where author_association = "CONTRIBUTOR" and issue = 378898407 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 1

  • jsignell 7

issue 1

  • Include filename or path in open_mfdataset · 7 ✖

author_association 1

  • CONTRIBUTOR · 7 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
440002135 https://github.com/pydata/xarray/issues/2550#issuecomment-440002135 https://api.github.com/repos/pydata/xarray/issues/2550 MDEyOklzc3VlQ29tbWVudDQ0MDAwMjEzNQ== jsignell 4806877 2018-11-19T18:53:27Z 2018-11-19T18:53:27Z CONTRIBUTOR

Having started writing a test, I now think that encoding['source'] is backend specific. Here it is implemented in netcdf4: https://github.com/pydata/xarray/blob/70e9eb8fc834e4aeff42c221c04c9713eb465b8a/xarray/backends/netCDF4_.py#L386 but I don't see it for pynio for instance: https://github.com/pydata/xarray/blob/70e9eb8fc834e4aeff42c221c04c9713eb465b8a/xarray/backends/pynio_.py#L77-L81

Is this something that we want to mandate that backends provide?

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Include filename or path in open_mfdataset 378898407
439913493 https://github.com/pydata/xarray/issues/2550#issuecomment-439913493 https://api.github.com/repos/pydata/xarray/issues/2550 MDEyOklzc3VlQ29tbWVudDQzOTkxMzQ5Mw== jsignell 4806877 2018-11-19T14:36:37Z 2018-11-19T14:36:37Z CONTRIBUTOR

Should I add a test that expects .encoding['source'] to ensure its continued presence?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Include filename or path in open_mfdataset 378898407
439742167 https://github.com/pydata/xarray/issues/2550#issuecomment-439742167 https://api.github.com/repos/pydata/xarray/issues/2550 MDEyOklzc3VlQ29tbWVudDQzOTc0MjE2Nw== jsignell 4806877 2018-11-19T00:52:03Z 2018-11-19T00:52:03Z CONTRIBUTOR

Ah I don't think I understood that adding source to encoding was a new addition. In latest master ('0.11.0+3.g70e9eb8) this works fine:

```python def func(ds): var = next(var for var in ds) return ds.assign(path=ds[var].encoding['source'])

ds = xr.open_mfdataset(['./air_1.nc', './air_2.nc'], concat_dim='path', preprocess=func) ```

I do think it is misleading though that after you've concatenated the data, the encoding['source'] on a concatenated var seems to be the first path.

```python

ds['air'].encoding['source'] '~/air_1.nc' ```

I'll close this one though since there is a clear way to access the filename. Thanks for the tip @jhamman!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Include filename or path in open_mfdataset 378898407
437464067 https://github.com/pydata/xarray/issues/2550#issuecomment-437464067 https://api.github.com/repos/pydata/xarray/issues/2550 MDEyOklzc3VlQ29tbWVudDQzNzQ2NDA2Nw== jsignell 4806877 2018-11-09T19:11:38Z 2018-11-09T19:11:38Z CONTRIBUTOR

A dirty fix would be to add an attribute to each dataset.

I thought @jhamman was suggesting that already exists, but I couldn't find it: https://github.com/pydata/xarray/issues/2550#issuecomment-437157299

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Include filename or path in open_mfdataset 378898407
437433736 https://github.com/pydata/xarray/issues/2550#issuecomment-437433736 https://api.github.com/repos/pydata/xarray/issues/2550 MDEyOklzc3VlQ29tbWVudDQzNzQzMzczNg== jsignell 4806877 2018-11-09T17:29:05Z 2018-11-09T17:29:05Z CONTRIBUTOR

Maybe we can inspect the preprocess function like this:

```python

preprocess = lambda a, b: print(a, b) preprocess .code.co_varnames ('a', 'b') ```

This response is ordered, so the first one can always be ds regardless of its name and then we can look for special names (like filename) in the rest.

From this answer: https://stackoverflow.com/a/4051447/4021797

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Include filename or path in open_mfdataset 378898407
437161279 https://github.com/pydata/xarray/issues/2550#issuecomment-437161279 https://api.github.com/repos/pydata/xarray/issues/2550 MDEyOklzc3VlQ29tbWVudDQzNzE2MTI3OQ== jsignell 4806877 2018-11-08T21:24:45Z 2018-11-08T21:24:45Z CONTRIBUTOR

@jhamman that looks pretty good, but I'm not seeing the source in the encoding dict. Is this what you were expecting?

```python def func(ds): var = next(var for var in ds) return ds.assign(path=ds[var].encoding['source'])

xr.open_mfdataset(['./ST4.2018092500.01h', './ST4.2018092501.01h'], engine='pynio', concat_dim='path', preprocess=func) python-traceback


KeyError Traceback (most recent call last) <ipython-input-49-184da62ce353> in <module>() ----> 1 ds = xr.open_mfdataset(['./ST4.2018092500.01h', './ST4.2018092501.01h'], engine='pynio', concat_dim='path', preprocess=func)

/opt/conda/lib/python3.6/site-packages/xarray/backends/api.py in open_mfdataset(paths, chunks, concat_dim, compat, preprocess, engine, lock, data_vars, coords, autoclose, parallel, **kwargs) 612 file_objs = [getattr_(ds, '_file_obj') for ds in datasets] 613 if preprocess is not None: --> 614 datasets = [preprocess(ds) for ds in datasets] 615 616 if parallel:

/opt/conda/lib/python3.6/site-packages/xarray/backends/api.py in <listcomp>(.0) 612 file_objs = [getattr_(ds, '_file_obj') for ds in datasets] 613 if preprocess is not None: --> 614 datasets = [preprocess(ds) for ds in datasets] 615 616 if parallel:

<ipython-input-48-fd450fa1393a> in func(ds) 1 def func(ds): 2 var = next(var for var in ds) ----> 3 return ds.assign(path=ds[var].encoding['source'])

KeyError: 'source' ```

xarray version: '0.11.0+1.g575e97ae'

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Include filename or path in open_mfdataset 378898407
437156317 https://github.com/pydata/xarray/issues/2550#issuecomment-437156317 https://api.github.com/repos/pydata/xarray/issues/2550 MDEyOklzc3VlQ29tbWVudDQzNzE1NjMxNw== jsignell 4806877 2018-11-08T21:07:48Z 2018-11-08T21:07:48Z CONTRIBUTOR

There is a preprocess argument. You provide a function and it is run on every file.

Yes but the input to that function is just the ds, I couldn't figure out a way to get the filename from within a preprocess function. This is what I was doing to poke around in there:

```python def func(ds): import pdb; pdb.set_trace()

xr.open_mfdataset(['./ST4.2018092500.01h', './ST4.2018092501.01h'], engine='pynio', concat_dim='path', preprocess=func) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Include filename or path in open_mfdataset 378898407

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 562.695ms · About: xarray-datasette