html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/2550#issuecomment-440002135,https://api.github.com/repos/pydata/xarray/issues/2550,440002135,MDEyOklzc3VlQ29tbWVudDQ0MDAwMjEzNQ==,4806877,2018-11-19T18:53:27Z,2018-11-19T18:53:27Z,CONTRIBUTOR,"Having started writing a test, I now think that `encoding['source']` is backend specific. Here it is implemented in netcdf4: https://github.com/pydata/xarray/blob/70e9eb8fc834e4aeff42c221c04c9713eb465b8a/xarray/backends/netCDF4_.py#L386 but I don't see it for pynio for instance: https://github.com/pydata/xarray/blob/70e9eb8fc834e4aeff42c221c04c9713eb465b8a/xarray/backends/pynio_.py#L77-L81
Is this something that we want to mandate that backends provide? ","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,378898407
https://github.com/pydata/xarray/issues/2550#issuecomment-439913493,https://api.github.com/repos/pydata/xarray/issues/2550,439913493,MDEyOklzc3VlQ29tbWVudDQzOTkxMzQ5Mw==,4806877,2018-11-19T14:36:37Z,2018-11-19T14:36:37Z,CONTRIBUTOR,Should I add a test that expects `.encoding['source']` to ensure its continued presence?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,378898407
https://github.com/pydata/xarray/issues/2550#issuecomment-439742167,https://api.github.com/repos/pydata/xarray/issues/2550,439742167,MDEyOklzc3VlQ29tbWVudDQzOTc0MjE2Nw==,4806877,2018-11-19T00:52:03Z,2018-11-19T00:52:03Z,CONTRIBUTOR,"Ah I don't think I understood that adding `source` to encoding was a new addition. In latest master (`'0.11.0+3.g70e9eb8`) this works fine:
```python
def func(ds):
var = next(var for var in ds)
return ds.assign(path=ds[var].encoding['source'])
ds = xr.open_mfdataset(['./air_1.nc', './air_2.nc'], concat_dim='path', preprocess=func)
```
I do think it is misleading though that after you've concatenated the data, the `encoding['source']` on a concatenated var seems to be the first path.
```python
>>> ds['air'].encoding['source']
'~/air_1.nc'
```
I'll close this one though since there is a clear way to access the filename. Thanks for the tip @jhamman!
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,378898407
https://github.com/pydata/xarray/issues/2550#issuecomment-437464067,https://api.github.com/repos/pydata/xarray/issues/2550,437464067,MDEyOklzc3VlQ29tbWVudDQzNzQ2NDA2Nw==,4806877,2018-11-09T19:11:38Z,2018-11-09T19:11:38Z,CONTRIBUTOR,">A dirty fix would be to add an attribute to each dataset.
I thought @jhamman was suggesting that already exists, but I couldn't find it: https://github.com/pydata/xarray/issues/2550#issuecomment-437157299","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,378898407
https://github.com/pydata/xarray/issues/2550#issuecomment-437433736,https://api.github.com/repos/pydata/xarray/issues/2550,437433736,MDEyOklzc3VlQ29tbWVudDQzNzQzMzczNg==,4806877,2018-11-09T17:29:05Z,2018-11-09T17:29:05Z,CONTRIBUTOR,"Maybe we can inspect the `preprocess` function like this:
```python
>>> preprocess = lambda a, b: print(a, b)
>>> preprocess .__code__.co_varnames
('a', 'b')
```
This response is ordered, so the first one can always be `ds` regardless of its name and then we can look for special names (like `filename`) in the rest.
From this answer: https://stackoverflow.com/a/4051447/4021797","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,378898407
https://github.com/pydata/xarray/issues/2550#issuecomment-437161279,https://api.github.com/repos/pydata/xarray/issues/2550,437161279,MDEyOklzc3VlQ29tbWVudDQzNzE2MTI3OQ==,4806877,2018-11-08T21:24:45Z,2018-11-08T21:24:45Z,CONTRIBUTOR,"@jhamman that looks pretty good, but I'm not seeing the source in the encoding dict. Is this what you were expecting?
```python
def func(ds):
var = next(var for var in ds)
return ds.assign(path=ds[var].encoding['source'])
xr.open_mfdataset(['./ST4.2018092500.01h', './ST4.2018092501.01h'],
engine='pynio', concat_dim='path', preprocess=func)
```
```python-traceback
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
in ()
----> 1 ds = xr.open_mfdataset(['./ST4.2018092500.01h', './ST4.2018092501.01h'], engine='pynio', concat_dim='path', preprocess=func)
/opt/conda/lib/python3.6/site-packages/xarray/backends/api.py in open_mfdataset(paths, chunks, concat_dim, compat, preprocess, engine, lock, data_vars, coords, autoclose, parallel, **kwargs)
612 file_objs = [getattr_(ds, '_file_obj') for ds in datasets]
613 if preprocess is not None:
--> 614 datasets = [preprocess(ds) for ds in datasets]
615
616 if parallel:
/opt/conda/lib/python3.6/site-packages/xarray/backends/api.py in (.0)
612 file_objs = [getattr_(ds, '_file_obj') for ds in datasets]
613 if preprocess is not None:
--> 614 datasets = [preprocess(ds) for ds in datasets]
615
616 if parallel:
in func(ds)
1 def func(ds):
2 var = next(var for var in ds)
----> 3 return ds.assign(path=ds[var].encoding['source'])
KeyError: 'source'
```
xarray version: '0.11.0+1.g575e97ae'","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,378898407
https://github.com/pydata/xarray/issues/2550#issuecomment-437156317,https://api.github.com/repos/pydata/xarray/issues/2550,437156317,MDEyOklzc3VlQ29tbWVudDQzNzE1NjMxNw==,4806877,2018-11-08T21:07:48Z,2018-11-08T21:07:48Z,CONTRIBUTOR,"> There is a preprocess argument. You provide a function and it is run on every file.
Yes but the input to that function is just the ds, I couldn't figure out a way to get the filename from within a preprocess function. This is what I was doing to poke around in there:
```python
def func(ds):
import pdb; pdb.set_trace()
xr.open_mfdataset(['./ST4.2018092500.01h', './ST4.2018092501.01h'],
engine='pynio', concat_dim='path', preprocess=func)
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,378898407