html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/2550#issuecomment-440002135,https://api.github.com/repos/pydata/xarray/issues/2550,440002135,MDEyOklzc3VlQ29tbWVudDQ0MDAwMjEzNQ==,4806877,2018-11-19T18:53:27Z,2018-11-19T18:53:27Z,CONTRIBUTOR,"Having started writing a test, I now think that `encoding['source']` is backend specific. Here it is implemented in netcdf4: https://github.com/pydata/xarray/blob/70e9eb8fc834e4aeff42c221c04c9713eb465b8a/xarray/backends/netCDF4_.py#L386 but I don't see it for pynio for instance: https://github.com/pydata/xarray/blob/70e9eb8fc834e4aeff42c221c04c9713eb465b8a/xarray/backends/pynio_.py#L77-L81 Is this something that we want to mandate that backends provide? ","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,378898407 https://github.com/pydata/xarray/issues/2550#issuecomment-439913493,https://api.github.com/repos/pydata/xarray/issues/2550,439913493,MDEyOklzc3VlQ29tbWVudDQzOTkxMzQ5Mw==,4806877,2018-11-19T14:36:37Z,2018-11-19T14:36:37Z,CONTRIBUTOR,Should I add a test that expects `.encoding['source']` to ensure its continued presence?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,378898407 https://github.com/pydata/xarray/issues/2550#issuecomment-439742167,https://api.github.com/repos/pydata/xarray/issues/2550,439742167,MDEyOklzc3VlQ29tbWVudDQzOTc0MjE2Nw==,4806877,2018-11-19T00:52:03Z,2018-11-19T00:52:03Z,CONTRIBUTOR,"Ah I don't think I understood that adding `source` to encoding was a new addition. In latest master (`'0.11.0+3.g70e9eb8`) this works fine: ```python def func(ds): var = next(var for var in ds) return ds.assign(path=ds[var].encoding['source']) ds = xr.open_mfdataset(['./air_1.nc', './air_2.nc'], concat_dim='path', preprocess=func) ``` I do think it is misleading though that after you've concatenated the data, the `encoding['source']` on a concatenated var seems to be the first path. ```python >>> ds['air'].encoding['source'] '~/air_1.nc' ``` I'll close this one though since there is a clear way to access the filename. Thanks for the tip @jhamman! ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,378898407 https://github.com/pydata/xarray/issues/2550#issuecomment-437464067,https://api.github.com/repos/pydata/xarray/issues/2550,437464067,MDEyOklzc3VlQ29tbWVudDQzNzQ2NDA2Nw==,4806877,2018-11-09T19:11:38Z,2018-11-09T19:11:38Z,CONTRIBUTOR,">A dirty fix would be to add an attribute to each dataset. I thought @jhamman was suggesting that already exists, but I couldn't find it: https://github.com/pydata/xarray/issues/2550#issuecomment-437157299","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,378898407 https://github.com/pydata/xarray/issues/2550#issuecomment-437433736,https://api.github.com/repos/pydata/xarray/issues/2550,437433736,MDEyOklzc3VlQ29tbWVudDQzNzQzMzczNg==,4806877,2018-11-09T17:29:05Z,2018-11-09T17:29:05Z,CONTRIBUTOR,"Maybe we can inspect the `preprocess` function like this: ```python >>> preprocess = lambda a, b: print(a, b) >>> preprocess .__code__.co_varnames ('a', 'b') ``` This response is ordered, so the first one can always be `ds` regardless of its name and then we can look for special names (like `filename`) in the rest. From this answer: https://stackoverflow.com/a/4051447/4021797","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,378898407 https://github.com/pydata/xarray/issues/2550#issuecomment-437161279,https://api.github.com/repos/pydata/xarray/issues/2550,437161279,MDEyOklzc3VlQ29tbWVudDQzNzE2MTI3OQ==,4806877,2018-11-08T21:24:45Z,2018-11-08T21:24:45Z,CONTRIBUTOR,"@jhamman that looks pretty good, but I'm not seeing the source in the encoding dict. Is this what you were expecting? ```python def func(ds): var = next(var for var in ds) return ds.assign(path=ds[var].encoding['source']) xr.open_mfdataset(['./ST4.2018092500.01h', './ST4.2018092501.01h'], engine='pynio', concat_dim='path', preprocess=func) ``` ```python-traceback --------------------------------------------------------------------------- KeyError Traceback (most recent call last) in () ----> 1 ds = xr.open_mfdataset(['./ST4.2018092500.01h', './ST4.2018092501.01h'], engine='pynio', concat_dim='path', preprocess=func) /opt/conda/lib/python3.6/site-packages/xarray/backends/api.py in open_mfdataset(paths, chunks, concat_dim, compat, preprocess, engine, lock, data_vars, coords, autoclose, parallel, **kwargs) 612 file_objs = [getattr_(ds, '_file_obj') for ds in datasets] 613 if preprocess is not None: --> 614 datasets = [preprocess(ds) for ds in datasets] 615 616 if parallel: /opt/conda/lib/python3.6/site-packages/xarray/backends/api.py in (.0) 612 file_objs = [getattr_(ds, '_file_obj') for ds in datasets] 613 if preprocess is not None: --> 614 datasets = [preprocess(ds) for ds in datasets] 615 616 if parallel: in func(ds) 1 def func(ds): 2 var = next(var for var in ds) ----> 3 return ds.assign(path=ds[var].encoding['source']) KeyError: 'source' ``` xarray version: '0.11.0+1.g575e97ae'","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,378898407 https://github.com/pydata/xarray/issues/2550#issuecomment-437156317,https://api.github.com/repos/pydata/xarray/issues/2550,437156317,MDEyOklzc3VlQ29tbWVudDQzNzE1NjMxNw==,4806877,2018-11-08T21:07:48Z,2018-11-08T21:07:48Z,CONTRIBUTOR,"> There is a preprocess argument. You provide a function and it is run on every file. Yes but the input to that function is just the ds, I couldn't figure out a way to get the filename from within a preprocess function. This is what I was doing to poke around in there: ```python def func(ds): import pdb; pdb.set_trace() xr.open_mfdataset(['./ST4.2018092500.01h', './ST4.2018092501.01h'], engine='pynio', concat_dim='path', preprocess=func) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,378898407