html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/2313#issuecomment-1468024753,https://api.github.com/repos/pydata/xarray/issues/2313,1468024753,IC_kwDOAMm_X85XgEex,61923007,2023-03-14T12:35:00Z,2023-03-14T12:35:00Z,NONE,"I'll like to work on this @TomNicholas, where do I start from?
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,344614881
https://github.com/pydata/xarray/issues/2313#issuecomment-1135302642,https://api.github.com/repos/pydata/xarray/issues/2313,1135302642,IC_kwDOAMm_X85Dq1fy,54370222,2022-05-24T01:31:22Z,2022-05-24T01:31:22Z,NONE,"Hello:

I have to find maximum precipitation of each year (for example: 2007 and 2008, Dataset link are: [2007](https://downloads.psl.noaa.gov/Datasets/cpc_us_precip/RT/precip.V1.0.2007.nc) and [2008](https://downloads.psl.noaa.gov/Datasets/cpc_us_precip/RT/precip.V1.0.2008.nc)).  I have done this using resample method (i.e. `.resample(time='Y').max()`) after concatenating it along time dimension. 

Following along [SO](https://stackoverflow.com/questions/51709266/using-xarray-to-open-a-multi-file-dataset-when-both-the-files-and-dataset-have-a), I am wondering if I can use preprocess to find maximum (or minimum or average) for each file first and then concatenate it using time dimension. I tried the following code and was not successful. Can someone help me with this?

```import dask.array as da
import numpy as np
import xarray as xr

from dask.distributed import Client
client = Client()
client

def preprocess_func(ds):
    '''Get maximum (or minimum or average) from each file and concatenate along time'''
    return ds.precip.max('time')

prec_ds=xr.open_mfdataset([prec_2007,prec_2008],
                       chunks={""lat"": 25,""lon"": 25,""time"": -1,},
                       preprocess=preprocess_func,
                       concat_dim='time')```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,344614881
https://github.com/pydata/xarray/issues/2313#issuecomment-1062761948,https://api.github.com/repos/pydata/xarray/issues/2313,1062761948,IC_kwDOAMm_X84_WHXc,30007270,2022-03-09T10:13:09Z,2022-03-09T10:13:09Z,NONE,"Seconding @dcherian's comment in #4901 on an example for `.encoding['source']`. Working off @raybellwaves' example, something like this would have been useful to me:

```
>>> import xarray as xr
>>> import numpy as np
>>> model1 = xr.DataArray(np.arange(2), coords=[np.arange(2)], name=""f"")
>>> model1.to_dataset().to_netcdf(""model1.nc"")
>>> model2 = xr.DataArray(np.arange(2), coords=[np.arange(2)], name=""f"")
>>> model2.to_dataset().to_netcdf(""model2.nc"")
>>> ds = xr.open_mfdataset(
...     [""model1.nc"", ""model2.nc""],
...     preprocess=lambda ds: ds.expand_dims(
...         {""model_name"": [ds.encoding[""source""].split(""/"")[-1].split(""."")[0]]}
...     ),
... )
>>> ds
<xarray.Dataset>
Dimensions:     (dim_0: 2, model_name: 2)
Coordinates:
  * dim_0       (dim_0) int64 0 1
  * model_name  (model_name) object 'model1' 'model2'
Data variables:
    f           (model_name, dim_0) int64 dask.array<chunksize=(1, 2), meta=np.ndarray>
```

On that note, the example above seems to work with some slight changes:
```
>>> import numpy as np
>>> import xarray as xr
>>> 
>>> f1 = xr.DataArray(np.arange(2), coords=[np.arange(2)], dims=[""a""], name=""f1"")
>>> f1 = f1.assign_coords(t='t0')
>>> f1.to_dataset().to_netcdf(""f1.nc"")
>>> 
>>> f2 = xr.DataArray(np.arange(2), coords=[np.arange(2)], dims=[""a""], name=""f2"")
>>> f2 = f2.assign_coords(t='t1')
>>> f2.to_dataset().to_netcdf(""f2.nc"")
>>> 
>>> # Concat along t
>>> def preprocess(ds):
...     return ds.expand_dims(""t"")
... 
>>> 
>>> ds = xr.open_mfdataset([""f1.nc"", ""f2.nc""], concat_dim=""t"", preprocess=preprocess)
>>> ds
<xarray.Dataset>
Dimensions:  (a: 2, t: 2)
Coordinates:
  * t        (t) object 't0' 't1'
  * a        (a) int64 0 1
Data variables:
    f1       (t, a) float64 dask.array<chunksize=(2, 2), meta=np.ndarray>
    f2       (t, a) float64 dask.array<chunksize=(2, 2), meta=np.ndarray>
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,344614881