html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/2436#issuecomment-610466323,https://api.github.com/repos/pydata/xarray/issues/2436,610466323,MDEyOklzc3VlQ29tbWVudDYxMDQ2NjMyMw==,16655388,2020-04-07T15:49:03Z,2020-04-07T15:49:03Z,NONE,"> unfortunately, `numpy` does not allow us to put `cftime` object into dtypes (yet!), so `ds.time.values` is a `numpy.ndarray` with dtype `object`, containing `cftime` objects. To make the code work, use it with `ds.time.values[0]`. Of course, that won't help if the array contains objects of more than one type.
>
> ```python
> >>> import cftime
> >>> isinstance(ds.time.values[0], cftime.DatetimeNoLeap)
> True
> >>> type(ds.time.values[0])
>
> ```
I use the following, which seems to work for me but I thought something shorter and more elegant could be done ...
```
def get_time_date_type(ds: Union[xr.Dataset, xr.DataArray]):
if ds.time.dtype == ""O"":
if len(ds.time.shape) == 0:
time0 = ds.time.item()
else:
time0 = ds.time[0].item()
return type(time0)
else:
return np.datetime64
```
>
> In #3498, the original proposal was to name the new kwarg `master_file`, but later it was renamed to `attrs_file`. If `l_f` is a list of file paths, you used it correctly.
Yes, `l_f` is a list of file paths.
>
> Before trying to help with debugging your issue: could you post the output of `xr.show_versions()`? That would help narrowing down on whether it's a dependency issue or a bug in `xarray`.
Here is the output:
```
In [2]: xr.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 3.10.0-514.2.2.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: fr_CA.UTF-8
LOCALE: fr_CA.UTF-8
libhdf5: 1.10.4
libnetcdf: 4.6.1
xarray: 0.15.2.dev29+g6048356
pandas: 1.0.1
numpy: 1.18.1
scipy: 1.4.1
netCDF4: 1.4.2
pydap: None
h5netcdf: None
h5py: 2.9.0
Nio: None
zarr: None
cftime: 1.0.4.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.1
dask: 2.10.1
distributed: 2.10.0
matplotlib: 3.0.2
cartopy: 0.16.0
seaborn: 0.9.0
numbagg: None
pint: 0.9
setuptools: 45.2.0.post20200210
pip: 20.0.2
conda: None
pytest: 5.3.4
IPython: 7.8.0
sphinx: 2.4.0
```
>
> Also, could you try to demonstrate your issue using a synthetic example? I've been trying to reproduce it with:
>
> ```python
> In [14]: units = 'days since 2000-02-25'
> ...: times = cftime.num2date(np.arange(7), units=units, calendar='365_day')
> ...: for x in range(5):
> ...: ds = xr.DataArray(
> ...: np.arange(x, 7 + x).reshape(7, 1),
> ...: coords={""time"": times, ""x"": [x]},
> ...: dims=['time', ""x""],
> ...: name='a',
> ...: ).to_dataset()
> ...: ds.to_netcdf(f'data-noleap{x}.nc')
> ...: paths = sorted(glob.glob(""data-noleap*.nc""))
> ...: with xr.open_mfdataset(paths, combine=""by_coords"") as ds:
> ...: print(ds.time.encoding)
> ...:
> {'zlib': False, 'shuffle': False, 'complevel': 0, 'fletcher32': False, 'contiguous': True, 'chunksizes': None, 'source': '.../data-noleap0.nc', 'original_shape': (7,), 'dtype': dtype('int64'), 'units': 'days since 2000-02-25 00:00:00.000000', 'calendar': 'noleap'}
> ```
I used your code and it works for me also. I noticed the synthetic file is `NETCDF4` and has `int` time variable while my files are `NETCDF4_CLASSIC` and the time variable is `double`. I modified the synthetic code to produce `NETCDF4_CLASSIC` files with `double` time variable but it does not change the results: encoding does not have any values related to the calendar.
Here is an output ouf `ncdump -hs` for one file, maybe it could help.
```
11:41 neree ~/travail/xarray_open_mfdataset_perd_time_attributes :ncdump -hs /expl6/climato/arch/bbw/series/200001/snw_bbw_200001_se.nc
netcdf snw_bbw_200001_se {
dimensions:
height = 1 ;
rlat = 300 ;
rlon = 340 ;
time = UNLIMITED ; // (248 currently)
variables:
double height(height) ;
height:units = ""m"" ;
height:long_name = ""height"" ;
height:standard_name = ""height"" ;
height:axis = ""Z"" ;
height:positive = ""up"" ;
height:coordinate_defines = ""point"" ;
height:actual_range = 0., 0. ;
height:_Storage = ""chunked"" ;
height:_ChunkSizes = 1 ;
height:_DeflateLevel = 6 ;
height:_Endianness = ""little"" ;
double lat(rlat, rlon) ;
lat:units = ""degrees_north"" ;
lat:long_name = ""latitude"" ;
lat:standard_name = ""latitude"" ;
lat:actual_range = 7.83627367019653, 82.5695037841797 ;
lat:_Storage = ""chunked"" ;
lat:_ChunkSizes = 50, 50 ;
lat:_DeflateLevel = 6 ;
lat:_Endianness = ""little"" ;
double lon(rlat, rlon) ;
lon:units = ""degrees_east"" ;
lon:long_name = ""longitude"" ;
lon:standard_name = ""longitude"" ;
lon:actual_range = -179.972747802734, 179.975296020508 ;
lon:_Storage = ""chunked"" ;
lon:_ChunkSizes = 50, 50 ;
lon:_DeflateLevel = 6 ;
lon:_Endianness = ""little"" ;
double rlat(rlat) ;
rlat:long_name = ""latitude in rotated pole grid"" ;
rlat:units = ""degrees"" ;
rlat:standard_name = ""grid_latitude"" ;
rlat:axis = ""Y"" ;
rlat:coordinate_defines = ""point"" ;
rlat:actual_range = -30.7100009918213, 35.0699996948242 ;
rlat:_Storage = ""chunked"" ;
rlat:_ChunkSizes = 50 ;
rlat:_DeflateLevel = 6 ;
rlat:_Endianness = ""little"" ;
double rlon(rlon) ;
rlon:long_name = ""longitude in rotated pole grid"" ;
rlon:units = ""degrees"" ;
rlon:standard_name = ""grid_longitude"" ;
rlon:axis = ""X"" ;
rlon:coordinate_defines = ""point"" ;
rlon:actual_range = -33.9900054931641, 40.5899810791016 ;
rlon:_Storage = ""chunked"" ;
rlon:_ChunkSizes = 50 ;
rlon:_DeflateLevel = 6 ;
rlon:_Endianness = ""little"" ;
char rotated_pole ;
rotated_pole:grid_mapping_name = ""rotated_latitude_longitude"" ;
rotated_pole:grid_north_pole_latitude = 42.5f ;
rotated_pole:grid_north_pole_longitude = 83.f ;
rotated_pole:north_pole_grid_longitude = 0.f ;
float snw(time, rlat, rlon) ;
snw:units = ""kg m-2"" ;
snw:long_name = ""Surface Snow Amount"" ;
snw:standard_name = ""surface_snow_amount"" ;
snw:realm = ""landIce land"" ;
snw:cell_measures = ""area: areacella"" ;
snw:coordinates = ""lon lat"" ;
snw:grid_mapping = ""rotated_pole"" ;
snw:level_desc = ""Height"" ;
snw:cell_methods = ""time: point"" ;
snw:_Storage = ""chunked"" ;
snw:_ChunkSizes = 250, 50, 50 ;
snw:_DeflateLevel = 6 ;
snw:_Endianness = ""little"" ;
double time(time) ;
time:long_name = ""time"" ;
time:standard_name = ""time"" ;
time:axis = ""T"" ;
time:calendar = ""gregorian"" ;
time:units = ""days since 2000-01-01 00:00:00"" ;
time:coordinate_defines = ""point"" ;
time:_Storage = ""chunked"" ;
time:_ChunkSizes = 250 ;
time:_DeflateLevel = 6 ;
time:_Endianness = ""little"" ;
// global attributes:
:Conventions = ""CF-1.6"" ;
:contact = ""paquin.dominique@ouranos.ca"" ;
:comment = ""CRCM5 v3331 0.22 deg AMNO22d2 L56 S17-15m ERA-INTERIM 0,75d PILSPEC PS3"" ;
:creation_date = ""2016-08-15 "" ;
:experiment = ""simulation de reference "" ;
:experiment_id = ""bbw"" ;
:driving_experiment = ""ERA-INTERIM "" ;
:driving_model_id = ""ECMWF-ERAINT "" ;
:driving_model_ensemble_member = ""r1i1p1 "" ;
:driving_experiment_name = ""evaluation "" ;
:institution = ""Ouranos "" ;
:institute_id = ""Our. "" ;
:model_id = ""OURANOS-CRCM5"" ;
:rcm_version_id = ""v3331"" ;
:project_id = """" ;
:ouranos_domain_name = ""AMNO22d2 "" ;
:ouranos_run_id = ""bbw OURALIB 1.3"" ;
:product = ""output"" ;
:reference = ""http://www.ouranos.ca"" ;
:history = ""Mon Nov 7 10:13:55 2016: ncks -O --chunk_policy g3d --cnk_dmn plev,1 --cnk_dmn rlon,50 --cnk_dmn rlat,50 --cnk_dmn time,250 /localscratch/72194520.gm-1r16-n04.guillimin.clumeq.ca/bbw/bbw/200001/nc4c_snw_bbw_200001_se.nc /localscratch/72194520.gm-1r16-n04.guillimin.clumeq.ca/bbw/bbw/200001/snw_bbw_200001_se.nc\n"",
""Mon Nov 7 10:13:50 2016: ncks -O --fl_fmt=netcdf4_classic -L 6 /localscratch/72194520.gm-1r16-n04.guillimin.clumeq.ca/bbw/bbw/200001/trim_snw_bbw_200001_se.nc /localscratch/72194520.gm-1r16-n04.guillimin.clumeq.ca/bbw/bbw/200001/nc4c_snw_bbw_200001_se.nc\n"",
""Mon Nov 7 10:13:48 2016: ncks -d time,2000-01-01 00:00:00,2000-01-31 23:59:59 /home/dpaquin1/postprod/bbw/transit2/200001/snw_bbw_200001_se.nc /localscratch/72194520.gm-1r16-n04.guillimin.clumeq.ca/bbw/bbw/200001/trim_snw_bbw_200001_se.nc\n"",
""Fri Nov 4 12:49:33 2016: ncks -4 -L 1 --no_tmp_fl -u -d time,2000-01-01 00:00,2000-02-01 00:00 /localscratch/72001487.gm-1r16-n04.guillimin.clumeq.ca/I5/snw_bbw_2000_se.nc /home/dpaquin1/postprod/bbw/work/200001/snw_bbw_200001_se.nc\n"",
""Fri Nov 4 12:48:52 2016: ncks -4 -L 1 /localscratch/72001487.gm-1r16-n04.guillimin.clumeq.ca/I5/snw_bbw_2000_se.nc /home/dpaquin1/postprod/bbw/work/2000/snw_bbw_2000_se.nc\n"",
""Fri Nov 4 12:48:44 2016: ncatted -O -a cell_measures,snw,o,c,area: areacella /localscratch/72001487.gm-1r16-n04.guillimin.clumeq.ca/I5/snw_bbw_2000_se.nc 25554_bbb"" ;
:NCO = ""4.4.4"" ;
:_SuperblockVersion = 2 ;
:_IsNetcdf4 = 1 ;
:_Format = ""netCDF-4 classic model"" ;
}
```
I guess the next option could be to go into xarray code to try to find what the problem is but I would need some direction for doing this.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,363299007
https://github.com/pydata/xarray/issues/2436#issuecomment-610020749,https://api.github.com/repos/pydata/xarray/issues/2436,610020749,MDEyOklzc3VlQ29tbWVudDYxMDAyMDc0OQ==,16655388,2020-04-06T20:31:10Z,2020-04-06T20:31:10Z,NONE,"
> #3498 added a new keyword argument to `open_mfdataset`, to choose which file to load to attributes from, can you try using that?
#3498 says something about a `master_file` keyword but `xr.open_mfdataset` does not accept it and I do not see anything else similar in the documentation except `attrs_file` but it is the first file by default and it did not return the calendar even when I specified `attrs_file=l_f[0]`.
> If this is the case, then to solve your original problem, you could also try using the `preprocess` argument to `open_mfdataset` to store the encoding somewhere where it won't be lost? i.e.
>
> ```python
> def store_encoding(ds):
> encoding = ds['time'].encoding
> ds.time.attrs['calendar_encoding'] = encoding
> return ds
>
> snw = xr.open_mfdataset(l_f, combine='nested', concat_dim='time',
> master_file=lf[0], preprocess=store_encoding)['snw']
> ```
I tried and it did not work ...
```
ipdb> ds = xr.open_mfdataset(l_f, combine='nested', concat_dim='time', preprocess=store_encoding)
ipdb> ds.time
array([cftime.DatetimeNoLeap(2006-01-01 00:00:00),
cftime.DatetimeNoLeap(2006-01-01 03:00:00),
cftime.DatetimeNoLeap(2006-01-01 06:00:00), ...,
cftime.DatetimeNoLeap(2006-12-31 15:00:00),
cftime.DatetimeNoLeap(2006-12-31 18:00:00),
cftime.DatetimeNoLeap(2006-12-31 21:00:00)], dtype=object)
Coordinates:
* time (time) object 2006-01-01 00:00:00 ... 2006-12-31 21:00:00
Attributes:
long_name: time
standard_name: time
axis: T
coordinate_defines: point
ipdb> ds.time.attrs
{'long_name': 'time', 'standard_name': 'time', 'axis': 'T', 'coordinate_defines': 'point'}
```
> > Related question but maybe out of line, is there any way to know that the snw.time type is cftime.DatetimeNoLeap (as it is visible in the overview of snw.time)?
>
> I'm not familiar with these classes, but presumably you mean more than just checking with `isinstance()`? e.g.
>
Yes, I was more thinking of something like `type(ds.time)` which would return `cftime.DatetimeNoLeap`
> ```python
> from cftime import DatetimeNoLeap
> print(isinstance(snw.time.values, cftime.DatetimeNoLeap))
> ```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,363299007
https://github.com/pydata/xarray/issues/2436#issuecomment-609998713,https://api.github.com/repos/pydata/xarray/issues/2436,609998713,MDEyOklzc3VlQ29tbWVudDYwOTk5ODcxMw==,16655388,2020-04-06T19:43:55Z,2020-04-06T19:43:55Z,NONE,"@TomNicholas I forgot about this sorry. I just made a quick check with the latest xarray master and I still have the problem ... see code.
Related question but maybe out of line, is there any way to know that the snw.time type is cftime.DatetimeNoLeap (as it is visible in the overview of `snw.time`)?
```
snw = xr.open_mfdataset(l_f, combine='nested', concat_dim='time')['snw']
ipdb> xr.__version__
'0.15.2.dev29+g6048356'
ipdb> snw.time
array([cftime.DatetimeNoLeap(2006-01-01 00:00:00),
cftime.DatetimeNoLeap(2006-01-01 03:00:00),
cftime.DatetimeNoLeap(2006-01-01 06:00:00), ...,
cftime.DatetimeNoLeap(2100-12-30 18:00:00),
cftime.DatetimeNoLeap(2100-12-30 21:00:00),
cftime.DatetimeNoLeap(2100-12-31 00:00:00)], dtype=object)
Coordinates:
* time (time) object 2006-01-01 00:00:00 ... 2100-12-31 00:00:00
Attributes:
long_name: time
standard_name: time
axis: T
coordinate_defines: point
ipdb> snw.time.encoding
{}
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,363299007
https://github.com/pydata/xarray/issues/2436#issuecomment-424436617,https://api.github.com/repos/pydata/xarray/issues/2436,424436617,MDEyOklzc3VlQ29tbWVudDQyNDQzNjYxNw==,16655388,2018-09-25T17:43:35Z,2018-09-25T17:43:35Z,NONE,@spencerkclark Yes I was looking at time.encoding. Following you example I did some tests and the problem is related to the fact that I am opening multiple netCDF files with open_mfdataset. Doing so time.encoding is empty while it is as expected when opening any of the files with open_dataset instead. ,"{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,363299007
https://github.com/pydata/xarray/issues/2436#issuecomment-424117789,https://api.github.com/repos/pydata/xarray/issues/2436,424117789,MDEyOklzc3VlQ29tbWVudDQyNDExNzc4OQ==,16655388,2018-09-24T20:42:20Z,2018-09-24T20:42:20Z,NONE,It would be ok but it is (or looks) empty when I use open_dataset() ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,363299007