home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

6 rows where author_association = "NONE" and issue = 363299007 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 2

  • sbiner 5
  • corentincarton 1

issue 1

  • save "encoding" when using open_mfdataset · 6 ✖

author_association 1

  • NONE · 6 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
888204512 https://github.com/pydata/xarray/issues/2436#issuecomment-888204512 https://api.github.com/repos/pydata/xarray/issues/2436 IC_kwDOAMm_X8408Ozg corentincarton 15659891 2021-07-28T10:35:42Z 2021-07-28T10:35:42Z NONE

Any update about this issue? I'm working on a code where I want to make sure I have consistent calendars for all my inputs. Couldn't we add an option to use the encoding from the first file in the list or something?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  save "encoding" when using open_mfdataset 363299007
610466323 https://github.com/pydata/xarray/issues/2436#issuecomment-610466323 https://api.github.com/repos/pydata/xarray/issues/2436 MDEyOklzc3VlQ29tbWVudDYxMDQ2NjMyMw== sbiner 16655388 2020-04-07T15:49:03Z 2020-04-07T15:49:03Z NONE

unfortunately, numpy does not allow us to put cftime object into dtypes (yet!), so ds.time.values is a numpy.ndarray with dtype object, containing cftime objects. To make the code work, use it with ds.time.values[0]. Of course, that won't help if the array contains objects of more than one type.

```python

import cftime isinstance(ds.time.values[0], cftime.DatetimeNoLeap) True type(ds.time.values[0]) <class 'cftime._cftime.DatetimeNoLeap'> I use the following, which seems to work for me but I thought something shorter and more elegant could be done ... def get_time_date_type(ds: Union[xr.Dataset, xr.DataArray]):

if ds.time.dtype == "O":
    if len(ds.time.shape) == 0:
        time0 = ds.time.item()
    else:
        time0 = ds.time[0].item()
    return type(time0)
else:
    return np.datetime64

```

In #3498, the original proposal was to name the new kwarg master_file, but later it was renamed to attrs_file. If l_f is a list of file paths, you used it correctly.

Yes, l_f is a list of file paths.

Before trying to help with debugging your issue: could you post the output of xr.show_versions()? That would help narrowing down on whether it's a dependency issue or a bug in xarray.

Here is the output: ``` In [2]: xr.show_versions()

INSTALLED VERSIONS

commit: None python: 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-514.2.2.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: fr_CA.UTF-8 LOCALE: fr_CA.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.1

xarray: 0.15.2.dev29+g6048356 pandas: 1.0.1 numpy: 1.18.1 scipy: 1.4.1 netCDF4: 1.4.2 pydap: None h5netcdf: None h5py: 2.9.0 Nio: None zarr: None cftime: 1.0.4.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.1 dask: 2.10.1 distributed: 2.10.0 matplotlib: 3.0.2 cartopy: 0.16.0 seaborn: 0.9.0 numbagg: None pint: 0.9 setuptools: 45.2.0.post20200210 pip: 20.0.2 conda: None pytest: 5.3.4 IPython: 7.8.0 sphinx: 2.4.0

```

Also, could you try to demonstrate your issue using a synthetic example? I've been trying to reproduce it with:

python In [14]: units = 'days since 2000-02-25' ...: times = cftime.num2date(np.arange(7), units=units, calendar='365_day') ...: for x in range(5): ...: ds = xr.DataArray( ...: np.arange(x, 7 + x).reshape(7, 1), ...: coords={"time": times, "x": [x]}, ...: dims=['time', "x"], ...: name='a', ...: ).to_dataset() ...: ds.to_netcdf(f'data-noleap{x}.nc') ...: paths = sorted(glob.glob("data-noleap*.nc")) ...: with xr.open_mfdataset(paths, combine="by_coords") as ds: ...: print(ds.time.encoding) ...: {'zlib': False, 'shuffle': False, 'complevel': 0, 'fletcher32': False, 'contiguous': True, 'chunksizes': None, 'source': '.../data-noleap0.nc', 'original_shape': (7,), 'dtype': dtype('int64'), 'units': 'days since 2000-02-25 00:00:00.000000', 'calendar': 'noleap'} I used your code and it works for me also. I noticed the synthetic file is NETCDF4 and has int time variable while my files are NETCDF4_CLASSIC and the time variable is double. I modified the synthetic code to produce NETCDF4_CLASSIC files with double time variable but it does not change the results: encoding does not have any values related to the calendar.

Here is an output ouf ncdump -hs for one file, maybe it could help.

``` 11:41 neree ~/travail/xarray_open_mfdataset_perd_time_attributes :ncdump -hs /expl6/climato/arch/bbw/series/200001/snw_bbw_200001_se.nc netcdf snw_bbw_200001_se { dimensions: height = 1 ; rlat = 300 ; rlon = 340 ; time = UNLIMITED ; // (248 currently) variables: double height(height) ; height:units = "m" ; height:long_name = "height" ; height:standard_name = "height" ; height:axis = "Z" ; height:positive = "up" ; height:coordinate_defines = "point" ; height:actual_range = 0., 0. ; height:_Storage = "chunked" ; height:_ChunkSizes = 1 ; height:_DeflateLevel = 6 ; height:_Endianness = "little" ; double lat(rlat, rlon) ; lat:units = "degrees_north" ; lat:long_name = "latitude" ; lat:standard_name = "latitude" ; lat:actual_range = 7.83627367019653, 82.5695037841797 ; lat:_Storage = "chunked" ; lat:_ChunkSizes = 50, 50 ; lat:_DeflateLevel = 6 ; lat:_Endianness = "little" ; double lon(rlat, rlon) ; lon:units = "degrees_east" ; lon:long_name = "longitude" ; lon:standard_name = "longitude" ; lon:actual_range = -179.972747802734, 179.975296020508 ; lon:_Storage = "chunked" ; lon:_ChunkSizes = 50, 50 ; lon:_DeflateLevel = 6 ; lon:_Endianness = "little" ; double rlat(rlat) ; rlat:long_name = "latitude in rotated pole grid" ; rlat:units = "degrees" ; rlat:standard_name = "grid_latitude" ; rlat:axis = "Y" ; rlat:coordinate_defines = "point" ; rlat:actual_range = -30.7100009918213, 35.0699996948242 ; rlat:_Storage = "chunked" ; rlat:_ChunkSizes = 50 ; rlat:_DeflateLevel = 6 ; rlat:_Endianness = "little" ; double rlon(rlon) ; rlon:long_name = "longitude in rotated pole grid" ; rlon:units = "degrees" ; rlon:standard_name = "grid_longitude" ; rlon:axis = "X" ; rlon:coordinate_defines = "point" ; rlon:actual_range = -33.9900054931641, 40.5899810791016 ; rlon:_Storage = "chunked" ; rlon:_ChunkSizes = 50 ; rlon:_DeflateLevel = 6 ; rlon:_Endianness = "little" ; char rotated_pole ; rotated_pole:grid_mapping_name = "rotated_latitude_longitude" ; rotated_pole:grid_north_pole_latitude = 42.5f ; rotated_pole:grid_north_pole_longitude = 83.f ; rotated_pole:north_pole_grid_longitude = 0.f ; float snw(time, rlat, rlon) ; snw:units = "kg m-2" ; snw:long_name = "Surface Snow Amount" ; snw:standard_name = "surface_snow_amount" ; snw:realm = "landIce land" ; snw:cell_measures = "area: areacella" ; snw:coordinates = "lon lat" ; snw:grid_mapping = "rotated_pole" ; snw:level_desc = "Height" ; snw:cell_methods = "time: point" ; snw:_Storage = "chunked" ; snw:_ChunkSizes = 250, 50, 50 ; snw:_DeflateLevel = 6 ; snw:_Endianness = "little" ; double time(time) ; time:long_name = "time" ; time:standard_name = "time" ; time:axis = "T" ; time:calendar = "gregorian" ; time:units = "days since 2000-01-01 00:00:00" ; time:coordinate_defines = "point" ; time:_Storage = "chunked" ; time:_ChunkSizes = 250 ; time:_DeflateLevel = 6 ; time:_Endianness = "little" ;

// global attributes: :Conventions = "CF-1.6" ; :contact = "paquin.dominique@ouranos.ca" ; :comment = "CRCM5 v3331 0.22 deg AMNO22d2 L56 S17-15m ERA-INTERIM 0,75d PILSPEC PS3" ; :creation_date = "2016-08-15 " ; :experiment = "simulation de reference " ; :experiment_id = "bbw" ; :driving_experiment = "ERA-INTERIM " ; :driving_model_id = "ECMWF-ERAINT " ; :driving_model_ensemble_member = "r1i1p1 " ; :driving_experiment_name = "evaluation " ; :institution = "Ouranos " ; :institute_id = "Our. " ; :model_id = "OURANOS-CRCM5" ; :rcm_version_id = "v3331" ; :project_id = "" ; :ouranos_domain_name = "AMNO22d2 " ; :ouranos_run_id = "bbw OURALIB 1.3" ; :product = "output" ; :reference = "http://www.ouranos.ca" ; :history = "Mon Nov 7 10:13:55 2016: ncks -O --chunk_policy g3d --cnk_dmn plev,1 --cnk_dmn rlon,50 --cnk_dmn rlat,50 --cnk_dmn time,250 /localscratch/72194520.gm-1r16-n04.guillimin.clumeq.ca/bbw/bbw/200001/nc4c_snw_bbw_200001_se.nc /localscratch/72194520.gm-1r16-n04.guillimin.clumeq.ca/bbw/bbw/200001/snw_bbw_200001_se.nc\n", "Mon Nov 7 10:13:50 2016: ncks -O --fl_fmt=netcdf4_classic -L 6 /localscratch/72194520.gm-1r16-n04.guillimin.clumeq.ca/bbw/bbw/200001/trim_snw_bbw_200001_se.nc /localscratch/72194520.gm-1r16-n04.guillimin.clumeq.ca/bbw/bbw/200001/nc4c_snw_bbw_200001_se.nc\n", "Mon Nov 7 10:13:48 2016: ncks -d time,2000-01-01 00:00:00,2000-01-31 23:59:59 /home/dpaquin1/postprod/bbw/transit2/200001/snw_bbw_200001_se.nc /localscratch/72194520.gm-1r16-n04.guillimin.clumeq.ca/bbw/bbw/200001/trim_snw_bbw_200001_se.nc\n", "Fri Nov 4 12:49:33 2016: ncks -4 -L 1 --no_tmp_fl -u -d time,2000-01-01 00:00,2000-02-01 00:00 /localscratch/72001487.gm-1r16-n04.guillimin.clumeq.ca/I5/snw_bbw_2000_se.nc /home/dpaquin1/postprod/bbw/work/200001/snw_bbw_200001_se.nc\n", "Fri Nov 4 12:48:52 2016: ncks -4 -L 1 /localscratch/72001487.gm-1r16-n04.guillimin.clumeq.ca/I5/snw_bbw_2000_se.nc /home/dpaquin1/postprod/bbw/work/2000/snw_bbw_2000_se.nc\n", "Fri Nov 4 12:48:44 2016: ncatted -O -a cell_measures,snw,o,c,area: areacella /localscratch/72001487.gm-1r16-n04.guillimin.clumeq.ca/I5/snw_bbw_2000_se.nc 25554_bbb" ; :NCO = "4.4.4" ; :_SuperblockVersion = 2 ; :_IsNetcdf4 = 1 ; :_Format = "netCDF-4 classic model" ; } ``` I guess the next option could be to go into xarray code to try to find what the problem is but I would need some direction for doing this.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  save "encoding" when using open_mfdataset 363299007
610020749 https://github.com/pydata/xarray/issues/2436#issuecomment-610020749 https://api.github.com/repos/pydata/xarray/issues/2436 MDEyOklzc3VlQ29tbWVudDYxMDAyMDc0OQ== sbiner 16655388 2020-04-06T20:31:10Z 2020-04-06T20:31:10Z NONE

3498 added a new keyword argument to open_mfdataset, to choose which file to load to attributes from, can you try using that?

3498 says something about a master_file keyword but xr.open_mfdataset does not accept it and I do not see anything else similar in the documentation except attrs_file but it is the first file by default and it did not return the calendar even when I specified attrs_file=l_f[0].

If this is the case, then to solve your original problem, you could also try using the preprocess argument to open_mfdataset to store the encoding somewhere where it won't be lost? i.e.

```python def store_encoding(ds): encoding = ds['time'].encoding ds.time.attrs['calendar_encoding'] = encoding return ds

snw = xr.open_mfdataset(l_f, combine='nested', concat_dim='time', master_file=lf[0], preprocess=store_encoding)['snw'] ```

I tried and it did not work ... ipdb> ds = xr.open_mfdataset(l_f, combine='nested', concat_dim='time', preprocess=store_encoding) ipdb> ds.time <xarray.DataArray 'time' (time: 2920)> array([cftime.DatetimeNoLeap(2006-01-01 00:00:00), cftime.DatetimeNoLeap(2006-01-01 03:00:00), cftime.DatetimeNoLeap(2006-01-01 06:00:00), ..., cftime.DatetimeNoLeap(2006-12-31 15:00:00), cftime.DatetimeNoLeap(2006-12-31 18:00:00), cftime.DatetimeNoLeap(2006-12-31 21:00:00)], dtype=object) Coordinates: * time (time) object 2006-01-01 00:00:00 ... 2006-12-31 21:00:00 Attributes: long_name: time standard_name: time axis: T coordinate_defines: point ipdb> ds.time.attrs {'long_name': 'time', 'standard_name': 'time', 'axis': 'T', 'coordinate_defines': 'point'}

Related question but maybe out of line, is there any way to know that the snw.time type is cftime.DatetimeNoLeap (as it is visible in the overview of snw.time)?

I'm not familiar with these classes, but presumably you mean more than just checking with isinstance()? e.g.

Yes, I was more thinking of something like type(ds.time) which would return cftime.DatetimeNoLeap

python from cftime import DatetimeNoLeap print(isinstance(snw.time.values, cftime.DatetimeNoLeap))

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  save "encoding" when using open_mfdataset 363299007
609998713 https://github.com/pydata/xarray/issues/2436#issuecomment-609998713 https://api.github.com/repos/pydata/xarray/issues/2436 MDEyOklzc3VlQ29tbWVudDYwOTk5ODcxMw== sbiner 16655388 2020-04-06T19:43:55Z 2020-04-06T19:43:55Z NONE

@TomNicholas I forgot about this sorry. I just made a quick check with the latest xarray master and I still have the problem ... see code.

Related question but maybe out of line, is there any way to know that the snw.time type is cftime.DatetimeNoLeap (as it is visible in the overview of snw.time)?

snw = xr.open_mfdataset(l_f, combine='nested', concat_dim='time')['snw'] ipdb> xr.__version__ '0.15.2.dev29+g6048356' ipdb> snw.time <xarray.DataArray 'time' (time: 277393)> array([cftime.DatetimeNoLeap(2006-01-01 00:00:00), cftime.DatetimeNoLeap(2006-01-01 03:00:00), cftime.DatetimeNoLeap(2006-01-01 06:00:00), ..., cftime.DatetimeNoLeap(2100-12-30 18:00:00), cftime.DatetimeNoLeap(2100-12-30 21:00:00), cftime.DatetimeNoLeap(2100-12-31 00:00:00)], dtype=object) Coordinates: * time (time) object 2006-01-01 00:00:00 ... 2100-12-31 00:00:00 Attributes: long_name: time standard_name: time axis: T coordinate_defines: point ipdb> snw.time.encoding {}

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  save "encoding" when using open_mfdataset 363299007
424436617 https://github.com/pydata/xarray/issues/2436#issuecomment-424436617 https://api.github.com/repos/pydata/xarray/issues/2436 MDEyOklzc3VlQ29tbWVudDQyNDQzNjYxNw== sbiner 16655388 2018-09-25T17:43:35Z 2018-09-25T17:43:35Z NONE

@spencerkclark Yes I was looking at time.encoding. Following you example I did some tests and the problem is related to the fact that I am opening multiple netCDF files with open_mfdataset. Doing so time.encoding is empty while it is as expected when opening any of the files with open_dataset instead.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  save "encoding" when using open_mfdataset 363299007
424117789 https://github.com/pydata/xarray/issues/2436#issuecomment-424117789 https://api.github.com/repos/pydata/xarray/issues/2436 MDEyOklzc3VlQ29tbWVudDQyNDExNzc4OQ== sbiner 16655388 2018-09-24T20:42:20Z 2018-09-24T20:42:20Z NONE

It would be ok but it is (or looks) empty when I use open_dataset()

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  save "encoding" when using open_mfdataset 363299007

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.082ms · About: xarray-datasette