home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

32 rows where user = 6063709 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, reactions, created_at (date), updated_at (date)

issue 17

  • decode_cf called on mfdataset throws error: 'Array' object has no attribute 'tolist' 4
  • xindexes set incorrectly for mfdataset with dask client and parallel=True 4
  • Query about concat 3
  • holoviews / bokeh doesn't like cftime coords 3
  • BUG: Fixes GH3215 3
  • Adding resample functionality to CFTimeIndex 2
  • Implement shift for CFTimeIndex 2
  • cftime_range does not support default cftime.datetime formatted output strings 2
  • add scatter plot method to dataset 1
  • Wall time much greater than CPU time 1
  • Support for netcdf4/hdf5 compression 1
  • Concatenate across multiple dimensions with open_mfdataset 1
  • Time bounds returned after an operation with resample-method 1
  • reduce on groupby auto-adds axis argument and complains when axis argument is specified 1
  • cftime_range fails for base cftime.datetime object 1
  • sel slice fails with cftime index when using dask.distributed client 1
  • open_mfdataset fails with cftime index when using parallel and dask delayed client 1

user 1

  • aidanheerdegen · 32 ✖

author_association 1

  • CONTRIBUTOR 32
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1026580368 https://github.com/pydata/xarray/issues/6226#issuecomment-1026580368 https://api.github.com/repos/pydata/xarray/issues/6226 IC_kwDOAMm_X849MF-Q aidanheerdegen 6063709 2022-02-01T08:19:46Z 2022-02-01T08:31:17Z CONTRIBUTOR

Update: It is pandas that is the critical package. Pinning distributed<2022.01.0, xarray<0.21.0 and cftime<1.5.2 didn't fix it, but adding pandas<1.4.0 makes the above test pass. Will now try unpinning other packages and confirm it is pandas that is the issue.

Edit: Confirmed it is pandas==1.4.0 that causes this issue. Following version combination does not produce this error: ``` INSTALLED VERSIONS


commit: None python: 3.9.10 | packaged by conda-forge | (main, Jan 30 2022, 18:04:04) [GCC 9.4.0] python-bits: 64 OS: Linux OS-release: 4.18.0-348.2.1.el8.nci.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_AU.utf8 LANG: en_AU.ISO8859-1 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.7.4

xarray: 0.21.0 pandas: 1.3.5 numpy: 1.22.1 scipy: 1.7.3 netCDF4: 1.5.6 pydap: installed h5netcdf: 0.13.1 h5py: 3.6.0 Nio: None zarr: 2.10.3 cftime: 1.5.2 nc_time_axis: 1.4.0 PseudoNetCDF: None rasterio: 1.2.6 cfgrib: 0.9.10.0 iris: 3.1.0 bottleneck: 1.3.2 dask: 2022.01.1 distributed: 2022.01.1 matplotlib: 3.5.1 cartopy: 0.19.0.post1 seaborn: 0.11.2 numbagg: None fsspec: 2022.01.0 cupy: 10.1.0 pint: 0.18 sparse: 0.13.0 setuptools: 59.8.0 pip: 21.3.1 conda: 4.11.0 pytest: 6.2.5 IPython: 8.0.1 sphinx: 4.4.0 ```

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset fails with cftime index when using parallel and dask delayed client 1120276279
895624119 https://github.com/pydata/xarray/issues/5686#issuecomment-895624119 https://api.github.com/repos/pydata/xarray/issues/5686 IC_kwDOAMm_X841YiO3 aidanheerdegen 6063709 2021-08-09T23:44:10Z 2021-08-09T23:44:10Z CONTRIBUTOR

Thanks for the super fast fix. I have confirmed this fixes #5677

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xindexes set incorrectly for mfdataset with dask client and parallel=True 963688125
895123758 https://github.com/pydata/xarray/issues/5686#issuecomment-895123758 https://api.github.com/repos/pydata/xarray/issues/5686 IC_kwDOAMm_X841WoEu aidanheerdegen 6063709 2021-08-09T10:46:57Z 2021-08-09T10:46:57Z CONTRIBUTOR

A colleague suggested it might be some sort of pickling issue, passing the generated object back to the main thread, but it was just speculation and I had no idea how to test that.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xindexes set incorrectly for mfdataset with dask client and parallel=True 963688125
895122645 https://github.com/pydata/xarray/issues/5686#issuecomment-895122645 https://api.github.com/repos/pydata/xarray/issues/5686 IC_kwDOAMm_X841WnzV aidanheerdegen 6063709 2021-08-09T10:44:42Z 2021-08-09T10:44:42Z CONTRIBUTOR

Thanks again for the prompt response @spencerkclark. Yes your MCVE is more (less?) M than mine. Thanks.

Perhaps I shouldn't have started a new issue, but it seemed the specific problem with .sel was just a knock on effect from this cftime issue.

I should have said in #5677 that as far as I could tell I was using cftime=1.5.0.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xindexes set incorrectly for mfdataset with dask client and parallel=True 963688125
895120951 https://github.com/pydata/xarray/issues/5686#issuecomment-895120951 https://api.github.com/repos/pydata/xarray/issues/5686 IC_kwDOAMm_X841WnY3 aidanheerdegen 6063709 2021-08-09T10:41:18Z 2021-08-09T10:41:18Z CONTRIBUTOR

Thanks for the updated report! Could you kindly share the full error traceback?

Sorry, see below python Traceback (most recent call last): File "/g/data/v45/aph502/helpdesk/fromgithub/20210804-Navid/mcve.py", line 28, in <module> assert (index_microseconds == xr.open_mfdataset('2???.nc', parallel=True).xindexes['time'].array.asi8).all() File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/xarray/coding/cftimeindex.py", line 683, in asi8 [ File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/xarray/coding/cftimeindex.py", line 684, in <listcomp> _total_microseconds(exact_cftime_datetime_difference(epoch, date)) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/xarray/core/resample_cftime.py", line 358, in exact_cftime_datetime_difference seconds = b.replace(microsecond=0) - a.replace(microsecond=0) File "src/cftime/_cftime.pyx", line 1369, in cftime._cftime.datetime.__sub__ TypeError: cannot compute the time difference between dates with different calendars

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xindexes set incorrectly for mfdataset with dask client and parallel=True 963688125
894981650 https://github.com/pydata/xarray/issues/5677#issuecomment-894981650 https://api.github.com/repos/pydata/xarray/issues/5677 IC_kwDOAMm_X841WFYS aidanheerdegen 6063709 2021-08-09T06:30:26Z 2021-08-09T06:30:26Z CONTRIBUTOR

Thanks for the very prompt response @spencerkclark, the weekend intervened but I have since narrowed it down further so have submitted a new issue

https://github.com/pydata/xarray/issues/5686

Will close this one.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  sel slice fails with cftime index when using dask.distributed client 962467654
673840314 https://github.com/pydata/xarray/issues/4337#issuecomment-673840314 https://api.github.com/repos/pydata/xarray/issues/4337 MDEyOklzc3VlQ29tbWVudDY3Mzg0MDMxNA== aidanheerdegen 6063709 2020-08-14T01:56:49Z 2020-08-14T01:56:49Z CONTRIBUTOR

Turns zero-padding years is platform dependent.

https://bugs.python.org/issue13305

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cftime_range does not support default cftime.datetime formatted output strings 677307460
673783177 https://github.com/pydata/xarray/issues/4336#issuecomment-673783177 https://api.github.com/repos/pydata/xarray/issues/4336 MDEyOklzc3VlQ29tbWVudDY3Mzc4MzE3Nw== aidanheerdegen 6063709 2020-08-14T01:05:33Z 2020-08-14T01:05:33Z CONTRIBUTOR

Thanks for the link to the tests. Your pytest-fu is strong! You're right, I didn't spot those.

I guess my philosophical point was that this throws an error: python import cftime import xarray date = cftime.datetime(10,1,1) xarray.cftime_range(date, periods=3, freq='Y') but this doesn't: python import cftime import xarray date = cftime.datetime(10,1,1).isoformat() xarray.cftime_range(date, periods=3, freq='Y') due to the latter being transformed to cftime.DatetimeGregorian as there is a default calendar attribute for cftime_range:

https://github.com/pydata/xarray/blob/cafab46aac8f7a073a32ec5aa47e213a9810ed54/xarray/coding/cftime_offsets.py#L788

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cftime_range fails for base cftime.datetime object 677296128
673769082 https://github.com/pydata/xarray/issues/4337#issuecomment-673769082 https://api.github.com/repos/pydata/xarray/issues/4337 MDEyOklzc3VlQ29tbWVudDY3Mzc2OTA4Mg== aidanheerdegen 6063709 2020-08-14T00:08:22Z 2020-08-14T00:08:22Z CONTRIBUTOR

Thanks for the detailed response @spencerkclark

Seems that we're using different versions of python as my datetime implementation doesn't produce a zero padded year. I guess that was a bug that has also been fixed: ```python In [1]: import datetime

In [2]: datetime.datetime(1, 1, 1).strftime("%Y-%m-%dT%H:%M:%S") Out[2]: '1-01-01T00:00:00' `` I agree that theisoformatis a good solution. I didn't know it was available in newer versions. I will upgrade mycftimeversion to>=1.1.3` to access it.

Thanks for being open to supporting the default output. My only goal is to remove barriers to productivity and remove sources of confusion as I want these tools to be used, and embraced, as widely as possible.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cftime_range does not support default cftime.datetime formatted output strings 677307460
577936734 https://github.com/pydata/xarray/issues/3717#issuecomment-577936734 https://api.github.com/repos/pydata/xarray/issues/3717 MDEyOklzc3VlQ29tbWVudDU3NzkzNjczNA== aidanheerdegen 6063709 2020-01-24T00:08:50Z 2020-01-24T00:08:50Z CONTRIBUTOR

Thanks @ScottWales.

So using dim and not axis works, as you might expect given Scott's explanation.

So for anyone having this issue, a work-around is to specify dim if possible.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  reduce on groupby auto-adds axis argument and complains when axis argument is specified 553930127
525588616 https://github.com/pydata/xarray/pull/3220#issuecomment-525588616 https://api.github.com/repos/pydata/xarray/issues/3220 MDEyOklzc3VlQ29tbWVudDUyNTU4ODYxNg== aidanheerdegen 6063709 2019-08-28T05:20:40Z 2019-08-28T05:20:40Z CONTRIBUTOR

Thanks @shoyer. I have added a test. It contains no assertion, but does fail with AttributeError: 'Array' object has no attribute 'tolist' without the code update. Is that sufficient?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  BUG: Fixes GH3215 481005183
525543137 https://github.com/pydata/xarray/pull/3220#issuecomment-525543137 https://api.github.com/repos/pydata/xarray/issues/3220 MDEyOklzc3VlQ29tbWVudDUyNTU0MzEzNw== aidanheerdegen 6063709 2019-08-28T01:15:25Z 2019-08-28T01:15:25Z CONTRIBUTOR

I can save the decoded version to a file and read it back in and it throws the error. I suppose this is traversing a different code path ```

xarray.decode_cf(xarray.Dataset.from_dict(ds.to_dict())) <xarray.Dataset> Dimensions: (time: 5) Coordinates: * time (time) object 2198-07-02 12:00:00 ... 2202-07-02 12:00:00 Data variables: average_T1 (time) datetime64[ns] ... xarray.decode_cf(xarray.Dataset.from_dict(ds.to_dict())).to_netcdf('tmp.nc') xarray.decode_cf(xarray.open_mfdataset('tmp.nc',decode_cf=False)) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-19.07/lib/python3.6/site-packages/xarray/conventions.py", line 479, in decode_cf decode_coords, drop_variables=drop_variables, use_cftime=use_cftime) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-19.07/lib/python3.6/site-packages/xarray/conventions.py", line 401, in decode_cf_variables stack_char_dim=stack_char_dim, use_cftime=use_cftime) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-19.07/lib/python3.6/site-packages/xarray/conventions.py", line 306, in decode_cf_variable var = coder.decode(var, name=name) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-19.07/lib/python3.6/site-packages/xarray/coding/times.py", line 419, in decode self.use_cftime) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-19.07/lib/python3.6/site-packages/xarray/coding/times.py", line 90, in _decode_cf_datetime_dtype last_item(values) or [0]]) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-19.07/lib/python3.6/site-packages/xarray/core/formatting.py", line 99, in last_item return np.ravel(array[indexer]).tolist() AttributeError: 'Array' object has no attribute 'tolist'

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  BUG: Fixes GH3215 481005183
525538937 https://github.com/pydata/xarray/pull/3220#issuecomment-525538937 https://api.github.com/repos/pydata/xarray/issues/3220 MDEyOklzc3VlQ29tbWVudDUyNTUzODkzNw== aidanheerdegen 6063709 2019-08-28T00:53:53Z 2019-08-28T00:53:53Z CONTRIBUTOR

HI @max-sixty. I am working on making a test, but when I serialise my test file so it is suitable for inclusion in a test it doesn't throw an error! ```

ds = xarray.open_mfdataset('temp_049.nc', decode_cf=False) xarray.decode_cf(ds) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-19.07/lib/python3.6/site-packages/xarray/conventions.py", line 479, in decode_cf decode_coords, drop_variables=drop_variables, use_cftime=use_cftime) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-19.07/lib/python3.6/site-packages/xarray/conventions.py", line 401, in decode_cf_variables stack_char_dim=stack_char_dim, use_cftime=use_cftime) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-19.07/lib/python3.6/site-packages/xarray/conventions.py", line 306, in decode_cf_variable var = coder.decode(var, name=name) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-19.07/lib/python3.6/site-packages/xarray/coding/times.py", line 419, in decode self.use_cftime) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-19.07/lib/python3.6/site-packages/xarray/coding/times.py", line 90, in _decode_cf_datetime_dtype last_item(values) or [0]]) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-19.07/lib/python3.6/site-packages/xarray/core/formatting.py", line 99, in last_item return np.ravel(array[indexer]).tolist() AttributeError: 'Array' object has no attribute 'tolist' xarray.decode_cf(xarray.Dataset.from_dict(ds.to_dict())) <xarray.Dataset> Dimensions: (time: 5) Coordinates: * time (time) object 2198-07-02 12:00:00 ... 2202-07-02 12:00:00 Data variables: average_T1 (time) datetime64[ns] ... ds.identical(xarray.Dataset.from_dict(ds.to_dict())) True

`` SeemsDataset.identicalis failing to find something that traversingdecode_cf` does. Odd.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  BUG: Fixes GH3215 481005183
521524233 https://github.com/pydata/xarray/issues/3215#issuecomment-521524233 https://api.github.com/repos/pydata/xarray/issues/3215 MDEyOklzc3VlQ29tbWVudDUyMTUyNDIzMw== aidanheerdegen 6063709 2019-08-15T05:56:18Z 2019-08-15T05:57:40Z CONTRIBUTOR

Confirmed that NUMPY_EXPERIMENTAL_ARRAY_FUNCTION=0 fixes this issue for me.

Have submitted a PR with your suggested fix https://github.com/pydata/xarray/pull/3220

Confirmed the submitted code fixes my issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  decode_cf called on mfdataset throws error: 'Array' object has no attribute 'tolist' 480512400
521504317 https://github.com/pydata/xarray/issues/3215#issuecomment-521504317 https://api.github.com/repos/pydata/xarray/issues/3215 MDEyOklzc3VlQ29tbWVudDUyMTUwNDMxNw== aidanheerdegen 6063709 2019-08-15T03:55:00Z 2019-08-15T03:55:00Z CONTRIBUTOR

Thanks for the explanation.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  decode_cf called on mfdataset throws error: 'Array' object has no attribute 'tolist' 480512400
521502195 https://github.com/pydata/xarray/issues/3215#issuecomment-521502195 https://api.github.com/repos/pydata/xarray/issues/3215 MDEyOklzc3VlQ29tbWVudDUyMTUwMjE5NQ== aidanheerdegen 6063709 2019-08-15T03:40:27Z 2019-08-15T03:40:27Z CONTRIBUTOR

Thanks @shoyer.

I still don't understand the different code paths between decode_cf=True and decode_cf=False + explicit call to xarray.decode_cf()

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  decode_cf called on mfdataset throws error: 'Array' object has no attribute 'tolist' 480512400
521456728 https://github.com/pydata/xarray/issues/3215#issuecomment-521456728 https://api.github.com/repos/pydata/xarray/issues/3215 MDEyOklzc3VlQ29tbWVudDUyMTQ1NjcyOA== aidanheerdegen 6063709 2019-08-14T23:28:35Z 2019-08-14T23:28:35Z CONTRIBUTOR

Seems that this is by design, from here

Dask Array doesn’t implement operations like tolist that would be very inefficient for larger datasets. Likewise, it is very inefficient to iterate over a Dask array with for loops

So dask has never had a tolist method, so in one case the object is a dask array, but not in the other case.

I still don't understand why it fails when decode_cf is called separately. Suggests there is a different code path as all underlying packages are identical.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  decode_cf called on mfdataset throws error: 'Array' object has no attribute 'tolist' 480512400
441486969 https://github.com/pydata/xarray/issues/470#issuecomment-441486969 https://api.github.com/repos/pydata/xarray/issues/470 MDEyOklzc3VlQ29tbWVudDQ0MTQ4Njk2OQ== aidanheerdegen 6063709 2018-11-26T00:14:46Z 2018-11-26T00:14:46Z CONTRIBUTOR

In the absence of a dedicated method, it is possible to obtain a scatterplot with the keyword options to plot.line(): rho_so_remap.plot.line(marker='o',linewidth=0.)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  add scatter plot method to dataset 94787306
410950987 https://github.com/pydata/xarray/issues/2231#issuecomment-410950987 https://api.github.com/repos/pydata/xarray/issues/2231 MDEyOklzc3VlQ29tbWVudDQxMDk1MDk4Nw== aidanheerdegen 6063709 2018-08-07T06:43:26Z 2018-08-07T06:59:22Z CONTRIBUTOR

Sorry, xarray doesn’t handle time bounds directly, nor does it update metadata according to cfconventions. These were intentional design choices to keep xarray simple, but in principle you could layer cf convention handling on top of xarray.

Nor does it bring along bounds variables when extracting variables from a dataset, e.g. double time(time) ; time:long_name = "time" ; time:cartesian_axis = "T" ; time:calendar_type = "NOLEAP" ; time:bounds = "time_bounds" ; time:units = "days since 0001-01-01" ; time:calendar = "NOLEAP" ; When a variable using the time dimension is extracted from a Dataset, the time_bounds variable is missing.

Is this also an intentional choice or something that xarray could/should support? Or does already and I've missed how to invoke this.

Edit: I've just realised, how is xarray supposed to "bring along" another variable in a DataArray object? I'll leave this query as maybe there is a solution? Have a bounds attribute similar to the coords attribute?

Is this just a dupe of https://github.com/pydata/xarray/issues/1475 ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Time bounds returned after an operation with resample-method 332018176
404500806 https://github.com/pydata/xarray/issues/2164#issuecomment-404500806 https://api.github.com/repos/pydata/xarray/issues/2164 MDEyOklzc3VlQ29tbWVudDQwNDUwMDgwNg== aidanheerdegen 6063709 2018-07-12T12:50:09Z 2018-07-12T12:50:09Z CONTRIBUTOR

Sounds like a great idea.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  holoviews / bokeh doesn't like cftime coords 324740017
404468603 https://github.com/pydata/xarray/issues/2164#issuecomment-404468603 https://api.github.com/repos/pydata/xarray/issues/2164 MDEyOklzc3VlQ29tbWVudDQwNDQ2ODYwMw== aidanheerdegen 6063709 2018-07-12T10:35:04Z 2018-07-12T10:35:04Z CONTRIBUTOR

Hi @rabernat,

I apologise if what I said is discouraging. I didn't intend it that way. It was the result of exasperation, as I think array is a fantastic tool, and I thought with the development of cftime support all barriers to widespread adoption had pretty much been overcome. When I said "another" it was in reference to the previous barrier of not supporting long time, or old time, indices which had since been overcome

As far as recommending, it is to the researchers at the centre of excellence where I am one of the people who is paid to support climate models and the support infrastructure to run and analyse their outputs. I guess I've outed myself as one of those paid computational support staff you referred to.

My initial comment above was a clumsy attempt to highlight what I thought was an important feature to support to further increase xarray adoption. From my perspective as someone who has to support users I'm often having to decide what I think the majority of users will be able to use efficiently, taking into account very wide levels of expertise and motivation. Before the cftime upgrades I did not wholeheartedly evangelise for xarray adoption because I knew there were many cases where it was not simple and easy to use. For every edge and corner case I have to support users when they encounter them. In some ways, having a tool that can do such amazing things as xarray, but which don't work in some circumstances for some datasets is very frustrating for users. It can take a lot of work to find out what doesn't work.

Having said which, we're currently doing half way through a 2 hour training session for xarray for researchers in the CoE who are interested, but not being able to easily plot cftime datasets will harm adoption, and all those who are volunteering their time developing xarray want it to be adopted as widely as possible right?

Thanks for the pointer to the contributor guide, I did read it, and I will try and find some time to make a positive contribution to xarray. I had started down that path already (https://github.com/pydata/xarray/issues/2244)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  holoviews / bokeh doesn't like cftime coords 324740017
404404401 https://github.com/pydata/xarray/issues/2164#issuecomment-404404401 https://api.github.com/repos/pydata/xarray/issues/2164 MDEyOklzc3VlQ29tbWVudDQwNDQwNDQwMQ== aidanheerdegen 6063709 2018-07-12T06:32:20Z 2018-07-12T06:32:20Z CONTRIBUTOR

Darn. Just when I thought the time stuff was sorted. This is (yet another) deal breaker as far as recommending mass adoption goes.

Is there an estimate when, or if, cftime indexes will be supported by xarray's .plot() method?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  holoviews / bokeh doesn't like cftime coords 324740017
399421816 https://github.com/pydata/xarray/issues/2244#issuecomment-399421816 https://api.github.com/repos/pydata/xarray/issues/2244 MDEyOklzc3VlQ29tbWVudDM5OTQyMTgxNg== aidanheerdegen 6063709 2018-06-22T12:11:44Z 2018-06-22T12:11:44Z CONTRIBUTOR

Great! Thanks @spencerkclark

I agree this is an excellent work around.

One of the most frustrating aspect of using an otherwise brilliant tool like xarray (and other python packages) is knowing what NOT to try and do. It is otherwise almost magical in the things it does well, so there is something of an expectation that it can and should do everything. If this work-around was part of the official documentation (until shift is implemented) that would be very useful.

I am often in the position of recommending software solutions to scientists who need to analyse their data, and the date issue with xarray (which CFTime has mostly alleviated) always made me hesitant to recommend it whole-heartedly. So thanks for your part in adding this to xarray.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement shift for CFTimeIndex  334778045
399352855 https://github.com/pydata/xarray/issues/2244#issuecomment-399352855 https://api.github.com/repos/pydata/xarray/issues/2244 MDEyOklzc3VlQ29tbWVudDM5OTM1Mjg1NQ== aidanheerdegen 6063709 2018-06-22T07:44:03Z 2018-06-22T07:44:03Z CONTRIBUTOR

Submitted issue brought up in this thread

https://github.com/pydata/xarray/issues/2191#issuecomment-399337976

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement shift for CFTimeIndex  334778045
399320016 https://github.com/pydata/xarray/issues/2191#issuecomment-399320016 https://api.github.com/repos/pydata/xarray/issues/2191 MDEyOklzc3VlQ29tbWVudDM5OTMyMDAxNg== aidanheerdegen 6063709 2018-06-22T04:51:16Z 2018-06-22T04:51:16Z CONTRIBUTOR

Does this need it's own issue then, so it doesn't get lost?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Adding resample functionality to CFTimeIndex 327089588
399319465 https://github.com/pydata/xarray/issues/2159#issuecomment-399319465 https://api.github.com/repos/pydata/xarray/issues/2159 MDEyOklzc3VlQ29tbWVudDM5OTMxOTQ2NQ== aidanheerdegen 6063709 2018-06-22T04:47:36Z 2018-06-22T04:47:36Z CONTRIBUTOR

👍 for this feature

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Concatenate across multiple dimensions with open_mfdataset 324350248
399315302 https://github.com/pydata/xarray/issues/2191#issuecomment-399315302 https://api.github.com/repos/pydata/xarray/issues/2191 MDEyOklzc3VlQ29tbWVudDM5OTMxNTMwMg== aidanheerdegen 6063709 2018-06-22T04:12:11Z 2018-06-22T04:45:03Z CONTRIBUTOR

I'm not sure if my issue belongs in here, but I didn't want to create a new Issue (there are already 455 open ones).

I am experimenting with the new CFTimeIndex functionality (thanks heaps BTW! That was a mammoth effort if the PR thread is anything to go by).

I am trying to shift a time index as I need to align datasets to a common start point. So using the example code above,

```python da.time.get_index('time').shift(1,'D')


NotImplementedError Traceback (most recent call last) <ipython-input-71-db48b2fbb340> in <module>() ----> 1 da.time.get_index('time').shift(1,'D')

/g/data3/hh5/public/apps/miniconda3/envs/analysis27-18.04/lib/python2.7/site-packages/pandas/core/indexes/base.pyc in shift(self, periods, freq) 2627 """ 2628 raise NotImplementedError("Not supported for type %s" % -> 2629 type(self).name) 2630 2631 def argsort(self, args, *kwargs):

NotImplementedError: Not supported for type CFTimeIndex ``` Is this not implemented because it might require resampling?

I ask because this works: python times[0] + pd.Timedelta('365 days') cftime.DatetimeNoLeap(2, 1, 1, 0, 0, 0, 0, -1, 1)

I guess I am asking, if I want to shift a time index is the best (only?) way currently is to loop over all the individual elements of the index and add a time offset to each?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Adding resample functionality to CFTimeIndex 327089588
134420461 https://github.com/pydata/xarray/issues/548#issuecomment-134420461 https://api.github.com/repos/pydata/xarray/issues/548 MDEyOklzc3VlQ29tbWVudDEzNDQyMDQ2MQ== aidanheerdegen 6063709 2015-08-25T00:05:40Z 2015-08-25T04:09:18Z CONTRIBUTOR

Brilliant. Thanks. I looked into the code but thought the encoding information was being stripped out.

So I've confirmed xray will round-trip fine. Shallow copies also round trip. Similarly making a new dataset from a variable with encoding information preserves that information and will output properly.

``` python % tmp = xray.open_dataset('saved_on_disk_compressed.nc') % tmp <xray.Dataset> Dimensions: (time: 3, x: 2, y: 2) Coordinates: reference_time datetime64[ns] 2014-09-05 lon (x, y) float64 -99.83 -99.32 -99.79 -99.23 * time (time) datetime64[ns] 2014-09-06 2014-09-07 2014-09-08 lat (x, y) float64 42.25 42.21 42.63 42.59 * x (x) int64 0 1 * y (y) int64 0 1 Data variables: temperature (x, y, time) float64 10.66 8.539 6.713 8.519 29.07 27.86 ... precipitation (x, y, time) float64 0.3385 6.773 8.985 0.9651 0.1359 ...

% tmp.temperature.encoding {'chunksizes': (2, 2, 3), 'complevel': 5, 'contiguous': False, 'dtype': dtype('float64'), 'fletcher32': False, 'shuffle': True, 'source': 'saved_on_disk_compressed.nc', 'zlib': True}

% tmp2 = tmp % tmp2.to_netcdf('saved_on_disk_comp_tmp2.nc') % tmp3 = xray.open_dataset('saved_on_disk_comp_tmp2.nc') % tmp3.temperature.encoding {'chunksizes': (2, 2, 3), 'complevel': 5, 'contiguous': False, 'dtype': dtype('float64'), 'fletcher32': False, 'shuffle': True, 'source': 'saved_on_disk_comp_xray.nc', 'zlib': True} ```

Setting encoding dictionary works fine too (in this case copying from an existing variable):

python % tmp4 = xray.DataArray(tmp.temperature.values).to_dataset(name='temperature') % tmp4.temperature.encoding {} % tmp4.temperature.encoding = tmp.temperature.encoding % tmp4.temperature.encoding {'chunksizes': (2, 2, 3), 'complevel': 5, 'contiguous': False, 'dtype': dtype('float64'), 'fletcher32': False, 'shuffle': True, 'source': 'saved_on_disk_compressed.nc', 'zlib': True} % tmp4.to_netcdf('saved_on_disk_comp_tmp4.nc') % tmp5 = xray.open_dataset('saved_on_disk_comp_tmp4.nc') % tmp.temperature.encoding {'chunksizes': (2, 2, 3), 'complevel': 5, 'contiguous': False, 'dtype': dtype('float64'), 'fletcher32': False, 'shuffle': True, 'source': 'saved_on_disk_compressed.nc', 'zlib': True}

That will do nicely. Thanks.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support for netcdf4/hdf5 compression 102703065
133992153 https://github.com/pydata/xarray/issues/516#issuecomment-133992153 https://api.github.com/repos/pydata/xarray/issues/516 MDEyOklzc3VlQ29tbWVudDEzMzk5MjE1Mw== aidanheerdegen 6063709 2015-08-24T02:21:43Z 2015-08-24T02:21:43Z CONTRIBUTOR

What is the netCDF4 chunking scheme for your compressed data? (use 'ncdump -hs' to reveal the per variable chunking scheme).

Very large datasets can have very long load times depending on the access pattern.

This can be overcome with an appropriately chosen chunking scheme, but if the chunk sizes are not well chosen (and the default library chunking is pretty terrible) then certain access patterns might still be very slow.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Wall time much greater than CPU time 99026442
91437792 https://github.com/pydata/xarray/issues/349#issuecomment-91437792 https://api.github.com/repos/pydata/xarray/issues/349 MDEyOklzc3VlQ29tbWVudDkxNDM3Nzky aidanheerdegen 6063709 2015-04-10T05:49:17Z 2015-04-10T05:49:17Z CONTRIBUTOR

Great to see open_mfdataset implemented! Awesome. The links on the documentation pages seem borked though:

https://github.com/xray/xray/blob/0cd100effc3866ed083c366723da0b502afa5a96/doc/io.rst

e.g. ":py:func:~xray.auto_combine" https://github.com/xray/xray/blob/0cd100effc3866ed083c366723da0b502afa5a96/doc/io.rst#id30

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Query about concat 59467251
81422044 https://github.com/pydata/xarray/issues/349#issuecomment-81422044 https://api.github.com/repos/pydata/xarray/issues/349 MDEyOklzc3VlQ29tbWVudDgxNDIyMDQ0 aidanheerdegen 6063709 2015-03-16T05:16:08Z 2015-03-16T05:16:08Z CONTRIBUTOR

Sorry, I thought I had read the docs (which are very good BTW). Thanks.

I have some large files and only want to pick out a single variable from each, and was hoping for some lazy-loading goodness.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Query about concat 59467251
81399209 https://github.com/pydata/xarray/issues/349#issuecomment-81399209 https://api.github.com/repos/pydata/xarray/issues/349 MDEyOklzc3VlQ29tbWVudDgxMzk5MjA5 aidanheerdegen 6063709 2015-03-16T04:11:37Z 2015-03-16T04:11:37Z CONTRIBUTOR

Is there support for an MFDataset-like multiple file open in xray?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Query about concat 59467251

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 18.013ms · About: xarray-datasette