home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

104 rows where user = 6628425 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, draft, created_at (date), updated_at (date), closed_at (date)

type 2

  • pull 84
  • issue 20

state 2

  • closed 99
  • open 5

repo 1

  • xarray 104
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
2279042264 PR_kwDOAMm_X85ui13E 8999 Port negative frequency fix for `pandas.date_range` to `cftime_range` spencerkclark 6628425 open 0     0 2024-05-04T14:48:08Z 2024-05-04T14:51:26Z   MEMBER   0 pydata/xarray/pulls/8999

Like pandas.date_range, cftime_range would previously return dates outside the range of the specified start and end dates if provided a negative frequency: ```

start = cftime.DatetimeGregorian(2023, 10, 31) end = cftime.DatetimeGregorian(2021, 11, 1) xr.cftime_range(start, end, freq="-1YE") CFTimeIndex([2023-12-31 00:00:00, 2022-12-31 00:00:00, 2021-12-31 00:00:00], dtype='object', length=3, calendar='standard', freq='-1YE-DEC') ```

This PR ports a bug fix from pandas (https://github.com/pandas-dev/pandas/issues/56147) to prevent this from happening. The above example now produces: ```

start = cftime.DatetimeGregorian(2023, 10, 31) end = cftime.DatetimeGregorian(2021, 11, 1) xr.cftime_range(start, end, freq="-1YE") CFTimeIndex([2022-12-31 00:00:00, 2021-12-31 00:00:00], dtype='object', length=2, calendar='standard', freq=None) ```

Since this is a bug fix, we do not make any attempt to preserve the old behavior if an earlier version of pandas is installed. In the testing context this means we skip some tests for pandas versions less than 3.0.

  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8999/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2276732187 PR_kwDOAMm_X85ubH0P 8996 Mark `test_use_cftime_false_standard_calendar_in_range` as an expected failure spencerkclark 6628425 closed 0     0 2024-05-03T01:05:21Z 2024-05-03T15:21:48Z 2024-05-03T15:21:48Z MEMBER   0 pydata/xarray/pulls/8996

Per https://github.com/pydata/xarray/issues/8844#issuecomment-2089427222, for the time being this marks test_use_cftime_false_standard_calendar_in_range as an expected failure under NumPy 2. Hopefully we'll be able to fix the upstream issue in pandas eventually.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8996/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2242197433 PR_kwDOAMm_X85smT4G 8942 WIP: Support calendar-specific `cftime.datetime` instances spencerkclark 6628425 open 0     0 2024-04-14T14:33:06Z 2024-04-14T15:41:08Z   MEMBER   1 pydata/xarray/pulls/8942

Since cftime version 1.3.0, the base cftime.datetime object can be calendar-aware, obviating the need for calendar-specific subclasses like cftime.DatetimeNoLeap. This PR aims to finally enable the use of these objects in xarray. We can also use this moment to remove cruft around accommodating inexact cftime datetime arithmetic, since that has been fixed since cftime version 1.2.0.

The idea will be to support both for a period of time and eventually drop support for the calendar-specific subclasses. I do not think too much should need to change within xarray—the main challenge will be to see if we can maintain adequate test coverage without multiplying the number of cftime tests by two. This draft PR is at least a start towards that.

  • [ ] Closes #4336
  • [ ] Closes #4853
  • [ ] Closes #5551
  • [ ] Closes #8298
  • [ ] Closes #8941
  • [ ] Tests added
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [ ] New functions/methods are listed in api.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8942/reactions",
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 1,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2152367065 PR_kwDOAMm_X85n0pyX 8782 Fix non-nanosecond casting behavior for `expand_dims` spencerkclark 6628425 closed 0     0 2024-02-24T15:38:41Z 2024-02-27T18:52:58Z 2024-02-27T18:51:49Z MEMBER   0 pydata/xarray/pulls/8782

This PR fixes the issue noted in https://github.com/pydata/xarray/issues/7493#issuecomment-1953091000 that non-nanosecond precision datetime or timedelta values passed to expand_dims would not be cast to nanosecond precision. The underlying issue was that the _possibly_convert_datetime_or_timedelta_index function did not appropriately handle being passed PandasIndexingAdapter objects.

  • [x] Fixes https://github.com/pydata/xarray/issues/7493#issuecomment-1953091000
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8782/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2060490766 PR_kwDOAMm_X85i9R7z 8575 Add chunk-friendly code path to `encode_cf_datetime` and `encode_cf_timedelta` spencerkclark 6628425 closed 0     6 2023-12-30T01:25:17Z 2024-01-30T02:17:58Z 2024-01-29T19:12:30Z MEMBER   0 pydata/xarray/pulls/8575

I finally had a moment to think about this some more following discussion in https://github.com/pydata/xarray/pull/8253. This PR adds a chunk-friendly code path to encode_cf_datetime and encode_cf_timedelta, which enables lazy encoding of time-like values, and by extension, preservation of chunks when writing time-like values to zarr. With these changes, the test added by @malmans2 in #8253 passes.

Though it largely reuses existing code, the lazy encoding implemented in this PR is stricter than eager encoding in a couple ways: 1. It requires either both the encoding units and dtype be prescribed, or neither be prescribed; prescribing one or the other is not supported, since it requires inferring one or the other from the data. In the case that neither is specified, the dtype is set to np.int64 and the units are either "nanoseconds since 1970-01-01" or "microseconds since 1970-01-01" depending on whether we are encoding np.datetime64[ns] values or cftime.datetime objects. In the case of timedelta64[ns] values, the units are set to "nanoseconds". 2. In addition, if an integer dtype is prescribed, but the units are set such that floating point values would be required, it raises instead of modifying the units to enable integer encoding. This is a requirement since the data units may differ between chunks, so overriding could result in inconsistent units.

As part of this PR, since dask requires we know the dtype of the array returned by the function passed to map_blocks, I also added logic to handle casting to the specified encoding dtype in an overflow-and-integer safe manner. This means an informative error message would be raised in the situation described in #8542:

OverflowError: Not possible to cast encoded times from dtype('int64') to dtype('int16') without overflow. Consider removing the dtype encoding, at which point xarray will make an appropriate choice, or explicitly switching to a larger integer dtype.

I eventually want to think about this on the decoding side as well, but that can wait for another PR.

  • [x] Closes #7132
  • [x] Closes #8230
  • [x] Closes #8432
  • [x] Closes #8253
  • [x] Addresses #8542
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8575/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1977766748 PR_kwDOAMm_X85eneDK 8415 Deprecate certain cftime frequency strings following pandas spencerkclark 6628425 closed 0     3 2023-11-05T12:27:59Z 2023-11-16T15:37:27Z 2023-11-16T15:19:40Z MEMBER   0 pydata/xarray/pulls/8415

Following several upstream PRs in pandas, this PR deprecates cftime frequency strings "A", "AS", "Q", "M", "H", "T", "S", "L", and "U" in favor of "Y", "YS", "QE", "ME", "h", "min", "s", "ms", and "us". Similarly following pandas, it makes a breaking change to have infer_freq return the latter frequencies instead of the former.

There are a few places in the tests and one place in the code where we need some version-specific logic to retain support for older pandas versions. @aulemahal it would be great if you could take a look to make sure that I handled this breaking change properly / fully in the date_range_like case.

I also took the liberty to transition to using "Y", "YS", "h", "min", "s", "ms", "us", and "ns" within our code, tests, and documentation to reduce the amount of warnings emitted. I have held off on switching to "QE", "ME", and anchored offsets involving "Y" or "YS" in pandas-related code since those are not supported in older versions of pandas.

The deprecation warning looks like this: ```

xr.cftime_range("2000", periods=5, freq="M") <stdin>:1: FutureWarning: 'M' is deprecated and will be removed in a future version. Please use 'ME' instead of 'M'. CFTimeIndex([2000-01-31 00:00:00, 2000-02-29 00:00:00, 2000-03-31 00:00:00, 2000-04-30 00:00:00, 2000-05-31 00:00:00], dtype='object', length=5, calendar='standard', freq='ME') ```

  • [x] Closes #8394
  • [x] Addresses the convert_calendar and date_range_like test failures in #8091
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8415/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1970241789 I_kwDOAMm_X851b4D9 8394 Update cftime frequency strings in line with recent updates in pandas spencerkclark 6628425 closed 0     1 2023-10-31T11:24:15Z 2023-11-16T15:19:42Z 2023-11-16T15:19:42Z MEMBER      

What is your issue?

Pandas has introduced some deprecations in how frequency strings are specified:

  • Deprecating "A", "A-JAN", etc. in favor of "Y", "Y-JAN", etc. (https://github.com/pandas-dev/pandas/pull/55252)
  • Deprecating "AS", "AS-JAN", etc. in favor of "YS", "YS-JAN", etc. (https://github.com/pandas-dev/pandas/pull/55479)
  • Deprecating "Q", "Q-JAN", etc. in favor of "QE", "QE-JAN", etc. (https://github.com/pandas-dev/pandas/pull/55553)
  • Deprecating "M" in favor of "ME" (https://github.com/pandas-dev/pandas/pull/54061)
  • Deprecating "H" in favor of "h" (https://github.com/pandas-dev/pandas/pull/54939)
  • Deprecating "T", "S", "L", and "U" in favor of "min", "s", "ms", and "us" (https://github.com/pandas-dev/pandas/pull/54061).

It would be good to carry these deprecations out for cftime frequency specifications to remain consistent.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8394/reactions",
    "total_count": 2,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 1,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1970215879 PR_kwDOAMm_X85eOCcn 8393 Port fix from pandas-dev/pandas#55283 to cftime resample spencerkclark 6628425 closed 0     1 2023-10-31T11:12:09Z 2023-11-02T09:40:46Z 2023-11-02T04:12:51Z MEMBER   0 pydata/xarray/pulls/8393

The remaining failing cftime resample tests in https://github.com/pydata/xarray/issues/8091 happen to be related to a bug that was fixed in the pandas implementation, https://github.com/pandas-dev/pandas/pull/55283, leading answers to change in some circumstances. This PR ports that bug fix to xarray's implementation of resample for data indexed by a CFTimeIndex.

  • [x] Fixes remaining failing cftime resample tests in https://github.com/pydata/xarray/issues/8091
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst

A simple example where answers change in pandas is the following:

Previously

```

import numpy as np; import pandas as pd index = pd.date_range("2000", periods=5, freq="5D") series = pd.Series(np.arange(index.size), index=index) series.resample("2D", closed="right", label="right", offset="1s").mean() 2000-01-01 00:00:01 0.0 2000-01-03 00:00:01 NaN 2000-01-05 00:00:01 1.0 2000-01-07 00:00:01 NaN 2000-01-09 00:00:01 NaN 2000-01-11 00:00:01 2.0 2000-01-13 00:00:01 NaN 2000-01-15 00:00:01 3.0 2000-01-17 00:00:01 NaN 2000-01-19 00:00:01 NaN 2000-01-21 00:00:01 4.0 Freq: 2D, dtype: float64 ```

Currently

```

import numpy as np; import pandas as pd index = pd.date_range("2000", periods=5, freq="5D") series = pd.Series(np.arange(index.size), index=index) series.resample("2D", closed="right", label="right", offset="1s").mean() 2000-01-01 00:00:01 0.0 2000-01-03 00:00:01 NaN 2000-01-05 00:00:01 NaN 2000-01-07 00:00:01 1.0 2000-01-09 00:00:01 NaN 2000-01-11 00:00:01 2.0 2000-01-13 00:00:01 NaN 2000-01-15 00:00:01 NaN 2000-01-17 00:00:01 3.0 2000-01-19 00:00:01 NaN 2000-01-21 00:00:01 4.0 Freq: 2D, dtype: float64 ```

This PR allows us to reproduce this change in xarray for data indexed by a CFTimeIndex. The bin edges were incorrect in the previous case; see https://github.com/pandas-dev/pandas/pull/52064#issuecomment-1785893752 for @MarcoGorelli's nice explanation as to why.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8393/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1925977158 PR_kwDOAMm_X85b41i3 8272 Fix datetime encoding precision loss regression for units requiring floating point values spencerkclark 6628425 closed 0     1 2023-10-04T11:12:59Z 2023-10-06T14:09:34Z 2023-10-06T14:08:51Z MEMBER   0 pydata/xarray/pulls/8272

This PR proposes a fix to #8271. I think the basic issue is that the only time we need to update the needed_units is if the data_delta does not evenly divide the ref_delta. If it does evenly divide it--as it does in the example in #8271--and we try to update the needed_units solely according to the value of the ref_delta, we run the risk of resetting them to something that would be coarser than the data requires. If it does not evenly divide it, we are safe to reset the needed_units because they will be guaranteed to be finer-grained than the data requires.

I modified test_roundtrip_float_times to reflect the example given by @larsbuntemeyer in #8271. @kmuehlbauer let me know if this fix makes sense to you.

  • [x] Closes #8271
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8272/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1657396474 PR_kwDOAMm_X85NxJ61 7731 Continue to use nanosecond-precision Timestamps in precision-sensitive areas spencerkclark 6628425 closed 0     9 2023-04-06T13:06:50Z 2023-04-13T15:17:14Z 2023-04-13T14:58:34Z MEMBER   0 pydata/xarray/pulls/7731

This addresses the remaining cftime-related test failures in #7707 by introducing a function that always returns a nanosecond-precision Timestamp object. Despite no corresponding test failures, for safety I grepped and went ahead and replaced the pd.Timestamp constructor with this function in a few other areas. I also updated our documentation to replace any mentions of the "Timestamp-valid range" with "nanosecond-precision range" since Timestamps are now more flexible, and included a note that we have an issue open for relaxing this nanosecond-precision assumption in xarray eventually.

While in principle I think it would be fine if CFTimeIndex.to_datetimeindex returned a DatetimeIndex with non-nanosecond-precision values, since we don't use to_datetimeindex anywhere outside of tests, in its current state it was returning nonsense values: ```

import pandas as pd import xarray as xr times = xr.cftime_range("0001", periods=5) times.to_datetimeindex() DatetimeIndex(['1754-08-30 22:43:41.128654848', '1754-08-31 22:43:41.128654848', '1754-09-01 22:43:41.128654848', '1754-09-02 22:43:41.128654848', '1754-09-03 22:43:41.128654848'], dtype='datetime64[ns]', freq=None) `` This is due to the assumption incftime_to_nptimethat the resulting array will have nanosecond-precision values. We can (and should) address this eventually, but for the sake of quickly supporting pandas version two I decided to be conservative and punt this off to be part of #7493.cftime_to_nptimeis used in places other thanto_datetimeindex`, so modifying it has other impacts downstream.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7731/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1533980729 PR_kwDOAMm_X85Hao6Z 7441 Preserve formatting of reference time units under pandas 2.0.0 spencerkclark 6628425 closed 0     9 2023-01-15T20:09:24Z 2023-04-01T12:41:44Z 2023-04-01T12:36:56Z MEMBER   0 pydata/xarray/pulls/7441

As suggested by @keewis, to preserve existing behavior in xarray, this PR forces any object passed to format_timestamp to be converted to a string using strftime with a constant format. This addresses the failing tests related to the units encoding in #7420.

  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7441/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1535387692 PR_kwDOAMm_X85HfYz8 7444 Preserve `base` and `loffset` arguments in `resample` spencerkclark 6628425 closed 0     7 2023-01-16T19:16:39Z 2023-03-08T18:16:12Z 2023-03-08T16:55:22Z MEMBER   0 pydata/xarray/pulls/7444

While pandas is getting set to remove the base and loffset arguments in resample, we have not had a chance to emit a deprecation warning for them yet in xarray (https://github.com/pydata/xarray/issues/7420). This PR preserves their functionality in xarray and should hopefully give users some extra time to adapt. Deprecation warnings for each are added so that we can eventually remove them.

I've taken the liberty to define a TimeResampleGrouper object, since we need some way to carry the loffset argument through the resample chain, even though it will no longer be allowed on the pd.Grouper object. Currently it is not particularly complicated, so hopefully it would be straightforward to adapt to what is envisioned in https://github.com/pydata/xarray/issues/6610#issuecomment-1341296800.

  • [x] closes #7266
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7444/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1489128898 PR_kwDOAMm_X85FCdPL 7373 Add `inclusive` argument to `cftime_range` and `date_range` and deprecate `closed` argument spencerkclark 6628425 closed 0     4 2022-12-10T23:40:47Z 2023-02-06T17:51:47Z 2023-02-06T17:51:46Z MEMBER   0 pydata/xarray/pulls/7373

Following pandas, this PR adds an inclusive argument to xarray.cftime_range and xarray.date_range and deprecates the closed argument. Pandas will be removing the closed argument soon in their date_range implementation, but we will continue supporting it to allow for our own deprecation cycle.

I think we may also need to update our minimum pandas version to 1.4 for this, since earlier versions of pandas do not support the inclusive argument.

  • [x] Closes #6985
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7373/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1467346810 PR_kwDOAMm_X85D2by_ 7331 Fix PR number in what’s new spencerkclark 6628425 closed 0     0 2022-11-29T02:20:18Z 2022-11-29T07:37:06Z 2022-11-29T07:37:05Z MEMBER   0 pydata/xarray/pulls/7331

I noticed the PR number was off in my what’s new entry in #7284. This fixes that.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7331/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1446967076 PR_kwDOAMm_X85Cx_0x 7284 Enable `origin` and `offset` arguments in `resample` spencerkclark 6628425 closed 0     0 2022-11-13T15:23:01Z 2022-11-29T00:06:46Z 2022-11-28T23:38:52Z MEMBER   0 pydata/xarray/pulls/7284

This PR enables the origin and offset arguments in resample. This was simple to do in the case of data indexed by a DatetimeIndex, but naturally required changes to our internal implementation of resample for data indexed by a CFTimeIndex. Fortunately those changes were fairly straightforward to port over from pandas.

This does not do anything to address the deprecation of base noted in #7266, but is an important first step toward getting up to speed with the latest version of pandas, both on the DatetimeIndex side and the CFTimeIndex side. This way we will at least be able to handle that deprecation in the same way for each.

  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst

I think things are fairly comprehensively implemented and tested here, but I'm marking this as a draft for now as I want to see if I can reduce the number of cftime resampling tests some, which have multiplied with the addition of these new arguments.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7284/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1428748922 PR_kwDOAMm_X85B07x6 7238 Improve non-nanosecond warning spencerkclark 6628425 closed 0     7 2022-10-30T11:44:56Z 2022-11-04T20:37:27Z 2022-11-04T20:13:19Z MEMBER   0 pydata/xarray/pulls/7238

Thanks for the feedback @hmaarrfk. Is this what you had in mind?

  • [x] Closes #7237

For example running this script: ```python import numpy as np import xarray as xr

times = [np.datetime64("2000-01-01", "us")] var = xr.Variable(["time"], times) da = xr.DataArray(times) leads to the following warnings: $ python test_warning.py test_warning.py:6: UserWarning: Converting non-nanosecond precision datetime values to nanosecond precision. This behavior can eventually be relaxed in xarray, as it is an artifact from pandas which is now beginning to support non-nanosecond precision values. This warning is caused by passing non-nanosecond np.datetime64 or np.timedelta64 values to the DataArray or Variable constructor; it can be silenced by converting the values to nanosecond precision ahead of time. var = xr.Variable(["time"], times) test_warning.py:7: UserWarning: Converting non-nanosecond precision datetime values to nanosecond precision. This behavior can eventually be relaxed in xarray, as it is an artifact from pandas which is now beginning to support non-nanosecond precision values. This warning is caused by passing non-nanosecond np.datetime64 or np.timedelta64 values to the DataArray or Variable constructor; it can be silenced by converting the values to nanosecond precision ahead of time. da = xr.DataArray(times) ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7238/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1419972576 PR_kwDOAMm_X85BXqCR 7201 Emit a warning when converting datetime or timedelta values to nanosecond precision spencerkclark 6628425 closed 0     3 2022-10-23T23:17:07Z 2022-10-26T16:07:16Z 2022-10-26T16:00:33Z MEMBER   0 pydata/xarray/pulls/7201

This PR addresses #7175 by converting datetime or timedelta values to nanosecond precision even if pandas does not. For the time being we emit a warning when pandas does not do the conversion, but we do (right now this is only in the development version of pandas). When things stabilize in pandas we can consider relaxing this constraint in xarray as well.

This got a little bit more complicated due to the presence of timezone-aware datetimes in pandas, but hopefully the tests cover those cases now.

  • [x] Closes #7175
  • [x] Closes #7197
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7201/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1410574596 PR_kwDOAMm_X85A4K3f 7171 Set `longdouble=False` in `cftime.date2num` within the date encoding context spencerkclark 6628425 closed 0     2 2022-10-16T18:20:58Z 2022-10-18T16:38:24Z 2022-10-18T16:37:57Z MEMBER   0 pydata/xarray/pulls/7171

Currently, the default behavior of cftime.date2num is to return integer values when possible (i.e. when the encoding units allow), and fall back to returning float64 values when that is not possible. Recently, cftime added the option to use float128 as the fallback dtype, which enables greater potential roundtrip precision. This is through the longdouble flag to cftime.date2num, which currently defaults to False. It was intentionally set to False by default, because netCDF does not support storing float128 values in files, and so, without any changes, would otherwise break xarray's encoding procedure.

The desire in cftime, however, is to eventually set this flag to True by default (https://github.com/Unidata/cftime/issues/297). This PR makes the necessary changes in xarray to adapt to this eventual new default. Essentially if the longdouble argument is allowed in the user's version of cftime.date2num, we explicitly set it to False to preserve the current float64 fallback behavior within the context of encoding times. There are a few more places where date2num is used (some additional places in the tests, and in calendar_ops.py), but in those places using float128 values would not present a problem.

At some point we might consider relaxing this behavior in xarray, since it is possible to store float128 values in zarr stores for example, but for the time being the simplest approach seems to be to stick with float64 for all backends (it would be complicated to have backend-specific defaults).

cc: @jswhit

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7171/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1413075015 I_kwDOAMm_X85UOdBH 7184 Potentially add option to encode times using `longdouble` values spencerkclark 6628425 open 0     0 2022-10-18T11:46:30Z 2022-10-18T11:47:00Z   MEMBER      

By default xarray will exactly roundtrip times saved to disk by encoding them using int64 values. However, if a user specifies time encoding units that prevent this, float64 values will be used, and this has the potential to cause roundtripping differences due to roundoff error. Recently, cftime added the ability to encode times using longdouble values (https://github.com/Unidata/cftime/pull/284). On some platforms this offers greater precision than float64 values (though typically not full quad precision). Nevertheless some users might be interested in encoding their times using such values.

The main thing that longdouble values have going for them is that they enable greater precision when using arbitrary units to encode the dates (with int64 we are constrained to using units that allow for time intervals to be expressed with integers). That said, the more I think about this, the more I feel it may not be the best idea:

  • Since the meaning of longdouble can vary from platform to platform, I wonder what happens if you encode times using longdouble values on one machine and decode them on another?
  • longdouble values cannot be stored with all backends; for example zarr supports it, but netCDF does not.
  • We already provide a robust way to exactly roundtrip any dates--i.e. encode them with int64 values--so adding a less robust (if slightly more flexible in terms of units) option might just cause confusion.

It's perhaps still worth opening this issue for discussion in case others have thoughts that might allay those concerns.

cc: @jswhit @dcherian

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7184/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1401909544 I_kwDOAMm_X85Tj3Eo 7145 Time decoding error message does not include the problematic variable's name spencerkclark 6628425 closed 0     5 2022-10-08T10:59:17Z 2022-10-13T23:21:55Z 2022-10-12T15:25:42Z MEMBER      

What is your issue?

If any variable in a Dataset has times that cannot be represented as cftime.datetime objects, an error message will be raised. However, this error message will not indicate the problematic variable's name. It would be nice if it did, because it would make it easier for users to determine the source of the error.

cc: @durack1 xref: Unidata/cftime#295

Example

This is a minimal example of the issue. The error message gives no indication that "invalid_times" is the problem:

```

import xarray as xr TIME_ATTRS = {"units": "days since 0001-01-01", "calendar": "noleap"} valid_times = xr.DataArray([0, 1], dims=["time"], attrs=TIME_ATTRS, name="valid_times") invalid_times = xr.DataArray([1e36, 2e36], dims=["time"], attrs=TIME_ATTRS, name="invalid_times") ds = xr.merge([valid_times, invalid_times]) xr.decode_cf(ds) Traceback (most recent call last): File "/Users/spencer/software/xarray/xarray/coding/times.py", line 275, in decode_cf_datetime dates = _decode_datetime_with_pandas(flat_num_dates, units, calendar) File "/Users/spencer/software/xarray/xarray/coding/times.py", line 210, in _decode_datetime_with_pandas raise OutOfBoundsDatetime( pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Cannot decode times from a non-standard calendar, 'noleap', using pandas.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/Users/spencer/software/xarray/xarray/coding/times.py", line 180, in _decode_cf_datetime_dtype result = decode_cf_datetime(example_value, units, calendar, use_cftime) File "/Users/spencer/software/xarray/xarray/coding/times.py", line 277, in decode_cf_datetime dates = _decode_datetime_with_cftime( File "/Users/spencer/software/xarray/xarray/coding/times.py", line 202, in _decode_datetime_with_cftime cftime.num2date(num_dates, units, calendar, only_use_cftime_datetimes=True) File "src/cftime/_cftime.pyx", line 605, in cftime._cftime.num2date File "src/cftime/_cftime.pyx", line 404, in cftime._cftime.cast_to_int OverflowError: time values outside range of 64 bit signed integers

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/spencer/software/xarray/xarray/conventions.py", line 655, in decode_cf vars, attrs, coord_names = decode_cf_variables( File "/Users/spencer/software/xarray/xarray/conventions.py", line 521, in decode_cf_variables new_vars[k] = decode_cf_variable( File "/Users/spencer/software/xarray/xarray/conventions.py", line 369, in decode_cf_variable var = times.CFDatetimeCoder(use_cftime=use_cftime).decode(var, name=name) File "/Users/spencer/software/xarray/xarray/coding/times.py", line 687, in decode dtype = _decode_cf_datetime_dtype(data, units, calendar, self.use_cftime) File "/Users/spencer/software/xarray/xarray/coding/times.py", line 190, in _decode_cf_datetime_dtype raise ValueError(msg) ValueError: unable to decode time units 'days since 0001-01-01' with "calendar 'noleap'". Try opening your dataset with decode_times=False or installing cftime if it is not installed. ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7145/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1402017668 PR_kwDOAMm_X85Ab9DL 7147 Include variable name in message if `decode_cf_variable` raises an error spencerkclark 6628425 closed 0     1 2022-10-08T17:53:23Z 2022-10-12T16:24:45Z 2022-10-12T15:25:42Z MEMBER   0 pydata/xarray/pulls/7147
  • [x] Closes #7145
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst

I'm not sure if there is a better way to do this, but this is one way to address #7145. The error message for the example now looks like:

```

xr.decode_cf(ds) Traceback (most recent call last): File "/Users/spencer/software/xarray/xarray/coding/times.py", line 275, in decode_cf_datetime dates = _decode_datetime_with_pandas(flat_num_dates, units, calendar) File "/Users/spencer/software/xarray/xarray/coding/times.py", line 210, in _decode_datetime_with_pandas raise OutOfBoundsDatetime( pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Cannot decode times from a non-standard calendar, 'noleap', using pandas.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/Users/spencer/software/xarray/xarray/coding/times.py", line 180, in _decode_cf_datetime_dtype result = decode_cf_datetime(example_value, units, calendar, use_cftime) File "/Users/spencer/software/xarray/xarray/coding/times.py", line 277, in decode_cf_datetime dates = _decode_datetime_with_cftime( File "/Users/spencer/software/xarray/xarray/coding/times.py", line 202, in _decode_datetime_with_cftime cftime.num2date(num_dates, units, calendar, only_use_cftime_datetimes=True) File "src/cftime/_cftime.pyx", line 605, in cftime._cftime.num2date File "src/cftime/_cftime.pyx", line 404, in cftime._cftime.cast_to_int OverflowError: time values outside range of 64 bit signed integers

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/Users/spencer/software/xarray/xarray/conventions.py", line 523, in decode_cf_variables new_vars[k] = decode_cf_variable( File "/Users/spencer/software/xarray/xarray/conventions.py", line 369, in decode_cf_variable var = times.CFDatetimeCoder(use_cftime=use_cftime).decode(var, name=name) File "/Users/spencer/software/xarray/xarray/coding/times.py", line 688, in decode dtype = _decode_cf_datetime_dtype(data, units, calendar, self.use_cftime) File "/Users/spencer/software/xarray/xarray/coding/times.py", line 190, in _decode_cf_datetime_dtype raise ValueError(msg) ValueError: unable to decode time units 'days since 0001-01-01' with "calendar 'noleap'". Try opening your dataset with decode_times=False or installing cftime if it is not installed.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/spencer/software/xarray/xarray/conventions.py", line 659, in decode_cf vars, attrs, coord_names = decode_cf_variables( File "/Users/spencer/software/xarray/xarray/conventions.py", line 534, in decode_cf_variables raise type(e)(f"Failed to decode variable {k!r}: {e}") ValueError: Failed to decode variable 'invalid_times': unable to decode time units 'days since 0001-01-01' with "calendar 'noleap'". Try opening your dataset with decode_times=False or installing cftime if it is not installed. ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7147/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1345573285 PR_kwDOAMm_X849hMS4 6940 Enable taking the mean of dask-backed cftime arrays spencerkclark 6628425 closed 0     4 2022-08-21T19:24:37Z 2022-09-10T12:28:16Z 2022-09-09T16:48:19Z MEMBER   0 pydata/xarray/pulls/6940

This was essentially enabled by @dcherian in #6556, but we did not remove the error that prevented computing the mean of a dask-backed cftime array. This PR removes that error, and adds some tests. One minor modification in _timedelta_to_seconds was needed for compatibility with scalar cftime arrays.

This happens to address the second part of #5897, so I added a regression test for that. It seems like we decided to simply document the behavior in the first part (https://github.com/pydata/xarray/issues/5898, https://github.com/dcherian/xarray/commit/99bfe128066ec3ef1b297650a47e2dd0a45801a8), but I'm not sure if we intend to change that behavior eventually or not.

  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6940/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1361886569 PR_kwDOAMm_X84-Xawt 6988 Simplify datetime64 `dt.calendar` tests spencerkclark 6628425 closed 0     2 2022-09-05T12:39:30Z 2022-09-09T09:50:55Z 2022-09-08T23:34:44Z MEMBER   0 pydata/xarray/pulls/6988

This PR simplifies the tests for the calendar attribute on the dt accessor when using a datetime64[ns]-dtype DataArray. Instead of creating random-valued datetime arrays, we can use arrays of zeros (i.e. 1970-01-01), since the values of the datetimes should not be relevant to these tests (only their type matters).

I suspect this should address #6906, because it eliminates the need to convert to datetime64[ns], though I still feel as though there is a more fundamental pandas issue lurking there.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6988/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1282211455 PR_kwDOAMm_X846O5KE 6717 Accommodate `OutOfBoundsTimedelta` error when decoding times spencerkclark 6628425 closed 0     1 2022-06-23T10:53:22Z 2022-06-24T18:48:54Z 2022-06-24T18:48:18Z MEMBER   0 pydata/xarray/pulls/6717

The development version of pandas raises an OutOfBoundsTimedelta error instead of an OverflowError in pd.to_timedelta if the timedelta cannot be represented with nanosecond precision. Therefore we must also be ready to catch that when decoding times.

The OutOfBoundsTimedelta exception was added in pandas version 1.1, which is prior to our current minimum version (1.2), so it should be safe to import without a version check.

  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6717/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1233828213 PR_kwDOAMm_X843tleA 6598 Fix overflow issue in decode_cf_datetime for dtypes <= np.uint32 spencerkclark 6628425 closed 0     0 2022-05-12T11:14:15Z 2022-05-15T15:00:44Z 2022-05-15T14:42:32Z MEMBER   0 pydata/xarray/pulls/6598
  • [x] Closes #6589
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6598/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1206481819 PR_kwDOAMm_X842VR5o 6489 Ensure datetime-like variables are left unmodified by `decode_cf_variable` spencerkclark 6628425 closed 0     1 2022-04-17T20:45:53Z 2022-04-18T18:00:49Z 2022-04-18T15:29:19Z MEMBER   0 pydata/xarray/pulls/6489

It seems rare that decode_cf_variable would be called on variables that contain datetime-like objects already, but in the case that it is, it seems best to let those variables pass through unmodified.

  • [x] Closes #6453
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6489/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1117563249 I_kwDOAMm_X85CnKlx 6204 [Bug]: cannot chunk a DataArray that originated as a coordinate spencerkclark 6628425 open 0     1 2022-01-28T15:56:44Z 2022-03-16T04:18:46Z   MEMBER      

What happened?

If I construct the following DataArray, and try to chunk its "x" coordinate, I get back a NumPy-backed DataArray: ``` In [2]: a = xr.DataArray([1, 2, 3], dims=["x"], coords=[[4, 5, 6]])

In [3]: a.x.chunk() Out[3]: <xarray.DataArray 'x' (x: 3)> array([4, 5, 6]) Coordinates: * x (x) int64 4 5 6 If I construct a copy of the `"x"` coordinate, things work as I would expect: In [4]: x = xr.DataArray(a.x, dims=a.x.dims, coords=a.x.coords, name="x")

In [5]: x.chunk() Out[5]: <xarray.DataArray 'x' (x: 3)> dask.array<xarray-\<this-array>, shape=(3,), dtype=int64, chunksize=(3,), chunktype=numpy.ndarray> Coordinates: * x (x) int64 4 5 6 ```

What did you expect to happen?

I would expect the following to happen: ``` In [2]: a = xr.DataArray([1, 2, 3], dims=["x"], coords=[[4, 5, 6]])

In [3]: a.x.chunk() Out[3]: <xarray.DataArray 'x' (x: 3)> dask.array<xarray-\<this-array>, shape=(3,), dtype=int64, chunksize=(3,), chunktype=numpy.ndarray> Coordinates: * x (x) int64 4 5 6 ```

Minimal Complete Verifiable Example

No response

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None python: 3.7.10 | packaged by conda-forge | (default, Feb 19 2021, 15:59:12) [Clang 11.0.1 ] python-bits: 64 OS: Darwin OS-release: 21.2.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.10.5 libnetcdf: 4.6.3

xarray: 0.20.1 pandas: 1.3.5 numpy: 1.19.4 scipy: 1.5.4 netCDF4: 1.5.5 pydap: None h5netcdf: 0.8.1 h5py: 2.10.0 Nio: None zarr: 2.7.0 cftime: 1.2.1 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.22.0 distributed: None matplotlib: 3.2.2 cartopy: 0.19.0.post1 seaborn: None numbagg: None fsspec: 2021.06.0 cupy: None pint: 0.15 sparse: None setuptools: 49.6.0.post20210108 pip: 20.2.4 conda: 4.10.1 pytest: 6.0.1 IPython: 7.27.0 sphinx: 3.2.1

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6204/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
976108741 MDExOlB1bGxSZXF1ZXN0NzE3MTM0MTcw 5723 Remove use of deprecated `kind` argument in `CFTimeIndex` tests spencerkclark 6628425 closed 0     6 2021-08-21T10:49:53Z 2021-10-24T11:37:02Z 2021-10-24T09:55:33Z MEMBER   0 pydata/xarray/pulls/5723

On the topic of FutureWarning's related to indexing in pandas (#5721), I noticed some another kind of warning in the CFTimeIndex tests: /Users/spencer/software/xarray/xarray/tests/test_cftimeindex.py:350: FutureWarning: 'kind' argument in get_slice_bound is deprecated and will be removed in a future version. Do not pass it. result = index.get_slice_bound("0001", "left", kind) I think it's safe to silence these by removing the kind argument from these tests. We never used it anyway in CFTimeIndex. This is sort of a follow-up to #5359.

  • [x] Passes pre-commit run --all-files
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5723/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
849771808 MDU6SXNzdWU4NDk3NzE4MDg= 5107 Converting `cftime.datetime` objects to `np.datetime64` values through `astype` spencerkclark 6628425 open 0     0 2021-04-04T01:02:55Z 2021-10-05T00:00:36Z   MEMBER      

The discussion of the use of the indexes property in #5102 got me thinking about this StackOverflow answer. For a while I have thought that my answer there isn't very satisfying, not only because it relies on this somewhat obscure indexes property, but also because it only works on dimension coordinates -- i.e. something that would be backed by an index.

Describe the solution you'd like

It would be better if we could do this conversion with astype, e.g. da.astype("datetime64[ns]"). This would allow conversion to datetime64 values for all cftime.datetime DataArrays -- dask-backed or NumPy-backed, 1D or ND -- through a fairly standard and well-known method. To my surprise, while you do not get the nice calendar-switching warning that CFTimeIndex.to_datetimeindex provides, this actually already kind of seems to work (?!):

``` In [1]: import xarray as xr

In [2]: times = xr.cftime_range("2000", periods=6, calendar="noleap")

In [3]: da = xr.DataArray(times.values.reshape((2, 3)), dims=["a", "b"])

In [4]: da.astype("datetime64[ns]") Out[4]: <xarray.DataArray (a: 2, b: 3)> array([['2000-01-01T00:00:00.000000000', '2000-01-02T00:00:00.000000000', '2000-01-03T00:00:00.000000000'], ['2000-01-04T00:00:00.000000000', '2000-01-05T00:00:00.000000000', '2000-01-06T00:00:00.000000000']], dtype='datetime64[ns]') Dimensions without coordinates: a, b ```

NumPy obviously does not officially support this -- nor would I expect it to -- so I would be wary of simply documenting this behavior as is. Would it be reasonable for us to modify xarray.core.duck_array_ops.astype to explicitly implement this conversion ourselves for cftime.datetime arrays? This way we could ensure this was always supported, and we could include appropriate errors for out-of-bounds times (the NumPy method currently overflows in that case) and warnings for switching from non-standard calendars.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5107/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
980761192 MDExOlB1bGxSZXF1ZXN0NzIwOTM5ODE4 5744 Install development version of nc-time-axis in upstream build spencerkclark 6628425 closed 0     2 2021-08-27T00:49:55Z 2021-08-27T13:16:25Z 2021-08-27T12:49:33Z MEMBER   0 pydata/xarray/pulls/5744

I think this would be good to do anyway, but I'm also curious to see if it fixes the cftime plotting tests in #5743.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5744/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
919537074 MDExOlB1bGxSZXF1ZXN0NjY4ODUwNDM2 5463 Explicitly state datetime units in array constructors in `test_datetime_mean` spencerkclark 6628425 closed 0     1 2021-06-12T11:48:22Z 2021-06-12T13:20:33Z 2021-06-12T12:58:43Z MEMBER   0 pydata/xarray/pulls/5463

This addresses the test_datetime_mean failures reported in #5366. Pandas now requires that we make sure the units of datetime arrays are specified explicitly in array constructors: https://github.com/pandas-dev/pandas/issues/36615#issuecomment-860040013.

  • [x] Passes pre-commit run --all-files
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5463/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
919277708 MDExOlB1bGxSZXF1ZXN0NjY4NjE5Mjkx 5461 Remove `xfail` decorator from tests that depend on nc-time-axis spencerkclark 6628425 closed 0     0 2021-06-11T22:44:46Z 2021-06-12T12:57:55Z 2021-06-12T12:57:53Z MEMBER   0 pydata/xarray/pulls/5461

nc-time-axis version 1.3.0 was released today (thanks @bjlittle!), which includes various fixes for incompatibilities with the latest version of cftime. This means that our tests that depend on nc-time-axis should now pass.

  • [x] Closes #5344
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5461/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
897953514 MDExOlB1bGxSZXF1ZXN0NjQ5ODg3Mzgw 5359 Make `kind` argument in `CFTimeIndex._maybe_cast_slice_bound` optional spencerkclark 6628425 closed 0     2 2021-05-21T11:25:46Z 2021-05-23T09:47:03Z 2021-05-23T00:13:20Z MEMBER   0 pydata/xarray/pulls/5359

Pandas recently deprecated the kind argument in Index._maybe_cast_slice_bound, and removed its use in several internal calls: https://github.com/pandas-dev/pandas/pull/41378. This led to some errors in the CFTimeIndex tests in our upstream build. We never made use of it in CFTimeIndex._maybe_cast_slice_bound so the simplest fix for backwards compatibility seems to be to make it optional for now -- in previous versions of pandas it was required -- and remove it when our minimum version of pandas is at least 1.3.0.

  • [x] Closes #5356
  • [x] Passes pre-commit run --all-files
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5359/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
860511053 MDExOlB1bGxSZXF1ZXN0NjE3Mzc4MjUz 5180 Convert calendar to lowercase in standard calendar checks spencerkclark 6628425 closed 0     0 2021-04-17T20:44:57Z 2021-04-18T10:17:11Z 2021-04-18T10:17:08Z MEMBER   0 pydata/xarray/pulls/5180

This fixes the issue in #5093, by ensuring that we always convert the calendar to lowercase before checking if it is one of the standard calendars in the decoding and encoding process. I've been careful to test that the calendar attribute is faithfully roundtripped despite this, uppercase letters and all.

~~I think part of the reason this went unnoticed for a while was that we could still decode times like this if cftime was installed; it is only in the case when cftime was not installed that our logic failed. This is because cftime.num2date already converts the calendar to lowercase internally.~~

Upon re-reading @pont-us's issue description, while it didn't cause an error, the behavior was incorrect with cftime installed too. I updated the test to check the dtype is np.datetime64 as well.

  • [x] Closes #5093
  • [x] Tests added
  • [x] Passes pre-commit run --all-files
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5180/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
857414221 MDExOlB1bGxSZXF1ZXN0NjE0ODM5MDAz 5154 Catch either OutOfBoundsTimedelta or OverflowError in CFTimeIndex.__sub__ and CFTimeIndex.__rsub__ spencerkclark 6628425 closed 0     2 2021-04-14T00:24:34Z 2021-04-14T15:44:17Z 2021-04-14T13:27:10Z MEMBER   0 pydata/xarray/pulls/5154

It seems that pandas did not include the change that led to #5006 in their latest release. Perhaps it is safer to just catch either error regardless of the pandas version.

  • [x] Closes #5147
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5154/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
823907100 MDExOlB1bGxSZXF1ZXN0NTg2MjQ0MjQ5 5006 Adapt exception handling logic in CFTimeIndex.__sub__ and __rsub__ spencerkclark 6628425 closed 0     0 2021-03-07T12:28:25Z 2021-03-07T13:22:06Z 2021-03-07T13:22:03Z MEMBER   0 pydata/xarray/pulls/5006

The exception that was raised in pandas when a datetime.timedelta object outside the range that could be expressed in units of nanoseconds was passed to the pandas.TimedeltaIndex constructor changed from an OverflowError to an OutOfBoundsTimedelta error in the development version of pandas. This PR adjusts our exception handling logic in CFTimeIndex.__sub__ and CFTimeIndex.__rsub__ to account for this.

  • [x] closes #4947
Previous versions of pandas: ```python >>> import pandas as pd; from datetime import timedelta >>> pd.TimedeltaIndex([timedelta(days=300 * 365)]) Traceback (most recent call last): File "pandas/_libs/tslibs/timedeltas.pyx", line 263, in pandas._libs.tslibs.timedeltas.array_to_timedelta64 TypeError: Expected unicode, got datetime.timedelta During handling of the above exception, another exception occurred: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/spencer/Software/miniconda3/envs/xarray-tests/lib/python3.7/site-packages/pandas/core/indexes/timedeltas.py", line 157, in __new__ data, freq=freq, unit=unit, dtype=dtype, copy=copy File "/Users/spencer/Software/miniconda3/envs/xarray-tests/lib/python3.7/site-packages/pandas/core/arrays/timedeltas.py", line 216, in _from_sequence data, inferred_freq = sequence_to_td64ns(data, copy=copy, unit=unit) File "/Users/spencer/Software/miniconda3/envs/xarray-tests/lib/python3.7/site-packages/pandas/core/arrays/timedeltas.py", line 926, in sequence_to_td64ns data = objects_to_td64ns(data, unit=unit, errors=errors) File "/Users/spencer/Software/miniconda3/envs/xarray-tests/lib/python3.7/site-packages/pandas/core/arrays/timedeltas.py", line 1036, in objects_to_td64ns result = array_to_timedelta64(values, unit=unit, errors=errors) File "pandas/_libs/tslibs/timedeltas.pyx", line 268, in pandas._libs.tslibs.timedeltas.array_to_timedelta64 File "pandas/_libs/tslibs/timedeltas.pyx", line 221, in pandas._libs.tslibs.timedeltas.convert_to_timedelta64 File "pandas/_libs/tslibs/timedeltas.pyx", line 166, in pandas._libs.tslibs.timedeltas.delta_to_nanoseconds OverflowError: Python int too large to convert to C long ``` Development version of pandas: ```python >>> import pandas as pd; from datetime import timedelta >>> pd.TimedeltaIndex([timedelta(days=300 * 365)]) Traceback (most recent call last): File "pandas/_libs/tslibs/timedeltas.pyx", line 348, in pandas._libs.tslibs.timedeltas.array_to_timedelta64 TypeError: Expected unicode, got datetime.timedelta During handling of the above exception, another exception occurred: Traceback (most recent call last): File "pandas/_libs/tslibs/timedeltas.pyx", line 186, in pandas._libs.tslibs.timedeltas.delta_to_nanoseconds OverflowError: Python int too large to convert to C long The above exception was the direct cause of the following exception: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/spencer/software/pandas/pandas/core/indexes/timedeltas.py", line 161, in __new__ tdarr = TimedeltaArray._from_sequence_not_strict( File "/Users/spencer/software/pandas/pandas/core/arrays/timedeltas.py", line 270, in _from_sequence_not_strict data, inferred_freq = sequence_to_td64ns(data, copy=copy, unit=unit) File "/Users/spencer/software/pandas/pandas/core/arrays/timedeltas.py", line 970, in sequence_to_td64ns data = objects_to_td64ns(data, unit=unit, errors=errors) File "/Users/spencer/software/pandas/pandas/core/arrays/timedeltas.py", line 1079, in objects_to_td64ns result = array_to_timedelta64(values, unit=unit, errors=errors) File "pandas/_libs/tslibs/timedeltas.pyx", line 362, in pandas._libs.tslibs.timedeltas.array_to_timedelta64 File "pandas/_libs/tslibs/timedeltas.pyx", line 353, in pandas._libs.tslibs.timedeltas.array_to_timedelta64 File "pandas/_libs/tslibs/timedeltas.pyx", line 306, in pandas._libs.tslibs.timedeltas.convert_to_timedelta64 File "pandas/_libs/tslibs/timedeltas.pyx", line 189, in pandas._libs.tslibs.timedeltas.delta_to_nanoseconds pandas._libs.tslibs.conversion.OutOfBoundsTimedelta: Python int too large to convert to C long ```
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5006/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
812922830 MDExOlB1bGxSZXF1ZXN0NTc3MTU5OTY3 4939 Add DataArrayCoarsen.reduce and DatasetCoarsen.reduce methods spencerkclark 6628425 closed 0     2 2021-02-21T18:49:47Z 2021-02-23T16:01:30Z 2021-02-23T16:01:27Z MEMBER   0 pydata/xarray/pulls/4939

As suggested by @dcherian, this was quite similar to rolling; it was useful in particular to follow how the tests were implemented there.

  • [x] Closes #3741
  • [x] Tests added
  • [x] Passes pre-commit run --all-files
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [x] New functions/methods are listed in api.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4939/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
777751284 MDExOlB1bGxSZXF1ZXN0NTQ3OTYxNzk1 4758 Ensure maximum accuracy when encoding and decoding cftime.datetime values spencerkclark 6628425 closed 0     1 2021-01-04T00:47:32Z 2021-02-10T21:52:16Z 2021-02-10T21:44:26Z MEMBER   0 pydata/xarray/pulls/4758
  • [x] Closes #4097
  • [x] Tests added
  • [x] Passes isort . && black . && mypy . && flake8
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst

Following up on #4684, this PR makes changes to our encoding / decoding process such that cftime.datetime objects can be roundtripped exactly. In the process, because it made the tests cleaner to define, I added cftime offsets for millisecond and microsecond frequency as well.

As I note in the what's new, exact roundtripping requires cftime of at least version 1.4.1, which included improvements to cftime.num2date (https://github.com/Unidata/cftime/pull/176, https://github.com/Unidata/cftime/pull/188) and cftime.date2num (https://github.com/Unidata/cftime/pull/178, https://github.com/Unidata/cftime/pull/225).

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4758/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
764440458 MDExOlB1bGxSZXF1ZXN0NTM4NTAzNDk3 4684 Ensure maximum accuracy when encoding and decoding np.datetime64[ns] values spencerkclark 6628425 closed 0     3 2020-12-12T21:43:57Z 2021-02-07T23:30:41Z 2021-01-03T23:39:04Z MEMBER   0 pydata/xarray/pulls/4684
  • [x] Closes #4045
  • [x] Tests added
  • [x] Passes isort . && black . && mypy . && flake8
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst

This PR cleans up the logic used to encode and decode times with pandas so that by default we use int64 values in both directions for all precisions down to nanosecond. If a user specifies an encoding (or a file is read in) such that float values would be required, things still work as they did before. I do this mainly by following the approach I described here: https://github.com/pydata/xarray/issues/4045#issuecomment-626257580.

In the process of doing this I made a few changes to coding.times._decode_datetime_with_pandas: - I removed the checks on the minimum and maximum dates to decode, as the issue those checks were imposed for (#975) was fixed in pandas way back in 2016 (https://github.com/pandas-dev/pandas/issues/14068). - I used an alternate approach for fixing #2002, which allows us to continue to use the optimization made in #1414 without having to cast the input array to a float dtype first.

Note this will change the default units that are chosen for encoding times in some instances -- previously we would never default to anything more precise than seconds -- but I think this change is for the better.

cc: @aldanor

@hmaarrfk this overlaps a little with your work in #4400, so I'm giving you credit here too (I hope you don't mind!).

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4684/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
802737682 MDExOlB1bGxSZXF1ZXN0NTY4ODE0NDY5 4871 Modify _encode_datetime_with_cftime for compatibility with cftime > 1.4.0 spencerkclark 6628425 closed 0     2 2021-02-06T16:34:02Z 2021-02-07T23:12:33Z 2021-02-07T23:12:30Z MEMBER   0 pydata/xarray/pulls/4871
  • [x] Closes #4870
  • [x] Tests added
  • [x] Passes pre-commit run --all-files
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4871/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
802734042 MDU6SXNzdWU4MDI3MzQwNDI= 4870 Time encoding error associated with cftime > 1.4.0 spencerkclark 6628425 closed 0     0 2021-02-06T16:15:20Z 2021-02-07T23:12:30Z 2021-02-07T23:12:30Z MEMBER      

As of cftime > 1.4.0, the return type of cftime.date2num can either be an integer or float. An integer dtype is used if the times can all be encoded exactly with the provided units; otherwise a float dtype is used. This causes problems in our current encoding pipeline, because we call cftime.date2num on dates one at a time through np.vectorize, and np.vectorize infers the type of the full returned array based on the result of the first function evaluation. If the first result is an integer, then the full array will be assumed to have an integer dtype, and any values that should be floats are cast as integers.

What happened:

``` In [1]: import cftime; import numpy as np; import xarray as xr

In [2]: times = np.array([cftime.DatetimeGregorian(2000, 1, 1), cftime.DatetimeGregorian(2000, 1, 1, 1)])

In [3]: xr.coding.times._encode_datetime_with_cftime(times, "days since 2000-01-01", calendar="gregorian") Out[3]: array([0, 0]) ```

What you expected to happen:

In [3]: xr.coding.times._encode_datetime_with_cftime(times, "days since 2000-01-01", calendar="gregorian") Out[3]: array([0. , 0.04166667])

A solution here would be to encode the times with a list comprehension instead, and cast the final result to an array, in which case NumPy infers the dtype in a more sensible way.

Environment:

Output of <tt>xr.show_versions()</tt> ``` INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 14:38:56) [Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 20.2.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.6.2 xarray: 0.16.2.dev175+g8cc34cb4.d20210201 pandas: 1.1.3 numpy: 1.19.1 scipy: 1.2.1 netCDF4: 1.5.1.2 pydap: installed h5netcdf: 0.7.4 h5py: 2.9.0 Nio: None zarr: 2.3.2 cftime: 1.4.1 nc_time_axis: 1.1.1.dev5+g531dd0d PseudoNetCDF: None rasterio: 1.0.25 cfgrib: 0.9.7.1 iris: None bottleneck: 1.2.1 dask: 2.11.0 distributed: 2.11.0 matplotlib: 3.3.2 cartopy: None seaborn: 0.9.0 numbagg: installed pint: None setuptools: 51.0.0.post20201207 pip: 19.2.2 conda: None pytest: 5.0.1 IPython: 7.10.1 sphinx: 3.0.4 ```
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4870/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
360420464 MDU6SXNzdWUzNjA0MjA0NjQ= 2416 Indicate calendar type in CFTimeIndex repr spencerkclark 6628425 closed 0     5 2018-09-14T19:07:04Z 2020-11-20T01:00:41Z 2020-07-23T10:42:29Z MEMBER      

Currently CFTimeIndex uses the default repr it inherits from pandas.Index. This just displays a potentially-truncated version of the values in the index, along with the index's data type and length, e.g.: CFTimeIndex([2000-01-01 00:00:00, 2000-01-02 00:00:00, 2000-01-03 00:00:00, 2000-01-04 00:00:00, 2000-01-05 00:00:00, 2000-01-06 00:00:00, 2000-01-07 00:00:00, 2000-01-08 00:00:00, 2000-01-09 00:00:00, 2000-01-10 00:00:00, ... 2000-12-22 00:00:00, 2000-12-23 00:00:00, 2000-12-24 00:00:00, 2000-12-25 00:00:00, 2000-12-26 00:00:00, 2000-12-27 00:00:00, 2000-12-28 00:00:00, 2000-12-29 00:00:00, 2000-12-30 00:00:00, 2000-12-31 00:00:00], dtype='object', length=366) It would be nice if the repr also included an indication of the calendar type of the index, since different indexes could have different calendar types. For example: CFTimeIndex([2000-01-01 00:00:00, 2000-01-02 00:00:00, 2000-01-03 00:00:00, 2000-01-04 00:00:00, 2000-01-05 00:00:00, 2000-01-06 00:00:00, 2000-01-07 00:00:00, 2000-01-08 00:00:00, 2000-01-09 00:00:00, 2000-01-10 00:00:00, ... 2000-12-22 00:00:00, 2000-12-23 00:00:00, 2000-12-24 00:00:00, 2000-12-25 00:00:00, 2000-12-26 00:00:00, 2000-12-27 00:00:00, 2000-12-28 00:00:00, 2000-12-29 00:00:00, 2000-12-30 00:00:00, 2000-12-31 00:00:00], dtype='object', length=366, calendar='proleptic_gregorian')

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2416/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
723874489 MDExOlB1bGxSZXF1ZXN0NTA1MzY3NDMy 4517 Eliminate use of calendar-naive cftime objects spencerkclark 6628425 closed 0     1 2020-10-18T00:24:22Z 2020-10-19T15:21:12Z 2020-10-19T15:20:37Z MEMBER   0 pydata/xarray/pulls/4517

This is a minor cleanup to remove our use of calendar-naive cftime datetime objects (it just occurs in one test). The behavior of the cftime.datetime constructor is set to change in Unidata/cftime#202. By default it will create a calendar-aware datetime with a Gregorian calendar, instead of a calendar-naive datetime. In xarray we don't have a real need to use calendar-naive datetimes, so I think it's just best to remove our use of them.

  • [x] Passes isort . && black . && mypy . && flake8
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4517/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
700590739 MDExOlB1bGxSZXF1ZXN0NDg2MTk3NTI0 4418 Add try/except logic to handle renaming of cftime datetime base class spencerkclark 6628425 closed 0     1 2020-09-13T15:25:53Z 2020-09-19T13:30:02Z 2020-09-19T13:29:14Z MEMBER   0 pydata/xarray/pulls/4418

cftime is planning on renaming the base class for its datetime objects from cftime.datetime to cftime.datetime_base. See discussion in https://github.com/Unidata/cftime/issues/198 and https://github.com/Unidata/cftime/pull/199. This PR adds the appropriate logic in xarray to handle this in a backwards-compatible way.

In the documentation in places where we refer to :py:class:`cftime.datetime` objects, I have modified things to read ``cftime`` datetime. Being more generic is probably better in any case, as in most instances we do not explicitly mean that the base class can be used, only subclasses of the base class.

cc: @jswhit

  • [x] Passes isort . && black . && mypy . && flake8
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4418/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
431970156 MDU6SXNzdWU0MzE5NzAxNTY= 2886 Expose use_cftime option in open_zarr spencerkclark 6628425 closed 0     7 2019-04-11T11:24:48Z 2020-09-02T15:19:32Z 2020-09-02T15:19:32Z MEMBER      

use_cftime was recently added as an option to decode_cf and open_dataset to give users a little more control over how times are decoded (#2759). It would be good if it was also available for open_zarr. This perhaps doesn't have quite the importance, because open_zarr only works for single data stores, so there is no risk of decoding times to different types (e.g. as there was for open_mfdataset, #1263); however, it would still be nice to be able to silence serialization warnings that result from decoding times to cftime objects in some instances, e.g. #2754.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2886/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
679552135 MDExOlB1bGxSZXF1ZXN0NDY4Mjk3Nzc3 4343 Allow for datetime strings formatted following the default cftime format in cftime_range and partial datetime string indexing spencerkclark 6628425 closed 0     1 2020-08-15T11:55:11Z 2020-08-17T23:27:10Z 2020-08-17T23:27:07Z MEMBER   0 pydata/xarray/pulls/4343

This PR adds support for datetime strings formatted following the default cftime format (YYYY-MM-DD hh:mm:ss) in cftime_range and partial datetime string indexing.

  • [x] Closes #4337
  • [x] Tests added
  • [x] Passes isort . && black . && mypy . && flake8
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4343/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
679568161 MDExOlB1bGxSZXF1ZXN0NDY4MzExMDk3 4344 Fix overflow-related bug in computing means of cftime.datetime arrays spencerkclark 6628425 closed 0     1 2020-08-15T13:08:33Z 2020-08-15T20:05:29Z 2020-08-15T20:05:23Z MEMBER   0 pydata/xarray/pulls/4344

Going through pandas.TimedeltaIndex within duck_array_ops._to_pytimedelta leads to overflow problems (presumably it casts to a "timedelta64[ns]" type internally). This PR updates the logic to directly use NumPy to do the casting, first to "timedelta64[us]", then to datetime.timedelta.

  • [x] Closes #4341
  • [x] Tests added
  • [x] Passes isort . && black . && mypy . && flake8
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4344/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
665779368 MDExOlB1bGxSZXF1ZXN0NDU2NzQ5NTYw 4272 Un-xfail cftime plotting tests spencerkclark 6628425 closed 0     1 2020-07-26T13:23:07Z 2020-07-27T19:19:38Z 2020-07-26T19:04:55Z MEMBER   0 pydata/xarray/pulls/4272

Closes #4265

The change that broke these tests in NumPy master has now been relaxed to trigger a DeprecationWarning (https://github.com/numpy/numpy/pull/16943).

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4272/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
637348647 MDExOlB1bGxSZXF1ZXN0NDMzMzU1NTM1 4148 Remove outdated note from DatetimeAccessor docstring spencerkclark 6628425 closed 0     1 2020-06-11T22:02:43Z 2020-06-11T23:24:03Z 2020-06-11T23:23:28Z MEMBER   0 pydata/xarray/pulls/4148

Noticed this today. This note in the DatetimeAccessor docstring is no longer relevant; these fields have been calendar-aware for some time.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4148/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
594436316 MDExOlB1bGxSZXF1ZXN0Mzk5MDYzNDQ1 3935 Add a days_in_month accessor to CFTimeIndex spencerkclark 6628425 closed 0     3 2020-04-05T12:38:50Z 2020-04-06T14:02:58Z 2020-04-06T14:02:11Z MEMBER   0 pydata/xarray/pulls/3935
  • [x] Tests added
  • [x] Passes isort -rc . && black . && mypy . && flake8
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API

This adds a days_in_month accessor to CFTimeIndex, which allows for easy computation of monthly time weights for non-standard calendars: ``` In [1]: import xarray as xr

In [2]: times = xr.cftime_range("2000", periods=24, freq="MS", calendar="noleap")

In [3]: da = xr.DataArray(times, dims=["time"])

In [4]: da.dt.days_in_month Out[4]: <xarray.DataArray 'days_in_month' (time: 24)> array([31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]) Coordinates: * time (time) object 2000-01-01 00:00:00 ... 2001-12-01 00:00:00 ```

This simplifies the "Calculating Seasonal Averages from Timeseries of Monthly Means" example @jhamman wrote for the docs a while back, which I've taken the liberty of updating.

The ability to add this feature to xarray is thanks in large part to @huard, who added a daysinmonth attribute to cftime.datetime objects late last year: https://github.com/Unidata/cftime/pull/138.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3935/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
593353601 MDExOlB1bGxSZXF1ZXN0Mzk4MTQ5ODY5 3930 Only fail certain use_cftime backend tests if a specific warning occurs spencerkclark 6628425 closed 0     1 2020-04-03T12:39:47Z 2020-04-03T23:22:29Z 2020-04-03T19:35:18Z MEMBER   0 pydata/xarray/pulls/3930
  • [x] Closes #3928
  • [x] Passes isort -rc . && black . && mypy . && flake8

The warning we want to avoid in these tests is: SerializationWarning: Unable to decode time axis into full numpy.datetime64 objects, continuing using cftime.datetime objects instead, reason: dates out of range dtype = _decode_cf_datetime_dtype(data, units, calendar, self.use_cftime)

Other warnings could occur, but shouldn't cause the tests to fail. This modifies these tests to only fail if a warning with this message occurs.

The warning that is occurring seems to be stemming from within the netcdf4-python library: DeprecationWarning: tostring() is deprecated. Use tobytes() instead. attributes = {k: var.getncattr(k) for k in var.ncattrs()}

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3930/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
589581955 MDExOlB1bGxSZXF1ZXN0Mzk1MDk5MTE2 3907 Un-xfail test_dayofyear_after_cftime_range spencerkclark 6628425 closed 0     1 2020-03-28T13:55:50Z 2020-03-28T14:26:49Z 2020-03-28T14:26:46Z MEMBER   0 pydata/xarray/pulls/3907

With Unidata/cftime#163 merged, this test, which we temporarily xfailed in #3885, should pass with cftime master.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3907/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
585494486 MDExOlB1bGxSZXF1ZXN0MzkxODU1ODAz 3874 Re-enable tests xfailed in #3808 and fix new CFTimeIndex failures due to upstream changes spencerkclark 6628425 closed 0     4 2020-03-21T12:57:49Z 2020-03-23T00:29:58Z 2020-03-22T22:19:42Z MEMBER   0 pydata/xarray/pulls/3874

xref: #3869

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3874/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
563202971 MDExOlB1bGxSZXF1ZXN0MzczNjU1Mzg4 3764 Fix CFTimeIndex-related errors stemming from updates in pandas spencerkclark 6628425 closed 0     10 2020-02-11T13:22:04Z 2020-03-15T14:58:26Z 2020-03-13T06:14:41Z MEMBER   0 pydata/xarray/pulls/3764
  • [x] Closes #3751
  • [x] Tests added
  • [x] Passes isort -rc . && black . && mypy . && flake8
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API

This fixes the errors identified when #3751 was created by allowing one to subtract a pd.Index of cftime.datetime objects from a CFTimeIndex.

Some new errors have come up too (not associated with any updates I made here), which I still need to work on identifying the source of: ``` ____ test_indexing_in_series_getitem[365_day] _____

series = 0001-01-01 00:00:00 1 0001-02-01 00:00:00 2 0002-01-01 00:00:00 3 0002-02-01 00:00:00 4 dtype: int64 index = CFTimeIndex([0001-01-01 00:00:00, 0001-02-01 00:00:00, 0002-01-01 00:00:00, 0002-02-01 00:00:00], dtype='object') scalar_args = [cftime.DatetimeNoLeap(0001-01-01 00:00:00)] range_args = ['0001', slice('0001-01-01', '0001-12-30', None), slice(None, '0001-12-30', None), slice(cftime.DatetimeNoLeap(0001-01...:00), cftime.DatetimeNoLeap(0001-12-30 00:00:00), None), slice(None, cftime.DatetimeNoLeap(0001-12-30 00:00:00), None)]

@requires_cftime
def test_indexing_in_series_getitem(series, index, scalar_args, range_args):
    for arg in scalar_args:
      assert series[arg] == 1

test_cftimeindex.py:597:


../../../pandas/pandas/core/series.py:884: in getitem return self._get_with(key)


self = 0001-01-01 00:00:00 1 0001-02-01 00:00:00 2 0002-01-01 00:00:00 3 0002-02-01 00:00:00 4 dtype: int64 key = cftime.DatetimeNoLeap(0001-01-01 00:00:00)

def _get_with(self, key):
    # other: fancy integer or otherwise
    if isinstance(key, slice):
        # _convert_slice_indexer to determing if this slice is positional
        #  or label based, and if the latter, convert to positional
        slobj = self.index._convert_slice_indexer(key, kind="getitem")
        return self._slice(slobj)
    elif isinstance(key, ABCDataFrame):
        raise TypeError(
            "Indexing a Series with DataFrame is not "
            "supported, use the appropriate DataFrame column"
        )
    elif isinstance(key, tuple):
        try:
            return self._get_values_tuple(key)
        except ValueError:
            # if we don't have a MultiIndex, we may still be able to handle
            #  a 1-tuple.  see test_1tuple_without_multiindex
            if len(key) == 1:
                key = key[0]
                if isinstance(key, slice):
                    return self._get_values(key)
            raise

    if not isinstance(key, (list, np.ndarray, ExtensionArray, Series, Index)):
      key = list(key)

E TypeError: 'cftime._cftime.DatetimeNoLeap' object is not iterable

../../../pandas/pandas/core/series.py:911: TypeError ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3764/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
569408125 MDExOlB1bGxSZXF1ZXN0Mzc4NjQyMTE5 3792 Enable pandas-style rounding of cftime.datetime objects spencerkclark 6628425 closed 0     1 2020-02-22T23:26:50Z 2020-03-02T12:03:47Z 2020-03-02T09:41:20Z MEMBER   0 pydata/xarray/pulls/3792
  • [x] Tests added
  • [x] Passes isort -rc . && black . && mypy . && flake8
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API

This is particularly useful for removing microsecond noise that can sometimes be added from decoding times via cftime.num2date, though also applies more generally. The methods used here for rounding dates in the integer domain are copied from pandas.

On a somewhat more internal note, this adds an asi8 property to CFTimeIndex, which encodes the dates as integer values representing microseconds since 1970-01-01; this encoding is made exact via the exact_cftime_datetime_difference function. It's possible this could be useful in other contexts.

Some examples:

``` In [1]: import xarray as xr

In [2]: times = xr.cftime_range("2000", periods=5, freq="17D")

In [3]: time = xr.DataArray(times, dims=["time"], name="time")

In [4]: time.dt.floor("11D") Out[4]: <xarray.DataArray 'floor' (time: 5)> array([cftime.DatetimeGregorian(1999-12-31 00:00:00), cftime.DatetimeGregorian(2000-01-11 00:00:00), cftime.DatetimeGregorian(2000-02-02 00:00:00), cftime.DatetimeGregorian(2000-02-13 00:00:00), cftime.DatetimeGregorian(2000-03-06 00:00:00)], dtype=object) Coordinates: * time (time) object 2000-01-01 00:00:00 ... 2000-03-09 00:00:00

In [5]: time.dt.ceil("11D") Out[5]: <xarray.DataArray 'ceil' (time: 5)> array([cftime.DatetimeGregorian(2000-01-11 00:00:00), cftime.DatetimeGregorian(2000-01-22 00:00:00), cftime.DatetimeGregorian(2000-02-13 00:00:00), cftime.DatetimeGregorian(2000-02-24 00:00:00), cftime.DatetimeGregorian(2000-03-17 00:00:00)], dtype=object) Coordinates: * time (time) object 2000-01-01 00:00:00 ... 2000-03-09 00:00:00

In [6]: time.dt.round("11D") Out[6]: <xarray.DataArray 'round' (time: 5)> array([cftime.DatetimeGregorian(1999-12-31 00:00:00), cftime.DatetimeGregorian(2000-01-22 00:00:00), cftime.DatetimeGregorian(2000-02-02 00:00:00), cftime.DatetimeGregorian(2000-02-24 00:00:00), cftime.DatetimeGregorian(2000-03-06 00:00:00)], dtype=object) Coordinates: * time (time) object 2000-01-01 00:00:00 ... 2000-03-09 00:00:00 ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3792/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
572709020 MDExOlB1bGxSZXF1ZXN0MzgxMzUxODE5 3808 xfail tests due to #3751 spencerkclark 6628425 closed 0     1 2020-02-28T11:52:15Z 2020-02-28T13:45:33Z 2020-02-28T13:39:58Z MEMBER   0 pydata/xarray/pulls/3808

@max-sixty @shoyer -- I agree we've let these linger far too long. This should hopefully get things back to being green.

  • [x] Passes isort -rc . && black . && mypy . && flake8
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3808/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
539648897 MDU6SXNzdWU1Mzk2NDg4OTc= 3641 interp with long cftime coordinates raises an error spencerkclark 6628425 closed 0     8 2019-12-18T12:23:16Z 2020-01-26T14:10:37Z 2020-01-26T14:10:37Z MEMBER      

MCVE Code Sample

``` In [1]: import xarray as xr

In [2]: times = xr.cftime_range('0001', periods=3, freq='500Y')

In [3]: da = xr.DataArray(range(3), dims=['time'], coords=[times])

In [4]: da.interp(time=['0002-05-01'])

TypeError Traceback (most recent call last) <ipython-input-4-f781cb4d500e> in <module> ----> 1 da.interp(time=['0002-05-01'])

~/Software/miniconda3/envs/xarray-tests/lib/python3.7/site-packages/xarray/core/dataarray.py in interp(self, coords, method, assume_sorted, kwargs, coords_kwargs) 1353 kwargs=kwargs, 1354 assume_sorted=assume_sorted, -> 1355 coords_kwargs, 1356 ) 1357 return self._from_temp_dataset(ds)

~/Software/miniconda3/envs/xarray-tests/lib/python3.7/site-packages/xarray/core/dataset.py in interp(self, coords, method, assume_sorted, kwargs, coords_kwargs) 2565 if k in var.dims 2566 } -> 2567 variables[name] = missing.interp(var, var_indexers, method, kwargs) 2568 elif all(d not in indexers for d in var.dims): 2569 # keep unrelated object array

~/Software/miniconda3/envs/xarray-tests/lib/python3.7/site-packages/xarray/core/missing.py in interp(var, indexes_coords, method, *kwargs) 607 new_dims = broadcast_dims + list(destination[0].dims) 608 interped = interp_func( --> 609 var.transpose(original_dims).data, x, destination, method, kwargs 610 ) 611

~/Software/miniconda3/envs/xarray-tests/lib/python3.7/site-packages/xarray/core/missing.py in interp_func(var, x, new_x, method, kwargs) 683 ) 684 --> 685 return _interpnd(var, x, new_x, func, kwargs) 686 687

~/Software/miniconda3/envs/xarray-tests/lib/python3.7/site-packages/xarray/core/missing.py in _interpnd(var, x, new_x, func, kwargs) 698 699 def _interpnd(var, x, new_x, func, kwargs): --> 700 x, new_x = _floatize_x(x, new_x) 701 702 if len(x) == 1:

~/Software/miniconda3/envs/xarray-tests/lib/python3.7/site-packages/xarray/core/missing.py in _floatize_x(x, new_x) 556 # represented by float. 557 xmin = x[i].values.min() --> 558 x[i] = x[i]._to_numeric(offset=xmin, dtype=np.float64) 559 new_x[i] = new_x[i]._to_numeric(offset=xmin, dtype=np.float64) 560 return x, new_x

~/Software/miniconda3/envs/xarray-tests/lib/python3.7/site-packages/xarray/core/variable.py in _to_numeric(self, offset, datetime_unit, dtype) 2001 """ 2002 numeric_array = duck_array_ops.datetime_to_numeric( -> 2003 self.data, offset, datetime_unit, dtype 2004 ) 2005 return type(self)(self.dims, numeric_array, self._attrs)

~/Software/miniconda3/envs/xarray-tests/lib/python3.7/site-packages/xarray/core/duck_array_ops.py in datetime_to_numeric(array, offset, datetime_unit, dtype) 410 if array.dtype.kind in "mM": 411 return np.where(isnull(array), np.nan, array.astype(dtype)) --> 412 return array.astype(dtype) 413 414

TypeError: float() argument must be a string or a number, not 'datetime.timedelta' ```

Problem Description

In principle we should be able to get this to work. The issue stems from the following logic in datetime_to_numeric: https://github.com/pydata/xarray/blob/45fd0e63f43cf313b022a33aeec7f0f982e1908b/xarray/core/duck_array_ops.py#L402-L404 Here we are relying on pandas to convert an array of datetime.timedelta objects to an array with dtype timedelta64[ns]. If the array of datetime.timedelta objects cannot be safely converted to timedelta64[ns] (e.g. due to an integer overflow) then this line is silently a no-op which leads to the error downstream at the dtype conversion step. This is my fault originally for suggesting this approach, https://github.com/pydata/xarray/pull/2668#discussion_r247271576.

~~To solve this I think we'll need to write our own logic to convert datetime.timedelta objects to numeric values instead of relying on pandas/NumPy.~~ (as @huard notes we should be able to use NumPy directly here for the conversion). We should not consider ourselves beholden to using nanosecond resolution for a couple of reasons: 1. datetime.timedelta objects do not natively support nanosecond resolution; they have microsecond resolution natively, which corresponds with a NumPy timedelta range of +/- 2.9e5 years. 2. One motivation/use-case for cftime dates is that they can represent long time periods that cannot be represented using a standard DatetimeIndex. We should do everything we can to support this with a CFTimeIndex.

@huard @dcherian this is an important issue we'll need to solve to be able to use a fixed offset for cftime dates for an application like polyfit/polyval.

xref: #3349 and #3631.

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 14:38:56) [Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 19.0.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: None xarray: 0.14.1 pandas: 0.25.0 numpy: 1.17.0 scipy: 1.3.1 netCDF4: None pydap: installed h5netcdf: 0.7.4 h5py: 2.9.0 Nio: None zarr: 2.3.2 cftime: 1.0.4.2 nc_time_axis: None PseudoNetCDF: None rasterio: 1.0.25 cfgrib: 0.9.7.1 iris: None bottleneck: 1.2.1 dask: 2.9.0+2.gd0daa5bc distributed: 2.9.0 matplotlib: 3.1.1 cartopy: None seaborn: 0.9.0 numbagg: installed setuptools: 42.0.2.post20191201 pip: 19.2.2 conda: None pytest: 5.0.1 IPython: 7.10.1 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3641/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
541448514 MDExOlB1bGxSZXF1ZXN0MzU2MDg0NDg3 3652 Use encoding['dtype'] over data.dtype when possible within CFMaskCoder.encode spencerkclark 6628425 closed 0     0 2019-12-22T13:05:18Z 2020-01-15T15:23:41Z 2020-01-15T15:22:30Z MEMBER   0 pydata/xarray/pulls/3652

This uses encoding['dtype'] over data.dtype when possible within CFMaskCoder.encode to decide what type to cast encoding['missing_value'] or encoding['_FillValue'] to; this is one way to fix #3624. Another possible way would be to ensure the times have the proper dtype coming from CFDatetimeCoder.encode. I'm not sure what is the preferred solution.

cc: @andersy005, @spencerahill

  • [x] Closes #3624
  • [x] Tests added
  • [x] Passes black . && mypy . && flake8
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3652/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
538068264 MDU6SXNzdWU1MzgwNjgyNjQ= 3624 Issue serializing arrays of times with certain dtype and _FillValue encodings spencerkclark 6628425 closed 0     0 2019-12-15T15:44:08Z 2020-01-15T15:22:30Z 2020-01-15T15:22:30Z MEMBER      

MCVE Code Sample

``` In [1]: import numpy as np; import pandas as pd; import xarray as xr

In [2]: times = pd.date_range('2000', periods=3)

In [3]: da = xr.DataArray(times, dims=['a'], coords=[[1, 2, 3]], name='foo')

In [4]: da.encoding['_FillValue'] = 1.0e20

In [5]: da.encoding['dtype'] = np.dtype('float64')

In [6]: da.to_dataset().to_netcdf('test.nc')

OverflowError Traceback (most recent call last) <ipython-input-6-cbc6b2cfdf9a> in <module> ----> 1 da.to_dataset().to_netcdf('test.nc')

~/Software/xarray/xarray/core/dataset.py in to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute, invalid_netcdf) 1548 unlimited_dims=unlimited_dims, 1549 compute=compute, -> 1550 invalid_netcdf=invalid_netcdf, 1551 ) 1552

~/Software/xarray/xarray/backends/api.py in to_netcdf(dataset, path_or_file, mode, format, group, engine, encoding, unlimited_dims, compute, multifile, invalid_netcdf) 1071 # to be parallelized with dask 1072 dump_to_store( -> 1073 dataset, store, writer, encoding=encoding, unlimited_dims=unlimited_dims 1074 ) 1075 if autoclose:

~/Software/xarray/xarray/backends/api.py in dump_to_store(dataset, store, writer, encoder, encoding, unlimited_dims) 1117 variables, attrs = encoder(variables, attrs) 1118 -> 1119 store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims) 1120 1121

~/Software/xarray/xarray/backends/common.py in store(self, variables, attributes, check_encoding_set, writer, unlimited_dims) 291 writer = ArrayWriter() 292 --> 293 variables, attributes = self.encode(variables, attributes) 294 295 self.set_attributes(attributes)

~/Software/xarray/xarray/backends/common.py in encode(self, variables, attributes) 380 # All NetCDF files get CF encoded by default, without this attempting 381 # to write times, for example, would fail. --> 382 variables, attributes = cf_encoder(variables, attributes) 383 variables = {k: self.encode_variable(v) for k, v in variables.items()} 384 attributes = {k: self.encode_attribute(v) for k, v in attributes.items()}

~/Software/xarray/xarray/conventions.py in cf_encoder(variables, attributes) 758 _update_bounds_encoding(variables) 759 --> 760 new_vars = {k: encode_cf_variable(v, name=k) for k, v in variables.items()} 761 762 # Remove attrs from bounds variables (issue #2921)

~/Software/xarray/xarray/conventions.py in <dictcomp>(.0) 758 _update_bounds_encoding(variables) 759 --> 760 new_vars = {k: encode_cf_variable(v, name=k) for k, v in variables.items()} 761 762 # Remove attrs from bounds variables (issue #2921)

~/Software/xarray/xarray/conventions.py in encode_cf_variable(var, needs_copy, name) 248 variables.UnsignedIntegerCoder(), 249 ]: --> 250 var = coder.encode(var, name=name) 251 252 # TODO(shoyer): convert all of these to use coders, too:

~/Software/xarray/xarray/coding/variables.py in encode(self, variable, name) 163 if fv is not None: 164 # Ensure _FillValue is cast to same dtype as data's --> 165 encoding["_FillValue"] = data.dtype.type(fv) 166 fill_value = pop_to(encoding, attrs, "_FillValue", name=name) 167 if not pd.isnull(fill_value):

OverflowError: Python int too large to convert to C long ```

Expected Output

I think this should succeed in writing to a netCDF file (it worked in xarray 0.14.0 and earlier).

Problem Description

I think this (admittedly very subtle) issue was introduced in https://github.com/pydata/xarray/pull/3502. Essentially at the time data enters CFMaskCoder.encode it does not necessarily have the dtype it will ultimately be encoded with. In the case of this example, data has type int64, but when it will be stored in the netCDF file it will be a double-precision float.

A possible solution here might be to rely on encoding['dtype'] (if it exists) to determine the type to cast the encoding values for '_FillValue' and 'missing_value' to, instead of relying solely on data.dtype (maybe use that as a fallback).

cc: @spencerahill

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 | packaged by conda-forge | (default, Dec 6 2019, 08:36:57) [Clang 9.0.0 (tags/RELEASE_900/final)] python-bits: 64 OS: Darwin OS-release: 19.0.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.1 xarray: master pandas: 0.25.3 numpy: 1.17.3 scipy: 1.3.2 netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.4.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.9.0 distributed: 2.9.0 matplotlib: 3.1.2 cartopy: None seaborn: None numbagg: None setuptools: 42.0.2.post20191203 pip: 19.3.1 conda: None pytest: 5.3.2 IPython: 7.10.1 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3624/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
548574558 MDExOlB1bGxSZXF1ZXN0MzYxODMyOTcw 3688 Fix test_cf_datetime_nan under pandas master spencerkclark 6628425 closed 0     1 2020-01-12T14:01:50Z 2020-01-13T16:36:33Z 2020-01-13T16:31:38Z MEMBER   0 pydata/xarray/pulls/3688

This fixes test_cf_datetime_nan for upcoming releases of pandas. See failure class (2) reported in #3673.

  • [x] Tests added
  • [x] Passes black . && mypy . && flake8
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3688/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
534404865 MDU6SXNzdWU1MzQ0MDQ4NjU= 3603 Test failure with dask master spencerkclark 6628425 closed 0     2 2019-12-07T13:55:54Z 2019-12-30T17:46:44Z 2019-12-30T17:46:44Z MEMBER      

It looks like https://github.com/dask/dask/pull/5684, which adds nanmedian to dask (nice!), caused the error message to change for when one tries to reduce an array over all axes via median (i.e. it no longer contains 'dask', because xarray now dispatches to the newly added dask function instead of failing before trying that).

@dcherian do you have thoughts on how to best address this? Should we just remove that check in test_reduce? ``` =================================== FAILURES =================================== _____ TestVariable.test_reduce _______

error = <class 'NotImplementedError'>, pattern = 'dask'

@contextmanager
def raises_regex(error, pattern):
    __tracebackhide__ = True
    with pytest.raises(error) as excinfo:
      yield

xarray/tests/init.py:104:


self = <xarray.tests.test_dask.TestVariable object at 0x7fd14f8e9c88>

def test_reduce(self):
    u = self.eager_var
    v = self.lazy_var
    self.assertLazyAndAllClose(u.mean(), v.mean())
    self.assertLazyAndAllClose(u.std(), v.std())
    with raise_if_dask_computes():
        actual = v.argmax(dim="x")
    self.assertLazyAndAllClose(u.argmax(dim="x"), actual)
    with raise_if_dask_computes():
        actual = v.argmin(dim="x")
    self.assertLazyAndAllClose(u.argmin(dim="x"), actual)
    self.assertLazyAndAllClose((u > 1).any(), (v > 1).any())
    self.assertLazyAndAllClose((u < 1).all("x"), (v < 1).all("x"))
    with raises_regex(NotImplementedError, "dask"):
      v.median()

xarray/tests/test_dask.py:220:


self = <xarray.Variable (x: 4, y: 6)> dask.array<array, shape=(4, 6), dtype=float64, chunksize=(2, 2), chunktype=numpy.ndarray> dim = None, axis = None, skipna = None, kwargs = {}

def wrapped_func(self, dim=None, axis=None, skipna=None, **kwargs):
  return self.reduce(func, dim, axis, skipna=skipna, **kwargs)

xarray/core/common.py:46:


self = <xarray.Variable (x: 4, y: 6)> dask.array<array, shape=(4, 6), dtype=float64, chunksize=(2, 2), chunktype=numpy.ndarray> func = <function _create_nan_agg_method.\<locals>.f at 0x7fd16c228378> dim = None, axis = None, keep_attrs = None, keepdims = False, allow_lazy = True kwargs = {'skipna': None} input_data = dask.array<array, shape=(4, 6), dtype=float64, chunksize=(2, 2), chunktype=numpy.ndarray>

def reduce(
    self,
    func,
    dim=None,
    axis=None,
    keep_attrs=None,
    keepdims=False,
    allow_lazy=None,
    **kwargs,
):
    """Reduce this array by applying `func` along some dimension(s).

    Parameters
    ----------
    func : function
        Function which can be called in the form
        `func(x, axis=axis, **kwargs)` to return the result of reducing an
        np.ndarray over an integer valued axis.
    dim : str or sequence of str, optional
        Dimension(s) over which to apply `func`.
    axis : int or sequence of int, optional
        Axis(es) over which to apply `func`. Only one of the 'dim'
        and 'axis' arguments can be supplied. If neither are supplied, then
        the reduction is calculated over the flattened array (by calling
        `func(x)` without an axis argument).
    keep_attrs : bool, optional
        If True, the variable's attributes (`attrs`) will be copied from
        the original object to the new one.  If False (default), the new
        object will be returned without attributes.
    keepdims : bool, default False
        If True, the dimensions which are reduced are left in the result
        as dimensions of size one
    **kwargs : dict
        Additional keyword arguments passed on to `func`.

    Returns
    -------
    reduced : Array
        Array with summarized data and the indicated dimension(s)
        removed.
    """
    if dim == ...:
        dim = None
    if dim is not None and axis is not None:
        raise ValueError("cannot supply both 'axis' and 'dim' arguments")

    if dim is not None:
        axis = self.get_axis_num(dim)

    if allow_lazy is not None:
        warnings.warn(
            "allow_lazy is deprecated and will be removed in version 0.16.0. It is now True by default.",
            DeprecationWarning,
        )
    else:
        allow_lazy = True

    input_data = self.data if allow_lazy else self.values

    if axis is not None:
        data = func(input_data, axis=axis, **kwargs)
    else:
      data = func(input_data, **kwargs)

xarray/core/variable.py:1534:


values = dask.array<array, shape=(4, 6), dtype=float64, chunksize=(2, 2), chunktype=numpy.ndarray> axis = None, skipna = None, kwargs = {} func = <function nanmedian at 0x7fd16c226bf8>, nanname = 'nanmedian'

def f(values, axis=None, skipna=None, **kwargs):
    if kwargs.pop("out", None) is not None:
        raise TypeError(f"`out` is not valid for {name}")

    values = asarray(values)

    if coerce_strings and values.dtype.kind in "SU":
        values = values.astype(object)

    func = None
    if skipna or (skipna is None and values.dtype.kind in "cfO"):
        nanname = "nan" + name
        func = getattr(nanops, nanname)
    else:
        func = _dask_or_eager_func(name)

    try:
      return func(values, axis=axis, **kwargs)

xarray/core/duck_array_ops.py:307:


a = dask.array<array, shape=(4, 6), dtype=float64, chunksize=(2, 2), chunktype=numpy.ndarray> axis = None, out = None

def nanmedian(a, axis=None, out=None):
  return _dask_or_eager_func("nanmedian", eager_module=nputils)(a, axis=axis)

xarray/core/nanops.py:144:


args = (dask.array<array, shape=(4, 6), dtype=float64, chunksize=(2, 2), chunktype=numpy.ndarray>,) kwargs = {'axis': None} dispatch_args = (dask.array<array, shape=(4, 6), dtype=float64, chunksize=(2, 2), chunktype=numpy.ndarray>,) wrapped = <function nanmedian at 0x7fd1737bcea0>

def f(*args, **kwargs):
    if list_of_args:
        dispatch_args = args[0]
    else:
        dispatch_args = args[array_args]
    if any(isinstance(a, dask_array.Array) for a in dispatch_args):
        try:
            wrapped = getattr(dask_module, name)
        except AttributeError as e:
            raise AttributeError(f"{e}: requires dask >={requires_dask}")
    else:
        wrapped = getattr(eager_module, name)
  return wrapped(*args, **kwargs)

xarray/core/duck_array_ops.py:47:


a = dask.array<array, shape=(4, 6), dtype=float64, chunksize=(2, 2), chunktype=numpy.ndarray> axis = None, keepdims = False, out = None

@derived_from(np)
def nanmedian(a, axis=None, keepdims=False, out=None):
    """
    This works by automatically chunking the reduced axes to a single chunk
    and then calling ``numpy.nanmedian`` function across the remaining dimensions
    """
    if axis is None:
        raise NotImplementedError(
          "The da.nanmedian function only works along an axis or a subset of axes.  "
            "The full algorithm is difficult to do in parallel"
        )

E NotImplementedError: The da.nanmedian function only works along an axis or a subset of axes. The full algorithm is difficult to do in parallel

/usr/share/miniconda/envs/xarray-tests/lib/python3.7/site-packages/dask/array/reductions.py:1299: NotImplementedError

During handling of the above exception, another exception occurred:

self = <xarray.tests.test_dask.TestVariable object at 0x7fd14f8e9c88>

def test_reduce(self):
    u = self.eager_var
    v = self.lazy_var
    self.assertLazyAndAllClose(u.mean(), v.mean())
    self.assertLazyAndAllClose(u.std(), v.std())
    with raise_if_dask_computes():
        actual = v.argmax(dim="x")
    self.assertLazyAndAllClose(u.argmax(dim="x"), actual)
    with raise_if_dask_computes():
        actual = v.argmin(dim="x")
    self.assertLazyAndAllClose(u.argmin(dim="x"), actual)
    self.assertLazyAndAllClose((u > 1).any(), (v > 1).any())
    self.assertLazyAndAllClose((u < 1).all("x"), (v < 1).all("x"))
    with raises_regex(NotImplementedError, "dask"):
      v.median()

xarray/tests/test_dask.py:220:


self = <contextlib._GeneratorContextManager object at 0x7fd14f8bcc50> type = <class 'NotImplementedError'> value = NotImplementedError('The da.nanmedian function only works along an axis or a subset of axes. The full algorithm is difficult to do in parallel') traceback = <traceback object at 0x7fd154597bc8>

def __exit__(self, type, value, traceback):
    if type is None:
        try:
            next(self.gen)
        except StopIteration:
            return False
        else:
            raise RuntimeError("generator didn't stop")
    else:
        if value is None:
            # Need to force instantiation so we can reliably
            # tell if we get the same exception back
            value = type()
        try:
          self.gen.throw(type, value, traceback)

E AssertionError: exception NotImplementedError('The da.nanmedian function only works along an axis or a subset of axes. The full algorithm is difficult to do in parallel') did not match pattern 'dask' ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3603/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
523878167 MDExOlB1bGxSZXF1ZXN0MzQxNzc5MTEw 3543 Minor fix to combine_by_coords to allow for the combination of CFTimeIndexes separated by large time intervals spencerkclark 6628425 closed 0     4 2019-11-16T18:20:57Z 2019-12-07T20:38:01Z 2019-12-07T20:38:00Z MEMBER   0 pydata/xarray/pulls/3543

This is a possible fix for the issue @mathause described in https://github.com/pydata/xarray/issues/3535#issuecomment-554317768. @TomNicholas does this seem like a safe change to make in combine_by_coords?

  • [x] Closes #3535
  • [x] Tests added
  • [x] Passes black . && mypy . && flake8
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3543/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
512835214 MDExOlB1bGxSZXF1ZXN0MzMyNzY2NzUx 3450 Remove outdated code related to compatibility with netcdftime spencerkclark 6628425 closed 0     1 2019-10-26T13:24:38Z 2019-10-29T15:30:55Z 2019-10-29T15:30:55Z MEMBER   0 pydata/xarray/pulls/3450

Per https://github.com/pydata/xarray/pull/3431#discussion_r337620810, this removes outdated code leftover from the netcdftime -> cftime transition. Currently the minimum version of netCDF4 that xarray tests against is 1.4, which does not include netcdftime, and instead specifies cftime as a required dependency.

  • [x] Passes black . && mypy . && flake8
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3450/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
430460404 MDExOlB1bGxSZXF1ZXN0MjY4MzQzMTM1 2879 Reduce length of cftime resample tests spencerkclark 6628425 closed 0     5 2019-04-08T13:44:50Z 2019-04-11T11:42:16Z 2019-04-11T11:42:09Z MEMBER   0 pydata/xarray/pulls/2879

The main issue is that we were resampling the same time indexes across a large range of frequencies, in some cases producing very long results, e.g. resampling an index that spans 27 years to a frequency of 12 hours.

This modifies the primary test so that it constructs time indexes whose ranges are based on the frequencies we resample to. Now in total the tests in test_cftimeindex_resample.py take around 6 seconds.

@jwenfai I did some coverage analysis offline, and these tests produce the same coverage that we had before (I found it necessary to be sure to test cases where the reference index had either a shorter or longer frequency than the resample frequency). Do you think what I have here is sufficient? I think we could potentially shorten things even more, but I'm not sure if it's worth the effort.

  • [x] Closes #2874

See below for the new profiling results; now the longest cftime tests are no longer associated with resample.

$ pytest -k cftime --durations=50 ... 0.18s call xarray/tests/test_backends.py::TestScipyInMemoryData::test_roundtrip_cftime_datetime_data 0.11s call xarray/tests/test_backends.py::TestScipyFilePath::test_roundtrip_cftime_datetime_data 0.10s call xarray/tests/test_backends.py::TestNetCDF4Data::test_roundtrip_cftime_datetime_data 0.09s call xarray/tests/test_backends.py::TestNetCDF4ClassicViaNetCDF4Data::test_roundtrip_cftime_datetime_data 0.09s call xarray/tests/test_backends.py::TestNetCDF4ViaDaskData::test_roundtrip_cftime_datetime_data 0.08s teardown xarray/tests/test_cftime_offsets.py::test_add_year_end_onOffset[julian-(2, 12)-()-<YearEnd: n=-1, month=12>-(1, 12)-()] 0.06s call xarray/tests/test_backends.py::TestNetCDF3ViaNetCDF4Data::test_roundtrip_cftime_datetime_data 0.06s call xarray/tests/test_backends.py::TestGenericNetCDFData::test_roundtrip_cftime_datetime_data 0.05s call xarray/tests/test_backends.py::TestScipyFileObject::test_roundtrip_cftime_datetime_data 0.04s call xarray/tests/test_conventions.py::TestCFEncodedDataStore::test_roundtrip_cftime_datetime_data 0.03s call xarray/tests/test_dataset.py::test_differentiate_cftime[True] 0.03s call xarray/tests/test_dataset.py::test_trapz_datetime[cftime-True] 0.02s call xarray/tests/test_coding_times.py::test_contains_cftime_datetimes_dask_3d[standard] 0.02s call xarray/tests/test_backends.py::test_use_cftime_standard_calendar_default_out_of_range[2500-gregorian] 0.02s call xarray/tests/test_dataset.py::test_differentiate_cftime[False] 0.02s call xarray/tests/test_cftimeindex_resample.py::test_resample[longer_da_freq-24-right-None-4A-MAY] 0.02s call xarray/tests/test_backends.py::test_use_cftime_standard_calendar_default_in_range[gregorian] 0.02s call xarray/tests/test_cftimeindex_resample.py::test_resample[longer_da_freq-24-None-right-11Q-JUN] 0.02s call xarray/tests/test_backends.py::test_use_cftime_standard_calendar_default_out_of_range[2500-proleptic_gregorian] 0.02s call xarray/tests/test_backends.py::test_use_cftime_true[1500-gregorian] 0.02s call xarray/tests/test_backends.py::test_use_cftime_true[2500-proleptic_gregorian] 0.01s call xarray/tests/test_backends.py::test_use_cftime_true[2000-gregorian] 0.01s call xarray/tests/test_cftimeindex_resample.py::test_resample[longer_da_freq-31-None-right-4A-MAY] 0.01s call xarray/tests/test_backends.py::test_use_cftime_standard_calendar_default_out_of_range[2500-standard] 0.01s call xarray/tests/test_backends.py::test_use_cftime_true[1500-julian] 0.01s call xarray/tests/test_cftimeindex_resample.py::test_resample[longer_da_freq-31-left-right-4A-MAY] 0.01s call xarray/tests/test_cftimeindex_resample.py::test_resample[longer_da_freq-24-right-None-11Q-JUN] 0.01s call xarray/tests/test_cftimeindex_resample.py::test_resample[longer_da_freq-24-None-right-4A-MAY] 0.01s call xarray/tests/test_cftimeindex_resample.py::test_resample[longer_da_freq-31-left-None-7M] 0.01s call xarray/tests/test_cftimeindex_resample.py::test_resample[longer_da_freq-24-left-None-4A-MAY] 0.01s call xarray/tests/test_cftimeindex_resample.py::test_resample[longer_da_freq-24-left-right-4A-MAY] 0.01s call xarray/tests/test_cftimeindex_resample.py::test_resample[longer_da_freq-31-None-None-11Q-JUN] 0.01s call xarray/tests/test_cftimeindex_resample.py::test_resample[longer_da_freq-24-right-right-4A-MAY] 0.01s call xarray/tests/test_backends.py::test_use_cftime_standard_calendar_default_out_of_range[1500-proleptic_gregorian] 0.01s call xarray/tests/test_cftimeindex_resample.py::test_resample[longer_da_freq-24-left-None-11Q-JUN] 0.01s call xarray/tests/test_backends.py::test_use_cftime_true[2500-julian] 0.01s call xarray/tests/test_backends.py::test_use_cftime_standard_calendar_default_out_of_range[1500-gregorian] 0.01s call xarray/tests/test_backends.py::test_use_cftime_true[1500-proleptic_gregorian] 0.01s call xarray/tests/test_cftimeindex_resample.py::test_resample[longer_da_freq-31-left-right-11Q-JUN] 0.01s call xarray/tests/test_backends.py::test_use_cftime_true[2000-standard] 0.01s call xarray/tests/test_backends.py::test_use_cftime_true[2500-standard] 0.01s call xarray/tests/test_cftimeindex_resample.py::test_resample[longer_da_freq-31-None-None-4A-MAY] 0.01s call xarray/tests/test_cftimeindex_resample.py::test_resample[longer_da_freq-31-right-right-11Q-JUN] 0.01s call xarray/tests/test_cftimeindex_resample.py::test_resample[longer_da_freq-24-right-right-7M] 0.01s call xarray/tests/test_backends.py::test_use_cftime_true[2500-gregorian] 0.01s call xarray/tests/test_cftimeindex_resample.py::test_resample[longer_da_freq-24-left-right-7M] 0.01s call xarray/tests/test_backends.py::test_use_cftime_true[2000-proleptic_gregorian] 0.01s call xarray/tests/test_cftimeindex_resample.py::test_resample[longer_da_freq-31-right-None-7M] 0.01s call xarray/tests/test_cftimeindex_resample.py::test_resample[longer_da_freq-31-None-right-7M] 0.01s call xarray/tests/test_cftimeindex_resample.py::test_resample[longer_da_freq-31-left-None-11Q-JUN]

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2879/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
427398236 MDU6SXNzdWU0MjczOTgyMzY= 2856 Roundtripping between a dimension coordinate and scalar coordinate on a Dataset spencerkclark 6628425 closed 0     4 2019-03-31T13:42:39Z 2019-04-04T21:58:24Z 2019-04-04T21:58:24Z MEMBER      

Code Sample, a copy-pastable example if possible

In xarray 0.12.0 the following example produces a Dataset with no indexes: ``` In [1]: import xarray as xr

In [2]: da = xr.DataArray([1], [('x', [0])], name='a')

In [3]: da.to_dataset().isel(x=0).expand_dims('x').indexes Out[3]: ```

Expected Output

In xarray 0.11.3 the roundtrip sequence above properly recovers the initial index along the 'x' dimension: ``` In [1]: import xarray as xr

In [2]: da = xr.DataArray([1], [('x', [0])], name='a')

In [3]: da.to_dataset().isel(x=0).expand_dims('x').indexes Out[3]: x: Int64Index([0], dtype='int64', name='x') ```

Output of xr.show_versions()

``` INSTALLED VERSIONS ------------------ commit: None python: 3.6.7 | packaged by conda-forge | (default, Feb 28 2019, 02:16:08) [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 18.2.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.1 libnetcdf: 4.6.1 xarray: 0.12.0 pandas: 0.24.2 numpy: 1.13.1 scipy: 0.19.1 netCDF4: 1.4.0 pydap: None h5netcdf: 0.5.1 h5py: 2.8.0 Nio: None zarr: None cftime: 1.0.0 nc_time_axis: None PseudonetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.0 dask: 0.17.5 distributed: 1.21.8 matplotlib: 2.0.2 cartopy: None seaborn: None setuptools: 40.5.0 pip: 9.0.1 conda: None pytest: 3.10.0 IPython: 6.4.0 sphinx: 1.7.4 ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2856/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
412078232 MDExOlB1bGxSZXF1ZXN0MjU0MzcyNzE4 2778 Add support for cftime.datetime coordinates with coarsen spencerkclark 6628425 closed 0     4 2019-02-19T19:06:17Z 2019-03-06T19:48:10Z 2019-03-06T19:47:47Z MEMBER   0 pydata/xarray/pulls/2778
  • [x] Tests added
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API

For now I've held off on making these changes dask-compatible (I could do it, but I'm not sure it is worth the extra complexity).

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2778/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
408612553 MDExOlB1bGxSZXF1ZXN0MjUxNzg2Mjk1 2759 Add use_cftime option to open_dataset spencerkclark 6628425 closed 0     5 2019-02-11T02:05:18Z 2019-02-19T20:47:30Z 2019-02-19T20:47:26Z MEMBER   0 pydata/xarray/pulls/2759

Based on @shoyer's suggestion in https://github.com/pydata/xarray/issues/2754#issuecomment-461983092.

  • [x] Closes #1263; Closes #2754
  • [x] Tests added
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2759/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
205473898 MDExOlB1bGxSZXF1ZXN0MTA0NzI2NzIz 1252 CFTimeIndex spencerkclark 6628425 closed 0     70 2017-02-06T02:10:47Z 2019-02-18T20:54:03Z 2018-05-13T05:19:11Z MEMBER   0 pydata/xarray/pulls/1252
  • [x] closes #1084
  • [x] passes git diff upstream/master | flake8 --diff
  • [x] tests added / passed
  • [x] whatsnew entry

This work in progress PR is a start on implementing a NetCDFTimeIndex, a subclass of pandas.Index, which closely mimics pandas.DatetimeIndex, but uses netcdftime._netcdftime.datetime objects. Currently implemented in the new index are: - Partial datetime-string indexing (using strictly ISO8601-format strings, using a date parser implemented by @shoyer in https://github.com/pydata/xarray/issues/1084#issuecomment-274372547) - Field-accessors for year, month, day, hour, minute, second, and microsecond, to enable groupby operations on attributes of date objects

This index is meant as a step towards improving the handling of non-standard calendars and dates outside the range Timestamp('1677-09-21 00:12:43.145225') to Timestamp('2262-04-11 23:47:16.854775807').


For now I have pushed only the code and some tests for the new index; I want to make sure the index is solid and well-tested before we consider integrating it into any of xarray's existing logic or writing any documentation.

Regarding the index, there are a couple remaining outstanding issues (that at least I'm aware of):

  1. Currently one can create non-sensical datetimes using netcdftime._netcdftime.datetime objects. This means one can attempt to index with an out-of-bounds string or datetime without raising an error. Could this possibly be addressed upstream? For example: ``` In [1]: from netcdftime import DatetimeNoLeap

In [2]: DatetimeNoLeap(2000, 45, 45) Out[2]: netcdftime._netcdftime.DatetimeNoLeap(2000, 45, 45, 0, 0, 0, 0, -1, 1) 2. I am looking to enable this index to be used in pandas.Series and pandas.DataFrame objects as well; this requires implementing aget_valuemethod. I have taken @shoyer's suggested simplified approach from https://github.com/pydata/xarray/issues/1084#issuecomment-275963433, and tweaked it to also allow for slice indexing, so I think this is most of the way there. A remaining to-do for me, however, is to implement something to allow for integer-indexing outside ofiloc, e.g. if you have a pandas.Seriesseries, indexing with the syntaxseries[1]orseries[1:3]```.

Hopefully this is a decent start; in particular I'm not an expert in writing tests so please let me know if there are improvements I can make to the structure and / or style I've used so far. I'm happy to make changes. I appreciate your help.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1252/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
405702347 MDExOlB1bGxSZXF1ZXN0MjQ5NjAzNzAy 2734 dropna() for a Series indexed by a CFTimeIndex spencerkclark 6628425 closed 0     0 2019-02-01T13:29:40Z 2019-02-16T02:17:05Z 2019-02-02T06:56:12Z MEMBER   0 pydata/xarray/pulls/2734

Thanks for the suggestion, @shoyer. - [x] Closes #2688 - [x] Tests added - [x] Fully documented, including whats-new.rst for all changes and api.rst for new API

cc: @jwenfai

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2734/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
410036612 MDExOlB1bGxSZXF1ZXN0MjUyODczMzMz 2771 Use DatetimeGregorian when calendar='standard' in cftime_range instead of DatetimeProlepticGregorian spencerkclark 6628425 closed 0     0 2019-02-13T22:37:55Z 2019-02-15T21:58:56Z 2019-02-15T21:58:16Z MEMBER   0 pydata/xarray/pulls/2771
  • [x] Closes #2761
  • [x] Tests added
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2771/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
408772665 MDU6SXNzdWU0MDg3NzI2NjU= 2761 'standard' calendar refers to 'proleptic_gregorian' in cftime_range rather than 'gregorian' spencerkclark 6628425 closed 0     2 2019-02-11T13:06:05Z 2019-02-15T21:58:16Z 2019-02-15T21:58:16Z MEMBER      

Code Sample, a copy-pastable example if possible

```python In [1]: import xarray

In [2]: xarray.cftime_range('2000', periods=3, calendar='standard').values Out[2]: array([cftime.DatetimeProlepticGregorian(2000, 1, 1, 0, 0, 0, 0, -1, 1), cftime.DatetimeProlepticGregorian(2000, 1, 2, 0, 0, 0, 0, -1, 1), cftime.DatetimeProlepticGregorian(2000, 1, 3, 0, 0, 0, 0, -1, 1)], dtype=object) ```

Problem description

When writing cftime_range I used dates from a proleptic Gregorian calendar when the calendar type was specified as 'standard'. While this is consistent with Python's built-in datetime.datetime (which uses a proleptic Gregorian calendar), this differs from the behavior in cftime.num2date and ultimately the CF conventions, which state that 'standard' should refer to the true Gregorian calendar. My inclination is that considering "cf" is in the name of cftime_range, we should adhere to those conventions as closely as possible (and hence the way I initially coded things was a mistake).

Expected Output

python In [2]: xarray.cftime_range('2000', periods=3, calendar='standard').values Out[2]: array([cftime.DatetimeGregorian(2000, 1, 1, 0, 0, 0, 0, -1, 1), cftime.DatetimeGregorian(2000, 1, 2, 0, 0, 0, 0, -1, 1), cftime.DatetimeGregorian(2000, 1, 3, 0, 0, 0, 0, -1, 1)], dtype=object)

Do others agree that we should fix this? If we were to make this change, would it be appropriate to consider it a bug and simply make the breaking change immediately, or might we need a deprecation cycle?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2761/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
400504690 MDU6SXNzdWU0MDA1MDQ2OTA= 2688 dropna() for a Series indexed by a CFTimeIndex spencerkclark 6628425 closed 0     3 2019-01-17T23:15:29Z 2019-02-02T06:56:12Z 2019-02-02T06:56:12Z MEMBER      

Code Sample, a copy-pastable example if possible

Currently something like the following raises an error: ``` In [1]: import xarray as xr

In [2]: import pandas as pd

In [3]: import numpy as np

In [4]: times = xr.cftime_range('2000', periods=3)

In [5]: series = pd.Series(np.array([0., np.nan, 1.]), index=times)

In [6]: series Out[6]: 2000-01-01 00:00:00 0.0 2000-01-02 00:00:00 NaN 2000-01-03 00:00:00 1.0 dtype: float64

In [7]: series.dropna()

TypeError Traceback (most recent call last) <ipython-input-7-45eb0c023203> in <module> ----> 1 series.dropna()

~/pandas/pandas/core/series.py in dropna(self, axis, inplace, **kwargs) 4169 4170 if self._can_hold_na: -> 4171 result = remove_na_arraylike(self) 4172 if inplace: 4173 self._update_inplace(result)

~/pandas/pandas/core/dtypes/missing.py in remove_na_arraylike(arr) 539 return arr[notna(arr)] 540 else: --> 541 return arr[notna(lib.values_from_object(arr))]

~/pandas/pandas/core/series.py in getitem(self, key) 801 key = com.apply_if_callable(key, self) 802 try: --> 803 result = self.index.get_value(self, key) 804 805 if not is_scalar(result):

~/xarray-dev/xarray/xarray/coding/cftimeindex.py in get_value(self, series, key) 321 """Adapted from pandas.tseries.index.DatetimeIndex.get_value""" 322 if not isinstance(key, slice): --> 323 return series.iloc[self.get_loc(key)] 324 else: 325 return series.iloc[self.slice_indexer(

~/xarray-dev/xarray/xarray/coding/cftimeindex.py in get_loc(self, key, method, tolerance) 300 else: 301 return pd.Index.get_loc(self, key, method=method, --> 302 tolerance=tolerance) 303 304 def _maybe_cast_slice_bound(self, label, side, kind):

~/pandas/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance) 2595 'backfill or nearest lookups') 2596 try: -> 2597 return self._engine.get_loc(key) 2598 except KeyError: 2599 return self._engine.get_loc(self._maybe_cast_indexer(key))

~/pandas/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

~/pandas/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

TypeError: '[ True False True]' is an invalid key ```

Problem description

We currently rely on this in the resampling logic within xarray for a Series indexed by a DatetimeIndex: https://github.com/pydata/xarray/blob/dc87dea52351835af472d131f70a7f7603b3100e/xarray/core/groupby.py#L268 It would be nice if we could do the same with a Series indexed by a CFTimeIndex, e.g. in #2593.

Expected Output

In [7]: series.dropna() Out[7]: 2000-01-01 00:00:00 0.0 2000-01-03 00:00:00 1.0 dtype: float64

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.1 | packaged by conda-forge | (default, Nov 13 2018, 09:50:42) [Clang 9.0.0 (clang-900.0.37)] python-bits: 64 OS: Darwin OS-release: 18.2.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.10.9+117.g80914e0.dirty pandas: 0.24.0.dev0+1332.g5d134ec numpy: 1.15.4 scipy: 1.1.0 netCDF4: 1.4.2 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.3.4 PseudonetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None cyordereddict: None dask: 1.0.0 distributed: 1.25.2 matplotlib: 3.0.2 cartopy: None seaborn: 0.9.0 setuptools: 40.6.3 pip: 18.1 conda: None pytest: 3.10.1 IPython: 7.2.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2688/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
398918281 MDU6SXNzdWUzOTg5MTgyODE= 2671 Enable subtracting a scalar cftime.datetime object from a CFTimeIndex spencerkclark 6628425 closed 0     0 2019-01-14T14:42:12Z 2019-01-30T16:45:10Z 2019-01-30T16:45:10Z MEMBER      

Code Sample, a copy-pastable example if possible

``` In [1]: import xarray

In [2]: times = xarray.cftime_range('2000', periods=3)

In [3]: times - times[0]

TypeError Traceback (most recent call last) <ipython-input-3-97cbca76a8af> in <module> ----> 1 times - times[0]

~/xarray-dev/xarray/xarray/coding/cftimeindex.py in sub(self, other) 417 return CFTimeIndex(np.array(self) - other.to_pytimedelta()) 418 else: --> 419 return CFTimeIndex(np.array(self) - other) 420 421 def _add_delta(self, deltas):

~/xarray-dev/xarray/xarray/coding/cftimeindex.py in new(cls, data, name) 238 result = object.new(cls) 239 result._data = np.array(data, dtype='O') --> 240 assert_all_valid_date_type(result._data) 241 result.name = name 242 return result

~/xarray-dev/xarray/xarray/coding/cftimeindex.py in assert_all_valid_date_type(data) 194 raise TypeError( 195 'CFTimeIndex requires cftime.datetime ' --> 196 'objects. Got object of {}.'.format(date_type)) 197 if not all(isinstance(value, date_type) for value in data): 198 raise TypeError(

TypeError: CFTimeIndex requires cftime.datetime objects. Got object of <class 'datetime.timedelta'>. ```

Problem description

This should result in a pandas.TimedeltaIndex, as is the case for a pandas.DatetimeIndex: ``` In [4]: import pandas

In [5]: times = pandas.date_range('2000', periods=3)

In [6]: times - times[0] Out[6]: TimedeltaIndex(['0 days', '1 days', '2 days'], dtype='timedelta64[ns]', freq=None) ```

Expected Output

``` In [1]: import xarray

In [2]: times = xarray.cftime_range('2000', periods=3)

In [3]: times - times[0] Out[3]: TimedeltaIndex(['0 days', '1 days', '2 days'], dtype='timedelta64[ns]', freq=None) ```

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.6 | packaged by conda-forge | (default, Jul 26 2018, 09:55:02) [GCC 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.53)] python-bits: 64 OS: Darwin OS-release: 18.2.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.10.9+127.ga7129d1 pandas: 0.24.0.dev0+1332.g5d134ec numpy: 1.15.4 scipy: 1.1.0 netCDF4: 1.4.2 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.3.4 PseudonetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None cyordereddict: None dask: 1.0.0 distributed: 1.25.1 matplotlib: 3.0.2 cartopy: None seaborn: 0.9.0 setuptools: 40.6.3 pip: 18.1 conda: None pytest: 3.10.1 IPython: 7.2.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2671/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
398920871 MDExOlB1bGxSZXF1ZXN0MjQ0NDY5MjI3 2672 Enable subtracting a scalar cftime.datetime object from a CFTimeIndex spencerkclark 6628425 closed 0     1 2019-01-14T14:47:50Z 2019-01-30T16:45:10Z 2019-01-30T16:45:10Z MEMBER   0 pydata/xarray/pulls/2672
  • [x] Closes #2671
  • [x] Tests added
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2672/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
396084601 MDExOlB1bGxSZXF1ZXN0MjQyMzg2MDIw 2651 Convert ref_date to UTC in encode_cf_datetime spencerkclark 6628425 closed 0     2 2019-01-04T22:10:21Z 2019-01-15T18:55:50Z 2019-01-05T19:06:55Z MEMBER   0 pydata/xarray/pulls/2651
  • [x] Closes #2649
  • [x] Tests added
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API

I think this should be an appropriate fix for #2649, but I'd appreciate input from those who are more experienced dealing with timezones in NumPy/pandas.

My understanding is that NumPy dates are stored as UTC and do not carry any timezone information. Therefore converting the ref_date with tz_convert(None) here, which converts it to UTC and removes the timezone information, should be appropriate for encoding.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2651/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
396195979 MDExOlB1bGxSZXF1ZXN0MjQyNDU4MTA0 2654 Improve test for #2649 spencerkclark 6628425 closed 0     0 2019-01-05T20:07:36Z 2019-01-06T00:56:00Z 2019-01-06T00:55:22Z MEMBER   0 pydata/xarray/pulls/2654

Currently, while we indeed do always decode to UTC, I'm not sure how well we test that. In addition this tests both the np.datetime64 and cftime.datetime pathways.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2654/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
395115822 MDExOlB1bGxSZXF1ZXN0MjQxNjUzMTYx 2640 Use built-in interp for interpolation with resample spencerkclark 6628425 closed 0     3 2019-01-01T22:09:44Z 2019-01-03T01:18:06Z 2019-01-03T01:18:06Z MEMBER   0 pydata/xarray/pulls/2640
  • [x] Closes #2197
  • [x] Tests added
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API

My main goal with this was to help out with #2593 (xarray's built-in interpolation method is compatible with cftime coordinates, so this refactor would simplify things there). While doing this I realized that I could also add the simple bug-fix for #2197.

cc: @jwenfai

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2640/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
394017899 MDExOlB1bGxSZXF1ZXN0MjQwODcyMDM1 2633 Fix dayofweek and dayofyear attributes from dates generated by cftime_range spencerkclark 6628425 closed 0     3 2018-12-25T12:57:13Z 2018-12-28T22:55:55Z 2018-12-28T19:04:50Z MEMBER   0 pydata/xarray/pulls/2633
  • [x] Tests added

It turns out there was a remaining bug in cftime (https://github.com/Unidata/cftime/issues/106) that impacted the results of the dayofwk and dayofyr attributes of cftime objects generated by their replace method, which we use when parsing dates from strings, and in some offset arithmetic.

A workaround is to add a dayofwk=-1 argument to each replace call where the dayofwk or dayofyr would be expected to change. I've fixed this bug upstream in cftime (https://github.com/Unidata/cftime/pull/108), but it will only be available in a future version. Would it be appropriate to use this workaround in xarray?

This would fix this doc page for instance:

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2633/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
393878051 MDExOlB1bGxSZXF1ZXN0MjQwNzcyNDM0 2630 Fix failure in time encoding for pandas < 0.21.1 spencerkclark 6628425 closed 0     1 2018-12-24T13:03:42Z 2018-12-24T15:58:21Z 2018-12-24T15:58:03Z MEMBER   0 pydata/xarray/pulls/2630
  • [x] Closes #2623
  • [x] Tests added
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API

This is related to a bug fixed in https://github.com/pandas-dev/pandas/pull/18020#issuecomment-340477318 (this should return a TimedeltaIndex): ``` In [2]: times = pd.date_range('2000', periods=3)

In [3]: times - np.datetime64('2000-01-01') Out[3]: DatetimeIndex(['1970-01-01', '1970-01-02', '1970-01-03'], dtype='datetime64[ns]', freq='D') Subtracting a `Timestamp` object seems to work in all versions: In [4]: times - pd.Timestamp('2000-01-01') Out[4]: TimedeltaIndex(['0 days', '1 days', '2 days'], dtype='timedelta64[ns]', freq=None) ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2630/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
391670868 MDExOlB1bGxSZXF1ZXN0MjM5MTEyNTUz 2613 Remove tz argument in cftime_range spencerkclark 6628425 closed 0     2 2018-12-17T11:32:10Z 2018-12-18T19:21:57Z 2018-12-18T17:21:36Z MEMBER   0 pydata/xarray/pulls/2613

This was caught by @jwenfai in #2593. I hope no one was inadvertently trying to use this argument before.

Should this need a what's new entry?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2613/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
390644093 MDExOlB1bGxSZXF1ZXN0MjM4MzYxOTAz 2604 Update cftime version in doc environment spencerkclark 6628425 closed 0     0 2018-12-13T11:52:02Z 2018-12-13T17:12:38Z 2018-12-13T17:12:38Z MEMBER   0 pydata/xarray/pulls/2604

As mentioned in https://github.com/pydata/xarray/issues/2597#issuecomment-446151329, the dayofyr and dayofwk attributes of cftime.datetime objects do not always work in versions of cftime prior to 1.0.2. This issue comes up in the latest doc build:

This updates the documentation environment to use the most recent version (1.0.3.4), which should fix things.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2604/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
389688621 MDExOlB1bGxSZXF1ZXN0MjM3NjIwODI4 2599 Add dayofyear and dayofweek accessors to CFTimeIndex spencerkclark 6628425 closed 0     3 2018-12-11T10:17:04Z 2018-12-11T19:29:13Z 2018-12-11T19:28:31Z MEMBER   0 pydata/xarray/pulls/2599
  • [x] Closes #2597
  • [x] Tests added
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2599/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
377361760 MDExOlB1bGxSZXF1ZXN0MjI4MzIyODcz 2543 Remove old-style resample example in documentation spencerkclark 6628425 closed 0     0 2018-11-05T11:37:47Z 2018-11-05T17:22:52Z 2018-11-05T16:46:30Z MEMBER   0 pydata/xarray/pulls/2543

Minor follow-up to #2541

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2543/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
374434077 MDExOlB1bGxSZXF1ZXN0MjI2MTM3ODc1 2516 Switch enable_cftimeindex to True by default spencerkclark 6628425 closed 0     5 2018-10-26T15:26:31Z 2018-11-01T17:52:45Z 2018-11-01T05:04:25Z MEMBER   0 pydata/xarray/pulls/2516

As discussed in #2437 and #2505, this sets the option enable_cftimeindex to True by default.

  • [x] Fully documented, including whats-new.rst for all changes.
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2516/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
374748365 MDExOlB1bGxSZXF1ZXN0MjI2MzYzMTc5 2522 Remove tests where results change in cftime 1.0.2.1 spencerkclark 6628425 closed 0     1 2018-10-28T12:25:38Z 2018-10-30T01:58:15Z 2018-10-30T01:00:43Z MEMBER   0 pydata/xarray/pulls/2522
  • [x] Closes #2521 (remove if there is no corresponding issue, which should only be the case for minor changes)

cftime version 1.0.2.1 (currently only installed on Windows, because it hasn't appeared on conda-forge yet) includes some changes that improve the precision of datetime arithmetic, which causes some results of infer_datetime_units to change. These changes aren't really a concern, because it doesn't impact our ability to round-trip dates; it just changes the units dates are encoded with in some cases. For that reason I've just deleted the tests where the answers change across versions.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2522/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
374695591 MDExOlB1bGxSZXF1ZXN0MjI2MzI4NjQ5 2519 Fix bug in encode_cf_datetime spencerkclark 6628425 closed 0     1 2018-10-27T22:28:37Z 2018-10-28T01:30:00Z 2018-10-28T00:39:00Z MEMBER   0 pydata/xarray/pulls/2519
  • [x] Closes #2272
  • [x] Tests added (for all bug fixes or enhancements)
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later)
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2519/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
374384608 MDExOlB1bGxSZXF1ZXN0MjI2MDk4NzM2 2515 Remove Dataset.T from api-hidden.rst spencerkclark 6628425 closed 0     1 2018-10-26T13:29:46Z 2018-10-26T14:52:22Z 2018-10-26T14:50:35Z MEMBER   0 pydata/xarray/pulls/2515

Just a minor followup to #2509 to remove Dataset.T from the documentation.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2515/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
369803291 MDExOlB1bGxSZXF1ZXN0MjIyNjUwODE1 2485 Improve arithmetic operations involving CFTimeIndexes and TimedeltaIndexes spencerkclark 6628425 closed 0     2 2018-10-13T13:41:18Z 2018-10-18T18:22:44Z 2018-10-17T04:00:57Z MEMBER   0 pydata/xarray/pulls/2485
  • [x] Closes #2484
  • [x] Tests added (for all bug fixes or enhancements)
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2485/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
369751771 MDU6SXNzdWUzNjk3NTE3NzE= 2484 Enable add/sub operations involving a CFTimeIndex and a TimedeltaIndex spencerkclark 6628425 closed 0     1 2018-10-13T01:00:28Z 2018-10-17T04:00:57Z 2018-10-17T04:00:57Z MEMBER      

``` In [1]: import xarray as xr

In [2]: start_dates = xr.cftime_range('1999-12', periods=12, freq='M')

In [3]: end_dates = start_dates.shift(1, 'M')

In [4]: end_dates - start_dates

TypeError Traceback (most recent call last) <ipython-input-4-43c24409020b> in <module>() ----> 1 end_dates - start_dates

/Users/spencerclark/xarray-dev/xarray/xarray/coding/cftimeindex.pyc in sub(self, other) 365 366 def sub(self, other): --> 367 return CFTimeIndex(np.array(self) - other) 368 369

TypeError: unsupported operand type(s) for -: 'numpy.ndarray' and 'CFTimeIndex'

```

Problem description

Subtracting one DatetimeIndex from another produces a TimedeltaIndex: ``` In [5]: import pandas as pd

In [6]: start_dates = pd.date_range('1999-12', periods=12, freq='M')

In [7]: end_dates = start_dates.shift(1, 'M')

In [8]: end_dates - start_dates Out[8]: TimedeltaIndex(['31 days', '29 days', '31 days', '30 days', '31 days', '30 days', '31 days', '31 days', '30 days', '31 days', '30 days', '31 days'], dtype='timedelta64[ns]', freq=None) ``` This should also be straightforward to enable for CFTimeIndexes and would be useful, for example, in the problem described in https://github.com/pydata/xarray/issues/2481#issue-369639339.

Expected Output

``` In [1]: import xarray as xr

In [2]: start_dates = xr.cftime_range('1999-12', periods=12, freq='M')

In [3]: end_dates = start_dates.shift(1, 'M')

In [4]: end_dates - start_dates Out[4]: TimedeltaIndex(['31 days', '29 days', '31 days', '30 days', '31 days', '30 days', '31 days', '31 days', '30 days', '31 days', '30 days', '31 days'], dtype='timedelta64[ns]', freq=None) ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2484/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
366969121 MDExOlB1bGxSZXF1ZXN0MjIwNTE2Nzcz 2464 Clean up _parse_array_of_cftime_strings spencerkclark 6628425 closed 0     2 2018-10-04T21:02:29Z 2018-10-05T11:10:46Z 2018-10-05T08:02:18Z MEMBER   0 pydata/xarray/pulls/2464

Per @shoyer's comment, https://github.com/pydata/xarray/pull/2431#discussion_r221976257, this cleans up _parse_array_of_cftime_strings, making it robust to multi-dimensional arrays in the process.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2464/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
362905229 MDExOlB1bGxSZXF1ZXN0MjE3NDg3NjIz 2431 Add CFTimeIndex.shift spencerkclark 6628425 closed 0     0 2018-09-23T01:42:25Z 2018-10-02T15:34:49Z 2018-10-02T14:44:30Z MEMBER   0 pydata/xarray/pulls/2431
  • [x] Closes #2244
  • [x] Tests added (for all bug fixes or enhancements)
  • [x] Tests passed (for all non-documentation changes)
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later)
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2431/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
365143140 MDExOlB1bGxSZXF1ZXN0MjE5MTU0NDk5 2448 Fix FutureWarning resulting from CFTimeIndex.date_type spencerkclark 6628425 closed 0     1 2018-09-29T15:48:16Z 2018-09-30T13:17:11Z 2018-09-30T13:16:49Z MEMBER   0 pydata/xarray/pulls/2448

With the latest version of pandas, checking the date_type of a CFTimeIndex produces a FutureWarning: ``` In [1]: import xarray as xr

In [2]: times = xr.cftime_range('2000', periods=5)

In [3]: times.date_type /Users/spencerclark/xarray-dev/xarray/xarray/coding/cftimeindex.py:161: FutureWarning: CFTimeIndex.data is deprecated and will be removed in a future version if self.data: Out[3]: cftime._cftime.DatetimeProlepticGregorian `` I think it was a typo to begin with to useself.dataincftimeindex.get_date_type(my mistake). Here I switch to usingself._data`, which is used elsewhere when internally referencing values of the index.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2448/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
362977117 MDExOlB1bGxSZXF1ZXN0MjE3NTMxMDQ5 2434 Enable use of cftime.datetime coordinates with differentiate and interp spencerkclark 6628425 closed 0     0 2018-09-23T21:02:36Z 2018-09-28T13:45:44Z 2018-09-28T13:44:55Z MEMBER   0 pydata/xarray/pulls/2434
  • [x] Tests added (for all bug fixes or enhancements)
  • [x] Tests passed (for all non-documentation changes)
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later)

As discussed in https://github.com/pydata/xarray/pull/2398#pullrequestreview-156804917, this enables the use of differentiate and interp on DataArrays/Datasets with cftime.datetime coordinates.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2434/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
342793201 MDExOlB1bGxSZXF1ZXN0MjAyNjA5OTEz 2301 WIP Add a CFTimeIndex-enabled xr.cftime_range function spencerkclark 6628425 closed 0     9 2018-07-19T16:04:10Z 2018-09-19T20:24:51Z 2018-09-19T20:24:40Z MEMBER   0 pydata/xarray/pulls/2301
  • [x] Closes #2142
  • [x] Tests added (for all bug fixes or enhancements)
  • [x] Tests passed (for all non-documentation changes)
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later)

I took the approach first discussed here by @shoyer and followed pandas by creating simplified offset classes for use with cftime objects to implement a CFTimeIndex-enabled cftime_range function. I still may clean things up a bit and add a few more tests, but I wanted to post this in its current state to show some progress, as I think it is more or less working. I will try to ping folks when it is ready for a more detailed review.

Here are a few examples: ``` In [1]: import xarray as xr

In [2]: xr.cftime_range('2000-02-01', '2002-05-05', freq='3M', calendar='noleap') Out[2]: CFTimeIndex([2000-02-28 00:00:00, 2000-05-31 00:00:00, 2000-08-31 00:00:00, 2000-11-30 00:00:00, 2001-02-28 00:00:00, 2001-05-31 00:00:00, 2001-08-31 00:00:00, 2001-11-30 00:00:00, 2002-02-28 00:00:00], dtype='object')

In [3]: xr.cftime_range('2000-02-01', periods=4, freq='3A-JUN', calendar='noleap') Out[3]: CFTimeIndex([2000-06-30 00:00:00, 2003-06-30 00:00:00, 2006-06-30 00:00:00, 2009-06-30 00:00:00], dtype='object')

In [4]: xr.cftime_range(end='2000-02-01', periods=4, freq='3A-JUN') Out[4]: CFTimeIndex([1990-06-30 00:00:00, 1993-06-30 00:00:00, 1996-06-30 00:00:00, 1999-06-30 00:00:00], dtype='object') ```

Hopefully the offset classes defined here would also be useful for implementing things like resample for CFTimeIndex objects (#2191) and CFTimeIndex.shift (#2244).

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2301/reactions",
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 1,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
324759261 MDExOlB1bGxSZXF1ZXN0MTg5MjYzMjc3 2166 Fix string slice indexing for a length-1 CFTimeIndex spencerkclark 6628425 closed 0     1 2018-05-21T01:04:01Z 2018-05-21T10:51:16Z 2018-05-21T08:02:35Z MEMBER   0 pydata/xarray/pulls/2166
  • [x] Closes #2165
  • [x] Tests added (for all bug fixes or enhancements)
  • [x] Tests passed (for all non-documentation changes)
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later)

The issue is that both is_monotonic_decreasing and is_monotonic_increasing return True for a length-1 index; therefore an additional check is needed to make sure the length of the index is greater than 1 in CFTimeIndex._maybe_cast_slice_bound. This is similar to how things are done in DatetimeIndex._maybe_cast_slice_bound in pandas.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2166/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
324758225 MDU6SXNzdWUzMjQ3NTgyMjU= 2165 CFTimeIndex improperly handles string slice for length-1 indexes spencerkclark 6628425 closed 0     0 2018-05-21T00:51:55Z 2018-05-21T08:02:35Z 2018-05-21T08:02:35Z MEMBER      

Code Sample, a copy-pastable example if possible

``` In [1]: import xarray as xr

In [2]: import cftime

In [3]: index = xr.CFTimeIndex([cftime.DatetimeNoLeap(1, 1, 1)])

In [4]: da = xr.DataArray([1], coords=[index], dims=['time'])

In [5]: da.sel(time=slice('0001', '0001')) Out[5]: <xarray.DataArray (time: 0)> array([], dtype=int64) Coordinates: * time (time) object ```

Problem description

When a CFTimeIndex is created with a single element, slicing using strings does not work; the example above should behave analogously to as it does when using a DatetimeIndex: ``` In [9]: import pandas as pd

In [10]: index = pd.DatetimeIndex(['2000-01-01'])

In [11]: da = xr.DataArray([1], coords=[index], dims=['time'])

In [12]: da.sel(time=slice('2000', '2000')) Out[12]: <xarray.DataArray (time: 1)> array([1]) Coordinates: * time (time) datetime64[ns] 2000-01-01 ``` I have a fix for this, which I will push shortly.

Expected Output

In [5]: da.sel(time=slice('0001', '0001')) Out[5]: <xarray.DataArray (time: 1)> array([1]) Coordinates: * time (time) object 0001-01-01 00:00:00

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.1.final.0 python-bits: 64 OS: Darwin OS-release: 17.4.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.4 pandas: 0.20.2 numpy: 1.13.1 scipy: 0.19.1 netCDF4: 1.4.0 h5netcdf: 0.5.1 h5py: 2.8.0 Nio: None zarr: None bottleneck: 1.2.0 cyordereddict: None dask: 0.15.0 distributed: 1.17.1 matplotlib: 2.0.2 cartopy: None seaborn: None setuptools: 33.1.1.post20170320 pip: 9.0.1 conda: None pytest: 3.1.2 IPython: 6.1.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2165/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
322813547 MDExOlB1bGxSZXF1ZXN0MTg3ODI0NjUw 2128 Fix datetime.timedelta casting bug in coding.times.infer_datetime_units spencerkclark 6628425 closed 0     1 2018-05-14T13:20:03Z 2018-05-14T19:18:05Z 2018-05-14T19:17:37Z MEMBER   0 pydata/xarray/pulls/2128
  • [x] Closes #2127
  • [x] Tests added
  • [x] Tests passed

I can confirm the docs now build properly locally:

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2128/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
322591813 MDU6SXNzdWUzMjI1OTE4MTM= 2127 cftime.datetime serialization example failing in latest doc build spencerkclark 6628425 closed 0     9 2018-05-13T12:58:15Z 2018-05-14T19:17:37Z 2018-05-14T19:17:37Z MEMBER      

Code Sample, a copy-pastable example if possible

``` In [1]: from itertools import product

In [2]: import numpy as np

In [3]: import xarray as xr

In [4]: from cftime import DatetimeNoLeap

In [5]: dates = [DatetimeNoLeap(year, month, 1) for year, month in product(range ...: (1, 3), range(1, 13))]

In [6]: with xr.set_options(enable_cftimeindex=True): ...: da = xr.DataArray(np.arange(24), coords=[dates], dims=['time'], name='foo') ...:

In [7]: da.to_netcdf('test.nc')

TypeError Traceback (most recent call last) <ipython-input-7-306dbf0ba669> in <module>() ----> 1 da.to_netcdf('test.nc')

/Users/spencerclark/xarray-dev/xarray/xarray/core/dataarray.pyc in to_netcdf(self, args, kwargs) 1514 dataset = self.to_dataset() 1515 -> 1516 return dataset.to_netcdf(args, **kwargs) 1517 1518 def to_dict(self):

/Users/spencerclark/xarray-dev/xarray/xarray/core/dataset.pyc in to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims) 1143 return to_netcdf(self, path, mode, format=format, group=group, 1144 engine=engine, encoding=encoding, -> 1145 unlimited_dims=unlimited_dims) 1146 1147 def to_zarr(self, store=None, mode='w-', synchronizer=None, group=None,

/Users/spencerclark/xarray-dev/xarray/xarray/backends/api.pyc in to_netcdf(dataset, path_or_file, mode, format, group, engine, writer, encoding, unlimited_dims) 681 try: 682 dataset.dump_to_store(store, sync=sync, encoding=encoding, --> 683 unlimited_dims=unlimited_dims) 684 if path_or_file is None: 685 return target.getvalue()

/Users/spencerclark/xarray-dev/xarray/xarray/core/dataset.pyc in dump_to_store(self, store, encoder, sync, encoding, unlimited_dims) 1073 1074 store.store(variables, attrs, check_encoding, -> 1075 unlimited_dims=unlimited_dims) 1076 if sync: 1077 store.sync()

/Users/spencerclark/xarray-dev/xarray/xarray/backends/common.pyc in store(self, variables, attributes, check_encoding_set, unlimited_dims) 356 """ 357 --> 358 variables, attributes = self.encode(variables, attributes) 359 360 self.set_attributes(attributes)

/Users/spencerclark/xarray-dev/xarray/xarray/backends/common.pyc in encode(self, variables, attributes) 441 # All NetCDF files get CF encoded by default, without this attempting 442 # to write times, for example, would fail. --> 443 variables, attributes = cf_encoder(variables, attributes) 444 variables = OrderedDict([(k, self.encode_variable(v)) 445 for k, v in variables.items()])

/Users/spencerclark/xarray-dev/xarray/xarray/conventions.pyc in cf_encoder(variables, attributes) 575 """ 576 new_vars = OrderedDict((k, encode_cf_variable(v, name=k)) --> 577 for k, v in iteritems(variables)) 578 return new_vars, attributes

python2/cyordereddict/_cyordereddict.pyx in cyordereddict._cyordereddict.OrderedDict.init (python2/cyordereddict/_cyordereddict.c:1225)()

//anaconda/envs/xarray-dev/lib/python2.7/_abcoll.pyc in update(args, *kwds) 569 self[key] = other[key] 570 else: --> 571 for key, value in other: 572 self[key] = value 573 for key, value in kwds.items():

/Users/spencerclark/xarray-dev/xarray/xarray/conventions.pyc in <genexpr>((k, v)) 575 """ 576 new_vars = OrderedDict((k, encode_cf_variable(v, name=k)) --> 577 for k, v in iteritems(variables)) 578 return new_vars, attributes

/Users/spencerclark/xarray-dev/xarray/xarray/conventions.pyc in encode_cf_variable(var, needs_copy, name) 232 variables.CFMaskCoder(), 233 variables.UnsignedIntegerCoder()]: --> 234 var = coder.encode(var, name=name) 235 236 # TODO(shoyer): convert all of these to use coders, too:

/Users/spencerclark/xarray-dev/xarray/xarray/coding/times.pyc in encode(self, variable, name) 384 data, 385 encoding.pop('units', None), --> 386 encoding.pop('calendar', None)) 387 safe_setitem(attrs, 'units', units, name=name) 388 safe_setitem(attrs, 'calendar', calendar, name=name)

/Users/spencerclark/xarray-dev/xarray/xarray/coding/times.pyc in encode_cf_datetime(dates, units, calendar) 338 339 if units is None: --> 340 units = infer_datetime_units(dates) 341 else: 342 units = _cleanup_netcdf_time_units(units)

/Users/spencerclark/xarray-dev/xarray/xarray/coding/times.pyc in infer_datetime_units(dates) 254 reference_date = dates[0] if len(dates) > 0 else '1970-01-01' 255 reference_date = format_cftime_datetime(reference_date) --> 256 unique_timedeltas = np.unique(np.diff(dates)).astype('timedelta64[ns]') 257 units = _infer_time_units_from_diff(unique_timedeltas) 258 return '%s since %s' % (units, reference_date)

TypeError: Cannot cast datetime.timedelta object from metadata [Y] to [ns] according to the rule 'same_kind' ```

Problem description

This seems to be an edge case that was not covered in the tests I added in #1252. Strangely if I cast the result of np.unique(np.diff(dates)) as an array before converting to 'timedelta64[ns]' objects things work: ``` In [9]: np.unique(np.diff(dates)).astype('timedelta64[ns]')


TypeError Traceback (most recent call last) <ipython-input-9-5d53452b676f> in <module>() ----> 1 np.unique(np.diff(dates)).astype('timedelta64[ns]')

TypeError: Cannot cast datetime.timedelta object from metadata [Y] to [ns] according to the rule 'same_kind'

In [10]: np.array(np.unique(np.diff(dates))).astype('timedelta64[ns]') Out[10]: array([2419200000000000, 2592000000000000, 2678400000000000], dtype='timedelta64[ns]') ``` Might anyone have any ideas as to what the underlying issue is? The fix could be as simple as that, but I don't understand why that makes a difference.

Expected Output

da.to_netcdf('test.nc') should succeed without an error.

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 2.7.14.final.0 python-bits: 64 OS: Darwin OS-release: 17.4.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.None xarray: 0.8.2+dev641.g7302d7e pandas: 0.22.0 numpy: 1.13.1 scipy: 0.19.1 netCDF4: 1.3.1 h5netcdf: None h5py: 2.7.1 Nio: None zarr: 2.2.0 bottleneck: None cyordereddict: 1.0.0 dask: 0.17.1 distributed: 1.21.3 matplotlib: 2.2.2 cartopy: None seaborn: 0.8.1 setuptools: 38.4.0 pip: 9.0.1 conda: None pytest: 3.3.2 IPython: 5.5.0 sphinx: 1.7.1
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2127/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
322586832 MDExOlB1bGxSZXF1ZXN0MTg3NjY1NjIy 2126 Add cftime to doc/environment.yml spencerkclark 6628425 closed 0     1 2018-05-13T11:44:09Z 2018-05-13T13:10:15Z 2018-05-13T11:56:54Z MEMBER   0 pydata/xarray/pulls/2126

cftime is now needed to build the documentation: http://xarray.pydata.org/en/latest/time-series.html#non-standard-calendars-and-dates-outside-the-timestamp-valid-range

Sorry I neglected this in #1252!

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2126/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
314151496 MDExOlB1bGxSZXF1ZXN0MTgxNTIxMjY2 2054 Updates for the renaming of netcdftime to cftime spencerkclark 6628425 closed 0     2 2018-04-13T15:25:50Z 2018-04-16T01:21:54Z 2018-04-16T01:07:59Z MEMBER   0 pydata/xarray/pulls/2054

Addresses https://github.com/pydata/xarray/pull/1252#issuecomment-381131366

Perhaps I should have waited until cftime was up on conda-forge, but once that happens I can update this PR to use that in setting up the CI environments rather than pip.

I made updates to the installing and time series pages of the docs. Does this need a what's new entry? I'm not sure which heading I would classify it under.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2054/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull

Next page

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 2165.62ms · About: xarray-datasette