home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

35 rows where user = 35919497 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, created_at (date), updated_at (date), closed_at (date)

type 2

  • pull 26
  • issue 9

state 1

  • closed 35

repo 1

  • xarray 35
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
324272267 MDU6SXNzdWUzMjQyNzIyNjc= 2157 groupby should not squeeze out dimensions aurghs 35919497 closed 0     1 2018-05-18T05:10:57Z 2024-01-08T01:05:24Z 2024-01-08T01:05:24Z COLLABORATOR      

Code Sample

```python arr = xr.DataArray( np.ones(3), dims=('x',), coords={ 'x': ('x', np.array([1, 3, 6])), } ) list(arr.groupby('x'))

[(1, <xarray.DataArray ()> array(1.) Coordinates: x int64 1), (3, <xarray.DataArray ()> array(1.) Coordinates: x int64 3), (6, <xarray.DataArray ()> array(1.) Coordinates: x int64 6)] ```

Problem description

The dimension x disappear. I have done some tests and it seems that this problem raise only with strictly ascending coordinates. For example in this case it works correctly:

```python arr = xr.DataArray( np.ones(3), dims=('x',), coords={ 'x': ('x', np.array([2, 1, 0])), } ) list(arr.groupby('x'))

[(0, <xarray.DataArray (x: 1)> array([1.]) Coordinates: * x (x) int64 0), (1, <xarray.DataArray (x: 1)> array([1.]) Coordinates: * x (x) int64 1), (2, <xarray.DataArray (x: 1)> array([1.]) Coordinates: * x (x) int64 2)] ```

Expected Output

```python arr = xr.DataArray( np.ones(3), dims=('x',), coords={ 'x': ('x', np.array([1, 3, 6])), } ) list(arr.groupby('x'))

[(1, <xarray.DataArray (x: 1)> ar1ay([1.]) Coordinates: * x (x) int64 1), (3, <xarray.DataArray (x: 1)> array([1.]) Coordinates: * x (x) int64 3), (6, <xarray.DataArray (x: 1)> array([1.]) Coordinates: * x (x) int64 6)] ```

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.0.final.0 python-bits: 64 OS: Linux OS-release: 4.13.0-41-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.4 pandas: 0.22.0 numpy: 1.14.3 scipy: 1.1.0 netCDF4: 1.3.1 h5netcdf: None h5py: 2.7.1 Nio: None zarr: None bottleneck: None cyordereddict: None dask: 0.17.4 distributed: 1.21.8 matplotlib: 2.2.2 cartopy: 0.16.0 seaborn: None setuptools: 38.4.1 pip: 10.0.1 conda: None pytest: 3.5.1 IPython: 6.2.1 sphinx: 1.7.4
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2157/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
916333567 MDExOlB1bGxSZXF1ZXN0NjY2MDE4MTYx 5455 Improve error message for guess engine aurghs 35919497 closed 0     1 2021-06-09T15:22:24Z 2021-06-23T16:36:16Z 2021-06-23T08:18:08Z COLLABORATOR   0 pydata/xarray/pulls/5455

When open_dataset() fails because no working engines are found, it suggests installing the dependencies of the compatible internal backends, providing explicitly the list. - [x] closes #5302 - [x] Tests added - [x] Passes pre-commit run --all-files

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5455/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
853644364 MDExOlB1bGxSZXF1ZXN0NjExNzAwOTQw 5135 Fix open_dataset regression aurghs 35919497 closed 0     15 2021-04-08T16:26:15Z 2021-04-15T12:11:34Z 2021-04-15T12:11:34Z COLLABORATOR   0 pydata/xarray/pulls/5135

Fix open_dataset regression, expands ~ in filepath_or_obj when necessary. I have checked the behaviour of the engines. It seems that pynio already expands ~.

  • [x] Closes #5098
  • [x] Passes pre-commit run --all-files
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5135/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
853705865 MDExOlB1bGxSZXF1ZXN0NjExNzUyNjU3 5136 Fix broken engine breakes xarray.open_dataset aurghs 35919497 closed 0     2 2021-04-08T17:47:12Z 2021-04-10T23:55:04Z 2021-04-10T23:55:01Z COLLABORATOR   0 pydata/xarray/pulls/5136

Currently, a broken engine breaks xarray.open_dataset. I have added a try except to avoid this problem.

Old behaviour:

```python

ds = xr.open_dataset('example.nc') Traceback (most recent call last):

File "/usr/local/Caskroom/miniconda/base/envs/xarray/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3331, in run_code exec(code_obj, self.user_global_ns, self.user_ns)

File "<ipython-input-3-0c694cae8262>", line 1, in <module> arr = xr.open_dataset("example.nc")

File "/Users/barghini/devel/xarray/xarray/backends/api.py", line 495, in open_dataset backend = plugins.get_backend(engine)

File "/Users/barghini/devel/xarray/xarray/backends/plugins.py", line 115, in get_backend engines = list_engines()

File "/Users/barghini/devel/xarray/xarray/backends/plugins.py", line 97, in list_engines return build_engines(pkg_entrypoints)

File "/Users/barghini/devel/xarray/xarray/backends/plugins.py", line 84, in build_engines external_backend_entrypoints = backends_dict_from_pkg(pkg_entrypoints)

File "/Users/barghini/devel/xarray/xarray/backends/plugins.py", line 58, in backends_dict_from_pkg backend = pkg_ep.load()

File "/usr/local/Caskroom/miniconda/base/envs/xarray/lib/python3.8/site-packages/pkg_resources/init.py", line 2450, in load return self.resolve()

File "/usr/local/Caskroom/miniconda/base/envs/xarray/lib/python3.8/site-packages/pkg_resources/init.py", line 2456, in resolve module = import(self.module_name, fromlist=['name'], level=0)

File "/Users/barghini/devel/xarray-sentinel/xarray_sentinel/sentinel1.py", line 13 ERROR ^ SyntaxError: invalid syntax ```

New behaviour:

```python

ds = xr.open_dataset('example.nc') /Users/barghini/devel/xarray/xarray/backends/plugins.py:61: RuntimeWarning: Engine sentinel-1 loading failed: name 'ERROR' is not defined warnings.warn(f"Engine {name} loading failed:\n{ex}", RuntimeWarning) ```

  • [x] Tests added
  • [x] Passes pre-commit run --all-files
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5136/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
786093946 MDExOlB1bGxSZXF1ZXN0NTU1MDE3MjY5 4810 add new backend api documentation aurghs 35919497 closed 0     2 2021-01-14T15:41:50Z 2021-03-25T14:01:25Z 2021-03-08T19:16:57Z COLLABORATOR   0 pydata/xarray/pulls/4810
  • add backend documentation
  • rename store_spec in filename_or_objin backend entrypoint method guess_can_open

  • [x] Related #4803

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4810/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
672912921 MDU6SXNzdWU2NzI5MTI5MjE= 4309 Flexible Backend - AbstractDataStore definition aurghs 35919497 closed 0     6 2020-08-04T16:14:16Z 2021-03-09T01:04:00Z 2021-03-09T01:04:00Z COLLABORATOR      

I just want to do a small recap of the current proposals for the class AbstractDataStore refactor discussed with @shoyer, @jhamman, and @alexamici.

Proposal 1: Store returns: - xr.Variables with the list of filters to apply to every variable - dataset attributes - encodings

Xarray applies to every variable only the filters selected by the backend before building the xr.Dataset.

Proposal 2: Store returns: - xr.Variables with all needed filters applied (configured by xarray), - dataset attributes - encodings

Xarray builds the xr.Dataset

Proposal 3: Store returns: - xr.Dataset

Before going on I'd like to collect pros and cons. For my understanding:

Proposal 1

pros: - the backend is free to decide which representation to provide. - more control on the backend (? not necessary true, the backend can decide to apply all the filters internally and provide xarray and empty list of filters to be applied) - enable / disable filters logic would be in xarray. - all the filters (applied by xarray) should have a similar interface. - maybe registered filters could be used by other backends

cons: - confusing backend-xarray interface. - more difficult to define interfaces. More conflicts (registered filters with the same name...) - need more structure to define this interface, more code to maintain.

Proposal 2

pros: - interface backend-xarray is clearer / backend and xarray have well different defined tasks. - interface would be minimal and easier to implement - no intermediate representations - less code to maintain

cons: - less control on filters. - more complex explicit definition of the interface (every filter must understand what decode_times means in their case) - more complexity inside the filters

The minimal interface would be something like that: py class AbstractBackEnd: def __init__(self, path, encode_times=True, ..., **kwargs): # signature of open_dataset raise NotImplementedError def get_variables(): """Return a dictionary of variable name and xr.Variable""" raise NotImplementedError def get_attrs(): """returns """ raise NotImplementedError def get_encoding(): """ """ raise NotImplementedError def close(self): pass

Proposal 3

pros w.r.t. porposal 2: - decode_coordinates is done by the backend as the other filters.

cons?

Any suggestions?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4309/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
785233324 MDU6SXNzdWU3ODUyMzMzMjQ= 4803 Update Documentation for backend Implementation aurghs 35919497 closed 0     1 2021-01-13T16:04:47Z 2021-03-08T20:58:02Z 2021-03-08T20:58:02Z COLLABORATOR      

The backend read-support refactor is drawing to a close and we should start to add the documentation to explain how to implement new backends.

We should: - decide where to put the documentation - decide a title - define a brief list of the main points to discuss in the documentation.

For the first point, I suggest putting the documentation in "Internal". For the second one, I suggest: "How to add a new backend"

Concerning the third point, in the following there is a list of the topics, that I suggest:: - BackendEntrypoint Description (BackendEntrypoint is the main interface with xarray, it's a container of functions to be implemented and attributes: guess_can_open, open_dataset, open_dataset_parameters, [guess_can_write], [dataset_writer]) - How to add the backend as an external entrypoint. - Description of the function contained in BackendEntrypoint to be implemented. In particular, for open_dataset we have two option to describe:
- No Lazy it returns a dataset containing numpy arrays. - Lazy it returns a dataset containing BackendArrays: - BackendArrays description: - thread-safe __getitem__ - Pickable (use CachingFileManager) - indexing.IndexingSupport

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4803/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
806076928 MDExOlB1bGxSZXF1ZXN0NTcxNTY1NzYy 4886 Sort backends aurghs 35919497 closed 0     0 2021-02-11T04:53:51Z 2021-02-12T17:48:24Z 2021-02-12T17:48:24Z COLLABORATOR   0 pydata/xarray/pulls/4886

Ensure that backend list are always sorted in the same way. In particular: - the standards backend are always the first in the following order: "netcdf4", "h5netcdf", "scipy" - all the other backends a sorted in lexicographic order.

the changes involve two files (plugins.py and test_plugins.py) and they include: - add utility function for sorting backends sort_backends - Update tests - Small changes in variables/functions names.

  • [x] Tests added
  • [x] Passes pre-commit run --all-files
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4886/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
761337459 MDExOlB1bGxSZXF1ZXN0NTM2MDE3ODY0 4673 Port all the engines to apiv2 aurghs 35919497 closed 0     1 2020-12-10T15:27:01Z 2021-02-11T01:56:48Z 2020-12-17T16:21:58Z COLLABORATOR   0 pydata/xarray/pulls/4673

Port all the engines to the new API apiv2. Note: - test_autoclose_future_warning has been removed because in apiv2.py autoclose has been removed - in open_backend_dataset_psedonetcdf currently is still used **format_kwargs and the signature is defined explicitly

  • [x] Related to https://github.com/pydata/xarray/issues/4309
  • [x] Tests updated
  • [x] Passes isort . && black . && mypy . && flake8
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4673/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
772346218 MDExOlB1bGxSZXF1ZXN0NTQzNjI4Nzkw 4719 Remove close_on_error store.py aurghs 35919497 closed 0     1 2020-12-21T17:34:23Z 2021-02-11T01:56:13Z 2020-12-22T14:31:05Z COLLABORATOR   0 pydata/xarray/pulls/4719

Remove close_on_error in store.py. This change involves only apiv2. Currently, api_v2.open_dataset can take in input a store instead of a file. In case of error, xarray closes the store. Xarray should manage the closure of a store that has been instantiated externally. This PR correct this behaviour in apiv2

  • [x] Related https://github.com/pydata/xarray/pull/4673
  • [x] Passes isort . && black . && mypy . && flake8
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4719/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
773481918 MDExOlB1bGxSZXF1ZXN0NTQ0NTY1ODY2 4724 Remove entrypoints in setup for internal backends aurghs 35919497 closed 0     1 2020-12-23T04:45:40Z 2021-02-11T01:56:03Z 2020-12-24T16:29:44Z COLLABORATOR   0 pydata/xarray/pulls/4724

This PR aims to avoid conflicts during the transition period between the old backend implementation and the new plugins. During the transition period will coexist both external backend plugins and internal ones. Currently, if two plugins with the same name are detected, we just pick one randomly. It would be better to be sure to use the external one.

Main changes: - Remove from setup.cfg - Store in the internal backend and stored in the dictionary in plugins.py. The dictionary is updated with the external plugins detected by pkg_resources. - Move the class BackendEntrypoints in common.py to resolve a circular import.
- Add a test

  • [x] Related to https://github.com/pydata/xarray/issues/4309
  • [x] Tests added
  • [x] Passes isort . && black . && mypy . && flake8
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4724/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
773717776 MDExOlB1bGxSZXF1ZXN0NTQ0NzUwNDgw 4726 Fix warning on chunks compatibility aurghs 35919497 closed 0     4 2020-12-23T12:25:42Z 2021-02-11T01:55:56Z 2020-12-24T11:32:43Z COLLABORATOR   0 pydata/xarray/pulls/4726

This PR fixes https://github.com/pydata/xarray/issues/4708. It's a very small change. Changes: - In dataset._check_chunks_compatibility now it doesn't raise a warning if the last chunk % preferred_chunk != 0. - Update tests - Style: rename a variable inside dataset._check_chunks_compatibility

  • [x] Closes https://github.com/pydata/xarray/issues/4708
  • [x] Tests added
  • [x] Passes isort . && black . && mypy . && flake8
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4726/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
773803598 MDExOlB1bGxSZXF1ZXN0NTQ0ODI3NTM0 4728 Remove unexpected warnings in tests aurghs 35919497 closed 0     0 2020-12-23T14:01:49Z 2021-02-11T01:55:54Z 2020-12-24T13:12:41Z COLLABORATOR   0 pydata/xarray/pulls/4728
  • 4646 add tests on chunking without using a with statement, causing unexpected warnings.

  • Add filterwarnings in test_plugins.test_remove_duplicates tests and backend_tests.test_chunking_consintency
  • [x] Tests fixex
  • [x] Passes isort . && black . && mypy . && flake8
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4728/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
773497334 MDExOlB1bGxSZXF1ZXN0NTQ0NTc4Mjc4 4725 remove autoclose in open_dataset and related warning test aurghs 35919497 closed 0     3 2020-12-23T05:28:59Z 2021-02-11T01:55:45Z 2020-12-24T16:25:26Z COLLABORATOR   0 pydata/xarray/pulls/4725

This PR remove autoclose option from open_dataset (both api.py and apiv2.py) and the corresponding test test_autoclose_future_warning from test.py autoclose=True option was deprecated in https://github.com/pydata/xarray/pull/2261 since xarray now uses a LRU cache to manage open file handles.

  • [x] Related to https://github.com/pydata/xarray/issues/4309 and https://github.com/pydata/xarray/pull/2261,
  • [x] Tests updated
  • [x] Passes isort . && black . && mypy . && flake8
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4725/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
760460622 MDExOlB1bGxSZXF1ZXN0NTM1Mjg4MzQ2 4669 add encodings["preferred_chunks"], used in open_dataset instead of en… aurghs 35919497 closed 0     0 2020-12-09T16:06:58Z 2021-02-11T01:52:11Z 2020-12-17T16:05:57Z COLLABORATOR   0 pydata/xarray/pulls/4669

Related to https://github.com/pydata/xarray/issues/4496 Add encodings["preferred_chunks"] in zarr, used in open_dataset instead of encodings["chunks"].

  • [x] Related to #https://github.com/pydata/xarray/issues/4496
  • [x] Passes isort . && black . && mypy . && flake8
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4669/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
760333256 MDExOlB1bGxSZXF1ZXN0NTM1MTgyNTQy 4667 unify zarr chunking with other chunking in apiv2.open_dataset aurghs 35919497 closed 0     1 2020-12-09T13:32:41Z 2021-02-11T01:51:59Z 2020-12-10T10:18:47Z COLLABORATOR   0 pydata/xarray/pulls/4667

It's the last part of, and closes #4595. Here we unify the code for chunking in apiv2.open_dataset. Note the code unification is only a refactor, there aren't functional changes since the zarr chunking has been already aligned with the others.

  • [x] Related to https://github.com/pydata/xarray/issues/4496
  • [x] Passes isort . && black . && mypy . && flake8
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4667/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
756325149 MDExOlB1bGxSZXF1ZXN0NTMxODg3OTk5 4646 Modify zarr chunking as suggested in #4496 aurghs 35919497 closed 0     0 2020-12-03T15:56:28Z 2021-02-11T01:51:55Z 2020-12-09T12:26:45Z COLLABORATOR   0 pydata/xarray/pulls/4646

Part of https://github.com/pydata/xarray/pull/4595 The changes involve only open_dataset(..., engine=zarr) (and marginally open_zarr), in particular, _get_chunks has been modified to fit #4496 (comment) option 1 chunking behaviour and align open_dataset chunking with dataset.chunk:

  • with auto it uses dask auto-chunking (if a preferred_chunking is defined, dask will take it into account as done in dataset.chunk)
  • with -1 it uses dask but no chunking.
  • with {} it uses the backend encoded chunks (when available) for on-disk data (xr.open_dataset) and the current chunking for already opened datasets (ds.chunk)

Add some test - [x] Releted to pydata#4496 - [x] Tests added - [x] Passes isort . && black . && mypy . && flake8 - [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4646/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
754261760 MDExOlB1bGxSZXF1ZXN0NTMwMTkxOTkw 4632 Move get_chunks from zarr.py to dataset.py aurghs 35919497 closed 0     0 2020-12-01T10:19:51Z 2021-02-11T01:51:40Z 2020-12-02T09:25:01Z COLLABORATOR   0 pydata/xarray/pulls/4632

The aim is to split the PR https://github.com/pydata/xarray/pull/4595 in small PRs. In this smaller PR there aren't changes in xarry interfaces, it's only a small code refactor: - Move get_chunks from zarr.py to dataset.py - Align apiv2 to apiv1: in apiv2 replace zarr.ZarrStore.maybe_chunk with dataset._maybe_chunk and zarr.ZarrStore.get_chunk with dataset._get_chunks. - remove zarr.ZarrStore.maybe_chunk and zarr.ZarrStore.get_chunks (no more used)

  • [x] Related #4496
  • [x] Passes isort . && black . && mypy . && flake8
  • No user visible changes (including notable bug fixes) are documented in whats-new.rst
  • No new functions/methods are listed in api.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4632/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
731226031 MDExOlB1bGxSZXF1ZXN0NTExMzczNjI5 4547 Update signature open_dataset for API v2 aurghs 35919497 closed 0     2 2020-10-28T08:35:54Z 2021-02-11T01:50:09Z 2020-11-06T14:43:10Z COLLABORATOR   0 pydata/xarray/pulls/4547

Proposal for the new API of open_dataset(). It is implemented in apiv2.py and it doesn't modify the current behavior of api.open_dataset(). It is something in between the first and second alternative suggested at https://github.com/pydata/xarray/issues/4490#issue-715374721, see the related quoted text:

Describe alternatives you've considered

For the overall approach:

  1. We could keep the current design, with separate keyword arguments for decoding options, and just be very careful about passing around these arguments. This seems pretty painful for the backend refactor, though.
  2. We could keep the current design only for the user facing open_dataset() interface, and then internally convert into the DecodingOptions() struct for passing to backend constructors. This would provide much needed flexibility for backend authors, but most users wouldn't benefit from the new interface. Perhaps this would make sense as an intermediate step?

Instead of a class for the decoders, I have added a function: resolve_decoders_kwargs.
resolve_decoders_kwargs performs two tasks: - If decode_cf is False, it sets to False all the decoders supported by the backend (using inspect). - It filters out the None decoder keywords.

So xarray manages the keyword decode_cf and passes on only the non-default decoders to the backend. If the user sets to a non-None value a decoder not supported by the backend, the backend will raise an error.

With this implementation drop_variable should be always supported by the backend. But I think this could be implemented easely by all the backends. I wouldn't group it with the decoders: to me, it seems to be more a filter rather than a decoder.

The behavior decode_cf is unchanged.

PRO: - the user doesn't need to import and instantiate a class. - users get the argument completion on open_dataset. - the backend defines directly in open_backend_dataset_${engine} API the accepted decoders. - xarray manages decode_cf, not the backends.

Missing points: - deccode_cf should be renamed decode. Probably, the behavior of decode should be modified for two reason: - currently If decode_cf is False, it sets the decoders to False, but there is no check on the other values. The accepted values should be: None (it keeps decoders default values), True (it sets all the decoders to True), False (it sets all the decoders to False). - currently we can set both a decoder and decode_cf without any warning. , but the - Deprecate backend_kwargs (or kwargs). - Separate mask_and_scale?

I think that we need a different PR for the three of them.

  • [x] related to https://github.com/pydata/xarray/issues/4490#
  • [x] Passes isort . && black . && mypy . && flake8
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4547/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
791035470 MDExOlB1bGxSZXF1ZXN0NTU5MTU5MDYy 4836 backend interface, now it uses subclassing aurghs 35919497 closed 0     0 2021-01-21T12:38:58Z 2021-01-28T15:22:45Z 2021-01-28T15:21:00Z COLLABORATOR   0 pydata/xarray/pulls/4836

Currently, the interface between the backend and xarray is the class/container BackendEntrypoint, that must be instantiated by the backend. With this pull request, BackendEntrypoint is replaced by AbstractBackendEntrypoint. The backend will inherit from this class.

Reason for these changes: - This type of interface is more standard.

  • [x] Tests updated
  • [x] Passes pre-commit run --all-files
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4836/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
786107421 MDExOlB1bGxSZXF1ZXN0NTU1MDI4MTU3 4811 Bugfix in list_engine aurghs 35919497 closed 0     3 2021-01-14T15:58:38Z 2021-01-19T10:10:26Z 2021-01-19T10:10:26Z COLLABORATOR   0 pydata/xarray/pulls/4811

Currently list_engines returns the list of all installed backend plus the list of the internal ones. For the internal ones, there is no check on the installed dependencies. Now the registration of the internal backends is done by the backends only if the needed dependencies are installed.

  • [x] Passes pre-commit run --all-files
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4811/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
324032926 MDU6SXNzdWUzMjQwMzI5MjY= 2148 groupby beahaviour w.r.t. non principal coordinates aurghs 35919497 closed 0     4 2018-05-17T13:52:43Z 2020-12-17T11:47:47Z 2020-12-17T11:47:47Z COLLABORATOR      

Code Sample

```python import numpy as np import xarray as xr

arr = xr.DataArray( np.ones(5), dims=('x',), coords={ 'x': ('x', np.array([1, 1, 1, 2, 2])), 'x2': ('x', np.array([1, 2, 3, 4, 5])), } ) arr <xarray.DataArray (x: 5)> array([1., 1., 1., 1., 1.]) Coordinates: * x (x) int64 1 1 1 2 2 x2 (x) int64 1 2 3 4 5

out = arr.groupby('x').mean('x') out <xarray.DataArray (x: 2)> array([1., 1.]) Coordinates: * x (x) int64 1 2 x2 (x) int64 1 2 3 4 5 ```

Problem description

Inconsistency between: - the shape dimension x: (2,) - the shape of the coordinates x2 depending on the dimension x: (5,)

Expected Output

The coordinate x2 should be dropped. python <xarray.DataArray (x: 2)> array([1., 1.]) Coordinates: * x (x) int64 1 2

Output of xr.show_versions()

```python INSTALLED VERSIONS


commit: None python: 3.6.0.final.0 python-bits: 64 OS: Linux OS-release: 4.13.0-41-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

xarray: 0.10.4 pandas: 0.22.0 numpy: 1.14.3 scipy: 1.1.0 netCDF4: 1.3.1 h5netcdf: None h5py: 2.7.1 Nio: None zarr: None bottleneck: None cyordereddict: None dask: 0.17.4 distributed: 1.21.8 matplotlib: 2.2.2 cartopy: 0.16.0 seaborn: None setuptools: 38.4.1 pip: 10.0.1 conda: None pytest: 3.5.1 IPython: 6.2.1 sphinx: 1.7.4 ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2148/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
710071238 MDU6SXNzdWU3MTAwNzEyMzg= 4468 Backend read support: dynamic import in xarray namespace of backend open functions aurghs 35919497 closed 0     0 2020-09-28T08:47:09Z 2020-12-10T14:29:56Z 2020-12-10T14:29:56Z COLLABORATOR      

@jhamman, @shoyer @alexamici we discussed last time about the possibility to import directly in the xarray namespace the open function of the backends, open_dataset_${engine}. I just want to recap some pro and cons of this proposal:

Pros: - Expert users can use directly the open function of the backend (without using engine=) - They can use Tab key to autocomplete the backend kwargs. - They can easily access to the backend open function signature. (that's really useful!)

Cons: - The users they might expect in the namespace also the other corresponding functions: open_mfdataset_${engine}, open_datarray_${engine} etc ... and we are not going to do it because it is too confusing

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4468/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
741714847 MDExOlB1bGxSZXF1ZXN0NTE5OTc5ODE5 4577 Backends entrypoints aurghs 35919497 closed 0     5 2020-11-12T15:53:00Z 2020-12-10T13:30:42Z 2020-12-10T09:56:13Z COLLABORATOR   0 pydata/xarray/pulls/4577
  • It's an update of @jhamman pull request https://github.com/pydata/xarray/pull/3166
  • It uses entrypoints module to detect the installed engines. The detection is done at open_dataset function call and it is cached. It raises a warning in case of conflicts.
  • Add a class for the backend interface BackendEtrypoint instead of a function.

Modified files: - add plugins.py containing detect_engines function and BackendEtrypoint. - dependencies file to add entrypoints. - backend.init to add detect_engines - apiv2.py and api.py do use detect_engines - zarr.py, h5netcdf_.py, cfgrib.py to instatiate the BackendEtrypoint.

  • [x] Related to #3166
  • [x] Tests added
  • [x] Passes isort . && black . && mypy . && flake8
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4577/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
717410970 MDU6SXNzdWU3MTc0MTA5NzA= 4496 Flexible backends - Harmonise zarr chunking with other backends chunking aurghs 35919497 closed 0 aurghs 35919497   7 2020-10-08T14:43:23Z 2020-12-10T10:51:09Z 2020-12-10T10:51:09Z COLLABORATOR      

Is your feature request related to a problem? Please describe. In #4309 we proposed to separate xarray - backend tasks, more or less in this way: - Backend returns a dataset - xarray manage chunks and cache.

With the changes in open_dataset to support also zarr (#4187 ), we introduced a slightly different behavior for zarr chunking with respect the other backends.

Behavior of all the backends except zar - if chunk == {} or 'auto': it uses dask and only one chunk per variable - if the user defines chunks for not all the dimensions, along these dimensions it uses only one chunk: ```python

ds = xr.open_dataset('test.nc', chunks={'x': 4}) print(ds['foo'].chunks) ((4, 4, 4, 4, 4), (4,)) *Zarr chunking behavior* is very similar, but it has a different default when the user doesn't choose the size of the chunk along some dimensions, i.e. - if chunk == {} or 'auto': it uses in both cases the on-disk chunks - if the user defines the chunks for not all the dimensions, along these dimensions it uses no disk chunck:python ds = xr.open_dataset('test.zarr', engine='zarr', chunks={'x': 4}) print(ds['foo'].encoding['chunks']) (5, 2) print(ds['foo'].chunks) ((4, 4, 4, 4, 4), (2, 2)) ```

Describe the solution you'd like

We could extend easily zarr behavior to all the backends (which, for now, don't use the field variable.encodings['chunks']): if no chunks are defined in encoding, we use as default the dimension size, otherwise, we use the encoded chunks. So for now we are not going to change any external behavior, but if needed the other backends can use this interface. I have some additional notes:

  • The key value auto is redundant because it has the same behavior as {}, we could remove one of them.
  • I would separate the concepts "on disk chunk" and "preferred chunking". We can use a different key in encodings or ask the backend to expose a function to compute the preferred chunking.

One last question: - In the new interface of open_dataset there is a new key, imported from open_zarr: overwrite_encoded_chunks. Is it really needed? Why do we support to overwrite of the encoded chunks at readi time? This operation can be easily done after or at write time.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4496/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
754283496 MDExOlB1bGxSZXF1ZXN0NTMwMjA5NzE4 4633 change default in ds.chunk and datarray.chunk variable.chunk aurghs 35919497 closed 0     2 2020-12-01T10:48:11Z 2020-12-10T10:38:06Z 2020-12-10T10:38:06Z COLLABORATOR   0 pydata/xarray/pulls/4633

The aim is to split the PR #4595 in small PRs. The scope of this smaller PR is to modify the default of chunks in dataset.chunk to align the behaviour to xr.open_dataset. The main changes are: - Modify the default of chunk in dataset.chunk, datarray.chunk and variable.chunk from None to {}. - If the user pass chunk=None inside is set to {} - Add a future warning to advice that the usage of None will raise an error in the future.

Note that the changes currently don't modify the behaviour of dataset.chunk

  • [x] Related #4496
  • [x] Passes isort . && black . && mypy . && flake8
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4633/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
755163626 MDExOlB1bGxSZXF1ZXN0NTMwOTI4MTc2 4642 Refactor apiv2.open_dataset aurghs 35919497 closed 0     1 2020-12-02T10:51:31Z 2020-12-10T10:29:24Z 2020-12-02T13:17:26Z COLLABORATOR   0 pydata/xarray/pulls/4642

Related to PR https://github.com/pydata/xarray/pull/4595. In this smaller PR, there aren't changes functional changes, it's only a small code refactor needed to simplify pydata#4595. Changes in apiv2.dataset_from_backend_dataset: - rename ds in backend_ds and ds2 in ds. - simplify if in chunking and split code adding function _chunks_ds - add _get_mtime specific function

Make resolve_decoders_kwargs and dataset_from_backend_dataset private - [x] related to https://github.com/pydata/xarray/pull/4595 - [x] Passes isort . && black . && mypy . && flake8

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4642/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
746628598 MDExOlB1bGxSZXF1ZXN0NTIzOTg5NDA4 4595 WIP: Chunking refactor aurghs 35919497 closed 0     0 2020-11-19T14:22:45Z 2020-12-10T10:28:25Z 2020-12-10T10:18:47Z COLLABORATOR   0 pydata/xarray/pulls/4595

This work aims to harmonize the way zarr deals with chunking to have similar behavior for all other backends and unify the code. Most of the changes involve the new API, apiv2.py, except for some changes in the code that has been added with the merge of https://github.com/pydata/xarray/pull/4187.

main changes: - refactor apiv2.dataset_from_backend_dataset function. - move _get_chunks from zarr to dataset. - modify _get_chunks to fit https://github.com/pydata/xarray/issues/4496#issuecomment-720785384 option 1 chunking behaviuor - Add warning when it is used in ds.chunk(..., chunk=None) - Add some test

nedded separate pull request for the following missing points: - standardize the key in encodings to define the on-disk chunks: chunksizes - add a specific key in encodings for preferred chunking (currently it is used chunks)

  • [x] Related https://github.com/pydata/xarray/issues/4496
  • [x] Tests added
  • [x] Passes isort . && black . && mypy . && flake8
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [ ] New functions/methods are listed in api.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4595/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
732378121 MDExOlB1bGxSZXF1ZXN0NTEyMzI4MDY4 4550 WIP: Zarr chunks refactor aurghs 35919497 closed 0     1 2020-10-29T14:44:31Z 2020-12-10T10:28:06Z 2020-11-10T16:08:53Z COLLABORATOR   0 pydata/xarray/pulls/4550

This work aims to harmonize the way zarr deals with chunking to have similar behavior for all other backends and unify the code. Most of the changes involve the new API, apiv2.py, except for some changes in the code that has been added with the merge of https://github.com/pydata/xarray/pull/4187.

main changes: - refactor apiv2.dataset_from_backend_dataset function. - move get_chunks from zarr to dataset.

current status: - in apiv2.open_dataset chunks='auto' and chunks={} now has the same beahviuor - in apiv2.open_dataset for all the backends now the default chunking is provided by the backend, if it is not available it uses one big chunk.

Missing points: - standardize the key in encodings to define the on-disk chunks: chunksizes - add a specific key in encodings for preferred chunking (currently it is used chunks)

There is one open point to be discussed yet: dataset.chunks and open_dataset(..., chunks=...) have different behaviors. dataset.chunks(chunks={}) opens the dataset with only one chunk per variable, while in open_dataset(..., chunks={}) it uses encodings['chunks'], when available.

Note that also chunks=None has a different behaviour: open_dataset(..., chunks=None) (or open_dataset(...), it's the deafult) returns variables without chunks, while dataset.chunk(chunks=None) (or dataset.chunk(), it's the default) has the same behavior of dataset.chunk(chunks=None). Probably it's not worth changing it.

  • [x] related to https://github.com/pydata/xarray/issues/4496
  • [ ] Tests added
  • [x] Passes isort . && black . && mypy . && flake8
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [ ] New functions/methods are listed in api.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4550/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
716644871 MDExOlB1bGxSZXF1ZXN0NDk5MzQ1NjQw 4494 Remove maybe chunck duplicated function aurghs 35919497 closed 0     1 2020-10-07T15:42:35Z 2020-12-10T10:27:34Z 2020-10-08T15:10:46Z COLLABORATOR   0 pydata/xarray/pulls/4494

I propose this small change with a view to unifying in open_dataset the logic of zarr chunking with that of the other backends. Currently, the function maybe_chunk is duplicated: it is defined inside the function dataset.chunks and as method of zarr.ZarrStore. This last function has been added with the recent merge of #4187 . I merged the two functions in a private function _maybe_chunk inside the module dataset.

  • [x] Addresses #4309
  • [ ] Tests added
  • [x] Passes isort . && black . && mypy . && flake8
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4494/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
712247862 MDExOlB1bGxSZXF1ZXN0NDk1NzYwOTI1 4477 WIP: Proposed refactor of read API for backends aurghs 35919497 closed 0     3 2020-09-30T20:12:36Z 2020-10-22T15:07:33Z 2020-10-22T15:06:39Z COLLABORATOR   0 pydata/xarray/pulls/4477

The first draft of the new backend API: - Move decoding inside the bakends. - Backends return Dataset with BackendArray. - Xarray manages chunking and caching. - Some code is duplicated, it will be simplified later. - Zarr chunking is still inside the backend for now.

cc @jhamman @shoyer

  • [x] Addresses #4309
  • [ ] Tests added
  • [ ] Passes isort . && black . && mypy . && flake8
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [ ] New functions/methods are listed in api.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4477/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
619492955 MDExOlB1bGxSZXF1ZXN0NDE4OTc5MTQ0 4071 #1621 optional decode timedelta aurghs 35919497 closed 0     1 2020-05-16T14:57:39Z 2020-05-19T15:44:21Z 2020-05-19T15:43:54Z COLLABORATOR   0 pydata/xarray/pulls/4071

Releated to ticket #1621. Add decode_timedelta kwargs in open_dataset, xr.open_datarray, xr.open_zarr, xr.decode_cf. If not passed explicitly the behaviour is not changed. - [x] Tests added for xr.decode_cf. - [x] Passes isort -rc . && black . && mypy . && flake8 - [x] Fully documented, including whats-new.rst for all changes and api.rst for new API

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4071/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
327392061 MDU6SXNzdWUzMjczOTIwNjE= 2196 inconsistent time coordinates types aurghs 35919497 closed 0     1 2018-05-29T16:14:27Z 2020-03-29T14:09:26Z 2020-03-29T14:09:26Z COLLABORATOR      

Code Sample, a copy-pastable example if possible

```python import numpy as np import pandas as pd import xarray as xr

time = np.arange('2005-02-01', '2007-03-01', dtype='datetime64') arr = xr.DataArray( np.arange(time.size), coords=[time,], dims=('time',), name='data' ) arr.resample(time='M').interpolate('linear')


ValueError Traceback (most recent call last) <ipython-input-7-6a92b6afe08e> in <module>() 7 np.arange(time.size), coords=[time,], dims=('time',), name='data' 8 ) ----> 9 arr.resample(time='M').interpolate('linear')

~/devel/c3s-cns/venv_op/lib/python3.6/site-packages/xarray/core/resample.py in interpolate(self, kind) 108 109 """ --> 110 return self._interpolate(kind=kind) 111 112 def _interpolate(self, kind='linear'):

~/devel/c3s-cns/venv_op/lib/python3.6/site-packages/xarray/core/resample.py in _interpolate(self, kind) 218 elif self._dim not in v.dims: 219 coords[k] = v --> 220 return DataArray(f(new_x), coords, dims, name=dummy.name, 221 attrs=dummy.attrs) 222

~/devel/c3s-cns/venv_op/lib/python3.6/site-packages/scipy/interpolate/polyint.py in call(self, x) 77 """ 78 x, x_shape = self._prepare_x(x) ---> 79 y = self._evaluate(x) 80 return self._finish_y(y, x_shape) 81

~/devel/c3s-cns/venv_op/lib/python3.6/site-packages/scipy/interpolate/interpolate.py in _evaluate(self, x_new) 632 y_new = self._call(self, x_new) 633 if not self._extrapolate: --> 634 below_bounds, above_bounds = self._check_bounds(x_new) 635 if len(y_new) > 0: 636 # Note fill_value must be broadcast up to the proper size

~/devel/c3s-cns/venv_op/lib/python3.6/site-packages/scipy/interpolate/interpolate.py in _check_bounds(self, x_new) 664 "range.") 665 if self.bounds_error and above_bounds.any(): --> 666 raise ValueError("A value in x_new is above the interpolation " 667 "range.") 668

ValueError: A value in x_new is above the interpolation range. ```

Problem description

The internal format of arr.time is datetime64[D]

```python arr.time

<xarray.DataArray 'time' (time: 758)> array(['2005-02-01', '2005-02-02', '2005-02-03', ..., '2007-02-26', '2007-02-27', '2007-02-28'], dtype='datetime64[D]') Coordinates: * time (time) datetime64[D] 2005-02-01 2005-02-02 2005-02-03 ... ``` Internally there is a cast to float, for both the old time indices x and the new time indices new_x, but the new time indices are in datetime64[ns], so they don't match.

DataArrayResample._interpolate

```python x = self._obj[self._dim].astype('float') y = self._obj.data

   axis = self._obj.get_axis_num(self._dim)

   f = interp1d(x, y, kind=kind, axis=axis, bounds_error=True,
                assume_sorted=True)
   new_x = self._full_index.values.astype('float')

``` With a cast to datetime64[ns] it works:

```python import numpy as np import pandas as pd import xarray as xr

time = np.arange('2005-02-01', '2007-03-01', dtype='datetime64').astype('datetime64[ns]') arr = xr.DataArray( np.arange(time.size), coords=[time,], dims=('time',), name='data' ) arr.resample(time='M').interpolate('linear') <xarray.DataArray 'data' (time: 25)> array([ 27., 58., 88., 119., 149., 180., 211., 241., 272., 302., 333., 364., 392., 423., 453., 484., 514., 545., 576., 606., 637., 667., 698., 729., 757.]) Coordinates: * time (time) datetime64[ns] 2005-02-28 2005-03-31 2005-04-30 ... ```

Expected Output

python <xarray.DataArray 'data' (time: 25)> array([ 27., 58., 88., 119., 149., 180., 211., 241., 272., 302., 333., 364., 392., 423., 453., 484., 514., 545., 576., 606., 637., 667., 698., 729., 757.]) Coordinates: * time (time) datetime64[ns] 2005-02-28 2005-03-31 2005-04-30 ...

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.0.final.0 python-bits: 64 OS: Linux OS-release: 4.13.0-43-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8 xarray: 0.10.4 pandas: 0.20.3 numpy: 1.13.1 scipy: 1.0.0 netCDF4: 1.3.1 h5netcdf: None h5py: None Nio: None zarr: None bottleneck: None cyordereddict: None dask: 0.16.1 distributed: None matplotlib: 2.0.2 cartopy: None seaborn: None setuptools: 38.4.0 pip: 10.0.1 conda: None pytest: 3.4.0 IPython: 6.1.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2196/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
327591169 MDU6SXNzdWUzMjc1OTExNjk= 2197 DataArrayResample.interpolate coordinates out of bound. aurghs 35919497 closed 0     2 2018-05-30T06:33:58Z 2019-01-03T01:18:06Z 2019-01-03T01:18:06Z COLLABORATOR      

Code Sample, a copy-pastable example if possible

```python import numpy as np import pandas as pd import xarray as xr

time = np.arange('2007-02-01', '2007-03-02', dtype='datetime64').astype('datetime64[ns]') arr = xr.DataArray( np.arange(time.size), coords=[time,], dims=('time',), name='data' ) arr.resample(time='M').interpolate('linear')


ValueError Traceback (most recent call last) <ipython-input-20-ff65c4d138e7> in <module>() 7 np.arange(time.size), coords=[time,], dims=('time',), name='data' 8 ) ----> 9 arr.resample(time='M').interpolate('linear')

~/devel/c3s-cns/venv_op/lib/python3.6/site-packages/xarray/core/resample.py in interpolate(self, kind) 108 109 """ --> 110 return self._interpolate(kind=kind) 111 112 def _interpolate(self, kind='linear'):

~/devel/c3s-cns/venv_op/lib/python3.6/site-packages/xarray/core/resample.py in _interpolate(self, kind) 218 elif self._dim not in v.dims: 219 coords[k] = v --> 220 return DataArray(f(new_x), coords, dims, name=dummy.name, 221 attrs=dummy.attrs) 222

~/devel/c3s-cns/venv_op/lib/python3.6/site-packages/scipy/interpolate/polyint.py in call(self, x) 77 """ 78 x, x_shape = self._prepare_x(x) ---> 79 y = self._evaluate(x) 80 return self._finish_y(y, x_shape) 81

~/devel/c3s-cns/venv_op/lib/python3.6/site-packages/scipy/interpolate/interpolate.py in _evaluate(self, x_new) 632 y_new = self._call(self, x_new) 633 if not self._extrapolate: --> 634 below_bounds, above_bounds = self._check_bounds(x_new) 635 if len(y_new) > 0: 636 # Note fill_value must be broadcast up to the proper size

~/devel/c3s-cns/venv_op/lib/python3.6/site-packages/scipy/interpolate/interpolate.py in _check_bounds(self, x_new) 664 "range.") 665 if self.bounds_error and above_bounds.any(): --> 666 raise ValueError("A value in x_new is above the interpolation " 667 "range.") 668

ValueError: A value in x_new is above the interpolation range.

```

Problem description

It raise an error if I try to interpolate. If time range is exactly a month, then it works: ```python time = np.arange('2007-02-01', '2007-03-01', dtype='datetime64').astype('datetime64[ns]') arr = xr.DataArray( np.arange(time.size), coords=[time,], dims=('time',), name='data' ) arr.resample(time='M').interpolate('linear')

<xarray.DataArray 'data' (time: 1)> array([27.]) Coordinates: * time (time) datetime64[ns] 2007-02-28 ```

The problem for the interpolation seems to be that the resampler contains indices out bound ('2007-03-31'). It is ok for the aggregations, but it doesn't work with the interpolation.

```python resampler = arr.resample(time='M')

resampler._full_index DatetimeIndex(['2007-02-28', '2007-03-31'], dtype='datetime64[ns]', name='time', freq='M') ```

Expected Output

python <xarray.DataArray 'data' (time: 1)> array([27.]) Coordinates: * time (time) datetime64[ns] 2007-02-28

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.0.final.0 python-bits: 64 OS: Linux OS-release: 4.13.0-43-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8 xarray: 0.10.3 pandas: 0.22.0 numpy: 1.14.3 scipy: 1.1.0 netCDF4: 1.3.1 h5netcdf: None h5py: None Nio: None zarr: None bottleneck: None cyordereddict: None dask: 0.17.4 distributed: None matplotlib: 2.2.2 cartopy: 0.16.0 seaborn: None setuptools: 39.2.0 pip: 10.0.1 conda: None pytest: 3.5.1 IPython: 6.4.0 sphinx: 1.7.4
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2197/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
324121124 MDU6SXNzdWUzMjQxMjExMjQ= 2153 Bug: side effect on method GroupBy.first aurghs 35919497 closed 0     1 2018-05-17T17:43:25Z 2018-05-29T03:15:08Z 2018-05-29T03:15:08Z COLLABORATOR      

Code Sample, a copy-pastable example if possible

```python arr = xr.DataArray( np.arange(5), dims=('x',), coords={ 'x': ('x', np.array([1, 1, 1, 2, 2])), } )

gr = arr.groupby('x') gr.first()

arr

<xarray.DataArray (x: 5)> array([0, 1, 2, 3, 4]) Coordinates: * x (x) int64 1 2

```

Problem description

A side effect of the GroupBy.first method call is that it substitutes the original array coordinates with the grouped ones .

Expected Output

arr

<xarray.DataArray (x: 5)> array([0, 1, 2, 3, 4]) Coordinates: * x (x) int64 1 1 1 2 2

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.0.final.0 python-bits: 64 OS: Linux OS-release: 4.13.0-41-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.4 pandas: 0.22.0 numpy: 1.14.3 scipy: 1.1.0 netCDF4: 1.3.1 h5netcdf: None h5py: 2.7.1 Nio: None zarr: None bottleneck: None cyordereddict: None dask: 0.17.4 distributed: 1.21.8 matplotlib: 2.2.2 cartopy: 0.16.0 seaborn: None setuptools: 38.4.1 pip: 10.0.1 conda: None pytest: 3.5.1 IPython: 6.2.1 sphinx: 1.7.4
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2153/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 1357.752ms · About: xarray-datasette