home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

53 rows where comments = 3, repo = 13221727 and user = 1217238 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: milestone, draft, state_reason, created_at (date), updated_at (date), closed_at (date)

type 2

  • issue 33
  • pull 20

state 2

  • closed 49
  • open 4

repo 1

  • xarray · 53 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
2266174558 I_kwDOAMm_X86HExRe 8975 Xarray sponsorship guidelines shoyer 1217238 open 0     3 2024-04-26T17:05:01Z 2024-04-30T20:52:33Z   MEMBER      

At what level of support should Xarray acknowledge sponsors on our website?

I would like to surface this for open discussion because there are potential sponsoring organizations with conflicts of interest with members of Xarray's leadership team (e.g., Earthmover, which employs @jhamman, @rabernat and @dcherian).

My suggestion is to use NumPy's guidelines, with an adjustment down to 1/3 of the thresholds to account for the smaller size of the project:

  • $10,000/yr for unrestricted financial contributions (e.g., donations)
  • $20,000/yr for financial contributions for a particular purpose (e.g., grants)
  • $30,000/yr for in-kind contributions (e.g., time for employees to contribute)
  • 2 person-months/yr of paid work time for one or more Xarray maintainers or regular contributors to any Xarray team or activity

The NumPy guidelines also include a grace period of a minimum of 6 months for acknowledging support. I would suggest increasing this to a minimum of 1 year for Xarray.

I would greatly appreciate any feedback from members of the community, either in this issue or on the next team meeting.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8975/reactions",
    "total_count": 6,
    "+1": 5,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
267542085 MDU6SXNzdWUyNjc1NDIwODU= 1647 Representing missing values in string arrays on disk shoyer 1217238 closed 0     3 2017-10-23T05:01:10Z 2024-02-06T13:03:40Z 2024-02-06T13:03:40Z MEMBER      

This came up as part of my clean-up of serializing unicode strings in https://github.com/pydata/xarray/pull/1648.

There are two ways to represent strings in netCDF files.

  • As character arrays (NC_CHAR), supported by both netCDF3 and netCDF4
  • As variable length unicode strings (NC_STRING), only supported by netCDF4/HDF5.

Currently, by default (if no _FillValue is set) we replace missing values (NaN) with an empty string when writing data to disk.

For character arrays, we could use the normal _FillValue mechanism to set a fill value and decode when data is read back from disk. In fact, this already currently works for dtype=bytes (though it isn't documented): ``` In [10]: ds = xr.Dataset({'foo': ('x', np.array([b'bar', np.nan], dtype=object), {}, {'_FillValue': b''})})

In [11]: ds Out[11]: <xarray.Dataset> Dimensions: (x: 2) Dimensions without coordinates: x Data variables: foo (x) object b'bar' nan

In [12]: ds.to_netcdf('foobar.nc')

In [13]: xr.open_dataset('foobar.nc').load() Out[13]: <xarray.Dataset> Dimensions: (x: 2) Dimensions without coordinates: x Data variables: foo (x) object b'bar' nan ```

For variable length strings, it currently isn't possible to set a fill-value. So there's no good way to indicate missing values, though this may change if the future depending on the resolution of the netCDF-python issue.

It would obviously be nice to always automatically round-trip missing values, both for strings and bytes. I see two possible ways to do this: 1. Require setting an explicit _FillValue when a string contains missing values, by raising an error if this isn't done. We need an explicit choice because there aren't any extra unused characters left over, at least for character arrays. (NetCDF explicitly allows arbitrary bytes to be stored in NC_CHAR, even though this maps to an HDF5 fixed-width string with ASCII encoding.) For variable length strings, we could potentially set a non-character unicode symbol like U+FFFF, but again that isn't supported yet. 2. Treat empty strings as equivalent to a missing value (NaN). This has the advantage of not requiring an explicit choice of _FillValue, so we don't need to wait for any netCDF4 issues to be resolved. However, this does mean that empty strings would not round-trip. Still, given the relative prevalence of missing values vs empty strings in xarray/pandas, it's probably the lesser evil to not preserve empty string.

The default option is to adopt neither of these, and keep the current behavior where missing values are written as empty strings and not decoded at all.

Any opinions? I am leaning towards option (2).

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1647/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
197939448 MDU6SXNzdWUxOTc5Mzk0NDg= 1189 Document using a spawning multiprocessing pool for multiprocessing with dask shoyer 1217238 closed 0     3 2016-12-29T01:21:50Z 2023-12-05T21:51:04Z 2023-12-05T21:51:04Z MEMBER      

This is a nice option for working with in-file HFD5/netCDF4 compression: https://github.com/pydata/xarray/pull/1128#issuecomment-261936849

Mixed multi-threading/multi-processing could also be interesting, if anyone wants to revive that: https://github.com/dask/dask/pull/457 (I think it would work now that xarray data stores are pickle-able)

CC @mrocklin

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1189/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
430188626 MDU6SXNzdWU0MzAxODg2MjY= 2873 Dask distributed tests fail locally shoyer 1217238 closed 0     3 2019-04-07T20:26:53Z 2023-12-05T21:43:02Z 2023-12-05T21:43:02Z MEMBER      

I'm not sure why, but when I run the integration tests with dask-distributed locally (on my MacBook pro), they fail: ``` $ pytest xarray/tests/test_distributed.py --maxfail 1 ================================================ test session starts ================================================= platform darwin -- Python 3.7.2, pytest-4.0.1, py-1.7.0, pluggy-0.8.0 rootdir: /Users/shoyer/dev/xarray, inifile: setup.cfg plugins: repeat-0.7.0 collected 19 items

xarray/tests/test_distributed.py F

====================================================== FAILURES ====================================================== ____ test_dask_distributed_netcdf_roundtrip[netcdf4-NETCDF3_CLASSIC] _______

loop = <tornado.platform.asyncio.AsyncIOLoop object at 0x1c182da1d0> tmp_netcdf_filename = '/private/var/folders/15/qdcz0wqj1t9dg40m_ld0fjkh00b4kd/T/pytest-of-shoyer/pytest-3/test_dask_distributed_netcdf_r0/testfile.nc' engine = 'netcdf4', nc_format = 'NETCDF3_CLASSIC'

@pytest.mark.parametrize('engine,nc_format', ENGINES_AND_FORMATS)  # noqa
def test_dask_distributed_netcdf_roundtrip(
        loop, tmp_netcdf_filename, engine, nc_format):

    if engine not in ENGINES:
        pytest.skip('engine not available')

    chunks = {'dim1': 4, 'dim2': 3, 'dim3': 6}

    with cluster() as (s, [a, b]):
        with Client(s['address'], loop=loop):

            original = create_test_data().chunk(chunks)

            if engine == 'scipy':
                with pytest.raises(NotImplementedError):
                    original.to_netcdf(tmp_netcdf_filename,
                                       engine=engine, format=nc_format)
                return

            original.to_netcdf(tmp_netcdf_filename,
                               engine=engine, format=nc_format)

            with xr.open_dataset(tmp_netcdf_filename,
                                 chunks=chunks, engine=engine) as restored:
                assert isinstance(restored.var1.data, da.Array)
                computed = restored.compute()
              assert_allclose(original, computed)

xarray/tests/test_distributed.py:87:


../../miniconda3/envs/xarray-py37/lib/python3.7/contextlib.py:119: in exit next(self.gen)


nworkers = 2, nanny = False, worker_kwargs = {}, active_rpc_timeout = 1, scheduler_kwargs = {}

@contextmanager
def cluster(nworkers=2, nanny=False, worker_kwargs={}, active_rpc_timeout=1,
            scheduler_kwargs={}):
    ...  # trimmed
    start = time()
    while list(ws):
        sleep(0.01)
      assert time() < start + 1, 'Workers still around after one second'

E AssertionError: Workers still around after one second

../../miniconda3/envs/xarray-py37/lib/python3.7/site-packages/distributed/utils_test.py:721: AssertionError ------------------------------------------------ Captured stderr call ------------------------------------------------ distributed.scheduler - INFO - Clear task state distributed.scheduler - INFO - Scheduler at: tcp://127.0.0.1:51715 distributed.worker - INFO - Start worker at: tcp://127.0.0.1:51718 distributed.worker - INFO - Listening to: tcp://127.0.0.1:51718 distributed.worker - INFO - Waiting to connect to: tcp://127.0.0.1:51715 distributed.worker - INFO - ------------------------------------------------- distributed.worker - INFO - Threads: 1 distributed.worker - INFO - Memory: 17.18 GB distributed.worker - INFO - Local Directory: /Users/shoyer/dev/xarray/_test_worker-5cabd1b7-4d9c-49eb-a79e-205c588f5dae/worker-n8uv72yx distributed.worker - INFO - ------------------------------------------------- distributed.worker - INFO - Start worker at: tcp://127.0.0.1:51720 distributed.worker - INFO - Listening to: tcp://127.0.0.1:51720 distributed.worker - INFO - Waiting to connect to: tcp://127.0.0.1:51715 distributed.scheduler - INFO - Register tcp://127.0.0.1:51718 distributed.worker - INFO - ------------------------------------------------- distributed.worker - INFO - Threads: 1 distributed.worker - INFO - Memory: 17.18 GB distributed.worker - INFO - Local Directory: /Users/shoyer/dev/xarray/_test_worker-71a426d4-bd34-4808-9d33-79cac2bb4801/worker-a70rlf4r distributed.worker - INFO - ------------------------------------------------- distributed.scheduler - INFO - Starting worker compute stream, tcp://127.0.0.1:51718 distributed.core - INFO - Starting established connection distributed.worker - INFO - Registered to: tcp://127.0.0.1:51715 distributed.worker - INFO - ------------------------------------------------- distributed.core - INFO - Starting established connection distributed.scheduler - INFO - Register tcp://127.0.0.1:51720 distributed.scheduler - INFO - Starting worker compute stream, tcp://127.0.0.1:51720 distributed.core - INFO - Starting established connection distributed.worker - INFO - Registered to: tcp://127.0.0.1:51715 distributed.worker - INFO - ------------------------------------------------- distributed.core - INFO - Starting established connection distributed.scheduler - INFO - Receive client connection: Client-59a7918c-5972-11e9-912a-8c85907bce57 distributed.core - INFO - Starting established connection distributed.core - INFO - Event loop was unresponsive in Worker for 1.05s. This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability. distributed.scheduler - INFO - Receive client connection: Client-worker-5a5c81de-5972-11e9-9136-8c85907bce57 distributed.core - INFO - Starting established connection distributed.core - INFO - Event loop was unresponsive in Worker for 1.33s. This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability. distributed.scheduler - INFO - Receive client connection: Client-worker-5b2496d8-5972-11e9-9137-8c85907bce57 distributed.core - INFO - Starting established connection distributed.scheduler - INFO - Remove client Client-59a7918c-5972-11e9-912a-8c85907bce57 distributed.scheduler - INFO - Remove client Client-59a7918c-5972-11e9-912a-8c85907bce57 distributed.scheduler - INFO - Close client connection: Client-59a7918c-5972-11e9-912a-8c85907bce57 distributed.worker - INFO - Stopping worker at tcp://127.0.0.1:51720 distributed.worker - INFO - Stopping worker at tcp://127.0.0.1:51718 distributed.scheduler - INFO - Remove worker tcp://127.0.0.1:51720 distributed.core - INFO - Removing comms to tcp://127.0.0.1:51720 distributed.scheduler - INFO - Remove worker tcp://127.0.0.1:51718 distributed.core - INFO - Removing comms to tcp://127.0.0.1:51718 distributed.scheduler - INFO - Lost all workers distributed.scheduler - INFO - Remove client Client-worker-5b2496d8-5972-11e9-9137-8c85907bce57 distributed.scheduler - INFO - Remove client Client-worker-5a5c81de-5972-11e9-9136-8c85907bce57 distributed.scheduler - INFO - Close client connection: Client-worker-5b2496d8-5972-11e9-9137-8c85907bce57 distributed.scheduler - INFO - Close client connection: Client-worker-5a5c81de-5972-11e9-9136-8c85907bce57 distributed.scheduler - INFO - Scheduler closing... distributed.scheduler - INFO - Scheduler closing all comms ```

Version info: ``` In [2]: xarray.show_versions()

INSTALLED VERSIONS

commit: 2ce0639ee2ba9c7b1503356965f77d847d6cfcdf python: 3.7.2 (default, Dec 29 2018, 00:00:04) [Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 18.2.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2

xarray: 0.12.1+4.g2ce0639e pandas: 0.24.0 numpy: 1.15.4 scipy: 1.1.0 netCDF4: 1.4.3.2 pydap: None h5netcdf: 0.7.0 h5py: 2.9.0 Nio: None zarr: 2.2.0 cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 1.1.5 distributed: 1.26.1 matplotlib: 3.0.2 cartopy: 0.17.0 seaborn: 0.9.0 setuptools: 40.0.0 pip: 18.0 conda: None pytest: 4.0.1 IPython: 6.5.0 sphinx: 1.8.2 ```

@mrocklin does this sort of error look familiar to you?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2873/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  not_planned xarray 13221727 issue
707647715 MDExOlB1bGxSZXF1ZXN0NDkyMDEzODg4 4453 Simplify and restore old behavior for deep-copies shoyer 1217238 closed 0     3 2020-09-23T20:10:33Z 2023-09-14T03:06:34Z 2023-09-14T03:06:33Z MEMBER   1 pydata/xarray/pulls/4453

Intended to fix https://github.com/pydata/xarray/issues/4449

The goal is to restore behavior to match what we had prior to https://github.com/pydata/xarray/pull/4379 for all types of data other than np.ndarray objects

Needs tests!

  • [ ] Closes #xxxx
  • [ ] Tests added
  • [ ] Passes isort . && black . && mypy . && flake8
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [ ] New functions/methods are listed in api.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4453/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
588105641 MDU6SXNzdWU1ODgxMDU2NDE= 3893 HTML repr in the online docs shoyer 1217238 open 0     3 2020-03-26T02:17:51Z 2023-09-11T17:41:59Z   MEMBER      

I noticed two minor issues in our online docs, now that we've switched to the hip new HTML repr by default.

  1. Most doc pages still show text, not HTML. I suspect this is a limitation of the IPython sphinx derictive we use for our snippets. We might be able to fix that by switching to jupyter-sphinx?

  2. The "attributes" part of the HTML repr in our notebook examples looks a little funny, with strange blue formatting around each attribute name. It looks like part of the outer style of our docs is leaking into the HTML repr:

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3893/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
176805500 MDU6SXNzdWUxNzY4MDU1MDA= 1004 Remove IndexVariable.name shoyer 1217238 open 0     3 2016-09-14T03:27:43Z 2023-03-11T19:57:40Z   MEMBER      

As discussed in #947, we should remove the IndexVariable.name attribute. It should be fine to use an IndexVariable anywhere, regardless of whether or not it labels ticks along a dimension.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1004/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1210267320 I_kwDOAMm_X85IIza4 6505 Dropping a MultiIndex variable raises an error after explicit indexes refactor shoyer 1217238 closed 0     3 2022-04-20T22:07:26Z 2022-07-21T14:46:58Z 2022-07-21T14:46:58Z MEMBER      

What happened?

With the latest released version of Xarray, it is possible to delete all variables corresponding to a MultiIndex by simply deleting the name of the MultiIndex.

After the explicit indexes refactor (i.e,. using the "main" development branch) this now raises error about how this would "corrupt" index state. This comes up when using drop() and assign_coords() and possibly some other methods.

This is not hard to work around, but we may want to consider this bug a blocker for the next Xarray release. I found the issue surfaced in several projects when attempting to use the new version of Xarray inside Google's codebase.

CC @benbovy in case you have any thoughts to share.

What did you expect to happen?

For now, we should preserve the behavior of deleting the variables corresponding to MultiIndex levels, but should issue a deprecation warning encouraging users to explicitly delete everything.

Minimal Complete Verifiable Example

```Python import xarray

array = xarray.DataArray( [[1, 2], [3, 4]], dims=['x', 'y'], coords={'x': ['a', 'b']}, ) stacked = array.stack(z=['x', 'y']) print(stacked.drop('z')) print() print(stacked.assign_coords(z=[1, 2, 3, 4])) ```

Relevant log output

```Python ValueError Traceback (most recent call last) Input In [1], in <cell line: 9>() 3 array = xarray.DataArray( 4 [[1, 2], [3, 4]], 5 dims=['x', 'y'], 6 coords={'x': ['a', 'b']}, 7 ) 8 stacked = array.stack(z=['x', 'y']) ----> 9 print(stacked.drop('z')) 10 print() 11 print(stacked.assign_coords(z=[1, 2, 3, 4]))

File ~/dev/xarray/xarray/core/dataarray.py:2425, in DataArray.drop(self, labels, dim, errors, labels_kwargs) 2408 def drop( 2409 self, 2410 labels: Mapping = None, (...) 2414 labels_kwargs, 2415 ) -> DataArray: 2416 """Backward compatible method based on drop_vars and drop_sel 2417 2418 Using either drop_vars or drop_sel is encouraged (...) 2423 DataArray.drop_sel 2424 """ -> 2425 ds = self._to_temp_dataset().drop(labels, dim, errors=errors) 2426 return self._from_temp_dataset(ds)

File ~/dev/xarray/xarray/core/dataset.py:4590, in Dataset.drop(self, labels, dim, errors, **labels_kwargs) 4584 if dim is None and (is_scalar(labels) or isinstance(labels, Iterable)): 4585 warnings.warn( 4586 "dropping variables using drop will be deprecated; using drop_vars is encouraged.", 4587 PendingDeprecationWarning, 4588 stacklevel=2, 4589 ) -> 4590 return self.drop_vars(labels, errors=errors) 4591 if dim is not None: 4592 warnings.warn( 4593 "dropping labels using list-like labels is deprecated; using " 4594 "dict-like arguments with drop_sel, e.g. `ds.drop_sel(dim=[labels]).", 4595 DeprecationWarning, 4596 stacklevel=2, 4597 )

File ~/dev/xarray/xarray/core/dataset.py:4549, in Dataset.drop_vars(self, names, errors) 4546 if errors == "raise": 4547 self._assert_all_in_dataset(names) -> 4549 assert_no_index_corrupted(self.xindexes, names) 4551 variables = {k: v for k, v in self._variables.items() if k not in names} 4552 coord_names = {k for k in self._coord_names if k in variables}

File ~/dev/xarray/xarray/core/indexes.py:1394, in assert_no_index_corrupted(indexes, coord_names) 1392 common_names_str = ", ".join(f"{k!r}" for k in common_names) 1393 index_names_str = ", ".join(f"{k!r}" for k in index_coords) -> 1394 raise ValueError( 1395 f"cannot remove coordinate(s) {common_names_str}, which would corrupt " 1396 f"the following index built from coordinates {index_names_str}:\n" 1397 f"{index}" 1398 )

ValueError: cannot remove coordinate(s) 'z', which would corrupt the following index built from coordinates 'z', 'x', 'y': <xarray.core.indexes.PandasMultiIndex object at 0x148c95150> ```

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: 33cdabd261b5725ac357c2823bd0f33684d3a954 python: 3.10.4 | packaged by conda-forge | (main, Mar 24 2022, 17:42:03) [Clang 12.0.1 ] python-bits: 64 OS: Darwin OS-release: 21.4.0 machine: arm64 processor: arm byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.1 libnetcdf: 4.8.1 xarray: 0.18.3.dev137+g96c56836 pandas: 1.4.2 numpy: 1.22.3 scipy: 1.8.0 netCDF4: 1.5.8 pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.11.3 cftime: 1.6.0 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2022.04.1 distributed: 2022.4.1 matplotlib: None cartopy: None seaborn: None numbagg: None fsspec: 2022.3.0 cupy: None pint: None sparse: None setuptools: 62.1.0 pip: 22.0.4 conda: None pytest: 7.1.1 IPython: 8.2.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6505/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
327166000 MDExOlB1bGxSZXF1ZXN0MTkxMDMwMjA4 2195 WIP: explicit indexes shoyer 1217238 closed 0     3 2018-05-29T04:25:15Z 2022-03-21T14:59:52Z 2022-03-21T14:59:52Z MEMBER   0 pydata/xarray/pulls/2195

Some utility functions that should be useful for https://github.com/pydata/xarray/issues/1603

Still very much a work in progress -- it would be great if someone has time to finish writing any of these in another PR!

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2195/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
891281614 MDU6SXNzdWU4OTEyODE2MTQ= 5302 Suggesting specific IO backends to install when open_dataset() fails shoyer 1217238 closed 0     3 2021-05-13T18:45:28Z 2021-06-23T08:18:07Z 2021-06-23T08:18:07Z MEMBER      

Currently, Xarray's internal backends don't get registered unless the necessary dependencies are installed: https://github.com/pydata/xarray/blob/1305d9b624723b86050ca5b2d854e5326bbaa8e6/xarray/backends/netCDF4_.py#L567-L568

In order to facilitating suggesting a specific backend to install (e.g., to improve error messages from opening tutorial datasets https://github.com/pydata/xarray/issues/5291), I would suggest that Xarray always registers its own backend entrypoints. Then we make the following changes to the plugin protocol:

  • guess_can_open() should work regardless of whether the underlying backend is installed
  • installed() returns a boolean reporting whether backend is installed. The default method in the base class would return True, for backwards compatibility.
  • open_dataset() of course should error if the backend is not installed.

This will let us leverage the existing guess_can_open() functionality to suggest specific optional dependencies to install. E.g., if you supply a netCDF3 file: Xarray cannot find a matching installed backend for this file in the installed backends ["h5netcdf"]. Consider installing one of the following backends which reports a match: ["scipy", "netcdf4"]

Does this reasonable and worthwhile?

CC @aurghs @alexamici

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5302/reactions",
    "total_count": 4,
    "+1": 4,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
291405750 MDU6SXNzdWUyOTE0MDU3NTA= 1855 swap_dims should support dimension names that are not existing variables shoyer 1217238 closed 0     3 2018-01-25T00:08:26Z 2020-01-08T18:27:29Z 2020-01-08T18:27:29Z MEMBER      

Code Sample, a copy-pastable example if possible

python input_ds = xarray.Dataset({'foo': ('x', [1, 2])}, {'x': [0, 1]}) input_ds.swap_dims({'x': 'z'})

Problem description

Currently this results in the error KeyError: 'z'

Expected Output

We now support dimensions without associated coordinate variables. So swap_dims() should be able to create new dimensions (e.g., z in this example) even if there isn't already a coordinate variable.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1855/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
494943997 MDExOlB1bGxSZXF1ZXN0MzE4NTk1NDE3 3316 Clarify that "scatter" is a plotting method in what's new. shoyer 1217238 closed 0     3 2019-09-18T02:02:22Z 2019-09-18T03:47:46Z 2019-09-18T03:46:35Z MEMBER   0 pydata/xarray/pulls/3316

When I read this, I thought it was referring to scattering data somehow :).

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3316/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
464793626 MDU6SXNzdWU0NjQ3OTM2MjY= 3083 test_rasterio_vrt_network is failing in continuous integration tests shoyer 1217238 closed 0     3 2019-07-05T23:13:25Z 2019-07-31T00:28:46Z 2019-07-31T00:28:46Z MEMBER      

``` @network def test_rasterio_vrt_network(self): import rasterio

    url = 'https://storage.googleapis.com/\
    gcp-public-data-landsat/LC08/01/047/027/\
    LC08_L1TP_047027_20130421_20170310_01_T1/\
    LC08_L1TP_047027_20130421_20170310_01_T1_B4.TIF'
    env = rasterio.Env(GDAL_DISABLE_READDIR_ON_OPEN='EMPTY_DIR',
                       CPL_VSIL_CURL_USE_HEAD=False,
                       CPL_VSIL_CURL_ALLOWED_EXTENSIONS='TIF')
    with env:
      with rasterio.open(url) as src:

xarray/tests/test_backends.py:3734:


/usr/share/miniconda/envs/test_env/lib/python3.6/site-packages/rasterio/env.py:430: in wrapper return f(args, kwds) /usr/share/miniconda/envs/test_env/lib/python3.6/site-packages/rasterio/init.py:216: in open s = DatasetReader(path, driver=driver, sharing=sharing, *kwargs)


??? E rasterio.errors.RasterioIOError: HTTP response code: 400 - Failed writing header ``` https://dev.azure.com/xarray/xarray/_build/results?buildId=150&view=ms.vss-test-web.build-test-results-tab&runId=2358&resultId=101228&paneView=debug

I'm not sure what's going on here -- the tiff file is still available at the given URL.

@scottyhq any idea?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3083/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
456963929 MDExOlB1bGxSZXF1ZXN0Mjg4ODcwMDQ0 3027 Ensure explicitly indexed arrays are preserved shoyer 1217238 closed 0     3 2019-06-17T14:21:18Z 2019-06-23T16:53:11Z 2019-06-23T16:49:23Z MEMBER   0 pydata/xarray/pulls/3027

Fixes https://github.com/pydata/xarray/issues/3009

Previously, indexing an ImplicitToExplicitIndexingAdapter object could directly return an ExplicitlyIndexed object, which could not be indexed normally, e.g., x[index] could result in an object that could not be indexed properly. This resulted in broken behavior with dask's new _meta attribute.

I'm pretty sure this fix is appropriate, but it does introduce two failing tests with xarray on dask master. In particular, there are now errors raised inside two tests from dask's blockwise_meta helper function: ```

  return meta.astype(dtype)

E AttributeError: 'ImplicitToExplicitIndexingAdapter' object has no attribute 'astype' ```

cc @mrocklin @pentschev

  • [x] Tests added
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3027/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
325436508 MDU6SXNzdWUzMjU0MzY1MDg= 2170 keepdims=True for xarray reductions shoyer 1217238 closed 0     3 2018-05-22T19:44:17Z 2019-06-23T09:18:33Z 2019-06-23T09:18:33Z MEMBER      

For operations where arrays are aggregated but then combined, the keepdims=True option for NumPy aggregations is convenient.

We should consider supporting this in xarray as well. Aggregating a DataArray/Dataset with keepdims=True (or maybe keep_dims=True) would remove all original coordinates along aggregated dimensions and return a result with a dimension of size 1 without any coordinates, e.g., ```

array = xr.DataArray([1, 2, 3], dims='x', coords={'x': ['a', 'b', 'c']}) array.mean(keepdims=True) <xarray.DataArray (x: 1)> array([2.]) Dimensions without coordinates: x ```

In case, array.mean(keepdims=True() is equivalent to array.mean().expand_dims('x') but in general this equivalent does not hold, because the location of the original dimension is lost.

Implementation-wise, we have two options: 1. Pass on keepdims=True to NumPy functions like numpy.mean(), or 2. Implement keepdims=True ourselves, in Variable.reduce().

I think I like option 2 a little better, because it places fewer requirements on aggregation functions. For example, functions like bottleneck.nanmean() don't accept a keepdims argument.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2170/reactions",
    "total_count": 10,
    "+1": 9,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 1
}
  completed xarray 13221727 issue
313040371 MDU6SXNzdWUzMTMwNDAzNzE= 2050 test_cross_engine_read_write_netcdf3 is now failing on master shoyer 1217238 closed 0     3 2018-04-10T18:31:58Z 2019-02-04T04:42:17Z 2019-02-04T04:42:17Z MEMBER      

Only on Python 3.5 and 3.6 for now: ``` =================================== FAILURES =================================== _ GenericNetCDFDataTest.testcross_engine_read_write_netcdf3 __ self = <xarray.tests.test_backends.GenericNetCDFDataTest testMethod=test_cross_engine_read_write_netcdf3> def test_cross_engine_read_write_netcdf3(self): data = create_test_data() valid_engines = set() if has_netCDF4: valid_engines.add('netcdf4') if has_scipy: valid_engines.add('scipy')

    for write_engine in valid_engines:
        for format in ['NETCDF3_CLASSIC', 'NETCDF3_64BIT']:
            with create_tmp_file() as tmp_file:
                data.to_netcdf(tmp_file, format=format,
                               engine=write_engine)
                for read_engine in valid_engines:
                    with open_dataset(tmp_file,
                                    engine=read_engine) as actual:

xarray/tests/test_backends.py:1596:


xarray/backends/api.py:299: in open_dataset autoclose=autoclose) xarray/backends/netCDF4_.py:280: in open ds = opener() xarray/backends/netCDF4_.py:204: in _open_netcdf4_group ds = nc4.Dataset(filename, mode=mode, **kwargs) netCDF4/_netCDF4.pyx:2015: in netCDF4._netCDF4.Dataset.init ???


??? E OSError: [Errno -36] NetCDF: Invalid argument: b'/tmp/tmpu5no_wbf/temp-1157.nc' netCDF4/_netCDF4.pyx:1636: OSError ___ GenericNetCDFDataTestAutocloseTrue.test_cross_engine_read_write_netcdf3 ____ self = <xarray.tests.test_backends.GenericNetCDFDataTestAutocloseTrue testMethod=test_cross_engine_read_write_netcdf3> def test_cross_engine_read_write_netcdf3(self): data = create_test_data() valid_engines = set() if has_netCDF4: valid_engines.add('netcdf4') if has_scipy: valid_engines.add('scipy')

    for write_engine in valid_engines:
        for format in ['NETCDF3_CLASSIC', 'NETCDF3_64BIT']:
            with create_tmp_file() as tmp_file:
                data.to_netcdf(tmp_file, format=format,
                               engine=write_engine)
                for read_engine in valid_engines:
                    with open_dataset(tmp_file,
                                    engine=read_engine) as actual:

xarray/tests/test_backends.py:1596:


xarray/backends/api.py:299: in open_dataset autoclose=autoclose) xarray/backends/netCDF4_.py:280: in open ds = opener() xarray/backends/netCDF4_.py:204: in _open_netcdf4_group ds = nc4.Dataset(filename, mode=mode, **kwargs) netCDF4/_netCDF4.pyx:2015: in netCDF4._netCDF4.Dataset.init ???


??? E OSError: [Errno -36] NetCDF: Invalid argument: b'/tmp/tmp9ak1v4wj/temp-1238.nc' netCDF4/_netCDF4.pyx:1636: OSError ```

Here's the diff of conda packages: diff --- broken.txt 2018-04-10 11:22:39.400835307 -0700 +++ works.txt 2018-04-10 11:23:12.840755416 -0700 @@ -9,2 +9,2 @@ -boto3 1.7.2 py_0 conda-forge -botocore 1.10.2 py_0 conda-forge +boto3 1.7.0 py_0 conda-forge +botocore 1.10.1 py_0 conda-forge @@ -23 +23 @@ -curl 7.59.0 1 conda-forge +curl 7.59.0 0 conda-forge @@ -29 +29 @@ -distributed 1.21.6 py36_0 conda-forge +distributed 1.21.5 py36_0 conda-forge @@ -62 +62 @@ -libgdal 2.2.4 1 conda-forge +libgdal 2.2.4 0 conda-forge @@ -66 +66 @@ -libnetcdf 4.5.0 3 conda-forge +libnetcdf 4.4.1.1 10 conda-forge @@ -83 +83 @@ -netcdf4 1.3.1 py36_2 conda-forge +netcdf4 1.3.1 py36_1 conda-forge @@ -85 +85 @@ -numcodecs 0.5.5 py36_0 conda-forge +numcodecs 0.5.4 py36_0 conda-forge @@ -131 +131 @@ -tornado 5.0.2 py36_0 conda-forge +tornado 5.0.1 py36_1 conda-forge

The culprit is almost certainly libnetcdf 4.4.1.1 -> 4.5.0

It looks like it's basically this issue again: https://github.com/Unidata/netcdf-c/issues/657

We could fix this either by skipping the tests in xarray's CI or upgrading netCDF-C on conda forge to 4.6.0 or 4.6.1.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2050/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
35633124 MDU6SXNzdWUzNTYzMzEyNA== 155 Expose a public interface for CF encoding/decoding functions shoyer 1217238 open 0     3 2014-06-12T23:33:42Z 2019-02-04T04:17:40Z   MEMBER      

Relevant discussion: #153

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/155/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
396989505 MDExOlB1bGxSZXF1ZXN0MjQzMDM5NzQ4 2661 Remove broken Travis-CI builds shoyer 1217238 closed 0     3 2019-01-08T16:40:24Z 2019-01-08T18:34:04Z 2019-01-08T18:34:00Z MEMBER   0 pydata/xarray/pulls/2661

Remove the optional condaforge-rc, netcdf4-dev and pynio-dev builds. These have been continuously failing (due to broken installs), so we shouldn't waste time/energy running them.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2661/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
393903950 MDU6SXNzdWUzOTM5MDM5NTA= 2631 Last call for v0.11.1 shoyer 1217238 closed 0     3 2018-12-24T16:01:22Z 2018-12-31T16:07:49Z 2018-12-31T16:07:48Z MEMBER      

@pydata/xarray I'm going to issue v0.11.1 in a day or two, unless there's anything else we really want to squeeze in. This is the last release with planned Python 2.7 support (but we could conceivably still do backports for nasty bugs).

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2631/reactions",
    "total_count": 4,
    "+1": 4,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
346313546 MDU6SXNzdWUzNDYzMTM1NDY= 2332 Test failures on master with DataArray.to_cdms2 shoyer 1217238 closed 0     3 2018-07-31T18:49:21Z 2018-09-05T15:18:45Z 2018-09-05T15:18:45Z MEMBER      

See https://travis-ci.org/pydata/xarray/jobs/410459646

Example failure: ``` =================================== FAILURES =================================== __ TestDataArray.testto_and_from_cdms2_classic ___ self = <xarray.tests.test_dataarray.TestDataArray testMethod=test_to_and_from_cdms2_classic> def test_to_and_from_cdms2_classic(self): """Classic with 1D axes""" pytest.importorskip('cdms2')

    original = DataArray(
        np.arange(6).reshape(2, 3),
        [('distance', [-2, 2], {'units': 'meters'}),
         ('time', pd.date_range('2000-01-01', periods=3))],
        name='foo', attrs={'baz': 123})
    expected_coords = [IndexVariable('distance', [-2, 2]),
                       IndexVariable('time', [0, 1, 2])]
    actual = original.to_cdms2()
  assert_array_equal(actual, original)

E ValueError: E error during assertion: E
E Traceback (most recent call last): E File "/home/travis/miniconda/envs/test_env/lib/python2.7/site-packages/numpy/testing/_private/utils.py", line 752, in assert_array_compare E x, y = x[~flagged], y[~flagged] E File "/home/travis/miniconda/envs/test_env/lib/python2.7/site-packages/cdms2/avariable.py", line 1177, in getitem E speclist = self._process_specs([key], {}) E File "/home/travis/miniconda/envs/test_env/lib/python2.7/site-packages/cdms2/avariable.py", line 938, in _process_specs E if Ellipsis in specs: E ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() E
E
E Arrays are not equal E x: TransientVariable([[0, 1, 2], E [3, 4, 5]]) E y: array([[0, 1, 2], E [3, 4, 5]]) ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2332/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
324286237 MDExOlB1bGxSZXF1ZXN0MTg4OTMyODgx 2158 Fix dtype=S1 encoding in to_netcdf() shoyer 1217238 closed 0     3 2018-05-18T06:30:55Z 2018-06-01T01:09:45Z 2018-06-01T01:09:38Z MEMBER   0 pydata/xarray/pulls/2158
  • [x] Closes #2149 (remove if there is no corresponding issue, which should only be the case for minor changes)
  • [x] Tests added (for all bug fixes or enhancements)
  • [x] Tests passed (for all non-documentation changes)
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later)

@crusaderky please take a look. Testing here is not as thorough as in https://github.com/pydata/xarray/pull/2150 yet, but it does include a regression test.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2158/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
36238963 MDU6SXNzdWUzNjIzODk2Mw== 170 Add DataArray.insert_dim shoyer 1217238 closed 0     3 2014-06-22T07:06:08Z 2018-04-26T17:18:01Z 2018-04-26T17:18:01Z MEMBER      

Signature: something like array = array.insert_dim(name, coord, axis=-1)

If index has size > 1, tile the array values along the new dimension, possibly using Variable.expand_dims to avoid copies.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/170/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
303725606 MDU6SXNzdWUzMDM3MjU2MDY= 1975 0.10.2 release shoyer 1217238 closed 0     3 2018-03-09T05:04:44Z 2018-03-15T00:06:40Z 2018-03-15T00:06:40Z MEMBER      

In the spirit of our goal for a more rapid release (https://github.com/pydata/xarray/issues/1821), let's aim to issue the 0.10.2 release in the next few days, ideally after the following PRs are merged (all of which are nearly ready):

  • [x] fix distributed writes #1793
  • [x] einsum for xarray #1968
  • [x] Support array_ufunc for xarray objects. #1962

CC @pydata/xarray

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1975/reactions",
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
300499762 MDExOlB1bGxSZXF1ZXN0MTcxNTY4MjAw 1945 Add seaborn import to toy weather data example. shoyer 1217238 closed 0     3 2018-02-27T05:37:09Z 2018-02-27T19:12:53Z 2018-02-27T19:12:53Z MEMBER   0 pydata/xarray/pulls/1945

It looks like this got inadvertently removed with the flake8 fix in #1925.

  • [x] Closes #1944
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1945/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
278335633 MDExOlB1bGxSZXF1ZXN0MTU1NzY3NDc2 1752 Refactor xarray.conventions into VariableCoder shoyer 1217238 closed 0     3 2017-12-01T02:15:01Z 2017-12-23T13:38:41Z 2017-12-14T17:43:04Z MEMBER   0 pydata/xarray/pulls/1752

Building off of discussion in #1087, I would like to propose refactoring xarray.conventions to use an interface based on VariableCoder objects with encode() and decode() methods.

The idea is make it easier to write new backends, by making decoding variables according to CF conventions as simple as calling decode() on each coder in a list of coders, with encoding defined by calling the same list of encoders in the opposite order.

As a proof of concept, here I implement only a single Coder. In addition to making use of xarray's existing lazy indexing behavior, I have written it so that dask arrays are decoded using dask (which would solve #1372)

Eventually, we should port all the coders in xarray.conventions to this new format. This is probably best saved for future PRs -- help would be appreciated!

  • [x] Tests added / passed
  • [x] Passes git diff upstream/master **/*py | flake8 --diff
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1752/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
267339197 MDExOlB1bGxSZXF1ZXN0MTQ3OTIwOTE1 1643 Deprecate Dataset.T as an alias for Dataset.transpose() shoyer 1217238 closed 0     3 2017-10-21T01:04:33Z 2017-11-02T19:01:59Z 2017-10-22T01:04:17Z MEMBER   0 pydata/xarray/pulls/1643
  • [x] Closes #1232
  • [x] Tests added / passed
  • [x] Passes git diff upstream/master | flake8 --diff
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1643/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
233708091 MDU6SXNzdWUyMzM3MDgwOTE= 1441 v0.9.6 release shoyer 1217238 closed 0     3 2017-06-05T20:55:18Z 2017-06-09T16:43:59Z 2017-06-09T15:57:09Z MEMBER      

I plan to issue this in within the next few days, after merging #1260 (Rasterio support) and #1439 (pydap fix). Let me know if there's anything else critical to get in.

CC @pydata/xarray

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1441/reactions",
    "total_count": 7,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 4,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
227512378 MDU6SXNzdWUyMjc1MTIzNzg= 1401 Verify xarray works with bottleneck 1.2 shoyer 1217238 closed 0     3 2017-05-09T22:06:36Z 2017-05-10T23:10:58Z 2017-05-10T23:10:58Z MEMBER      

This is somewhat time sensitive: https://github.com/kwgoodman/bottleneck/issues/168

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1401/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
217347106 MDU6SXNzdWUyMTczNDcxMDY= 1331 Convert an existing xarray dimension into a MultiIndex shoyer 1217238 closed 0     3 2017-03-27T19:17:22Z 2017-03-28T18:11:02Z 2017-03-28T18:11:02Z MEMBER      

Suppose I have two xarray Datasets, each defined along the 'x' axis: python ds1 = xarray.Dataset({'foo': (('x',), [1, 2, 3])}, {'x': [1, 2, 3], 'y': 'a'}) ds2 = xarray.Dataset({'foo': (('x',), [4, 5, 6])}, {'x': [1, 2, 3], 'y': 'b'}) <xarray.Dataset> Dimensions: (x: 3) Coordinates: y |S1 'a' * x (x) int64 1 2 3 Data variables: foo (x) int64 1 2 3

Now I want to stack them along a new MultiIndex 'yx' that consists of y and x levels: python desired = xarray.Dataset({'foo': (('yx',), [1, 2, 3, 4, 5, 6])}, {'yx': pandas.MultiIndex.from_product([['a', 'b'], [1, 2, 3]], names=['y', 'x'])}) <xarray.Dataset> Dimensions: (yx: 6) Coordinates: * yx (yx) MultiIndex - y (yx) object 'a' 'a' 'a' 'b' 'b' 'b' - x (yx) int64 1 2 3 1 2 3 Data variables: foo (yx) int64 1 2 3 4 5 6

How can this be achieved with the minimum effort? What is the missing utility function that we need? I attempted to use set_index and swap_dims but so far have been unsuccessful.

@benbovy any ideas? I think something similar may have come up when we were discussing your set_index PR.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1331/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
213265588 MDExOlB1bGxSZXF1ZXN0MTEwMDc1Mjk0 1305 Clarify licenses for bundled code shoyer 1217238 closed 0     3 2017-03-10T07:30:29Z 2017-03-11T23:28:38Z 2017-03-11T23:28:38Z MEMBER   0 pydata/xarray/pulls/1305

They are all now called out explicitly in the README as well.

  • [x] closes #1254
  • [x] passes git diff upstream/master | flake8 --diff
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1305/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
149678642 MDU6SXNzdWUxNDk2Nzg2NDI= 835 Merging variables with overlapping not but not conflicting values shoyer 1217238 closed 0     3 2016-04-20T07:05:43Z 2017-01-23T22:41:31Z 2017-01-23T22:41:31Z MEMBER      

It should be possible to safely merge together variables with values [NaN, 1, 2] and [0, 1, NaN] by using methods such as combine_first (which should also be OK with conflicting values, like the pandas method) and merge (which should raise if values conflict).

See this stackoverflow post for a merge example: http://stackoverflow.com/questions/36731870/how-to-merge-xarray-datasets-with-conflicting-coordinates

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/835/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
44594982 MDU6SXNzdWU0NDU5NDk4Mg== 242 Add a "drop" option to squeeze shoyer 1217238 closed 0     3 2014-10-01T17:54:50Z 2016-12-16T03:43:58Z 2016-12-16T03:27:11Z MEMBER      

If True, squeezed dimensions should be dropped from the resulting object (instead of being retained as scalar)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/242/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
193467825 MDExOlB1bGxSZXF1ZXN0OTY1MDgyNjU= 1153 Add drop=True argument to isel, sel and squeeze shoyer 1217238 closed 0     3 2016-12-05T11:02:14Z 2016-12-16T03:27:11Z 2016-12-16T03:27:11Z MEMBER   0 pydata/xarray/pulls/1153

Fixes #242

This is useful for getting rid of extraneous scalar variables that arise from indexing, and in particular will resolve an issue for optional indexes: https://github.com/pydata/xarray/pull/1017#issuecomment-260777664

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1153/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
169271691 MDU6SXNzdWUxNjkyNzE2OTE= 938 Update examples to use xarray.tutorial.load_dataset() shoyer 1217238 closed 0     3 2016-08-04T01:33:59Z 2016-08-27T19:08:19Z 2016-08-27T19:08:19Z MEMBER      

This is cleaner than require users to separately download data, and it already works, for everything in the xarray-data repository!

ds = xr.tutorial.load_dataset('RASM_example_data')

We might want to rename the file to simply "rasm" to keep things shorter.

CC @rabernat @jhamman

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/938/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
171800968 MDU6SXNzdWUxNzE4MDA5Njg= 973 Release v0.8.2 shoyer 1217238 closed 0     3 2016-08-18T01:54:52Z 2016-08-20T03:23:36Z 2016-08-20T02:05:38Z MEMBER      

Once we merge #972, I'd like to release v0.8.2.

It fixes several bugs likely to impact users and is almost completely backwards compatible (except for now automatically aligning in broadcast when we previously raised an error).

CC @jhamman in case he has time to try doing the release process sometime in the next few days.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/973/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
170259709 MDU6SXNzdWUxNzAyNTk3MDk= 956 Skip identical indexes with non-unique values in align? shoyer 1217238 closed 0     3 2016-08-09T20:14:38Z 2016-08-19T01:19:47Z 2016-08-19T01:19:47Z MEMBER      

Currently, when objects with with non-unique (duplicated) values in one of their indexes are passed to align, an error surfaces from pandas: InvalidIndexError: Reindexing only valid with uniquely valued Index objects

We could certainly give a more informative error here (see this complaint on StackOverflow), but a bigger issue is that this probably isn't strictly necessary. Instead, we could skip indexes for alignment if they are already equal. This is slightly less principled (a non-unique index may indicate something has gone wrong), but certainly more convenient and more inline with how pandas works (e.g., it even allows arithmetic between objects with non-unique indexes, which I believe does not work currently in xarray).

Currently, we do this as a special case when merging arrays and exactly one has labels (see _align_for_merge in https://github.com/pydata/xarray/pull/950). But we could probably do this in general, either by default or with a flag to enable it (or turn it off). This would then propagate to every xarray operation that uses align under the covers.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/956/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
168936861 MDU6SXNzdWUxNjg5MzY4NjE= 936 PyNIO backend doesn't play well with open_mfdataset shoyer 1217238 closed 0     3 2016-08-02T17:01:53Z 2016-08-14T20:02:17Z 2016-08-14T20:02:17Z MEMBER      

As reported on StackOverflow: http://stackoverflow.com/questions/38711915/segmentation-fault-writing-xarray-datset-to-netcdf-or-dataframe/

It appears that we can only open a single file at a time with pynio?

Adding a thread lock via lock=True didn't solve the issue.

cc @david-ian-brown

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/936/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
124566831 MDExOlB1bGxSZXF1ZXN0NTQ4OTgwNjY= 695 Build docs on RTD using conda shoyer 1217238 closed 0     3 2016-01-01T23:23:01Z 2016-01-02T01:31:20Z 2016-01-02T01:31:17Z MEMBER   0 pydata/xarray/pulls/695

It works!

To preview, see http://xray.readthedocs.org/en/rtd-conda/

Fixes #602

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/695/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
120508785 MDExOlB1bGxSZXF1ZXN0NTI3MzYwOTc= 670 Add broadcast function to the API shoyer 1217238 closed 0     3 2015-12-04T23:41:56Z 2016-01-01T22:13:17Z 2016-01-01T22:13:05Z MEMBER   0 pydata/xarray/pulls/670

This is a renaming and update of the existing xray.broadcast_arrays function, which now works properly in the light of #648.

xref #649 cc @rabernat

Examples

Broadcast two data arrays against one another to fill out their dimensions:

```

a = xray.DataArray([1, 2, 3], dims='x') b = xray.DataArray([5, 6], dims='y') a <xray.DataArray (x: 3)> array([1, 2, 3]) Coordinates: * x (x) int64 0 1 2 b <xray.DataArray (y: 2)> array([5, 6]) Coordinates: * y (y) int64 0 1 a2, b2 = xray.broadcast(a, b) a2 <xray.DataArray (x: 3, y: 2)> array([[1, 1], [2, 2], [3, 3]]) Coordinates: * x (x) int64 0 1 2 * y (y) int64 0 1 b2 <xray.DataArray (x: 3, y: 2)> array([[5, 6], [5, 6], [5, 6]]) Coordinates: * y (y) int64 0 1 * x (x) int64 0 1 2 ```

Fill out the dimensions of all data variables in a dataset:

```

ds = xray.Dataset({'a': a, 'b': b}) ds2, = xray.broadcast(ds) # use tuple unpacking to extract one dataset ds2 <xray.Dataset> Dimensions: (x: 3, y: 2) Coordinates: * x (x) int64 0 1 2 * y (y) int64 0 1 Data variables: a (x, y) int64 1 1 2 2 3 3 b (x, y) int64 5 6 5 6 5 6 ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/670/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
105927249 MDExOlB1bGxSZXF1ZXN0NDQ3MzczODA= 569 BUG: ensure xray works with pandas 0.17.0 shoyer 1217238 closed 0   0.6.1 1307323 3 2015-09-11T01:12:55Z 2015-10-21T07:05:48Z 2015-09-11T06:23:56Z MEMBER   0 pydata/xarray/pulls/569

We were using some internal routines in pandas to convert object of datetime objects arrays to datetime64. Predictably, these internal routines have now changed, breaking xray.

This is definitely my fault but also bad luck -- I had a guard against the internal function dissappearing, but not against the keyword arguments changing.

In any case, this fix ensures forwards compatibility with the next release of pandas, which will be coming out next week.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/569/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
103966239 MDExOlB1bGxSZXF1ZXN0NDM3MjQxODQ= 554 Fixes for complex numbers shoyer 1217238 closed 0   0.6.1 1307323 3 2015-08-31T00:36:57Z 2015-10-21T07:05:47Z 2015-09-01T20:28:51Z MEMBER   0 pydata/xarray/pulls/554

Fixes #553

Also ensures we skip NaN when aggregating with complex dtypes.

~~Still needs release notes.~~

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/554/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
29921033 MDU6SXNzdWUyOTkyMTAzMw== 79 Better support for batched/out-of-core computation shoyer 1217238 closed 0     3 2014-03-21T17:55:46Z 2015-09-20T23:28:22Z 2015-09-20T23:28:22Z MEMBER      

One option: add a batch_apply method:

This would be a shortcut for split-apply-combine with groupby/apply if the grouping over a dimension is only being done for efficiency reasons.

This function should take several parameters: - The dimension to group over. - The batchsize to group over on this dimension (defaulting to 1). - The func to apply to each group.

At first, this function would be useful just to avoid memory issues. Eventually, it would be nice to add a n_jobs parameter which would automatically dispatch to multiprocessing/joblib. We would need to get pickling (issue #24) working first to be able to do this.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/79/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
69216911 MDU6SXNzdWU2OTIxNjkxMQ== 394 Checklist for releasing a version of xray with dask support shoyer 1217238 closed 0   0.5 987654 3 2015-04-17T21:02:10Z 2015-06-01T18:27:49Z 2015-06-01T18:27:49Z MEMBER      

For dask: - [x] default threadpool for dask.array - [x] fix indexing bugs for dask.array - [x] make a decision on (and if necessary implement) renaming "block" to "chunk" - [x] fix repeated use of da.insert

For xray: - [x] update xray for the updated dask (https://github.com/xray/xray/pull/395) - [x] figure out how to handle caching with the .load() method on dask arrays - [x] cleanup the xray documentation on dask array. - [x] write an introductory blog post

Things we can add in an incremental release: - make non-aggregating grouped operations more useable - automatic lazy apply for grouped operations on xray objects

CC @mrocklin

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/394/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
71994342 MDExOlB1bGxSZXF1ZXN0MzQ0MDc4MDE= 405 Add robust retry logic when accessing remote datasets shoyer 1217238 closed 0   0.5 987654 3 2015-04-29T21:25:47Z 2015-05-01T20:33:46Z 2015-05-01T20:33:45Z MEMBER   0 pydata/xarray/pulls/405

Accessing data from remote datasets now has retrying logic (with exponential backoff) that should make it robust to occasional bad responses from DAP servers.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/405/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
59323055 MDU6SXNzdWU1OTMyMzA1NQ== 345 Alternatives to virtual variables in the form "time.season"? shoyer 1217238 closed 0     3 2015-02-28T03:32:55Z 2015-03-03T01:10:12Z 2015-03-02T23:20:07Z MEMBER      

@jhamman writes in #337:

Since ds.groupby('time.season').mean('time') returns a Dataset with a Coordinates variable named time.season, ds.sel(time.season='JJA') doesn't work for Python syntax reasons. So, I don't know if I would change the syntax used in my example (selecting my position). I'm not keen on this constructor: ds.sel(**{'time.season':'JJA'}). I'm wondering if it would be better to name the coordinates returned from "virtual variable" operations without the "time." portion. Just a thought.

I agree, this is awkward. This has been on my to-do list in the back of my head for some time.

My hesitation with just using 'season' for the name of ds['time.season'] is that it would be a little weird to have indexing return something with a different name than what you asked for.

ds['season'] is another optional that initially looks even more appealing, but what if we have more than datetime variable? This is not unheard of (e.g., 'time' and 'forecast_reference_time').

Another option would be to simply support ds['time_season']. Then at least you can do indexing without using ** (e.g., ds.sel(time_season='JJA')).

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/345/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
59032389 MDExOlB1bGxSZXF1ZXN0MzAwNTY0MDk= 337 Cleanup (mostly documentation) shoyer 1217238 closed 0   0.4 799013 3 2015-02-26T07:40:01Z 2015-02-27T22:22:47Z 2015-02-26T07:43:37Z MEMBER   0 pydata/xarray/pulls/337
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/337/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
49755539 MDU6SXNzdWU0OTc1NTUzOQ== 280 Proposal: allow tuples instead of slice objects in sel or isel shoyer 1217238 closed 0     3 2014-11-21T22:21:10Z 2015-02-24T01:22:13Z 2015-02-24T01:22:13Z MEMBER      

e.g., we should be able to write ds.sel(time=('2000', '2010')) as an alias for ds.sel(time=slice('2000', '2010'))

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/280/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
39264845 MDU6SXNzdWUzOTI2NDg0NQ== 197 We need some way to identify non-index coordinates shoyer 1217238 closed 0   0.3 740776 3 2014-08-01T06:36:13Z 2014-12-19T07:16:14Z 2014-09-10T06:07:15Z MEMBER      

I am currently working with station data. In order to keep around latitude and longitude (I use station_id as the coordinate variable), I need to resort to some ridiculous contortions:

python residuals = results['y'] - observations['y'] residuals.dataset.update(results.select_vars('longitude', 'latitude'))

There has got to be an easier way to handle this.


I don't want to revert to some primitive guessing strategy (e.g, looking at attrs['coordinates']) to figure out which extra variables can be safely kept after mathematical operations.

Another approach would be to try to preserve everything in the dataset linked to an DataArray when doing math. But I don't really like this option, either, because it would lead to serious propagation of "linked dataset variables", which are rather surprising and can have unexpected performance consequences (though at least they appear in repr as of #128).


This leaves me to a final alternative: restructuring xray's internals to provide first-class support for coordinates that are not indexes. For example, this would mean promoting ds.coordinates to an actual dictionary stored on a dataset, and allowing it to hold objects that aren't an xray.Coordinate.

Making this change transparent to users would likely require changing the Dataset signature to something like Dataset(variables, coords, attrs). We might (yet again) want to rename Coordinate, to something like IndexVar, to emphasis the notion of "index" and "non-index" coordinates. And we could get rid of the terrible "linked dataset variable".

Once we have non-index coordinates, we need a policy for what to do when adding with two DataArrays for which they differ. I think my preferred approach is to not enforce that they be found on both arrays, but to raise an exception if there are any conflicting values -- unless they are scalar valued, in which case the dropped or turned into a tuple or given different names. (Otherwise there would be cases where you couldn't calculate x[1] - x[0].)

We might even able to keep around multi-dimension coordinates this way (e.g., 2D lat/lon arrays for projected data).... I'll need to think about that one some more.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/197/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
38848839 MDU6SXNzdWUzODg0ODgzOQ== 190 Consistent use of abbreviations: attrs, dims, coords shoyer 1217238 closed 0   0.2 650893 3 2014-07-27T19:38:35Z 2014-08-14T07:24:29Z 2014-08-14T07:24:29Z MEMBER      

Right now, we use ds.attrs but the keyword argument is still attributes. We should resolve this inconsistency.

We also use dimensions and coordinates instead of the natural abbreviations dims and coords (although dims is used in the Variable constructor). Should we switch to the abbreviated versions for consistency with attrs?

Note that I switched to attrs in part because of its use in other packages (h5py, pytables and blz). There is not as clear precedent for what to call dimensions and coordinates.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/190/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
29879804 MDExOlB1bGxSZXF1ZXN0MTM4MjM3NzI= 77 ENH: Dataset.reindex_like and DatasetArray.reindex_like shoyer 1217238 closed 0     3 2014-03-21T05:12:53Z 2014-06-12T17:30:21Z 2014-04-09T03:05:43Z MEMBER   0 pydata/xarray/pulls/77

This provides an interface for re-indexing a dataset or dataset array using the coordinates from another object. Missing values along any coordinate are replaced by NaN.

This method is directly based on the pandas method DataFrame.reindex_like (and the related series and panel variants). Eventually, I would like to build upon this functionality to add a join method to xray.align with the possible values {'outer', 'inner', 'left', 'right'}, just like DataFrame.align.

This PR depends on PR #71, since I use its improved copy method for datasets.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/77/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
28123550 MDExOlB1bGxSZXF1ZXN0MTI4MzQwMDU= 15 Version now contains git commit ID shoyer 1217238 closed 0     3 2014-02-23T18:17:04Z 2014-06-12T17:29:51Z 2014-02-23T20:22:49Z MEMBER   0 pydata/xarray/pulls/15

Thanks to some code borrowed from pandas, setup.py now reports the development version of xray as something like "0.1.0.dev-de28cd6".

I also took this opportunity to add xray.version.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/15/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
28297980 MDExOlB1bGxSZXF1ZXN0MTI5MzIxMDU= 20 Handle mask_and_scale ourselves instead of using netCDF4 shoyer 1217238 closed 0     3 2014-02-26T00:19:15Z 2014-06-12T17:29:32Z 2014-02-28T22:33:16Z MEMBER   0 pydata/xarray/pulls/20

This lets us use NaNs instead of masked arrays to indicate missing values.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/20/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
30339447 MDU6SXNzdWUzMDMzOTQ0Nw== 85 Rename `DatasetArray` to `DataArray`? shoyer 1217238 closed 0     3 2014-03-27T20:33:38Z 2014-05-06T20:10:19Z 2014-03-31T07:12:52Z MEMBER      

This would make it less ambiguous that this is the preferred way to access and manipulate data in xray.

On a related note, I would like to make XArray more of an internal implementation detail that we only expose to advanced users.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/85/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 178.458ms · About: xarray-datasette