home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

20 rows where comments = 5 and user = 2448579 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 2

  • issue 12
  • pull 8

state 2

  • closed 17
  • open 3

repo 1

  • xarray 20
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1574694462 I_kwDOAMm_X85d2-4- 7513 intermittent failures with h5netcdf, h5py on macos dcherian 2448579 closed 0     5 2023-02-07T16:58:43Z 2024-04-28T23:35:21Z 2024-04-28T23:35:21Z MEMBER      

What is your issue?

cc @hmaarrfk @kmuehlbauer

Passed: https://github.com/pydata/xarray/actions/runs/4115923717/jobs/7105298426 Failed: https://github.com/pydata/xarray/actions/runs/4115946392/jobs/7105345290

Versions: h5netcdf 1.1.0 pyhd8ed1ab_0 conda-forge h5py 3.8.0 nompi_py310h5555e59_100 conda-forge hdf4 4.2.15 h7aa5921_5 conda-forge hdf5 1.12.2 nompi_h48135f9_101 conda-forge

``` =================================== FAILURES =================================== ___ test_open_mfdataset_manyfiles[h5netcdf-20-True-5-5] ______ [gw1] darwin -- Python 3.10.9 /Users/runner/micromamba-root/envs/xarray-tests/bin/python

readengine = 'h5netcdf', nfiles = 20, parallel = True, chunks = 5 file_cache_maxsize = 5

@requires_dask
@pytest.mark.filterwarnings("ignore:use make_scale(name) instead")
def test_open_mfdataset_manyfiles(
    readengine, nfiles, parallel, chunks, file_cache_maxsize
):
    # skip certain combinations
    skip_if_not_engine(readengine)

    if ON_WINDOWS:
        pytest.skip("Skipping on Windows")

    randdata = np.random.randn(nfiles)
    original = Dataset({"foo": ("x", randdata)})
    # test standard open_mfdataset approach with too many files
    with create_tmp_files(nfiles) as tmpfiles:
        writeengine = readengine if readengine != "pynio" else "netcdf4"
        # split into multiple sets of temp files
        for ii in original.x.values:
            subds = original.isel(x=slice(ii, ii + 1))
            if writeengine != "zarr":
                subds.to_netcdf(tmpfiles[ii], engine=writeengine)
            else:  # if writeengine == "zarr":
                subds.to_zarr(store=tmpfiles[ii])

        # check that calculation on opened datasets works properly
      with open_mfdataset(
            tmpfiles,
            combine="nested",
            concat_dim="x",
            engine=readengine,
            parallel=parallel,
            chunks=chunks if (not chunks and readengine != "zarr") else "auto",
        ) as actual:

/Users/runner/work/xarray/xarray/xarray/tests/test_backends.py:3267:


/Users/runner/work/xarray/xarray/xarray/backends/api.py:991: in open_mfdataset datasets, closers = dask.compute(datasets, closers) /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/base.py:599: in compute results = schedule(dsk, keys, kwargs) /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/threaded.py:89: in get results = get_async( /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/local.py:511: in get_async raise_exception(exc, tb) /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/local.py:319: in reraise raise exc /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/local.py:224: in execute_task result = _execute_task(task, data) /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/core.py:119: in _execute_task return func((_execute_task(a, cache) for a in args)) /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/utils.py:72: in apply return func(args, kwargs) /Users/runner/work/xarray/xarray/xarray/backends/api.py:526: in open_dataset backend_ds = backend.open_dataset( /Users/runner/work/xarray/xarray/xarray/backends/h5netcdf_.py:417: in open_dataset ds = store_entrypoint.open_dataset( /Users/runner/work/xarray/xarray/xarray/backends/store.py:32: in open_dataset vars, attrs = store.load() /Users/runner/work/xarray/xarray/xarray/backends/common.py:129: in load (decode_variable_name(k), v) for k, v in self.get_variables().items() /Users/runner/work/xarray/xarray/xarray/backends/h5netcdf.py:220: in get_variables return FrozenDict( /Users/runner/work/xarray/xarray/xarray/core/utils.py:471: in FrozenDict return Frozen(dict(args, *kwargs)) /Users/runner/work/xarray/xarray/xarray/backends/h5netcdf_.py:221: in <genexpr> (k, self.open_store_variable(k, v)) for k, v in self.ds.variables.items() /Users/runner/work/xarray/xarray/xarray/backends/h5netcdf_.py:200: in open_store_variable elif var.compression is not None: /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/h5netcdf/core.py:394: in compression return self._h5ds.compression


self = <[AttributeError("'NoneType' object has no attribute '_root'") raised in repr()] Variable object at 0x151378970>

@property
def _h5ds(self):
    # Always refer to the root file and store not h5py object
    # subclasses:
  return self._root._h5file[self._h5path]

E AttributeError: 'NoneType' object has no attribute '_h5file'

```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7513/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1954445639 I_kwDOAMm_X850fnlH 8350 optimize align for scalars at least dcherian 2448579 open 0     5 2023-10-20T14:48:25Z 2023-10-20T19:17:39Z   MEMBER      

What happened?

Here's a simple rescaling calculation: ```python import numpy as np import xarray as xr

ds = xr.Dataset( {"a": (("x", "y"), np.ones((300, 400))), "b": (("x", "y"), np.ones((300, 400)))} ) mean = ds.mean() # scalar std = ds.std() # scalar rescaled = (ds - mean) / std ```

The profile for the last line shows 30% (!!!) time spent in align (really reindex_like) except there's nothing to reindex when only scalars are involved!

This is a small example inspired by a ML pipeline where this normalization is happening very many times in a tight loop.

cc @benbovy

What did you expect to happen?

A fast path for when no reindexing needs to happen.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8350/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1217566173 I_kwDOAMm_X85IkpXd 6528 cumsum drops index coordinates dcherian 2448579 open 0     5 2022-04-27T16:04:08Z 2023-09-22T07:55:56Z   MEMBER      

What happened?

cumsum drops index coordinates. Seen in #6525, #3417

What did you expect to happen?

Preserve index coordinates

Minimal Complete Verifiable Example

```Python import xarray as xr

ds = xr.Dataset( {"foo": (("x",), [7, 3, 1, 1, 1, 1, 1])}, coords={"x": [0, 1, 2, 3, 4, 5, 6]}, ) ds.cumsum("x") ```

<xarray.Dataset> Dimensions: (x: 7) Dimensions without coordinates: x Data variables: foo (x) int64 7 10 11 12 13 14 15

Relevant log output

No response

Anything else we need to know?

No response

Environment

xarray main
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6528/reactions",
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1175093771 I_kwDOAMm_X85GCoIL 6391 apply_ufunc and Datasets with variables without the core dimension dcherian 2448579 closed 0     5 2022-03-21T09:13:02Z 2023-09-17T08:20:15Z 2023-09-17T08:20:14Z MEMBER      

Is your feature request related to a problem?

Consider this example

python ds = xr.Dataset({"a": ("x", [1, 2, 3]), "b": ("y", [1, 2, 3])}) xr.apply_ufunc(np.mean, ds, input_core_dims=[["x"]])

This raises ValueError: operand to apply_ufunc has required core dimensions ['x'], but some of these dimensions are absent on an input variable: ['x']

because core dimension x is missing on variable b. This behaviour makes it annoying to use apply_ufunc on Datasets.

Describe the solution you'd like

Add a new kwarg to apply_ufunc called missing_core_dim that controls how to handle variables without all input core dimensions. This kwarg could take one of two values: 1. "raise" - raise an error, current behaviour 2. "copy" - skip applying the function and copy the variable from input to output. 3. "drop"- skip applying the function and drop the variable.

Describe alternatives you've considered

No response

Additional context

No response

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6391/reactions",
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1812301185 I_kwDOAMm_X85sBYWB 8005 Design for IntervalIndex dcherian 2448579 open 0     5 2023-07-19T16:30:50Z 2023-09-09T06:30:20Z   MEMBER      

Is your feature request related to a problem?

We should add a wrapper for pandas.IntervalIndex this would solve a long standing problem around propagating "bounds" variables (CF conventions, https://github.com/pydata/xarray/issues/1475)

The CF design

CF "encoding" for intervals is to use bounds variables. There is an attribute "bounds" on the dimension coordinate, that refers to a second variable (at least 2D). Example: x has an attribute bounds that refers to x_bounds.

```python import numpy as np

left = np.arange(0.5, 3.6, 1) right = np.arange(1.5, 4.6, 1) bounds = np.stack([left, right])

ds = xr.Dataset( {"data": ("x", [1, 2, 3, 4])}, coords={"x": ("x", [1, 2, 3, 4], {"bounds": "x_bounds"}), "x_bounds": (("bnds", "x"), bounds)}, ) ds ```

A fundamental problem with our current data model is that we lose x_bounds when we extract ds.data because there is a dimension bnds that is not shared with ds.data. Very important metadata is now lost!

We would also like to use the "bounds" to enable interval based indexing. ds.sel(x=1.1) should give you the value from the appropriate interval.

Pandas IntervalIndex

All the indexing is easy to implement by wrapping pandas.IntervalIndex, but there is one limitation. pd.IntervalIndex saves two pieces of information for each interval (left bound, right bound). CF saves three : left bound, right bound (see x_bounds) and a "central" value (see x). This should be OK to work around in our wrapper.

Fundamental Question

To me, a core question is whether x_bounds needs to be preserved after creating an IntervalIndex. 1. If so, we need a better rule around coordinate variable propagation. In this case, the IntervalIndex would be associated with x and x_bounds. So the rule could be > "propagate all variables necessary to propagate an index associated with any of the dimensions on the extracted variable."

So when extracting `ds.data` we propagate all variables necessary to propagate indexes associated with `ds.data.dims` that is `x` which would say "propagate `x`, `x_bounds`, and the IntervalIndex.
  1. Alternatively, we could choose to drop x_bounds entirely. I interpret this approach as "decoding" the bounds variable to an interval index object. When saving to disk, we would encode the interval index in two variables. (See below)

Describe the solution you'd like

I've prototyped (2) [approach 1 in this notebook) following @benbovy's suggestion

```python from xarray import Variable from xarray.indexes import PandasIndex class XarrayIntervalIndex(PandasIndex): def __init__(self, index, dim, coord_dtype): assert isinstance(index, pd.IntervalIndex) # for PandasIndex self.index = index self.dim = dim self.coord_dtype = coord_dtype @classmethod def from_variables(cls, variables, options): assert len(variables) == 1 (dim,) = tuple(variables) bounds = options["bounds"] assert isinstance(bounds, (xr.DataArray, xr.Variable)) (axis,) = bounds.get_axis_num(set(bounds.dims) - {dim}) left, right = np.split(bounds.data, 2, axis=axis) index = pd.IntervalIndex.from_arrays(left.squeeze(), right.squeeze()) coord_dtype = bounds.dtype return cls(index, dim, coord_dtype) def create_variables(self, variables): from xarray.core.indexing import PandasIndexingAdapter newvars = {self.dim: xr.Variable(self.dim, PandasIndexingAdapter(self.index))} return newvars def __repr__(self): string = f"Xarray{self.index!r}" return string def to_pandas_index(self): return self.index @property def mid(self): return PandasIndex(self.index.right, self.dim, self.coord_dtype) @property def left(self): return PandasIndex(self.index.right, self.dim, self.coord_dtype) @property def right(self): return PandasIndex(self.index.right, self.dim, self.coord_dtype) ```

python ds1 = ( ds.drop_indexes("x") .set_xindex("x", XarrayIntervalIndex, bounds=ds.x_bounds) .drop_vars("x_bounds") ) ds1

python ds1.sel(x=1.1)

Describe alternatives you've considered

I've tried some approaches in this notebook

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8005/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1806239984 PR_kwDOAMm_X85Vl5Ch 7989 Allow opening datasets with nD dimenson coordinate variables. dcherian 2448579 closed 0     5 2023-07-15T17:33:18Z 2023-07-19T19:06:25Z 2023-07-19T18:25:33Z MEMBER   0 pydata/xarray/pulls/7989
  • [x] Closes #2233
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst

Avoid automatic creating of Index variable when nD variable shares name with one of its dimensions.

Closes #2233

```python url = "http://www.smast.umassd.edu:8080/thredds/dodsC/FVCOM/NECOFS/Forecasts/NECOFS_GOM3_FORECAST.nc" ds = xr.open_dataset(url, engine="netcdf4") display(ds)

xr.testing._assert_internal_invariants(ds, check_default_indexes=False) ! no raise on #7368 ```

~The internal invariants assert fails on main but succeeds on #7368~. EDIT: now fixed the invariants check.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7989/reactions",
    "total_count": 2,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 1,
    "eyes": 0
}
    xarray 13221727 pull
1639732867 PR_kwDOAMm_X85M2fjy 7670 Delete built-in cfgrib backend dcherian 2448579 closed 0     5 2023-03-24T16:53:56Z 2023-06-01T15:22:33Z 2023-03-29T15:19:51Z MEMBER   0 pydata/xarray/pulls/7670
  • [x] Closes #7199
  • [x] Tests ~added~ deleted
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7670/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1284094480 I_kwDOAMm_X85MiboQ 6722 Avoid loading any data for reprs dcherian 2448579 closed 0     5 2022-06-24T19:04:30Z 2022-10-28T16:23:20Z 2022-10-28T16:23:20Z MEMBER      

What happened?

For "small" datasets, we load in to memory when displaying the repr. For cloud backed datasets with large number of "small" variables, this can use a lot of time sequentially loading O(100) variables just for a repr.

https://github.com/pydata/xarray/blob/6c8db5ed005e000b35ad8b6ea9080105e608e976/xarray/core/formatting.py#L548-L549

What did you expect to happen?

Fast reprs!

Minimal Complete Verifiable Example

This dataset has 48 "small" variables ```Python import xarray as xr

dc1 = xr.open_dataset('s3://its-live-data/datacubes/v02/N40E080/ITS_LIVE_vel_EPSG32645_G0120_X250000_Y4750000.zarr', engine= 'zarr', storage_options = {'anon':True}) dc1.repr_html() ```

On 2022.03.0 this repr takes 36.4s If I comment the array.size condition I get 6μs.

MVCE confirmation

  • [x] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [x] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [x] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [x] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.10.4 | packaged by conda-forge | (main, Mar 24 2022, 17:43:32) [Clang 12.0.1 ] python-bits: 64 OS: Darwin OS-release: 21.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 2022.3.0 pandas: 1.4.2 numpy: 1.22.4 scipy: 1.8.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.11.3 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.10 cfgrib: None iris: None bottleneck: None dask: 2022.05.2 distributed: None matplotlib: 3.5.2 cartopy: 0.20.2 seaborn: 0.11.2 numbagg: None fsspec: 2022.5.0 cupy: None pint: None sparse: None setuptools: 62.3.2 pip: 22.1.2 conda: None pytest: None IPython: 8.4.0 sphinx: 4.5.0
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6722/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1382753751 I_kwDOAMm_X85SayXX 7069 release? dcherian 2448579 closed 0     5 2022-09-22T17:00:58Z 2022-10-01T18:25:13Z 2022-10-01T18:25:13Z MEMBER      

What is your issue?

It's been 3 months since our last release.

We still have quite a few regressions from the last release but @benbovy does have open PRs for a number of them. However, we do have some nice bugfixes and other commits in the mean time.

I propose we issue a new release, perhaps after @benbovy merges the PRs he thinks are ready.

I'll be out of town for the next few days, so if someone else could volunteer to be release manager that would be great!

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7069/reactions",
    "total_count": 4,
    "+1": 4,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1284071791 PR_kwDOAMm_X846VIdv 6721 Fix .chunks loading lazy backed array data dcherian 2448579 closed 0     5 2022-06-24T18:45:45Z 2022-06-29T20:15:16Z 2022-06-29T20:06:36Z MEMBER   0 pydata/xarray/pulls/6721
  • [x] Closes #6538
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst

@shoyer is there a way to test this?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6721/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
654889988 MDU6SXNzdWU2NTQ4ODk5ODg= 4215 setting variables named in CF attributes as coordinate variables dcherian 2448579 closed 0     5 2020-07-10T16:17:08Z 2021-04-19T03:32:02Z 2021-04-19T03:32:02Z MEMBER      

This came up in #2844 by @DWesl (see also #3689)

Currently we have decode_coords which sets variables named in attrs["coordinates"] as coordinate variables.

There are a number of other CF attributes that can contain variable names. 1. bounds 1. grid_mapping 1. ancillary_variables 1. cell_measures 1. maybe more?

As in #3689 it's hard to see why a lot of these variables named in these attributes would be useful as "data variables".

Question: Should we allow decode_coords to control whether variables mentioned in these attributes are set as coordinate variables?

cc @jthielen

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4215/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
589583977 MDU6SXNzdWU1ODk1ODM5Nzc= 3908 check index invariants in test_dataset, test_dataarray dcherian 2448579 closed 0     5 2020-03-28T14:07:17Z 2020-10-13T04:03:58Z 2020-10-13T04:03:58Z MEMBER      

There are a large number of tests in test_dataarray.py and test_dataset.py that use the assert actual.equals(expected) pattern.

We should switch these to be assert_equal(actual, expected) so that we run the invariant checks:

https://github.com/pydata/xarray/blob/acf7d4157ca44f05c85a92d1b914b68738988773/xarray/tests/init.py#L150-L154

Seems like an easy use of regexes for someone that knows them (i.e. not me ;))

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3908/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
550355524 MDU6SXNzdWU1NTAzNTU1MjQ= 3698 dask.optimize on xarray objects dcherian 2448579 closed 0     5 2020-01-15T18:29:18Z 2020-09-20T05:21:57Z 2020-09-20T05:21:57Z MEMBER      

I am trying to call dask.optimize on a xarray object before the graph gets too big. But get weird errors. Simple examples below. All examples work if I remove the dask.optimize step.

cc @mrocklin @shoyer

This works with dask arrays:

python a = dask.array.ones((10,5), chunks=(1,3)) a = dask.optimize(a)[0] a.compute()

It works when a dataArray is constructed using a dask array

python da = xr.DataArray(a) da = dask.optimize(da)[0] da.compute()

but fails when creating a DataArray with a numpy array and then chunking it

:man_shrugging: python da = xr.DataArray(a.compute()).chunk({"dim_0": 5}) da = dask.optimize(da)[0] da.compute()

fails with error

``` python

TypeError Traceback (most recent call last) <ipython-input-50-1f16efa19800> in <module> 1 da = xr.DataArray(a.compute()).chunk({"dim_0": 5}) 2 da = dask.optimize(da)[0] ----> 3 da.compute()

~/python/xarray/xarray/core/dataarray.py in compute(self, kwargs) 838 """ 839 new = self.copy(deep=False) --> 840 return new.load(kwargs) 841 842 def persist(self, **kwargs) -> "DataArray":

~/python/xarray/xarray/core/dataarray.py in load(self, kwargs) 812 dask.array.compute 813 """ --> 814 ds = self._to_temp_dataset().load(kwargs) 815 new = self._from_temp_dataset(ds) 816 self._variable = new._variable

~/python/xarray/xarray/core/dataset.py in load(self, kwargs) 659 660 # evaluate all the dask arrays simultaneously --> 661 evaluated_data = da.compute(*lazy_data.values(), kwargs) 662 663 for k, data in zip(lazy_data, evaluated_data):

~/miniconda3/envs/dcpy_updated/lib/python3.7/site-packages/dask/base.py in compute(args, kwargs) 434 keys = [x.dask_keys() for x in collections] 435 postcomputes = [x.dask_postcompute() for x in collections] --> 436 results = schedule(dsk, keys, kwargs) 437 return repack([f(r, a) for r, (f, a) in zip(results, postcomputes)]) 438

~/miniconda3/envs/dcpy_updated/lib/python3.7/site-packages/dask/threaded.py in get(dsk, result, cache, num_workers, pool, kwargs) 79 get_id=_thread_get_id, 80 pack_exception=pack_exception, ---> 81 kwargs 82 ) 83

~/miniconda3/envs/dcpy_updated/lib/python3.7/site-packages/dask/local.py in get_async(apply_async, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, **kwargs) 484 _execute_task(task, data) # Re-execute locally 485 else: --> 486 raise_exception(exc, tb) 487 res, worker_id = loads(res_info) 488 state["cache"][key] = res

~/miniconda3/envs/dcpy_updated/lib/python3.7/site-packages/dask/local.py in reraise(exc, tb) 314 if exc.traceback is not tb: 315 raise exc.with_traceback(tb) --> 316 raise exc 317 318

~/miniconda3/envs/dcpy_updated/lib/python3.7/site-packages/dask/local.py in execute_task(key, task_info, dumps, loads, get_id, pack_exception) 220 try: 221 task, data = loads(task_info) --> 222 result = _execute_task(task, data) 223 id = get_id() 224 result = dumps((result, id))

~/miniconda3/envs/dcpy_updated/lib/python3.7/site-packages/dask/core.py in _execute_task(arg, cache, dsk) 117 func, args = arg[0], arg[1:] 118 args2 = [_execute_task(a, cache) for a in args] --> 119 return func(*args2) 120 elif not ishashable(arg): 121 return arg

TypeError: string indices must be integers ```

And a different error when rechunking a dask-backed DataArray

python da = xr.DataArray(a).chunk({"dim_0": 5}) da = dask.optimize(da)[0] da.compute()

``` python

IndexError Traceback (most recent call last) <ipython-input-55-d978bbb9e38d> in <module> 1 da = xr.DataArray(a).chunk({"dim_0": 5}) 2 da = dask.optimize(da)[0] ----> 3 da.compute()

~/python/xarray/xarray/core/dataarray.py in compute(self, kwargs) 838 """ 839 new = self.copy(deep=False) --> 840 return new.load(kwargs) 841 842 def persist(self, **kwargs) -> "DataArray":

~/python/xarray/xarray/core/dataarray.py in load(self, kwargs) 812 dask.array.compute 813 """ --> 814 ds = self._to_temp_dataset().load(kwargs) 815 new = self._from_temp_dataset(ds) 816 self._variable = new._variable

~/python/xarray/xarray/core/dataset.py in load(self, kwargs) 659 660 # evaluate all the dask arrays simultaneously --> 661 evaluated_data = da.compute(*lazy_data.values(), kwargs) 662 663 for k, data in zip(lazy_data, evaluated_data):

~/miniconda3/envs/dcpy_updated/lib/python3.7/site-packages/dask/base.py in compute(args, kwargs) 434 keys = [x.dask_keys() for x in collections] 435 postcomputes = [x.dask_postcompute() for x in collections] --> 436 results = schedule(dsk, keys, kwargs) 437 return repack([f(r, a) for r, (f, a) in zip(results, postcomputes)]) 438

~/miniconda3/envs/dcpy_updated/lib/python3.7/site-packages/dask/threaded.py in get(dsk, result, cache, num_workers, pool, kwargs) 79 get_id=_thread_get_id, 80 pack_exception=pack_exception, ---> 81 kwargs 82 ) 83

~/miniconda3/envs/dcpy_updated/lib/python3.7/site-packages/dask/local.py in get_async(apply_async, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, **kwargs) 484 _execute_task(task, data) # Re-execute locally 485 else: --> 486 raise_exception(exc, tb) 487 res, worker_id = loads(res_info) 488 state["cache"][key] = res

~/miniconda3/envs/dcpy_updated/lib/python3.7/site-packages/dask/local.py in reraise(exc, tb) 314 if exc.traceback is not tb: 315 raise exc.with_traceback(tb) --> 316 raise exc 317 318

~/miniconda3/envs/dcpy_updated/lib/python3.7/site-packages/dask/local.py in execute_task(key, task_info, dumps, loads, get_id, pack_exception) 220 try: 221 task, data = loads(task_info) --> 222 result = _execute_task(task, data) 223 id = get_id() 224 result = dumps((result, id))

~/miniconda3/envs/dcpy_updated/lib/python3.7/site-packages/dask/core.py in _execute_task(arg, cache, dsk) 117 func, args = arg[0], arg[1:] 118 args2 = [_execute_task(a, cache) for a in args] --> 119 return func(*args2) 120 elif not ishashable(arg): 121 return arg

~/miniconda3/envs/dcpy_updated/lib/python3.7/site-packages/dask/array/core.py in concatenate3(arrays) 4305 if not ndim: 4306 return arrays -> 4307 chunks = chunks_from_arrays(arrays) 4308 shape = tuple(map(sum, chunks)) 4309

~/miniconda3/envs/dcpy_updated/lib/python3.7/site-packages/dask/array/core.py in chunks_from_arrays(arrays) 4085 4086 while isinstance(arrays, (list, tuple)): -> 4087 result.append(tuple([shape(deepfirst(a))[dim] for a in arrays])) 4088 arrays = arrays[0] 4089 dim += 1

~/miniconda3/envs/dcpy_updated/lib/python3.7/site-packages/dask/array/core.py in <listcomp>(.0) 4085 4086 while isinstance(arrays, (list, tuple)): -> 4087 result.append(tuple([shape(deepfirst(a))[dim] for a in arrays])) 4088 arrays = arrays[0] 4089 dim += 1

IndexError: tuple index out of range ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3698/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
665232266 MDU6SXNzdWU2NjUyMzIyNjY= 4265 cftime plotting fails on upstream-dev dcherian 2448579 closed 0     5 2020-07-24T15:07:44Z 2020-07-27T13:13:48Z 2020-07-26T19:04:55Z MEMBER      

seen in https://dev.azure.com/xarray/xarray/_build/results?buildId=3365&view=logs&jobId=2280efed-fda1-53bd-9213-1fa8ec9b4fa8&j=2280efed-fda1-53bd-9213-1fa8ec9b4fa8&t=175181ee-1928-5a6b-f537-168f7a8b7c2d

=========================== short test summary info ============================ FAILED xarray/tests/test_plot.py::TestCFDatetimePlot::test_cfdatetime_line_plot FAILED xarray/tests/test_plot.py::TestCFDatetimePlot::test_cfdatetime_pcolormesh_plot FAILED xarray/tests/test_plot.py::TestCFDatetimePlot::test_cfdatetime_contour_plot

e.g. ``` =================================== FAILURES =================================== __ TestCFDatetimePlot.testcfdatetime_line_plot ___

self = <xarray.tests.test_plot.TestCFDatetimePlot object at 0x7f71d66219d0>

def test_cfdatetime_line_plot(self):

E ValueError: setting an array element with a sequence. The requested array would exceed the maximum number of dimension of 1.

/usr/share/miniconda/envs/xarray-tests/lib/python3.8/site-packages/matplotlib/transforms.py:943: ValueError ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4265/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
517855271 MDExOlB1bGxSZXF1ZXN0MzM2ODQ0OTE1 3487 Respect user-specified coordinates attribute. dcherian 2448579 closed 0     5 2019-11-05T15:46:00Z 2019-12-10T16:02:20Z 2019-12-10T16:02:01Z MEMBER   0 pydata/xarray/pulls/3487

A minimally invasive solution to #3351. If variable.encoding["coordinates"] is specified, we write that attribute to disk and warn the user that roundtripping may not work.

  • [x] Closes #3351
  • [x] Tests added
  • [x] Passes black . && mypy . && flake8
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3487/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
511668854 MDExOlB1bGxSZXF1ZXN0MzMxODIzODAw 3441 fix plotting with transposed nondim coords. dcherian 2448579 closed 0     5 2019-10-24T02:42:18Z 2019-12-04T21:37:03Z 2019-12-04T16:45:13Z MEMBER   0 pydata/xarray/pulls/3441
  • [x] Closes #3138
  • [x] Tests added
  • [x] Passes black . && mypy . && flake8
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3441/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
520252049 MDExOlB1bGxSZXF1ZXN0MzM4ODczODg3 3500 make coarsen reductions consistent with reductions on other classes dcherian 2448579 closed 0     5 2019-11-08T21:56:03Z 2019-12-04T16:11:21Z 2019-12-04T16:11:16Z MEMBER   0 pydata/xarray/pulls/3500
  • [x] Tests added
  • [x] Passes black . && mypy . && flake8
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API

This PR uses inject_reduce_methods to inject reduction methods into the Coarsen classes. So now we can do coarsen.count() and pass skipna down to the reduction methods.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3500/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
431595147 MDU6SXNzdWU0MzE1OTUxNDc= 2885 Add broadcast_like? dcherian 2448579 closed 0     5 2019-04-10T16:20:37Z 2019-07-14T20:24:32Z 2019-07-14T20:24:32Z MEMBER      

What do you think of adding da.broadcast_like(a)?

This would be equivalent to xr.broadcast(a, da)[1]

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2885/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
439418506 MDExOlB1bGxSZXF1ZXN0Mjc1MjI4MzM3 2935 plot: If provided with colormap do not modify it. dcherian 2448579 closed 0     5 2019-05-02T04:00:43Z 2019-05-09T16:19:57Z 2019-05-09T16:19:53Z MEMBER   0 pydata/xarray/pulls/2935
  • [x] Closes #2932
  • [x] Tests added
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API

If provided a Colormap we don't override it anymore. facetgrid determines a colormap first and passes that down, so this prevents overwriting.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2935/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
279595497 MDExOlB1bGxSZXF1ZXN0MTU2NjIwNjM4 1762 ENH: Add dt.date accessor. dcherian 2448579 closed 0     5 2017-12-06T01:43:51Z 2018-05-10T05:12:15Z 2018-01-14T00:28:11Z MEMBER   0 pydata/xarray/pulls/1762

This PR lets us access dt.date thereby removing all higher frequency time information.

Use case: Just like dayofyear but easier to interpret when only looking at 1 year of data.

Example: Start with da.time <xarray.DataArray 'time' (time: 8737)> array(['2014-01-01T00:00:00.000000000', '2014-01-01T01:00:00.000000000', '2014-01-01T02:00:00.000000000', ..., '2014-12-30T22:00:00.000000000', '2014-12-30T23:00:00.000000000', '2014-12-31T00:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] 2014-01-01 2014-01-01T01:00:00 ...

then da.time.dt.date yields <xarray.DataArray 'date' (time: 8737)> array(['2014-01-01T00:00:00.000000000', '2014-01-01T00:00:00.000000000', '2014-01-01T00:00:00.000000000', ..., '2014-12-30T00:00:00.000000000', '2014-12-30T00:00:00.000000000', '2014-12-31T00:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] 2014-01-01 2014-01-01T01:00:00 ...

  • [x] Tests added (for all bug fixes or enhancements)
  • [x] Tests passed (for all non-documentation changes)
  • [x] Passes git diff upstream/master **/*py | flake8 --diff (remove if you did not edit any Python files)
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later)
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1762/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 1760.343ms · About: xarray-datasette