home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

16 rows where issue = 647804004 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 6

  • weiji14 6
  • shoyer 5
  • dcherian 2
  • Carreau 1
  • martindurant 1
  • pep8speaks 1

author_association 3

  • CONTRIBUTOR 8
  • MEMBER 7
  • NONE 1

issue 1

  • Xarray open_mfdataset with engine Zarr · 16 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
696783734 https://github.com/pydata/xarray/pull/4187#issuecomment-696783734 https://api.github.com/repos/pydata/xarray/issues/4187 MDEyOklzc3VlQ29tbWVudDY5Njc4MzczNA== dcherian 2448579 2020-09-22T15:08:36Z 2020-09-22T15:08:36Z MEMBER

Thanks @weiji14 and @Mikejmnez . This is a great contribution.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Xarray open_mfdataset with engine Zarr 647804004
696766963 https://github.com/pydata/xarray/pull/4187#issuecomment-696766963 https://api.github.com/repos/pydata/xarray/issues/4187 MDEyOklzc3VlQ29tbWVudDY5Njc2Njk2Mw== martindurant 6042212 2020-09-22T14:41:41Z 2020-09-22T14:41:41Z CONTRIBUTOR

Note that zarr.open* now works with fsspec URLs (in master)

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Xarray open_mfdataset with engine Zarr 647804004
696518519 https://github.com/pydata/xarray/pull/4187#issuecomment-696518519 https://api.github.com/repos/pydata/xarray/issues/4187 MDEyOklzc3VlQ29tbWVudDY5NjUxODUxOQ== shoyer 1217238 2020-09-22T05:40:57Z 2020-09-22T05:40:57Z MEMBER

Thanks @weiji14 and @Mikejmnez for your contribution!

{
    "total_count": 3,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 3,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Xarray open_mfdataset with engine Zarr 647804004
651483660 https://github.com/pydata/xarray/pull/4187#issuecomment-651483660 https://api.github.com/repos/pydata/xarray/issues/4187 MDEyOklzc3VlQ29tbWVudDY1MTQ4MzY2MA== pep8speaks 24736507 2020-06-30T02:32:18Z 2020-09-21T23:44:07Z NONE

Hello @weiji14! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! :beers:

Comment last updated at 2020-09-21 23:44:07 UTC
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Xarray open_mfdataset with engine Zarr 647804004
690885702 https://github.com/pydata/xarray/pull/4187#issuecomment-690885702 https://api.github.com/repos/pydata/xarray/issues/4187 MDEyOklzc3VlQ29tbWVudDY5MDg4NTcwMg== shoyer 1217238 2020-09-11T05:35:29Z 2020-09-11T05:35:29Z MEMBER

Just to note down a few things:

  1. The deprecated "auto_chunk" kwarg was removed
  2. open_zarr uses chunks="auto" by default, whereas open_dataset uses chunks=None (see my comment inline)

The different default chunk behaviour (point 2) is worth raising, and it might be best to postpone the deprecation of open_zarr until the next release, so that there's time to discuss what we want the default setting to be (None or auto).

These all sound good to me!

I agree that we shouldn't change the default behavior for open_dataset, and should keep open_zarr around for now -- there is no urgent need to deprecate it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Xarray open_mfdataset with engine Zarr 647804004
652702644 https://github.com/pydata/xarray/pull/4187#issuecomment-652702644 https://api.github.com/repos/pydata/xarray/issues/4187 MDEyOklzc3VlQ29tbWVudDY1MjcwMjY0NA== weiji14 23487320 2020-07-01T23:59:32Z 2020-07-03T04:23:34Z CONTRIBUTOR

I agree. I think we should keep open_zarr around.

Just wanted to mention that two of the reviewers in the last PR (see https://github.com/pydata/xarray/pull/4003#issuecomment-619644606 and https://github.com/pydata/xarray/pull/4003#issuecomment-620169860) seemed in favour of deprecating open_zarr. If I'm counting the votes correctly (did I miss anyone?), that's 2 for, and 2 against. We'll need a tiebreaker :laughing:

As reminder (because it took me a while to remember!), one goal with this refactor is to have open_mfdataset work with all backends (including zarr and rasterio) by specifying the engine kwarg.

Yes exactly, time does fly (half a year has gone by already!).

Currently I'm trying to piggyback Zarr into test_openmfdataset_manyfiles from #1983, ~~and am currently having trouble finding out why opening Zarr stores via open_mfdataset doesn't return a dask backed array like the other engines (Edit: it only happens when chunks is None, see https://github.com/pydata/xarray/pull/4187#discussion_r448734418). Might need to spend another day digging through the code to see if this is expected behaviour.~~ Edit: got a workaround solution in b3d6a6a46f8ead25b6f7f593f7b46f43a4de650c by using chunks="auto" as was the default in open_zarr.

As a note I'm working on implementing zarr spec v3 in zarr-python, still deciding how we want to handle the new spec/API.

If there are any changes that you would like or dislike in an API, feedback is welcome.

Thanks for chipping in @Carreau! I'm sure the community will have some useful suggestions. Just cross-referencing https://zarr-developers.github.io/zarr/specs/2019/06/19/zarr-v3-update.html so others can get a better feel for where things are at.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Xarray open_mfdataset with engine Zarr 647804004
652260859 https://github.com/pydata/xarray/pull/4187#issuecomment-652260859 https://api.github.com/repos/pydata/xarray/issues/4187 MDEyOklzc3VlQ29tbWVudDY1MjI2MDg1OQ== weiji14 23487320 2020-07-01T08:02:14Z 2020-07-01T22:23:27Z CONTRIBUTOR

I wonder if it's really worth deprecating open_zarr(). open_dataset(..., engine='zarr') is a bit more verbose, especially with backend_kwargs to pass optional arguments. It seems pretty harmless to keep open_zarr() around, especially if it's just an alias for open_datraset(engine='zarr').

Depends on which line in the Zen of Python you want to follow - "Simple is better than complex", or "There should be one-- and preferably only one --obvious way to do it". From a maintenance perspective, it's balancing the cost of a deprecation cycle vs writing code that tests both instances I guess.

We could also automatically detect zarr stores in open_dataset without requiring engine='zarr' if:

  1. the argument inherits from collections.abc.Mapping, and
  2. it contains a key '.zgroup', corresponding to zarr metadata.

As for the annoyance of needing to write backend_kwargs={"consolidated": True}, I wonder if we could detect this automatically by checking for the existence of a .zmetadata key? This would add a small amount of overhead (one file access) but this probably would not be prohibitively expensive.

These are some pretty good ideas. I also wonder if there's a way to mimic the dataset identifiers like in rasterio, something like xr.open_dataset("zarr:some_zarrfile.zarr"). Feels a lot more like fsspec's url chaining too.

Counter-argument would be that the cyclomatic complexity of open_dataset is already too high, and it really should be refactored before adding more 'magic'. Especially if new backend engines come online (e.g. #4142).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Xarray open_mfdataset with engine Zarr 647804004
652625539 https://github.com/pydata/xarray/pull/4187#issuecomment-652625539 https://api.github.com/repos/pydata/xarray/issues/4187 MDEyOklzc3VlQ29tbWVudDY1MjYyNTUzOQ== Carreau 335567 2020-07-01T20:17:36Z 2020-07-01T20:17:36Z CONTRIBUTOR

As a note I'm working on implementing zarr spec v3 in zarr-python, still deciding how we want to handle the new spec/API.

If there are any changes that you would like or dislike in an API, feedback is welcome.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Xarray open_mfdataset with engine Zarr 647804004
652482831 https://github.com/pydata/xarray/pull/4187#issuecomment-652482831 https://api.github.com/repos/pydata/xarray/issues/4187 MDEyOklzc3VlQ29tbWVudDY1MjQ4MjgzMQ== dcherian 2448579 2020-07-01T15:20:39Z 2020-07-01T15:20:39Z MEMBER

I wonder if it's really worth deprecating open_zarr(). open_dataset(..., engine='zarr') is a bit more verbose, especially with backend_kwargs to pass optional arguments. It seems pretty harmless to keep open_zarr() around, especially if it's just an alias for open_datraset(engine='zarr').

I agree. I think we should keep open_zarr around.

As reminder (because it took me a while to remember!), one goal with this refactor is to have open_mfdataset work with all backends (including zarr and rasterio) by specifying the engine kwarg.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Xarray open_mfdataset with engine Zarr 647804004
652206442 https://github.com/pydata/xarray/pull/4187#issuecomment-652206442 https://api.github.com/repos/pydata/xarray/issues/4187 MDEyOklzc3VlQ29tbWVudDY1MjIwNjQ0Mg== shoyer 1217238 2020-07-01T05:49:51Z 2020-07-01T05:50:31Z MEMBER

As for the annoyance of needing to write backend_kwargs={"consolidated": True}, I wonder if we could detect this automatically by checking for the existence of a .zmetadata key? This would add a small amount of overhead (one file access) but this probably would not be prohibitively expensive.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Xarray open_mfdataset with engine Zarr 647804004
652204634 https://github.com/pydata/xarray/pull/4187#issuecomment-652204634 https://api.github.com/repos/pydata/xarray/issues/4187 MDEyOklzc3VlQ29tbWVudDY1MjIwNDYzNA== shoyer 1217238 2020-07-01T05:44:20Z 2020-07-01T05:44:32Z MEMBER

We could also automatically detect zarr stores in open_dataset without requiring engine='zarr' if:

  1. the argument inherits from collections.abc.Mapping, and
  2. it contains a key '.zgroup', corresponding to zarr metadata.
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Xarray open_mfdataset with engine Zarr 647804004
652202093 https://github.com/pydata/xarray/pull/4187#issuecomment-652202093 https://api.github.com/repos/pydata/xarray/issues/4187 MDEyOklzc3VlQ29tbWVudDY1MjIwMjA5Mw== shoyer 1217238 2020-07-01T05:36:00Z 2020-07-01T05:36:00Z MEMBER

I wonder if it's really worth deprecating open_zarr(). open_dataset(..., engine='zarr') is a bit more verbose, especially with backend_kwargs to pass optional arguments. It seems pretty harmless to keep open_zarr() around, especially if it's just an alias for open_datraset(engine='zarr').

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Xarray open_mfdataset with engine Zarr 647804004
652104356 https://github.com/pydata/xarray/pull/4187#issuecomment-652104356 https://api.github.com/repos/pydata/xarray/issues/4187 MDEyOklzc3VlQ29tbWVudDY1MjEwNDM1Ng== weiji14 23487320 2020-06-30T23:42:58Z 2020-07-01T03:04:11Z CONTRIBUTOR

Four more failures, something to do with dask? Seems related to #3919 and #3921.

  • [ ] TestZarrDictStore.test_vectorized_indexing - IndexError: only slices with step >= 1 are supported
  • [x] TestZarrDictStore.test_manual_chunk - ZeroDivisionError: integer division or modulo by zero
  • [ ] TestZarrDirectoryStore.test_vectorized_indexing - IndexError: only slices with step >= 1 are supported
  • [x] TestZarrDirectoryStore.test_manual_chunk - ZeroDivisionError: integer division or modulo by zero

Edit: Fixed the ZeroDivisionErrror in 6fbeadf41a1a547383da0c8f4499c99099dbdf97. The IndexError was fixed in a hacky way though, see https://github.com/pydata/xarray/pull/4187#discussion_r448077275.

```python-traceback =================================== FAILURES =================================== __________________ TestZarrDictStore.test_vectorized_indexing __________________ self = <xarray.tests.test_backends.TestZarrDictStore object at 0x7f5832433940> @pytest.mark.xfail( not has_dask, reason="the code for indexing without dask handles negative steps in slices incorrectly", ) def test_vectorized_indexing(self): in_memory = create_test_data() with self.roundtrip(in_memory) as on_disk: indexers = { "dim1": DataArray([0, 2, 0], dims="a"), "dim2": DataArray([0, 2, 3], dims="a"), } expected = in_memory.isel(**indexers) actual = on_disk.isel(**indexers) # make sure the array is not yet loaded into memory assert not actual["var1"].variable._in_memory assert_identical(expected, actual.load()) # do it twice, to make sure we're switched from # vectorized -> numpy when we cached the values actual = on_disk.isel(**indexers) assert_identical(expected, actual) def multiple_indexing(indexers): # make sure a sequence of lazy indexings certainly works. with self.roundtrip(in_memory) as on_disk: actual = on_disk["var3"] expected = in_memory["var3"] for ind in indexers: actual = actual.isel(**ind) expected = expected.isel(**ind) # make sure the array is not yet loaded into memory assert not actual.variable._in_memory assert_identical(expected, actual.load()) # two-staged vectorized-indexing indexers = [ { "dim1": DataArray([[0, 7], [2, 6], [3, 5]], dims=["a", "b"]), "dim3": DataArray([[0, 4], [1, 3], [2, 2]], dims=["a", "b"]), }, {"a": DataArray([0, 1], dims=["c"]), "b": DataArray([0, 1], dims=["c"])}, ] multiple_indexing(indexers) # vectorized-slice mixed indexers = [ { "dim1": DataArray([[0, 7], [2, 6], [3, 5]], dims=["a", "b"]), "dim3": slice(None, 10), } ] multiple_indexing(indexers) # vectorized-integer mixed indexers = [ {"dim3": 0}, {"dim1": DataArray([[0, 7], [2, 6], [3, 5]], dims=["a", "b"])}, {"a": slice(None, None, 2)}, ] multiple_indexing(indexers) # vectorized-integer mixed indexers = [ {"dim3": 0}, {"dim1": DataArray([[0, 7], [2, 6], [3, 5]], dims=["a", "b"])}, {"a": 1, "b": 0}, ] multiple_indexing(indexers) # with negative step slice. indexers = [ { "dim1": DataArray([[0, 7], [2, 6], [3, 5]], dims=["a", "b"]), "dim3": slice(-1, 1, -1), } ] > multiple_indexing(indexers) xarray/tests/test_backends.py:686: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ xarray/tests/test_backends.py:642: in multiple_indexing assert_identical(expected, actual.load()) xarray/core/dataarray.py:814: in load ds = self._to_temp_dataset().load(**kwargs) xarray/core/dataset.py:666: in load v.load() xarray/core/variable.py:381: in load self._data = np.asarray(self._data) /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/numpy/core/numeric.py:501: in asarray return array(a, dtype, copy=False, order=order) xarray/core/indexing.py:677: in __array__ self._ensure_cached() xarray/core/indexing.py:674: in _ensure_cached self.array = NumpyIndexingAdapter(np.asarray(self.array)) /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/numpy/core/numeric.py:501: in asarray return array(a, dtype, copy=False, order=order) xarray/core/indexing.py:653: in __array__ return np.asarray(self.array, dtype=dtype) /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/numpy/core/numeric.py:501: in asarray return array(a, dtype, copy=False, order=order) xarray/core/indexing.py:557: in __array__ return np.asarray(array[self.key], dtype=None) xarray/backends/zarr.py:57: in __getitem__ return array[key.tuple] /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/zarr/core.py:572: in __getitem__ return self.get_basic_selection(selection, fields=fields) /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/zarr/core.py:698: in get_basic_selection fields=fields) /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/zarr/core.py:738: in _get_basic_selection_nd indexer = BasicIndexer(selection, self) /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/zarr/indexing.py:279: in __init__ dim_indexer = SliceDimIndexer(dim_sel, dim_len, dim_chunk_len) /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/zarr/indexing.py:107: in __init__ err_negative_step() _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ def err_negative_step(): > raise IndexError('only slices with step >= 1 are supported') E IndexError: only slices with step >= 1 are supported /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/zarr/errors.py:55: IndexError _____________________ TestZarrDictStore.test_manual_chunk ______________________ self = <xarray.tests.test_backends.TestZarrDictStore object at 0x7f5832b80cf8> @requires_dask @pytest.mark.filterwarnings("ignore:Specified Dask chunks") def test_manual_chunk(self): original = create_test_data().chunk({"dim1": 3, "dim2": 4, "dim3": 3}) # All of these should return non-chunked arrays NO_CHUNKS = (None, 0, {}) for no_chunk in NO_CHUNKS: open_kwargs = {"chunks": no_chunk} > with self.roundtrip(original, open_kwargs=open_kwargs) as actual: xarray/tests/test_backends.py:1594: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ /usr/share/miniconda/envs/xarray-tests/lib/python3.6/contextlib.py:81: in __enter__ return next(self.gen) xarray/tests/test_backends.py:1553: in roundtrip with self.open(store_target, **open_kwargs) as ds: /usr/share/miniconda/envs/xarray-tests/lib/python3.6/contextlib.py:81: in __enter__ return next(self.gen) xarray/tests/test_backends.py:1540: in open with xr.open_dataset(store_target, engine="zarr", **kwargs) as ds: xarray/backends/api.py:587: in open_dataset ds = maybe_decode_store(store, chunks) xarray/backends/api.py:511: in maybe_decode_store for k, v in ds.variables.items() xarray/backends/api.py:511: in <dictcomp> for k, v in ds.variables.items() xarray/backends/zarr.py:398: in maybe_chunk var = var.chunk(chunk_spec, name=name2, lock=None) xarray/core/variable.py:1007: in chunk data = da.from_array(data, chunks, name=name, lock=lock, **kwargs) /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/dask/array/core.py:2712: in from_array chunks, x.shape, dtype=x.dtype, previous_chunks=previous_chunks /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/dask/array/core.py:2447: in normalize_chunks (), /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/dask/array/core.py:2445: in <genexpr> for s, c in zip(shape, chunks) /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/dask/array/core.py:954: in blockdims_from_blockshape for d, bd in zip(shape, chunks) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ .0 = <zip object at 0x7f58332d9d48> ((bd,) * (d // bd) + ((d % bd,) if d % bd else ()) if d else (0,)) > for d, bd in zip(shape, chunks) ) E ZeroDivisionError: integer division or modulo by zero /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/dask/array/core.py:954: ZeroDivisionError _______________ TestZarrDirectoryStore.test_vectorized_indexing ________________ self = <xarray.tests.test_backends.TestZarrDirectoryStore object at 0x7f5832a08a20> @pytest.mark.xfail( not has_dask, reason="the code for indexing without dask handles negative steps in slices incorrectly", ) def test_vectorized_indexing(self): in_memory = create_test_data() with self.roundtrip(in_memory) as on_disk: indexers = { "dim1": DataArray([0, 2, 0], dims="a"), "dim2": DataArray([0, 2, 3], dims="a"), } expected = in_memory.isel(**indexers) actual = on_disk.isel(**indexers) # make sure the array is not yet loaded into memory assert not actual["var1"].variable._in_memory assert_identical(expected, actual.load()) # do it twice, to make sure we're switched from # vectorized -> numpy when we cached the values actual = on_disk.isel(**indexers) assert_identical(expected, actual) def multiple_indexing(indexers): # make sure a sequence of lazy indexings certainly works. with self.roundtrip(in_memory) as on_disk: actual = on_disk["var3"] expected = in_memory["var3"] for ind in indexers: actual = actual.isel(**ind) expected = expected.isel(**ind) # make sure the array is not yet loaded into memory assert not actual.variable._in_memory assert_identical(expected, actual.load()) # two-staged vectorized-indexing indexers = [ { "dim1": DataArray([[0, 7], [2, 6], [3, 5]], dims=["a", "b"]), "dim3": DataArray([[0, 4], [1, 3], [2, 2]], dims=["a", "b"]), }, {"a": DataArray([0, 1], dims=["c"]), "b": DataArray([0, 1], dims=["c"])}, ] multiple_indexing(indexers) # vectorized-slice mixed indexers = [ { "dim1": DataArray([[0, 7], [2, 6], [3, 5]], dims=["a", "b"]), "dim3": slice(None, 10), } ] multiple_indexing(indexers) # vectorized-integer mixed indexers = [ {"dim3": 0}, {"dim1": DataArray([[0, 7], [2, 6], [3, 5]], dims=["a", "b"])}, {"a": slice(None, None, 2)}, ] multiple_indexing(indexers) # vectorized-integer mixed indexers = [ {"dim3": 0}, {"dim1": DataArray([[0, 7], [2, 6], [3, 5]], dims=["a", "b"])}, {"a": 1, "b": 0}, ] multiple_indexing(indexers) # with negative step slice. indexers = [ { "dim1": DataArray([[0, 7], [2, 6], [3, 5]], dims=["a", "b"]), "dim3": slice(-1, 1, -1), } ] > multiple_indexing(indexers) xarray/tests/test_backends.py:686: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ xarray/tests/test_backends.py:642: in multiple_indexing assert_identical(expected, actual.load()) xarray/core/dataarray.py:814: in load ds = self._to_temp_dataset().load(**kwargs) xarray/core/dataset.py:666: in load v.load() xarray/core/variable.py:381: in load self._data = np.asarray(self._data) /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/numpy/core/numeric.py:501: in asarray return array(a, dtype, copy=False, order=order) xarray/core/indexing.py:677: in __array__ self._ensure_cached() xarray/core/indexing.py:674: in _ensure_cached self.array = NumpyIndexingAdapter(np.asarray(self.array)) /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/numpy/core/numeric.py:501: in asarray return array(a, dtype, copy=False, order=order) xarray/core/indexing.py:653: in __array__ return np.asarray(self.array, dtype=dtype) /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/numpy/core/numeric.py:501: in asarray return array(a, dtype, copy=False, order=order) xarray/core/indexing.py:557: in __array__ return np.asarray(array[self.key], dtype=None) xarray/backends/zarr.py:57: in __getitem__ return array[key.tuple] /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/zarr/core.py:572: in __getitem__ return self.get_basic_selection(selection, fields=fields) /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/zarr/core.py:698: in get_basic_selection fields=fields) /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/zarr/core.py:738: in _get_basic_selection_nd indexer = BasicIndexer(selection, self) /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/zarr/indexing.py:279: in __init__ dim_indexer = SliceDimIndexer(dim_sel, dim_len, dim_chunk_len) /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/zarr/indexing.py:107: in __init__ err_negative_step() _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ def err_negative_step(): > raise IndexError('only slices with step >= 1 are supported') E IndexError: only slices with step >= 1 are supported /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/zarr/errors.py:55: IndexError ___________________ TestZarrDirectoryStore.test_manual_chunk ___________________ self = <xarray.tests.test_backends.TestZarrDirectoryStore object at 0x7f5831763ef0> @requires_dask @pytest.mark.filterwarnings("ignore:Specified Dask chunks") def test_manual_chunk(self): original = create_test_data().chunk({"dim1": 3, "dim2": 4, "dim3": 3}) # All of these should return non-chunked arrays NO_CHUNKS = (None, 0, {}) for no_chunk in NO_CHUNKS: open_kwargs = {"chunks": no_chunk} > with self.roundtrip(original, open_kwargs=open_kwargs) as actual: xarray/tests/test_backends.py:1594: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ /usr/share/miniconda/envs/xarray-tests/lib/python3.6/contextlib.py:81: in __enter__ return next(self.gen) xarray/tests/test_backends.py:1553: in roundtrip with self.open(store_target, **open_kwargs) as ds: /usr/share/miniconda/envs/xarray-tests/lib/python3.6/contextlib.py:81: in __enter__ return next(self.gen) xarray/tests/test_backends.py:1540: in open with xr.open_dataset(store_target, engine="zarr", **kwargs) as ds: xarray/backends/api.py:587: in open_dataset ds = maybe_decode_store(store, chunks) xarray/backends/api.py:511: in maybe_decode_store for k, v in ds.variables.items() xarray/backends/api.py:511: in <dictcomp> for k, v in ds.variables.items() xarray/backends/zarr.py:398: in maybe_chunk var = var.chunk(chunk_spec, name=name2, lock=None) xarray/core/variable.py:1007: in chunk data = da.from_array(data, chunks, name=name, lock=lock, **kwargs) /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/dask/array/core.py:2712: in from_array chunks, x.shape, dtype=x.dtype, previous_chunks=previous_chunks /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/dask/array/core.py:2447: in normalize_chunks (), /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/dask/array/core.py:2445: in <genexpr> for s, c in zip(shape, chunks) /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/dask/array/core.py:954: in blockdims_from_blockshape for d, bd in zip(shape, chunks) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ .0 = <zip object at 0x7f58324b7f48> ((bd,) * (d // bd) + ((d % bd,) if d % bd else ()) if d else (0,)) > for d, bd in zip(shape, chunks) ) E ZeroDivisionError: integer division or modulo by zero /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/dask/array/core.py:954: ZeroDivisionError ```
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Xarray open_mfdataset with engine Zarr 647804004
651772649 https://github.com/pydata/xarray/pull/4187#issuecomment-651772649 https://api.github.com/repos/pydata/xarray/issues/4187 MDEyOklzc3VlQ29tbWVudDY1MTc3MjY0OQ== weiji14 23487320 2020-06-30T12:56:00Z 2020-06-30T12:56:00Z CONTRIBUTOR

Is it ok to drop the deprecated auto_chunk tests here in this PR (or leave it to another PR)? The deprecation warning was first added in https://github.com/pydata/xarray/pull/2530/commits/ae4cf0ab19b3e563bde90a48b3e6ee615930d4a1, and I see that auto_chunk was used back in v0.12.1 at http://xarray.pydata.org/en/v0.12.1/generated/xarray.open_zarr.html.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Xarray open_mfdataset with engine Zarr 647804004
651662701 https://github.com/pydata/xarray/pull/4187#issuecomment-651662701 https://api.github.com/repos/pydata/xarray/issues/4187 MDEyOklzc3VlQ29tbWVudDY1MTY2MjcwMQ== weiji14 23487320 2020-06-30T09:03:53Z 2020-06-30T09:24:12Z CONTRIBUTOR

Nevermind, I found it. There was an if that should have been an elif. Onward to the next error - UnboundLocalError. Edit: Also fixed!

```python-traceback =================================== FAILURES =================================== __________________________ TestDataset.test_lazy_load __________________________ self = <xarray.tests.test_dataset.TestDataset object at 0x7f4aed5df940> def test_lazy_load(self): store = InaccessibleVariableDataStore() create_test_data().dump_to_store(store) for decode_cf in [True, False]: > ds = open_dataset(store, decode_cf=decode_cf) xarray/tests/test_dataset.py:4188: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ xarray/backends/api.py:587: in open_dataset ds = maybe_decode_store(store) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ store = <xarray.tests.test_dataset.InaccessibleVariableDataStore object at 0x7f4aed5dfb38> lock = False def maybe_decode_store(store, lock=False): ds = conventions.decode_cf( store, mask_and_scale=mask_and_scale, decode_times=decode_times, concat_characters=concat_characters, decode_coords=decode_coords, drop_variables=drop_variables, use_cftime=use_cftime, decode_timedelta=decode_timedelta, ) _protect_dataset_variables_inplace(ds, cache) > if chunks is not None: E UnboundLocalError: local variable 'chunks' referenced before assignment xarray/backends/api.py:466: UnboundLocalError ```
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Xarray open_mfdataset with engine Zarr 647804004
651624166 https://github.com/pydata/xarray/pull/4187#issuecomment-651624166 https://api.github.com/repos/pydata/xarray/issues/4187 MDEyOklzc3VlQ29tbWVudDY1MTYyNDE2Ng== weiji14 23487320 2020-06-30T08:02:03Z 2020-06-30T09:23:32Z CONTRIBUTOR

This is the one test failure (AttributeError) on Linux py36-bare-minimum:

```python-traceback =================================== FAILURES =================================== __________________________ TestDataset.test_lazy_load __________________________ self = <xarray.tests.test_dataset.TestDataset object at 0x7fa80b2b7be0> def test_lazy_load(self): store = InaccessibleVariableDataStore() create_test_data().dump_to_store(store) for decode_cf in [True, False]: > ds = open_dataset(store, decode_cf=decode_cf) xarray/tests/test_dataset.py:4188: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ xarray/backends/api.py:578: in open_dataset engine = _get_engine_from_magic_number(filename_or_obj) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ filename_or_obj = <xarray.tests.test_dataset.InaccessibleVariableDataStore object at 0x7fa80b2b7d30> def _get_engine_from_magic_number(filename_or_obj): # check byte header to determine file type if isinstance(filename_or_obj, bytes): magic_number = filename_or_obj[:8] else: > if filename_or_obj.tell() != 0: E AttributeError: 'InaccessibleVariableDataStore' object has no attribute 'tell' xarray/backends/api.py:116: AttributeError ```

Been scratching my head debugging this one. There doesn't seem to be an obvious reason why this test is failing, since 1) this test isn't for Zarr and 2) this test shouldn't be affected by the new if blocks checking if engine=="zarr". Will need to double check the logic here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Xarray open_mfdataset with engine Zarr 647804004

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 1672.317ms · About: xarray-datasette