commits: 5654aee927586c2dcbc3f34d674ed5c9646326c1

This data as json

sha	message	author_date	committer_date	raw_author	raw_committer	repo	author	committer
5654aee927586c2dcbc3f34d674ed5c9646326c1	Xarray open_mfdataset with engine Zarr (#4187) * create def for multiple zarr files and added commentary/definition, which matches almost exactly that of ``xr.open_mfdatasets``, but withou ``engine`` * just as with ``xr.open_mfdatasets``, identify the paths as local directory paths/strings * added error if no path * finished copying similar code from `xr.open_mfdatasets` * remove blank lines * fixed typo * added ``xr.open_mzarr()`` to the list of available modules to call * imported missing function * imported missing glob * imported function from backend.api * imported function to facilitate mzarr * correctly imported functions from core to mzarr * imported to use on open_mzarr * removed lock and autoclose since not taken by ``open_zarr`` * fixed typo * class is not needed since zarr stores don`t remain open * removed old behavior * set default * listed open_mzarr * removed unused imported function * imported Path - hadn`t before * remove unncessesary comments * modified comments * isorted zarr * isorted * erased open_mzarr. Added capability to open_dataset to open zarr files * removed imported but unused * comment to `zarr` engine * added chunking code from `open_zarr` * remove import `open_mzarr`` * removed `open_mzarr`` from top-level-function * missing return in nested function * moved outside of nested function, had touble with reading before assignement * added missing argument associated with zarr stores, onto the definition of open_dataset * isort zarr.py * removed blank lines, fixed typo on `chunks` * removed imported but unused * restored conditional for `auto` * removed imported but unused `dask.array` * added capabilities for file_or_obj to be a mutablemapper such as `fsspec.get_mapper`, and thus compatible with `intake-xarray` * moved to a different conditional since file_or_obj is a mutablemapping, not a str, path or AbstractDataStore * isort api.py * restored the option for when file_or_obk is a str, such as an url. * fixed relabel * update open_dataset for zarr files * remove open_zarr from tests, now open_dataset(engine=`zarr`) * remove extra file, and raise deprecating warning on open_zarr * added internal call to open_dataset from depricated open_zarr * defined engine=`zarr` * correct argument for open_dataset * pass arguments as backend_kwargs * pass backend_kwargs as argument * typo * set `overwrite_enconded_chunks as backend_kwargs * do not pass as backend, use for chunking * removed commented code * moved definitions to zarr backends * Ensure class functions have necessary variables Was missing some 'self' and other kwarg variables. Also linted using black. * Combine MutableMapping and Zarr engine condition As per https://github.com/pydata/xarray/pull/4003#discussion_r441978720. * Pop out overwrite_encoded_chunks after shallow copy backend_kwargs dict Don't pop the backend_kwargs dict as per https://github.com/pydata/xarray/pull/4003#discussion_r441979810, make a shallow copy of the backend_kwargs dictionary first. Also removed `overwrite_encoded_chunks` as a top level kwarg of `open_dataset`. Instead, pass it to `backend_kwargs` when using engine="zarr". * Fix some errors noticed by PEP8 * Reorganize code in backends api.py and actually test using engine zarr Merge at 1977ba16147f6c0dfaac8f9f720698b622a5acfd wasn't done very well. Reorganized the logic of the code to reduce the diff with xarray master, and ensure that the zarr backend tests actually have engine="zarr" in them. * Add back missing decode_timedelta kwarg * Add back a missing engine="zarr" to test_distributed.py * Ensure conditional statements make sense * Fix UnboundLocalError on 'chunks' referenced before assignment Need to pass in chunks to maybe_decode_store, to resolve UnboundLocalError: local variable 'chunks' referenced before assignment. * Run isort to fix import order * Fix tests where kwargs needs to be inside of backend_kwargs dict now Also temporarily silence deprecate_auto_chunk tests using pytest.raises(TypeError). May remove those fully later. * Change open_zarr to open_dataset with engine="zarr" in io.rst * Fix test_distributed by wrapping consolidated in backend_kwargs dict Patches cb6d06606a9f5a9418da57006c8e976d3d362def. * Ensure read-only mode when using open_dataset with engine="zarr" * Turn chunks from "auto" to None if dask is not available * Add back a missing else statement in maybe_chunk * Allow xfail test_vectorized_indexing when has_dask Instead of when not has_dask. * Typo on chunks arg in open_dataset * Fix ZeroDivisionError by adding back check that chunks is not False Yet another if-statement that wasn't properly transferred from zarr.py to api.py. * Fix a typo that was causing TypeError: 'method' object is not iterable * Move the `if not chunks` block to after auto detect Patches logic of 6fbeadf41a1a547383da0c8f4499c99099dbdf97 to fix errors when Dask is not installed. * Revert "Allow xfail test_vectorized_indexing when has_dask" This reverts commit aca2012fb5f46e839c980781b50e8bf8b0562ed0. * Temporarily xfail test_vectorized_indexing with or without dask * Put zarr in open_mfdataset engine list * Test open_mfdataset_manyfiles with engine zarr Zarr objects are folders with seem to cause issues with closing, so added a try-except to api.py to catch failures in f.close(). Some tests failing when chunks=None because a numpy array is returned instead of a dask array. * Remember to set a ._file_obj when using Zarr Yet another logic error fixed, resolves the try-except hack in b9a239eff23378015896191c5ad237733a4795bd. * Expect np.ndarray when using open_mfdataset on Zarr with chunks None * Add an entry to what's new for open_mfdataset with Zarr engine Plus a small formatting fix in open_mfdataset docstring * Make zarr engine's custom chunk mechanism more in line with ds.chunk Slightly edited the token name string to start with 'xarray' and include chunks in tokenize. Also replace the deprecated `_replace_vars_and_dims` method with just `_replace`. * Workaround problem where dask arrays aren't returned when chunks is None Revert 827e546155a157f64dfe1585bf09ad733bc52543 and workaround to get dask arrays by fixing some if-then logic in the code when `engine="zarr"` is involved. Things work fine when using chunks="auto", perhaps because the try `import dask.array` is needed to trigger loading into dask arrays? Also removed using chunks="auto" in some Zarr tests to simplify. * Default to chunks="auto" for Zarr tests to fix test_vectorized_indexing Revert hack in 6b99225fc17fe7c51423b30c66914709e5239a05 as test_vectorized_indexing now works on dask, specifically the negative slices test. It will still fail without dask, as was the behaviour before. Solution was to set `chunks="auto"` as the default when testing using `open_dataset` with `engine="zarr"`, similar to the default for `open_zarr`. Reverted some aspects of dce4e7cd1fcf35fb7d3293bf6cc410646b588c64 to ensure this `chunks="auto"`setting is visible throughout the Zarr test suite. * Fix test by passing in chunk_store to backend_kwargs * Revert "Change open_zarr to open_dataset with engine="zarr" in io.rst" This reverts commit cd0b9efe5dd573b4234493a1c491dc11b13574cf. * Remove open_zarr DeprecationWarning Partially reverts b488363b32705e6bd0b174b927cb129d247f5d69. * Update open_dataset docstring to specify chunk options for zarr engine * Let only chunks = None return non-chunked arrays * Remove for-loop in test_manual_chunk since testing only one no_chunk * Update open_dataset docstring to remove mention of chunks=None with Zarr Co-authored-by: Miguel Jimenez-Urias <mjimen17@jh.edu> Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>	2020-09-22T05:40:30Z	2020-09-22T05:40:30Z	8cff2dcfaf8d786a961d5c90749c62ed7daaa008	cd792325681cbad9f663f2879d8b69f1edbb678f	13221727	23487320	19864447