html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/pull/4003#issuecomment-651530759,https://api.github.com/repos/pydata/xarray/issues/4003,651530759,MDEyOklzc3VlQ29tbWVudDY1MTUzMDc1OQ==,8241481,2020-06-30T04:45:42Z,2020-06-30T04:45:42Z,CONTRIBUTOR,"@weiji14 @shoyer Thanks you guys! Sorry it has taken me long to come back to this PR - I really mean to come back to this but I got stuck with another bigger PR that is actually part of my main research project. Anyways, much appreciated for the help, cheers!!
- Since I am a novice at this, on my end, should I close this PR?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,606683601
https://github.com/pydata/xarray/pull/4003#issuecomment-620943840,https://api.github.com/repos/pydata/xarray/issues/4003,620943840,MDEyOklzc3VlQ29tbWVudDYyMDk0Mzg0MA==,8241481,2020-04-29T01:43:43Z,2020-04-29T01:44:46Z,CONTRIBUTOR,"Following your advise, ```open_dataset``` can now open ``zarr`` files. This is done:
```python
ds = xarray.open_dataset(store, engine=""zarr"", chunks=""auto"")
```
NOTE: ``xr.open_dataset`` has ``chunks=None`` by default, whereas it used to be ``chunks=""auto""`` on ``xarray.open_zarr``.
**Additional feature**: As a result of these changes, ``open_mfdataset`` can now (automatically) open multiple zarr files (e.g. in parallel) when given a glob. This is,
```python
paths='directory_name/*/subdirectory_name/*'
ds = xarray.open_mfdataset(paths, enginne=""zarr"", chunks=""auto"", concat_dim=""time"", combine=""nested"")
```
does yield the desired behavior.
This is different from ```fsspec.open_local``` vs ```fsspec.mapper``` on ``intake-xarray`` when opening files with a glob. But agreed, that can be addressed in a different PR.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,606683601
https://github.com/pydata/xarray/pull/4003#issuecomment-620133764,https://api.github.com/repos/pydata/xarray/issues/4003,620133764,MDEyOklzc3VlQ29tbWVudDYyMDEzMzc2NA==,8241481,2020-04-27T17:45:25Z,2020-04-27T17:45:25Z,CONTRIBUTOR,"I like this approach (add capability to open_mfdataset to open multiple zarr files), as it is the easiest and cleanest. I considered it, and I am glad this is coming up because I wanted to know different opinions. Two things influenced my decision to have ```open_mzarr``` separate from ```open_mfdataset```:
1. Zarr stores are inherently different from netcdf-files, which becomes more evident when openning multiple files given a glob-path (```paths='directory*/subdirectory*/*'```). zarr stores can potentially be recognized as directories rather than files (e.g. as opposed to ```paths='directory*/subdirectory*/*.nc'```). This distinction comes into play when, for example, trying to open files (```zarr``` vs ```netcdf```) through ```intake-xarray```. I know this an upstream behavior, but I think it needs to be considered and it is my end goal by allowing ```xarray``` to read multiple ```zarr``` files (in parallel) - To use ```intake-xarray``` to read them. The way to open files on ```intake-xarray``` (```zarr``` vs others) is again kept separate, and uses different functions. This is,
**For netcdf**-files (```intake-xarray/netcdf.py```):
```python
url_path = fsspec.open_local(paths, *kwargs)
```
which can interpret a glob path. Then ```url``` is then passed to ```xarray.open_mfdataset```
**zarr** files (```intake-xarray/xzarr```):
```python
url_path = fsspec.mapper(paths, *kwargs)
```
```fsspec.mapper``` does not recognize glob-paths, and ```fspec.open_local```, which does recognize globs, cannot detect zarr-stores (as these are recognized as directories rather than files with a known extension). See an issue I created about such behavior https://github.com/intake/filesystem_spec/issues/286#issue-606019293 (apologizes, I am new at github and don't know if this is the correct way to link issues across repositories)
2. Zarr continues to be under development, and the behavior of zarr it appears will rely heavily on ```fsspec``` more in the future. I wonder if such future development is the reason why even on ```xarray```, ```open_zarr``` is contained in a different file from ```open_mfdataset```, a similar behavior also happening in ```intake-xarray```.
I am extremely interested what people think about ```xarray``` and ```intake-xarray``` compatibility/development, when it comes with ```zarr``` files being read in parallel...","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,606683601