github: issue_comments: 10 rows where author_association = "NONE" and user = 3309802 sorted by updated

10 rows where author_association = "NONE" and user = 3309802 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
1317352980	https://github.com/pydata/xarray/issues/6799#issuecomment-1317352980	https://api.github.com/repos/pydata/xarray/issues/6799	IC_kwDOAMm_X85OhTYU	gjoseph92 3309802	2022-11-16T17:00:04Z	2022-11-16T17:00:04Z	NONE	The current code also has the unfortunate side-effect of merging all chunks too Don't really know what I'm talking about here, but it looks to me like the current dask-interpolation routine uses `blockwise`. That is, it's trying to simply map a function over each chunk in the array. To get the chunks into a structure where this is correct to do, you have to first merge all the chunks along the interpolation axis. I would have expected interpolation to use `map_overlap`. You'd add some padding to each chunk, map the interpolation over each chunk (without combining them), then trim off the extra. By using overlap, you don't need to combine all the chunks into one big array first, so the operation can actually be parallel. FYI, fixing this would probably be a big deal to geospatial people—then you could do array reprojection without GDAL! Unfortunately not something I have time to work on right now, but perhaps someone else would be interested?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	`interp` performance with chunked dimensions 1307112340
1165001097	https://github.com/pydata/xarray/issues/6709#issuecomment-1165001097	https://api.github.com/repos/pydata/xarray/issues/6709	IC_kwDOAMm_X85FcIGJ	gjoseph92 3309802	2022-06-23T23:15:19Z	2022-06-23T23:15:19Z	NONE	I took a little bit more of a look at this and I don't think root task overproduction is the (only) problem here. I also feel like intuitively, this operation shouldn't require holding so many root tasks around at once. But the graph dask is making, or how it's ordering it, doesn't seem to work that way. We can see the ordering is pretty bad: When we actually run it (on https://github.com/dask/distributed/pull/6614 with overproduction fixed), you can see that dask requires keeping tons of the input chunks in memory, because they're going to be needed by a future task that isn't able to run yet (because not all of its inputs have been computed): I feel like it's possible that the order in which dask is executing the input tasks is bad? But I more thank that I haven't thought about the problem enough, and there's an obvious reason why the graph is structured like this.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Means of zarr arrays cause a memory overload in dask workers 1277437106
1164690164	https://github.com/pydata/xarray/issues/6709#issuecomment-1164690164	https://api.github.com/repos/pydata/xarray/issues/6709	IC_kwDOAMm_X85Fa8L0	gjoseph92 3309802	2022-06-23T17:37:59Z	2022-06-23T17:37:59Z	NONE	FYI @robin-cls I would be a bit surprised if there is anything you can do on your end to fix things here with off-the-shelf dask. What @dcherian mentioned in https://github.com/dask/distributed/issues/6360#issuecomment-1129484190 is probably the only thing that might work. Otherwise you'll need to run one my experimental branches.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Means of zarr arrays cause a memory overload in dask workers 1277437106
1164660225	https://github.com/pydata/xarray/issues/6709#issuecomment-1164660225	https://api.github.com/repos/pydata/xarray/issues/6709	IC_kwDOAMm_X85Fa04B	gjoseph92 3309802	2022-06-23T17:05:12Z	2022-06-23T17:05:12Z	NONE	Thanks @dcherian, yeah this is definitely root task overproduction. I think your case is somewhat similar to @TomNicholas's https://github.com/dask/distributed/issues/6571 (that one might even be a little simpler actually). There's some prototyping going on to address this, but I'd say "soon" is probably on the couple month timescale right now FYI. https://github.com/dask/distributed/pull/6598 or https://github.com/dask/distributed/pull/6614 will probably make this work. I'm hopefully going to benchmark these against some real workloads in the next couple days, so I'll probably add yours in. Thanks for the MVCE! Is my understanding of distributed mean wrong ? Why are the random-sample not flushed? See https://github.com/dask/distributed/issues/6360#issuecomment-1129434333 and the linked issues for why this happens.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Means of zarr arrays cause a memory overload in dask workers 1277437106
1085150420	https://github.com/pydata/xarray/pull/5879#issuecomment-1085150420	https://api.github.com/repos/pydata/xarray/issues/5879	IC_kwDOAMm_X85ArhTU	gjoseph92 3309802	2022-03-31T21:41:32Z	2022-03-31T21:41:32Z	NONE	Yeah, I guess I expected `OpenFile` to, well, act like an open file. So maybe this is more of an fsspec interface issue? I'll open a separate issue for improving the UX of this in xarray though. I think this would be rather confusing for new users.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Check for path-like objects rather than Path type, use os.fspath 1031275532
1085125053	https://github.com/pydata/xarray/issues/2314#issuecomment-1085125053	https://api.github.com/repos/pydata/xarray/issues/2314	IC_kwDOAMm_X85ArbG9	gjoseph92 3309802	2022-03-31T21:15:59Z	2022-03-31T21:15:59Z	NONE	Just noticed this issue; people needing to do this sort of thing might want to look at stackstac (especially playing with the `chunks=` parameter) or odc-stac for loading the data. The graph will be cleaner than what you'd get from `xr.concat([xr.open_rasterio(...) for ...])`. still appears to "over-eagerly" load more than just what is being worked on FYI, this is basically expected behavior for distributed, see: * https://github.com/dask/distributed/issues/5223 * https://github.com/dask/distributed/issues/5555	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Chunked processing across multiple raster (geoTIF) files 344621749
1085077801	https://github.com/pydata/xarray/pull/5879#issuecomment-1085077801	https://api.github.com/repos/pydata/xarray/issues/5879	IC_kwDOAMm_X85ArPkp	gjoseph92 3309802	2022-03-31T20:34:51Z	2022-03-31T20:34:51Z	NONE	"s3://noaa-nwm-retrospective-2-1-zarr-pds/lakeout.zarr" is a directory, right? You cannot open that as a file Yeah correct. I oversimplified this from the problem I actually cared about, since of course zarr is not a single file that can be `fsspec.open`'d in the first place, and the zarr engine is doing some magic there when passed the plain string. Here's a more illustrative example: ```python In [1]: import xarray as xr In [2]: import fsspec In [3]: import os In [4]: url = "s3://noaa-nwm-retrospective-2-1-pds/model_output/1979/197902010100.CHRTOUT_DOMAIN1.comp" # a netCDF file in s3 In [5]: f = fsspec.open(url) In [6]: f Out[6]: <OpenFile 'noaa-nwm-retrospective-2-1-pds/model_output/1979/197902010100.CHRTOUT_DOMAIN1.comp'> In [7]: isinstance(f, os.PathLike) Out[7]: True In [8]: s3f = f.open() In [9]: s3f Out[9]: <File-like object S3FileSystem, noaa-nwm-retrospective-2-1-pds/model_output/1979/197902010100.CHRTOUT_DOMAIN1.comp> In [10]: isinstance(s3f, os.PathLike) Out[10]: False In [11]: ds = xr.open_dataset(s3f, engine='h5netcdf') In [12]: ds Out[12]: <xarray.Dataset> Dimensions: (time: 1, reference_time: 1, feature_id: 2776738) Coordinates: * time (time) datetime64[ns] 1979-02-01T01:00:00 * reference_time (reference_time) datetime64[ns] 1979-02-01 * feature_id (feature_id) int32 101 179 181 ... 1180001803 1180001804 latitude (feature_id) float32 ... longitude (feature_id) float32 ... Data variables: crs \|S1 ... order (feature_id) int32 ... elevation (feature_id) float32 ... streamflow (feature_id) float64 ... q_lateral (feature_id) float64 ... velocity (feature_id) float64 ... qSfcLatRunoff (feature_id) float64 ... qBucket (feature_id) float64 ... qBtmVertRunoff (feature_id) float64 ... Attributes: (12/18) TITLE: OUTPUT FROM WRF-Hydro v5.2.0-beta2 featureType: timeSeries proj4: +proj=lcc +units=m +a=6370000.0 +b=6370000.0 ... model_initialization_time: 1979-02-01_00:00:00 station_dimension: feature_id model_output_valid_time: 1979-02-01_01:00:00 ... ... model_configuration: retrospective dev_OVRTSWCRT: 1 dev_NOAH_TIMESTEP: 3600 dev_channel_only: 0 dev_channelBucket_only: 0 dev: dev_ prefix indicates development/internal me... In [13]: ds = xr.open_dataset(f, engine='h5netcdf') AttributeError Traceback (most recent call last) <ipython-input-13-de834ca911b4> in <module> ----> 1 ds = xr.open_dataset(f, engine='h5netcdf') ~/dev/dask-playground/env/lib/python3.9/site-packages/xarray/backends/api.py in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, backend_kwargs, args, kwargs) 493 494 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None) --> 495 backend_ds = backend.open_dataset( 496 filename_or_obj, 497 drop_variables=drop_variables, ~/dev/dask-playground/env/lib/python3.9/site-packages/xarray/backends/h5netcdf_.py in open_dataset(self, filename_or_obj, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta, format, group, lock, invalid_netcdf, phony_dims, decode_vlen_strings) 384 ): 385 --> 386 filename_or_obj = _normalize_path(filename_or_obj) 387 store = H5NetCDFStore.open( 388 filename_or_obj, ~/dev/dask-playground/env/lib/python3.9/site-packages/xarray/backends/common.py in _normalize_path(path) 21 def _normalize_path(path): 22 if isinstance(path, os.PathLike): ---> 23 path = os.fspath(path) 24 25 if isinstance(path, str) and not is_remote_uri(path): ~/dev/dask-playground/env/lib/python3.9/site-packages/fsspec/core.py in fspath(self) 96 def fspath(self): 97 # may raise if cannot be resolved to local file ---> 98 return self.open().fspath() 99 100 def enter(self): AttributeError: 'S3File' object has no attribute 'fspath' ``` Because the plain `fsspec.OpenFile` object has an `__fspath__` attribute (but calling it raises an error), it causes `xarray.backends.common._normalize_path` to fail. Because the `s3fs.S3File` object does not* have an `__fspath__` attribute, `normalize_path` doesn't try to call `os.fspath` on it, so the file-like object is able to be passed all the way down into h5netcdf, which is able to handle it. Note though that if I downgrade xarray to 0.19.0 (last version before this PR was merged), I still can't use the plain `fssspec.OpenFile` object successfully. It's not xarray's fault anymore—it gets passed all the way into h5netcdf—but h5netcdf also tries to call `fspath` on the `OpenFile`, which fails in the same way. ```python In [1]: import xarray as xr In [2]: import fsspec In [3]: xr.__version__ Out[3]: '0.19.0' In [4]: url = "s3://noaa-nwm-retrospective-2-1-pds/model_output/1979/197902010100.CHRTOUT_DOMAIN1.comp" # a netCDF file in s3 In [5]: f = fsspec.open(url) In [6]: xr.open_dataset(f.open(), engine="h5netcdf") Out[6]: <xarray.Dataset> Dimensions: (time: 1, reference_time: 1, feature_id: 2776738) Coordinates: * time (time) datetime64[ns] 1979-02-01T01:00:00 * reference_time (reference_time) datetime64[ns] 1979-02-01 * feature_id (feature_id) int32 101 179 181 ... 1180001803 1180001804 latitude (feature_id) float32 ... longitude (feature_id) float32 ... Data variables: crs \|S1 ... order (feature_id) int32 ... elevation (feature_id) float32 ... streamflow (feature_id) float64 ... q_lateral (feature_id) float64 ... velocity (feature_id) float64 ... qSfcLatRunoff (feature_id) float64 ... qBucket (feature_id) float64 ... qBtmVertRunoff (feature_id) float64 ... Attributes: (12/18) TITLE: OUTPUT FROM WRF-Hydro v5.2.0-beta2 featureType: timeSeries proj4: +proj=lcc +units=m +a=6370000.0 +b=6370000.0 ... model_initialization_time: 1979-02-01_00:00:00 station_dimension: feature_id model_output_valid_time: 1979-02-01_01:00:00 ... ... model_configuration: retrospective dev_OVRTSWCRT: 1 dev_NOAH_TIMESTEP: 3600 dev_channel_only: 0 dev_channelBucket_only: 0 dev: dev_ prefix indicates development/internal me... In [7]: xr.open_dataset(f, engine="h5netcdf") --------------------------------------------------------------------------- KeyError Traceback (most recent call last) ~/dev/dask-playground/env/lib/python3.9/site-packages/xarray/backends/file_manager.py in _acquire_with_cache_info(self, needs_lock) 198 try: --> 199 file = self._cache[self._key] 200 except KeyError: ~/dev/dask-playground/env/lib/python3.9/site-packages/xarray/backends/lru_cache.py in __getitem__(self, key) 52 with self._lock: ---> 53 value = self._cache[key] 54 self._cache.move_to_end(key) KeyError: [<class 'h5netcdf.core.File'>, (<OpenFile 'noaa-nwm-retrospective-2-1-pds/model_output/1979/197902010100.CHRTOUT_DOMAIN1.comp'>,), 'r', (('decode_vlen_strings', True), ('invalid_netcdf', None))] During handling of the above exception, another exception occurred: AttributeError Traceback (most recent call last) <ipython-input-7-e6098b8ab402> in <module> ----> 1 xr.open_dataset(f, engine="h5netcdf") ~/dev/dask-playground/env/lib/python3.9/site-packages/xarray/backends/api.py in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, backend_kwargs, args, kwargs) 495 496 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None) --> 497 backend_ds = backend.open_dataset( 498 filename_or_obj, 499 drop_variables=drop_variables, ~/dev/dask-playground/env/lib/python3.9/site-packages/xarray/backends/h5netcdf_.py in open_dataset(self, filename_or_obj, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta, format, group, lock, invalid_netcdf, phony_dims, decode_vlen_strings) 372 373 filename_or_obj = _normalize_path(filename_or_obj) --> 374 store = H5NetCDFStore.open( 375 filename_or_obj, 376 format=format, ~/dev/dask-playground/env/lib/python3.9/site-packages/xarray/backends/h5netcdf_.py in open(cls, filename, mode, format, group, lock, autoclose, invalid_netcdf, phony_dims, decode_vlen_strings) 176 177 manager = CachingFileManager(h5netcdf.File, filename, mode=mode, kwargs=kwargs) --> 178 return cls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose) 179 180 def _acquire(self, needs_lock=True): ~/dev/dask-playground/env/lib/python3.9/site-packages/xarray/backends/h5netcdf_.py in __init__(self, manager, group, mode, lock, autoclose) 121 # todo: utilizing find_root_and_group seems a bit clunky 122 # making filename available on h5netcdf.Group seems better --> 123 self._filename = find_root_and_group(self.ds)[0].filename 124 self.is_remote = is_remote_uri(self._filename) 125 self.lock = ensure_lock(lock) ~/dev/dask-playground/env/lib/python3.9/site-packages/xarray/backends/h5netcdf_.py in ds(self) 187 @property 188 def ds(self): --> 189 return self._acquire() 190 191 def open_store_variable(self, name, var): ~/dev/dask-playground/env/lib/python3.9/site-packages/xarray/backends/h5netcdf_.py in _acquire(self, needs_lock) 179 180 def _acquire(self, needs_lock=True): --> 181 with self._manager.acquire_context(needs_lock) as root: 182 ds = _nc4_require_group( 183 root, self._group, self._mode, create_group=_h5netcdf_create_group ~/.pyenv/versions/3.9.1/lib/python3.9/contextlib.py in __enter__(self) 115 del self.args, self.kwds, self.func 116 try: --> 117 return next(self.gen) 118 except StopIteration: 119 raise RuntimeError("generator didn't yield") from None ~/dev/dask-playground/env/lib/python3.9/site-packages/xarray/backends/file_manager.py in acquire_context(self, needs_lock) 185 def acquire_context(self, needs_lock=True): 186 """Context manager for acquiring a file.""" --> 187 file, cached = self._acquire_with_cache_info(needs_lock) 188 try: 189 yield file ~/dev/dask-playground/env/lib/python3.9/site-packages/xarray/backends/file_manager.py in _acquire_with_cache_info(self, needs_lock) 203 kwargs = kwargs.copy() 204 kwargs["mode"] = self._mode --> 205 file = self._opener(self._args, kwargs) 206 if self._mode == "w": 207 # ensure file doesn't get overriden when opened again ~/dev/dask-playground/env/lib/python3.9/site-packages/h5netcdf/core.py in __init__(self, path, mode, invalid_netcdf, phony_dims, kwargs) 978 self._preexisting_file = mode in {"r", "r+", "a"} 979 self._h5py = h5py --> 980 self._h5file = self._h5py.File( 981 path, mode, track_order=track_order, kwargs 982 ) ~/dev/dask-playground/env/lib/python3.9/site-packages/h5py/_hl/files.py in __init__(self, name, mode, driver, libver, userblock_size, swmr, rdcc_nslots, rdcc_nbytes, rdcc_w0, track_order, fs_strategy, fs_persist, fs_threshold, fs_page_size, page_buf_size, min_meta_keep, min_raw_keep, locking, kwds) 484 name = repr(name).encode('ASCII', 'replace') 485 else: --> 486 name = filename_encode(name) 487 488 if track_order is None: ~/dev/dask-playground/env/lib/python3.9/site-packages/h5py/_hl/compat.py in filename_encode(filename) 17 filenames in h5py for more information. 18 """ ---> 19 filename = fspath(filename) 20 if sys.platform == "win32": 21 if isinstance(filename, str): ~/dev/dask-playground/env/lib/python3.9/site-packages/fsspec/core.py in __fspath__(self) 96 def __fspath__(self): 97 # may raise if cannot be resolved to local file ---> 98 return self.open().__fspath__() 99 100 def __enter__(self): AttributeError: 'S3File' object has no attribute '__fspath__' ``` The problem is that `OpenFile` doesn't have a `read` or `seek` method, so h5py doesn't think it's a proper file-like object and tries to `fspath` it here: https://github.com/h5py/h5py/blob/master/h5py/_hl/files.py#L509 So I may just be misunderstanding what an `fsspec.OpenFile` object is supposed to be (it's not actually a file-like object until you `.open()` it?). But I expect users would be similarly confused by this distinction.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Check for path-like objects rather than Path type, use os.fspath 1031275532
1085030197	https://github.com/pydata/xarray/pull/5879#issuecomment-1085030197	https://api.github.com/repos/pydata/xarray/issues/5879	IC_kwDOAMm_X85ArD81	gjoseph92 3309802	2022-03-31T19:46:40Z	2022-03-31T19:50:07Z	NONE	@martindurant exactly, `os.PathLike` just uses duck-typing, which fsspec matches. This generally means you can't pass s3fs/gcsfs files into `xr.open_dataset` (from what I've tried so far). (I don't know if you actually should be able to do this, but regardless, the error would be very confusing to a new user.) ```python In [32]: xr.open_dataset("s3://noaa-nwm-retrospective-2-1-zarr-pds/lakeout.zarr", engine="zarr") Out[32]: <xarray.Dataset> Dimensions: (feature_id: 5783, time: 367439) Coordinates: * feature_id (feature_id) int32 491 531 747 ... 947070204 1021092845 latitude (feature_id) float32 ... longitude (feature_id) float32 ... * time (time) datetime64[ns] 1979-02-01T01:00:00 ... 2020-12-31T... Data variables: crs \|S1 ... inflow (time, feature_id) float64 ... outflow (time, feature_id) float64 ... water_sfc_elev (time, feature_id) float32 ... Attributes: Conventions: CF-1.6 TITLE: OUTPUT FROM WRF-Hydro v5.2.0-beta2 code_version: v5.2.0-beta2 featureType: timeSeries model_configuration: retrospective model_output_type: reservoir proj4: +proj=lcc +units=m +a=6370000.0 +b=6370000.... reservoir_assimilated_value: Assimilation not performed reservoir_type: 1 = level pool everywhere station_dimension: lake_id In [33]: xr.open_dataset(fsspec.open("s3://noaa-nwm-retrospective-2-1-zarr-pds/lakeout.zarr"), engine="zarr") KeyError Traceback (most recent call last) <ipython-input-33-76e10d75e2c2> in <module> ----> 1 xr.open_dataset(fsspec.open("s3://noaa-nwm-retrospective-2-1-zarr-pds/lakeout.zarr"), engine="zarr") ~/dev/dask-playground/env/lib/python3.9/site-packages/xarray/backends/api.py in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, backend_kwargs, args, kwargs) 493 494 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None) --> 495 backend_ds = backend.open_dataset( 496 filename_or_obj, 497 drop_variables=drop_variables, ~/dev/dask-playground/env/lib/python3.9/site-packages/xarray/backends/zarr.py in open_dataset(self, filename_or_obj, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta, group, mode, synchronizer, consolidated, chunk_store, storage_options, stacklevel) 797 ): 798 --> 799 filename_or_obj = _normalize_path(filename_or_obj) 800 store = ZarrStore.open_group( 801 filename_or_obj, ~/dev/dask-playground/env/lib/python3.9/site-packages/xarray/backends/common.py in _normalize_path(path) 21 def _normalize_path(path): 22 if isinstance(path, os.PathLike): ---> 23 path = os.fspath(path) 24 25 if isinstance(path, str) and not is_remote_uri(path): ~/dev/dask-playground/env/lib/python3.9/site-packages/fsspec/core.py in fspath(self) 96 def fspath(self): 97 # may raise if cannot be resolved to local file ---> 98 return self.open().fspath() 99 100 def enter(self): ~/dev/dask-playground/env/lib/python3.9/site-packages/fsspec/core.py in open(self) 138 been deleted; but a with-context is better style. 139 """ --> 140 out = self.enter() 141 closer = out.close 142 fobjects = self.fobjects.copy()[:-1] ~/dev/dask-playground/env/lib/python3.9/site-packages/fsspec/core.py in enter(self) 101 mode = self.mode.replace("t", "").replace("b", "") + "b" 102 --> 103 f = self.fs.open(self.path, mode=mode) 104 105 self.fobjects = [f] ~/dev/dask-playground/env/lib/python3.9/site-packages/fsspec/spec.py in open(self, path, mode, block_size, cache_options, compression, kwargs) 1007 else: 1008 ac = kwargs.pop("autocommit", not self._intrans) -> 1009 f = self._open( 1010 path, 1011 mode=mode, ~/dev/dask-playground/env/lib/python3.9/site-packages/s3fs/core.py in _open(self, path, mode, block_size, acl, version_id, fill_cache, cache_type, autocommit, requester_pays, kwargs) 532 cache_type = self.default_cache_type 533 --> 534 return S3File( 535 self, 536 path, ~/dev/dask-playground/env/lib/python3.9/site-packages/s3fs/core.py in init(self, s3, path, mode, block_size, acl, version_id, fill_cache, s3_additional_kwargs, autocommit, cache_type, requester_pays) 1824 1825 if "r" in mode: -> 1826 self.req_kw["IfMatch"] = self.details["ETag"] 1827 1828 def _call_s3(self, method, kwarglist,* *kwargs): KeyError: 'ETag' ```	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Check for path-like objects rather than Path type, use os.fspath 1031275532
1085012805	https://github.com/pydata/xarray/pull/5879#issuecomment-1085012805	https://api.github.com/repos/pydata/xarray/issues/5879	IC_kwDOAMm_X85Aq_tF	gjoseph92 3309802	2022-03-31T19:25:28Z	2022-03-31T19:25:28Z	NONE	Note that `isinstance(fsspec.OpenFile(...), os.PathLike)` due to the magic of ABCs. Are we sure that we want to be calling `os.fspath` on fsspec files? In many cases (like an S3File, GCSFile, etc.) this will fail with a confusing error like `'S3File' object has no attribute '__fspath__'`. cc @martindurant	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Check for path-like objects rather than Path type, use os.fspath 1031275532
856285749	https://github.com/pydata/xarray/pull/5449#issuecomment-856285749	https://api.github.com/repos/pydata/xarray/issues/5449	MDEyOklzc3VlQ29tbWVudDg1NjI4NTc0OQ==	gjoseph92 3309802	2021-06-07T21:45:30Z	2021-06-07T21:45:30Z	NONE	@mathause sorry for breaking things here. Note that passing `output_dtypes` didn't work as it was supposed to before, and also didn't cause a cast. We went back and forth on whether `output_types` should cause explicit casting, and whether it was sensible to provide both it and `meta`. Ultimately we decided they should be mutually exclusive, and should not cause casting, but without much knowledge of how downstream libraries were using these arguments. So maybe we should revisit that choice in dask? Also I think maybe this test should be changed rather than skipped. Saying `output_dtypes=[int]` and then `assert float == actual.dtype` just seems weird to me. Perhaps removing one of `output_dtypes` or `meta` from the test would be the best solution.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	fix dask meta and output_dtypes error 913830070

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

10 rows where author_association = "NONE" and user = 3309802 sorted by updated_at descending

In [13]: ds = xr.open_dataset(f, engine='h5netcdf')

In [33]: xr.open_dataset(fsspec.open("s3://noaa-nwm-retrospective-2-1-zarr-pds/lakeout.zarr"), engine="zarr")

Advanced export