github: issues: 17 rows where user = 10137 sorted by updated

17 rows where user = 10137 sorted by updated_at descending

Search:

descending

id	node_id	number	title	user	state	comments	created_at	updated_at ▲	closed_at	author_association	draft	pull_request	body	reactions	state_reason	repo	type
1605108888	I_kwDOAMm_X85frASY	7574	xr.open_mfdataset doesn't work with fsspec and dask	ghost 10137	closed	12	2023-03-01T14:45:56Z	2023-09-08T00:33:41Z	2023-09-08T00:33:41Z	NONE			What happened? I was trying to read multiple byte netcdf (requires h5netcdf engine) file with xr.open_mfdataset with parallel=True to leverage dask.delayed capabilities (parallel=False works though) but it failed. The netcdf files were noaa-goes16 satellite images, but I can't tell if it matters. What did you expect to happen? It should have loaded all the netcdf files into a xarray.DataSet object Minimal Complete Verifiable Example ```python import fsspec import xarray as xr paths = [ 's3://noaa-goes16/ABI-L2-LSTC/2022/185/03/OR_ABI-L2-LSTC-M6_G16_s20221850301180_e20221850303553_c20221850305091.nc', 's3://noaa-goes16/ABI-L2-LSTC/2022/185/02/OR_ABI-L2-LSTC-M6_G16_s20221850201180_e20221850203553_c20221850205142.nc' ] fs = fsspec.filesystem('s3') xr.open_mfdataset( [fs.open(path, mode="rb") for path in paths], engine="h5netcdf", combine="nested", concat_dim="t", parallel=True ) ``` MVCE confirmation [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [X] Complete example — the example is self-contained, including all data and the text of any traceback. [ ] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [X] New issue — a search of GitHub Issues suggests this is not a duplicate. Relevant log output ```Python KeyError Traceback (most recent call last) File ~/miniconda3/envs/rxr/lib/python3.11/site-packages/xarray/backends/file_manager.py:210, in CachingFileManager._acquire_with_cache_info(self, needs_lock) 209 try: --> 210 file = self._cache[self._key] 211 except KeyError: File ~/miniconda3/envs/rxr/lib/python3.11/site-packages/xarray/backends/lru_cache.py:56, in LRUCache.getitem(self, key) 55 with self._lock: ---> 56 value = self._cache[key] 57 self._cache.move_to_end(key) ?[0;31mKeyError?[0m: [<class 'h5netcdf.core.File'>, ((b'\x89HDF\r\n', b'\x1a\n', b'\x02\x08\x08\x00\x00\x00 ... EXTREMELY STRING ... 00\x00\x00\x00\x00\x00\x0ef'] During handling of the above exception, another exception occurred: TypeError Traceback (most recent call last) Cell In[9], line 11 4 paths = [ 5 's3://noaa-goes16/ABI-L2-LSTC/2022/185/03/OR_ABI-L2-LSTC-M6_G16_s20221850301180_e20221850303553_c20221850305091.nc', 6 's3://noaa-goes16/ABI-L2-LSTC/2022/185/02/OR_ABI-L2-LSTC-M6_G16_s20221850201180_e20221850203553_c20221850205142.nc' 7 ] 9 fs = fsspec.filesystem('s3') ---> 11 xr.open_mfdataset( 12 [fs.open(path, mode="rb") for path in paths], 13 engine="h5netcdf", 14 combine="nested", 15 concat_dim="t", 16 parallel=True 17 ).LST File ~/miniconda3/envs/rxr/lib/python3.11/site-packages/xarray/backends/api.py:991, in open_mfdataset(paths, chunks, concat_dim, compat, preprocess, engine, data_vars, coords, combine, parallel, join, attrs_file, combine_attrs, *kwargs) 986 datasets = [preprocess(ds) for ds in datasets] 988 if parallel: 989 # calling compute here will return the datasets/file_objs lists, 990 # the underlying datasets will still be stored as dask arrays --> 991 datasets, closers = dask.compute(datasets, closers) 993 # Combine all datasets, closing them in case of a ValueError 994 try: File ~/miniconda3/envs/rxr/lib/python3.11/site-packages/dask/base.py:599, in compute(traverse, optimize_graph, scheduler, get, args,* kwargs) 596 keys.append(x.dask_keys()) 597 postcomputes.append(x.dask_postcompute()) --> 599 results = schedule(dsk, keys, kwargs) 600 return repack([f(r, a) for r, (f, a) in zip(results, postcomputes)]) File ~/miniconda3/envs/rxr/lib/python3.11/site-packages/dask/threaded.py:89, in get(dsk, keys, cache, num_workers, pool, kwargs) 86 elif isinstance(pool, multiprocessing.pool.Pool): 87 pool = MultiprocessingPoolExecutor(pool) ---> 89 results = get_async( 90 pool.submit, 91 pool._max_workers, 92 dsk, 93 keys, 94 cache=cache, 95 get_id=_thread_get_id, 96 pack_exception=pack_exception, 97 kwargs, 98 ) 100 # Cleanup pools associated to dead threads 101 with pools_lock: File ~/miniconda3/envs/rxr/lib/python3.11/site-packages/dask/local.py:511, in get_async(submit, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, chunksize, *kwargs) 509 _execute_task(task, data) # Re-execute locally 510 else: --> 511 raise_exception(exc, tb) 512 res, worker_id = loads(res_info) 513 state["cache"][key] = res File ~/miniconda3/envs/rxr/lib/python3.11/site-packages/dask/local.py:319, in reraise(exc, tb) 317 if exc.traceback is not tb: 318 raise exc.with_traceback(tb) --> 319 raise exc File ~/miniconda3/envs/rxr/lib/python3.11/site-packages/dask/local.py:224, in execute_task(key, task_info, dumps, loads, get_id, pack_exception) 222 try: 223 task, data = loads(task_info) --> 224 result = _execute_task(task, data) 225 id = get_id() 226 result = dumps((result, id)) File ~/miniconda3/envs/rxr/lib/python3.11/site-packages/dask/core.py:119, in _execute_task(arg, cache, dsk) 115 func, args = arg[0], arg[1:] 116 # Note: Don't assign the subtask results to a variable. numpy detects 117 # temporaries by their reference count and can execute certain 118 # operations in-place. --> 119 return func((_execute_task(a, cache) for a in args)) 120 elif not ishashable(arg): 121 return arg File ~/miniconda3/envs/rxr/lib/python3.11/site-packages/dask/utils.py:73, in apply(func, args, kwargs) 42 """Apply a function given its positional and keyword arguments. 43 44 Equivalent to `func(args, kwargs)` (...) ... ---> 19 filename = fspath(filename) 20 if sys.platform == "win32": 21 if isinstance(filename, str): TypeError: expected str, bytes or os.PathLike object, not tuple ``` Anything else we need to know? No response* Environment INSTALLED VERSIONS ------------------ commit: None python: 3.11.0 \| packaged by conda-forge \| (main, Jan 15 2023, 05:44:48) [Clang 14.0.6 ] python-bits: 64 OS: Darwin OS-release: 21.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: None LOCALE: (None, 'UTF-8') libhdf5: 1.12.2 libnetcdf: None xarray: 2023.2.0 pandas: 1.5.3 numpy: 1.24.2 scipy: 1.10.1 netCDF4: None pydap: None h5netcdf: 1.1.0 h5py: 3.8.0 Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: 1.3.6 cfgrib: None iris: None bottleneck: None dask: 2023.2.1 distributed: 2023.2.1 matplotlib: 3.7.0 cartopy: 0.21.1 seaborn: 0.12.2 numbagg: None fsspec: 2023.1.0 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 67.4.0 pip: 23.0.1 conda: None pytest: 7.2.1 mypy: None IPython: 8.10.0 sphinx: None [/Users/jo/miniconda3/envs/rxr/lib/python3.11/site-packages/_distutils_hack/__init__.py:33](https://file+.vscode-resource.vscode-cdn.net/Users/jo/miniconda3/envs/rxr/lib/python3.11/site-packages/_distutils_hack/__init__.py:33): UserWarning: Setuptools is replacing distutils. warnings.warn("Setuptools is replacing distutils.")	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7574/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
548263148	MDU6SXNzdWU1NDgyNjMxNDg=	3684	open_mfdataset - different behavior with dask.distributed.LocalCluster	ghost 10137	open	3	2020-01-10T19:58:19Z	2023-09-05T10:56:23Z		NONE			Big fan of Xarray! Not that familiar with submitting tickets like this, so my apologies for rule breaking. Also, if this belongs over in the dask project, I can move there. dask 2.6.0 numpy 1.17.3 xarray 0.14.1 netCDF4 1.5.3 I am attempting to use open_mfdataset on nc files I've generated through dask/xarray after initializing the dask LocalCluster. I've found that I am able to compute successfully when I don't run the distributed cluster. But if I do, I get a variety of issues. I've got a synthetic data generating example here. Running the soundspeed.compute() will sometimes succeed, and will sometimes cause worker restarts resulting in hdf errors and no return. I was thinking it was something with serialization, i've seen other tickets with similar issues, but I don't see how it applies to my test case. Example code: ```python import numpy as np import xarray as xr import os from dask.distributed import Client cl = Client() outpth = r'D:\dasktest\data_dir\EM2040\converted\test' mint = 0 maxt = 1000 for i in range(100): times = np.arange(mint, maxt) beams = np.arange(250) sectors=['40107_0_260000', '40107_1_320000', '40107_2_290000'] soundspeed = np.random.randn(1000,3,250) ds = xr.Dataset({'soundspeed': (('time','sectors','beams'), soundspeed)}, {'time': times, 'sectors': sectors, 'beams':beams},) ds.to_netcdf(os.path.join(outpth, 'test{}.nc'.format(i)), mode='w') mint = maxt maxt += 1000 fils = [os.path.join(outpth, x) for x in os.listdir(outpth) if os.path.splitext(x)[1] == '.nc'] tst = xr.open_mfdataset(fils, concat_dim='time', combine='nested') tst.soundspeed.compute() ``` I've found that running this example with <10 files reduces the number of errors I'm getting dramatically. I've tried this on different machines in different domain environments just to be sure. I really just want to make sure I'm not making a silly mistake somewhere. Appreciate the help. My last run on actual data: ```python ra.soundspeed.compute() distributed.nanny - WARNING - Restarting worker distributed.nanny - WARNING - Restarting worker distributed.nanny - WARNING - Restarting worker distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001F83F1E2360>, key=BasicIndexer((slice(None, None, None), slice(None, None, None)))))), (slice(0, 1719, None), slice(0, 3, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\dataarray.py", line 837, in compute return new.load(kwargs) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\dataarray.py", line 811, in load ds = self._to_temp_dataset().load(kwargs) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\dataset.py", line 649, in load evaluated_data = da.compute(lazy_data.values(), kwargs) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\dask\base.py", line 436, in compute results = schedule(dsk, keys, kwargs) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\client.py", line 2545, in get results = self.gather(packed, asynchronous=asynchronous, direct=direct) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\client.py", line 1845, in gather asynchronous=asynchronous, File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\client.py", line 762, in sync self.loop, func, args, callback_timeout=callback_timeout, kwargs File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\utils.py", line 333, in sync raise exc.with_traceback(tb) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\utils.py", line 317, in f result[0] = yield future File "C:\PydroXL_19\envs\dasktest\lib\site-packages\tornado\gen.py", line 735, in run value = future.result() File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\client.py", line 1701, in gather raise exception.with_traceback(traceback) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\dask\array\core.py", line 106, in getter c = np.asarray(c) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\numpy\core_asarray.py", line 85, in asarray return array(a, dtype, copy=False, order=order) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\indexing.py", line 481, in array return np.asarray(self.array, dtype=dtype) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\numpy\core_asarray.py", line 85, in asarray return array(a, dtype, copy=False, order=order) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\indexing.py", line 643, in array return np.asarray(self.array, dtype=dtype) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\numpy\core_asarray.py", line 85, in asarray return array(a, dtype, copy=False, order=order) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\indexing.py", line 547, in array return np.asarray(array[self.key], dtype=None) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\netCDF4.py", line 72, in getitem key, self.shape, indexing.IndexingSupport.OUTER, self.getitem File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\indexing.py", line 827, in explicit_indexing_adapter result = raw_indexing_method(raw_key.tuple) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\netCDF4.py", line 83, in getitem original_array = self.get_array(needs_lock=False) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\netCDF4.py", line 62, in get_array ds = self.datastore.acquire(needs_lock) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\netCDF4.py", line 360, in _acquire with self._manager.acquire_context(needs_lock) as root: File "C:\PydroXL_19\envs\dasktest\lib\contextlib.py", line 81, in enter return next(self.gen) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\file_manager.py", line 186, in acquire_context file, cached = self._acquire_with_cache_info(needs_lock) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\file_manager.py", line 204, in _acquire_with_cache_info file = self._opener(self._args, kwargs) File "netCDF4_netCDF4.pyx", line 2321, in netCDF4._netCDF4.Dataset.init File "netCDF4_netCDF4.pyx", line 1885, in netCDF4._netCDF4._ensure_nc_success OSError: [Errno -101] NetCDF: HDF error: b'D:\dasktest\data_dir\EM2040\converted\rangeangle_20.nc' ``` My last run on the synthetic data set generated above: ```python tst.soundspeed.compute() distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FC8AA20>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FCB82D0>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FCB8240>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FCB81F8>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FCB81B0>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FCB8360>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FCB83A8>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FCB8510>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FCB8750>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FCB8990>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FCB8BD0>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FCB8E10>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FC9D090>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FC9D2D0>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FC9D510>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FC9D750>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FC9DC18>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FC9DBD0>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FC9DCA8>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') Traceback (most recent call last): File "<stdin>", line 1, in <module> distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FC9DD38>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\dataarray.py", line 837, in compute return new.load(kwargs) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\dataarray.py", line 811, in load ds = self._to_temp_dataset().load(kwargs) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\dataset.py", line 649, in load evaluated_data = da.compute(lazy_data.values(),* kwargs) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\dask\base.py", line 436, in compute results = schedule(dsk, keys, kwargs) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\client.py", line 2545, in get results = self.gather(packed, asynchronous=asynchronous, direct=direct) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\client.py", line 1845, in gather asynchronous=asynchronous, File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\client.py", line 762, in sync self.loop, func, args, callback_timeout=callback_timeout, kwargs File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\utils.py", line 333, in sync raise exc.with_traceback(tb) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\utils.py", line 317, in f result[0] = yield future File "C:\PydroXL_19\envs\dasktest\lib\site-packages\tornado\gen.py", line 735, in run value = future.result() File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\client.py", line 1701, in gather raise exception.with_traceback(traceback) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\dask\array\core.py", line 106, in getter c = np.asarray(c) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\numpy\core_asarray.py", line 85, in asarray return array(a, dtype, copy=False, order=order) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\indexing.py", line 481, in array return np.asarray(self.array, dtype=dtype) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\numpy\core_asarray.py", line 85, in asarray return array(a, dtype, copy=False, order=order) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\indexing.py", line 643, in array return np.asarray(self.array, dtype=dtype) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\numpy\core_asarray.py", line 85, in asarray return array(a, dtype, copy=False, order=order) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\indexing.py", line 547, in array return np.asarray(array[self.key], dtype=None) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\netCDF4.py", line 72, in getitem key, self.shape, indexing.IndexingSupport.OUTER, self.getitem File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\indexing.py", line 827, in explicit_indexing_adapter result = raw_indexing_method(raw_key.tuple) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\netCDF4.py", line 83, in getitem original_array = self.get_array(needs_lock=False) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\netCDF4.py", line 62, in get_array ds = self.datastore.acquire(needs_lock) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\netCDF4.py", line 360, in _acquire with self._manager.acquire_context(needs_lock) as root: File "C:\PydroXL_19\envs\dasktest\lib\contextlib.py", line 81, in enter return next(self.gen) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\file_manager.py", line 186, in acquire_context file, cached = self._acquire_with_cache_info(needs_lock) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\file_manager.py", line 204, in _acquire_with_cache_info file = self._opener(*self._args, kwargs) File "netCDF4_netCDF4.pyx", line 2321, in netCDF4._netCDF4.Dataset.init File "netCDF4_netCDF4.pyx", line 1885, in netCDF4._netCDF4._ensure_nc_success OSError: [Errno -101] NetCDF: HDF error: b'D:\dasktest\data_dir\EM2040\converted\test\test4.nc' ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3684/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1244030662	I_kwDOAMm_X85KJmbG	6625	Why am I getting 'Passing method to Float64Index.get_loc is deprecated' error when using the .sel method to extract some data, and how do I solve it?	ghost 10137	closed	5	2022-05-21T16:40:53Z	2022-09-26T08:47:03Z	2022-07-09T00:41:53Z	NONE			What is your issue? `climateModels['CSIRO-QCCCE-CSIRO-Mk3-6']['RCP 45'][2]['tasmax'].sel(lon = 74, lat= 31, time = '2041-06-16', method='nearest').data[0]` `\anaconda3\lib\site-packages\xarray\core\indexes.py:234: FutureWarning: Passing method to Float64Index.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead. I don't know much about how to solve this issue, can anyone help me out please?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6625/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1299316581	I_kwDOAMm_X85Ncf9l	6766	xr.open_dataset(url) gives NetCDF4 (lru_cache.py) error "oc_open: Could not read url"	ghost 10137	closed	8	2022-07-08T18:15:18Z	2022-07-11T14:49:10Z	2022-07-11T14:49:09Z	NONE			What is your issue? This code I use was working about a year ago but today gives me an error: `import xarray as xr url = 'http://psl.noaa.gov/thredds/dodsC/Datasets/NARR/monolevel/uwnd.10m.2000.nc' ds = xr. open_dataset(url)` The Traceback includes the following: `File "C:\Users\Codiga_D\AppData\Local\Continuum\miniconda3\envs\EQ\lib\site-packages\xarray\backends\lru_cache.py", line 53, in __getitem__ value = self._cache[key]` and ``` OSError: [Errno -68] NetCDF: I/O failure: b'http://psl.noaa.gov/thredds/dodsC/Datasets/NARR/monolevel/uwnd.10m.2000.nc' Note:Caching=1 Error:curl error: SSL connect error curl error details: Warning:oc_open: Could not read url ``` I have confirmed that the file I am trying to read is on the server and the server is not requiring a password (nothing I am aware of, about the server, has changed since my code used to work successfully). I am on Windows using a conda virtual env (no pip). My xarray is 0.20.2 and my netCDF4 is 1.6.0-- these are almost certainly more recent than the ones I was using when my code used to succeed, but I didn't record which version(s) used to work. It was suggested that I pin netcdf4 to 1.5.8, so I tried this but got the same error. Recently I had to update security certificates locally here, and this could be related, but I'm not sure. Any suggestions for how I should troubleshoot this? Also, should I post an issue at https://github.com/Unidata/netcdf4-python instead of, and/or in addition to, this one? I found these issues, which seem possibly related, but don't seem to be resolved well yet: https://github.com/Unidata/netcdf4-python/issues/755 https://github.com/pydata/xarray/issues/4925 (I also opened 'discussion' #6742 but so far there has been little response there.)	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6766/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
270677100	MDExOlB1bGxSZXF1ZXN0MTUwMzA4NTg0	1682	Add option “engine”	ghost 10137	closed	11	2017-11-02T14:38:07Z	2022-04-15T02:01:28Z	2022-04-15T02:01:28Z	NONE	0	pydata/xarray/pulls/1682	Implements a new xarray option `engine` for setting the default backend data read/write engine. Inspired by this Stack Overflow answer. This PR is not ready for merge yet but I wanted to verify if the code changes are on the right track. The default `engine` option value is `None`. If this option is set the `_get_default_engine()` function will return its value without going through the import statements chain. [ ] Closes #xxxx [ ] Tests added / passed [ ] Passes `git diff upstream/master */py \| flake8 --diff` [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1682/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
910844095	MDU6SXNzdWU5MTA4NDQwOTU=	5434	xarray.open_rasterio	ghost 10137	closed	2	2021-06-03T20:51:38Z	2022-04-09T01:31:26Z	2022-04-09T01:31:26Z	NONE			Could you please change `xarray.open_rasterio` from `experimental` to `stable` with more faster capability of reading geotiff files (if possible)? For original array indexing capabilities, I would like to stick in `xarray` than `rioxarray`. With much respected. Thank you.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5434/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
853260893	MDExOlB1bGxSZXF1ZXN0NjExMzc3OTQ0	5131	Remove trailing space from DatasetGroupBy repr	ghost 10137	closed	1	2021-04-08T09:19:30Z	2021-04-08T14:49:15Z	2021-04-08T14:49:15Z	NONE	0	pydata/xarray/pulls/5131	Remove trailing whitespace from DatasetGroupBy representation because flake8 reports it as a violation when present in doctests. Fix #5130	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5131/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
853168658	MDU6SXNzdWU4NTMxNjg2NTg=	5130	Trailing whitespace in DatasetGroupBy text representation	ghost 10137	closed	1	2021-04-08T07:39:08Z	2021-04-08T14:49:14Z	2021-04-08T14:49:14Z	NONE			When displaying a DatasetGroupBy in an interactive Python session, the first line of output contains a trailing whitespace. The first example in the documentation demonstrate this: ```pycon import xarray as xr, numpy as np ds = xr.Dataset( ... {"foo": (("x", "y"), np.random.rand(4, 3))}, ... coords={"x": [10, 20, 30, 40], "letters": ("x", list("abba"))}, ... ) ds.groupby("letters") DatasetGroupBy, grouped over 'letters' 2 groups with labels 'a', 'b'. ``` There is a trailing whitespace in the first line of output which is "DatasetGroupBy, grouped over 'letters' ". This can be seen more clearly by converting the object to a string (note the whitespace before `\n`): ```pycon str(ds.groupby("letters")) "DatasetGroupBy, grouped over 'letters' \n2 groups with labels 'a', 'b'." ``` While this isn't a problem in itself, it causes an issue for us because we use flake8 in continuous integration to verify that our code is correctly formatted and we also have doctests that rely on DatasetGroupBy textual representation. Flake8 reports a violation on the trailing whitespaces in our docstrings. If we remove the trailing whitespaces, our doctests fail because the expected output doesn't match the actual output. So we have conflicting constraints coming from our tools which both seem reasonable. Trailing whitespaces are forbidden by flake8 because, among other reasons, they lead to noisy git diffs. Doctest want the expected output to be exactly the same as the actual output and considers a trailing whitespace to be a significant difference. We could configure flake8 to ignore this particular violation for the files in which we have these doctests, but this may cause other trailing whitespaces to creep in our code, which we don't want. Unfortunately it's not possible to just add `# NoQA` comments to get flake8 to ignore the violation only for specific lines because that creates a difference between expected and actual output from doctest point of view. Flake8 doesn't allow to disable checks for blocks of code either. Is there a reason for having this trailing whitespace in DatasetGroupBy representation? Whould it be OK to remove it? If so please let me know and I can make a pull request.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5130/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
323703742	MDU6SXNzdWUzMjM3MDM3NDI=	2139	From pandas to xarray without blowing up memory	ghost 10137	closed	15	2018-05-16T16:51:09Z	2020-10-14T19:34:54Z	2019-08-27T08:54:26Z	NONE			I have a billion rows of data, but really it's just two categorical variables, time, lat, lon and some data variables. Thinking it would somehow help me get the data into xarray, I created a five level pandas MultiIndex array out of the data, but thus far this has not been successful. xarray tries to create a product and that's just not going to work.. Trying to write a NetCDF file has presented its own issues, and I'm left wondering if there isn't a much simpler way to go about this?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2139/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
454073421	MDU6SXNzdWU0NTQwNzM0MjE=	3007	NaN values for variables when converting from a pandas dataframe to xarray.DataSet	ghost 10137	closed	5	2019-06-10T09:15:21Z	2020-03-23T13:15:16Z	2020-03-23T13:15:15Z	NONE			Code Sample, a copy-pastable example if possible ```python wind_surface hurs bui fwi lat lon time 34.511383 16.467664 1971-01-10 12:00:00 29.658546 70.481293 ... 8.134300 7.409146 34.515558 16.723973 1971-01-10 12:00:00 30.896049 71.356644 ... 8.874528 8.399877 34.517359 16.852138 1971-01-10 12:00:00 31.514799 71.708603 ... 8.789351 8.763743 34.518970 16.980310 1971-01-10 12:00:00 32.105423 72.023773 ... 8.962551 9.125644 34.520391 17.108487 1971-01-10 12:00:00 32.724174 72.106110 ... 8.725038 9.249104 [5 rows x 10 columns] In [81]: df.to_xarray() Out[81]: <xarray.Dataset> Dimensions: (lat: 5, lon: 5, time: 1) Coordinates: * lat (lat) float64 34.51 34.52 34.52 34.52 34.52 * lon (lon) float64 16.47 16.72 16.85 16.98 17.11 * time (time) object '1971-01-10 12:00:00' Data variables: wind_surface (lat, lon, time) float64 29.658546 nan nan ... nan 32.724174 hurs (lat, lon, time) float64 70.48129 nan nan ... nan nan 72.10611 precip (lat, lon, time) float64 0.0 nan nan nan ... nan nan nan 0.0 tmax (lat, lon, time) float64 16.060822 nan nan ... nan 16.185822 ffmc (lat, lon, time) float64 83.58528 nan nan ... nan nan 84.05673 isi (lat, lon, time) float64 7.7641253 nan nan ... nan nan 9.64494 dmc (lat, lon, time) float64 6.797345 nan nan ... nan nan 7.90833 dc (lat, lon, time) float64 25.314878 nan nan ... nan 24.324644 bui (lat, lon, time) float64 8.1343 nan nan ... nan nan 8.725038 fwi (lat, lon, time) float64 7.409146 nan nan ... nan 9.2491045 ``` Problem description Hi, I get those nan values for variables when I try to convert from a pandas.DataFrame with MultiIndex to a xarray.DataArray. The same happend if I try to build a xarray.Dataset and then unstack the multiindex as shown below: `python ds = xr.Dataset(df) ds.unstack('dim_0') <xarray.Dataset> Dimensions: (lat: 5, lon: 5, time: 1) Coordinates: * lat (lat) float64 34.51 34.52 34.52 34.52 34.52 * lon (lon) float64 16.47 16.72 16.85 16.98 17.11 * time (time) object '1971-01-10 12:00:00' Data variables: wind_surface (lat, lon, time) float32 29.658546 nan nan ... nan 32.724174 hurs (lat, lon, time) float32 70.48129 nan nan ... nan nan 72.10611 precip (lat, lon, time) float32 0.0 nan nan nan ... nan nan nan 0.0` Maybe it's not an issue. I don't know. I'm lost. Any help is welcome. Regards Output of `xr.show_versions()` # Paste the output here xr.show_versions() here INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 (default, May 9 2019, 11:55:04) [GCC 8.3.0] python-bits: 64 OS: Linux OS-release: 5.0.0-16-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.12.1 pandas: 0.24.2 numpy: 1.16.3 scipy: 1.3.0 netCDF4: 1.5.2 pydap: installed h5netcdf: 0.7.3 h5py: 2.9.0 Nio: None zarr: 2.3.1 cftime: 1.0.1 nc_time_axis: 1.1.0 PseudonetCDF: None rasterio: 1.0.23 cfgrib: None iris: 2.3.0dev0 bottleneck: 1.2.1 dask: 1.2.2 distributed: None matplotlib: 3.1.0 cartopy: 0.17.1.dev168+ seaborn: 0.9.0 setuptools: 40.8.0 pip: 19.1.1 conda: None pytest: None IPython: 7.5.0 sphinx: 2.0.1	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3007/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
270701183	MDExOlB1bGxSZXF1ZXN0MTUwMzI2NzMw	1683	Add h5netcdf to the engine import hierarchy	ghost 10137	closed	2	2017-11-02T15:39:35Z	2018-06-05T05:16:40Z	2018-02-12T16:06:44Z	NONE	0	pydata/xarray/pulls/1683	h5netcdf is now part of the import statements in the `_get_default_engine()` function. The order is: netcdf4, scipy.io.netcdf, h5netcdf. [ ] Closes #xxxx [ ] Tests added / passed [ ] Passes `git diff upstream/master */py \| flake8 --diff` [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1683/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
291926319	MDU6SXNzdWUyOTE5MjYzMTk=	1860	IndexError when accesing a data variable through a PydapDataStore	ghost 10137	closed	4	2018-01-26T14:58:14Z	2018-01-27T08:41:21Z	2018-01-27T08:41:21Z	NONE			Code Sample, a copy-pastable example if possible `python import xarray as xa from pydap.cas.urs import setup_session url = 'https://goldsmr4.gesdisc.eosdis.nasa.gov/dods/M2T1NXFLX' session = setup_session(username='**', password='', check_url=url) store = xa.backends.PydapDataStore.open(url, session=session) ds = xa.open_dataset(store) tlml = ds['tlml'] print(tlml[0,0,0])` Problem description I was trying to connect to NASA MERRA-2 data through OPeNDAP, following the documentation here: http://xarray.pydata.org/en/stable/io.html#OPeNDAP. Opening the dataset works fine, but trying to access a data variable throws a strange `IndexError`. Traceback below: ``` Traceback (most recent call last): File "C:\Anaconda3\envs\jaws\lib\site-packages\IPython\core\formatters.py", line 702, in call printer.pretty(obj) File "C:\Anaconda3\envs\jaws\lib\site-packages\IPython\lib\pretty.py", line 395, in pretty return default_pprint(obj, self, cycle) File "C:\Anaconda3\envs\jaws\lib\site-packages\IPython\lib\pretty.py", line 510, in _default_pprint _repr_pprint(obj, p, cycle) File "C:\Anaconda3\envs\jaws\lib\site-packages\IPython\lib\pretty.py", line 701, in _repr_pprint output = repr(obj) File "c:\src\xarray\xarray\core\common.py", line 100, in __repr__ return formatting.array_repr(self) File "c:\src\xarray\xarray\core\formatting.py", line 393, in array_repr summary.append(short_array_repr(arr.values)) File "c:\src\xarray\xarray\core\dataarray.py", line 411, in values return self.variable.values File "c:\src\xarray\xarray\core\variable.py", line 392, in values return _as_array_or_item(self._data) File "c:\src\xarray\xarray\core\variable.py", line 216, in _as_array_or_item data = np.asarray(data) File "C:\Anaconda3\envs\jaws\lib\site-packages\numpy\core\numeric.py", line 492, in asarray return array(a, dtype, copy=False, order=order) File "c:\src\xarray\xarray\core\indexing.py", line 572, in __array__ self._ensure_cached() File "c:\src\xarray\xarray\core\indexing.py", line 569, in _ensure_cached self.array = NumpyIndexingAdapter(np.asarray(self.array)) File "C:\Anaconda3\envs\jaws\lib\site-packages\numpy\core\numeric.py", line 492, in asarray return array(a, dtype, copy=False, order=order) File "c:\src\xarray\xarray\core\indexing.py", line 553, in __array__ return np.asarray(self.array, dtype=dtype) File "C:\Anaconda3\envs\jaws\lib\site-packages\numpy\core\numeric.py", line 492, in asarray return array(a, dtype, copy=False, order=order) File "c:\src\xarray\xarray\core\indexing.py", line 520, in __array__ return np.asarray(array[self.key], dtype=None) File "c:\src\xarray\xarray\conventions.py", line 134, in __getitem__ return np.asarray(self.array[key], dtype=self.dtype) File "c:\src\xarray\xarray\coding\variables.py", line 71, in __getitem__ return self.func(self.array[key]) File "c:\src\xarray\xarray\coding\variables.py", line 140, in _apply_mask data = np.asarray(data, dtype=dtype) File "C:\Anaconda3\envs\jaws\lib\site-packages\numpy\core\numeric.py", line 492, in asarray return array(a, dtype, copy=False, order=order) File "c:\src\xarray\xarray\core\indexing.py", line 520, in __array__ return np.asarray(array[self.key], dtype=None) File "c:\src\xarray\xarray\backends\pydap.py", line 33, in getitem result = robust_getitem(array, key, catch=ValueError) File "c:\src\xarray\xarray\backends\common.py", line 67, in robust_getitem return array[key] File "C:\src\pydap\src\pydap\model.py", line 320, in getitem out.data = self._get_data_index(index) File "C:\src\pydap\src\pydap\model.py", line 350, in _get_data_index return self._data[index] File "C:\src\pydap\src\pydap\handlers\dap.py", line 149, in getitem return dataset[self.id].data File "C:\src\pydap\src\pydap\model.py", line 426, in getitem return self._getitem_string(key) File "C:\src\pydap\src\pydap\model.py", line 410, in _getitem_string return self[splitted[0]]['.'.join(splitted[1:])] File "C:\src\pydap\src\pydap\model.py", line 320, in getitem** out.data = self._get_data_index(index) File "C:\src\pydap\src\pydap\model.py", line 350, in _get_data_index return self._data[index] IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices ``` Expected Output Expecting to see the value of variable 'tlml' at these coordinates. Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.6.4.final.0 python-bits: 64 OS: Windows OS-release: 7 machine: AMD64 processor: Intel64 Family 6 Model 61 Stepping 4, GenuineIntel byteorder: little LC_ALL: None LANG: en LOCALE: None.None xarray: 0.10.0+dev44.g0a0593d pandas: 0.22.0 numpy: 1.14.0 scipy: 1.0.0 netCDF4: 1.3.1 h5netcdf: 0.5.0 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: 0.16.1 distributed: 1.20.2 matplotlib: None cartopy: None seaborn: None setuptools: 38.4.0 pip: 9.0.1 conda: None pytest: None IPython: 6.2.1 sphinx: 1.6.6	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1860/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
291524555	MDU6SXNzdWUyOTE1MjQ1NTU=	1857	AttributeError: '<class 'pydap.model.GridType'>' object has no attribute 'shape'	ghost 10137	closed	6	2018-01-25T10:42:20Z	2018-01-26T13:25:07Z	2018-01-26T13:25:07Z	NONE			Code Sample, a copy-pastable example if possible `python import xarray as xa from pydap.cas.urs import setup_session url = 'https://goldsmr4.gesdisc.eosdis.nasa.gov/dods/M2T1NXFLX' session = setup_session(username='**', password='**', check_url=url) store = xa.backends.PydapDataStore.open(url, session=session) ds = xa.open_dataset(store)` Problem description I was trying to connect to NASA MERRA-2 data through OPeNDAP, following the documentation here: http://xarray.pydata.org/en/stable/io.html#OPeNDAP. I was able to get through a previous bug (#1775) by installing the latest master version of xarray. Expected Output Expecting the collection (M2T1NXFLX) content to show up as an xarray dataset. Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.6.4.final.0 python-bits: 64 OS: Windows OS-release: 7 machine: AMD64 processor: Intel64 Family 6 Model 61 Stepping 4, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None xarray: 0.10.0+dev44.g0a0593d pandas: 0.22.0 numpy: 1.14.0 scipy: 1.0.0 netCDF4: None h5netcdf: None Nio: None zarr: None bottleneck: None cyordereddict: None dask: None distributed: None matplotlib: None cartopy: None seaborn: None setuptools: 38.4.0 pip: 9.0.1 conda: None pytest: None IPython: None sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1857/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
245186903	MDU6SXNzdWUyNDUxODY5MDM=	1486	boolean indexing	ghost 10137	closed	2	2017-07-24T19:39:42Z	2017-09-07T08:06:49Z	2017-09-07T08:06:48Z	NONE			I am trying to figure out how boolean indexing works in xarray. I have a couple of data arrays below: `X_day[latitude,longitude,time] X_night[latitude,longitude,time] Rule[latitude,longitude]` I want to merge X_day and X_night as a new X based on Rule. First I make a copy of X_day to be X: `X = X_day` Then I tried: `X[Rule==True, :] = X_night` and this: `X.values[Rule==True, :] = X_night.values` and also this: `X.where(Rule==True) = X_night` None of the above assignment worked. Please help.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1486/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
244702576	MDU6SXNzdWUyNDQ3MDI1NzY=	1484	Matrix cross product in xarray	ghost 10137	closed	4	2017-07-21T15:21:37Z	2017-07-24T16:36:22Z	2017-07-24T16:36:22Z	NONE			Hi I am new to xarray and need some advice on one task. I need to do a cross product calculation using variables from a netcdf file and write the output to a new netcdf file. I feel this could be done using netcdf-python and pandas, but I hope to use xarray to simplify the task. My code will be something like this: ds = xr.open_dataset(NC_FILE) var1 = ds['VAR1'] var2 = ds['VAR2'] var3 = ds['VAR3'] var4 = ds['VAR4'] var1-4 above will have dimensions [latitude, longitude]. I will use var1-4 to generate a matrix of dimensions [Nlat, Nlon, M], something like: [var1, var1-var2, var1-var3, (var1-var2)np.cos(var4)] (here, M=4). My question here is, how do I build this matrix in xarray Dataset? Since this matrix will be eventually used to cross product with another matrix (pd.DataFrame) of dimensions [M, K], is it better to convert var1-4 to pd.DataFrame first? Following code will be like this: matrix = [var1, var1-var2, var1-var3, (var1-var2)np.cos(var4)] # Nlat x Nlon x M factor = pd.read_csv(somefile) # M x K result = pd.DataFrame.dot(matrix,factor) # Nlat x Nlon x K result2 = xr.Dataset(result) result2.to_netcdf(outfile) Can someone show me the correct code to build the Nlat x Nlon x M matrix? Can the cross product be done in xr.Dataset to avoid conversion to and from pd.DataFrame? Thank you, Xin	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1484/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
98442885	MDU6SXNzdWU5ODQ0Mjg4NQ==	505	Resampling drops datavars with unsigned integer datatypes	ghost 10137	closed	1	2015-07-31T18:04:51Z	2015-07-31T19:44:32Z	2015-07-31T19:44:32Z	NONE			If a variable has an unsigned integer type (uint16, uint32, etc.), resampling time will drop that variable. Does not occur with signed integer types (int16, etc.). ``` import numpy as np import pandas as pd import xray numbers = np.arange(1, 6).astype('uint32') ds = xray.Dataset( {'numbers': ('time', numbers)}, coords = {'time': pd.date_range('2000-01-01', periods=5)}) resampled = ds.resample('24H', dim='time') assert 'numbers' not in resampled ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/505/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
91676831	MDU6SXNzdWU5MTY3NjgzMQ==	448	asarray Compatibility	ghost 10137	closed	3	2015-06-29T02:45:25Z	2015-06-30T23:02:57Z	2015-06-30T23:02:57Z	NONE			To "numpify" a function, usually asarray is used: def db2w(arr): return 10 ** (np.asarray(arr) / 20.0) Now you could replace the divide with np.divide, but it seems much simpler to use np.asarray. Unfortunately, if you use any function that has been "vectorized," it will only return the values of the DataArray as an ndarray. This strips the object of any "xray" meta-data and severely limits the use of this class. It requires that any function that wants to work seamlessly with a DataArray explicitly check that it's an instance of DataArray This seems counter-intuitive to the numpy framework where any function, once properly vectorized, can work with python scalars (int, float) or list types (tuple, list) as well as the actual ndarray class. It would be awesome if code that worked for these cases could just work for DataArrays.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/448/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);

issues

17 rows where user = 10137 sorted by updated_at descending

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

MVCE confirmation

Relevant log output

```Python

Anything else we need to know?

Environment

What is your issue?

What is your issue?

Code Sample, a copy-pastable example if possible

Problem description

Output of `xr.show_versions()`

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of `xr.show_versions()`

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of `xr.show_versions()`

Advanced export

issues

17 rows where user = 10137 sorted by updated_at descending

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

MVCE confirmation

Relevant log output

```Python

Anything else we need to know?

Environment

What is your issue?

What is your issue?

Code Sample, a copy-pastable example if possible

Problem description

Output of xr.show_versions()

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of xr.show_versions()

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of xr.show_versions()

Advanced export

Output of `xr.show_versions()`

Output of `xr.show_versions()`

Output of `xr.show_versions()`