issues
17 rows where user = 10137 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: comments, created_at (date), updated_at (date), closed_at (date)
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at ▲ | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1605108888 | I_kwDOAMm_X85frASY | 7574 | xr.open_mfdataset doesn't work with fsspec and dask | ghost 10137 | closed | 0 | 12 | 2023-03-01T14:45:56Z | 2023-09-08T00:33:41Z | 2023-09-08T00:33:41Z | NONE | What happened?I was trying to read multiple byte netcdf (requires h5netcdf engine) file with xr.open_mfdataset with parallel=True to leverage dask.delayed capabilities (parallel=False works though) but it failed. The netcdf files were noaa-goes16 satellite images, but I can't tell if it matters. What did you expect to happen?It should have loaded all the netcdf files into a xarray.DataSet object Minimal Complete Verifiable Example```python import fsspec import xarray as xr paths = [ 's3://noaa-goes16/ABI-L2-LSTC/2022/185/03/OR_ABI-L2-LSTC-M6_G16_s20221850301180_e20221850303553_c20221850305091.nc', 's3://noaa-goes16/ABI-L2-LSTC/2022/185/02/OR_ABI-L2-LSTC-M6_G16_s20221850201180_e20221850203553_c20221850205142.nc' ] fs = fsspec.filesystem('s3') xr.open_mfdataset( [fs.open(path, mode="rb") for path in paths], engine="h5netcdf", combine="nested", concat_dim="t", parallel=True ) ``` MVCE confirmation
Relevant log output```PythonKeyError Traceback (most recent call last) File ~/miniconda3/envs/rxr/lib/python3.11/site-packages/xarray/backends/file_manager.py:210, in CachingFileManager._acquire_with_cache_info(self, needs_lock) 209 try: --> 210 file = self._cache[self._key] 211 except KeyError: File ~/miniconda3/envs/rxr/lib/python3.11/site-packages/xarray/backends/lru_cache.py:56, in LRUCache.getitem(self, key) 55 with self._lock: ---> 56 value = self._cache[key] 57 self._cache.move_to_end(key) ?[0;31mKeyError?[0m: [<class 'h5netcdf.core.File'>, ((b'\x89HDF\r\n', b'\x1a\n', b'\x02\x08\x08\x00\x00\x00 ... EXTREMELY STRING ... 00\x00\x00\x00\x00\x00\x0ef'] During handling of the above exception, another exception occurred: TypeError Traceback (most recent call last) Cell In[9], line 11 4 paths = [ 5 's3://noaa-goes16/ABI-L2-LSTC/2022/185/03/OR_ABI-L2-LSTC-M6_G16_s20221850301180_e20221850303553_c20221850305091.nc', 6 's3://noaa-goes16/ABI-L2-LSTC/2022/185/02/OR_ABI-L2-LSTC-M6_G16_s20221850201180_e20221850203553_c20221850205142.nc' 7 ] 9 fs = fsspec.filesystem('s3') ---> 11 xr.open_mfdataset( 12 [fs.open(path, mode="rb") for path in paths], 13 engine="h5netcdf", 14 combine="nested", 15 concat_dim="t", 16 parallel=True 17 ).LST File ~/miniconda3/envs/rxr/lib/python3.11/site-packages/xarray/backends/api.py:991, in open_mfdataset(paths, chunks, concat_dim, compat, preprocess, engine, data_vars, coords, combine, parallel, join, attrs_file, combine_attrs, **kwargs) 986 datasets = [preprocess(ds) for ds in datasets] 988 if parallel: 989 # calling compute here will return the datasets/file_objs lists, 990 # the underlying datasets will still be stored as dask arrays --> 991 datasets, closers = dask.compute(datasets, closers) 993 # Combine all datasets, closing them in case of a ValueError 994 try: File ~/miniconda3/envs/rxr/lib/python3.11/site-packages/dask/base.py:599, in compute(traverse, optimize_graph, scheduler, get, args, kwargs) 596 keys.append(x.dask_keys()) 597 postcomputes.append(x.dask_postcompute()) --> 599 results = schedule(dsk, keys, kwargs) 600 return repack([f(r, a) for r, (f, a) in zip(results, postcomputes)]) File ~/miniconda3/envs/rxr/lib/python3.11/site-packages/dask/threaded.py:89, in get(dsk, keys, cache, num_workers, pool, kwargs) 86 elif isinstance(pool, multiprocessing.pool.Pool): 87 pool = MultiprocessingPoolExecutor(pool) ---> 89 results = get_async( 90 pool.submit, 91 pool._max_workers, 92 dsk, 93 keys, 94 cache=cache, 95 get_id=_thread_get_id, 96 pack_exception=pack_exception, 97 kwargs, 98 ) 100 # Cleanup pools associated to dead threads 101 with pools_lock: File ~/miniconda3/envs/rxr/lib/python3.11/site-packages/dask/local.py:511, in get_async(submit, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, chunksize, **kwargs) 509 _execute_task(task, data) # Re-execute locally 510 else: --> 511 raise_exception(exc, tb) 512 res, worker_id = loads(res_info) 513 state["cache"][key] = res File ~/miniconda3/envs/rxr/lib/python3.11/site-packages/dask/local.py:319, in reraise(exc, tb) 317 if exc.traceback is not tb: 318 raise exc.with_traceback(tb) --> 319 raise exc File ~/miniconda3/envs/rxr/lib/python3.11/site-packages/dask/local.py:224, in execute_task(key, task_info, dumps, loads, get_id, pack_exception) 222 try: 223 task, data = loads(task_info) --> 224 result = _execute_task(task, data) 225 id = get_id() 226 result = dumps((result, id)) File ~/miniconda3/envs/rxr/lib/python3.11/site-packages/dask/core.py:119, in _execute_task(arg, cache, dsk) 115 func, args = arg[0], arg[1:] 116 # Note: Don't assign the subtask results to a variable. numpy detects 117 # temporaries by their reference count and can execute certain 118 # operations in-place. --> 119 return func(*(_execute_task(a, cache) for a in args)) 120 elif not ishashable(arg): 121 return arg File ~/miniconda3/envs/rxr/lib/python3.11/site-packages/dask/utils.py:73, in apply(func, args, kwargs)
42 """Apply a function given its positional and keyword arguments.
43
44 Equivalent to TypeError: expected str, bytes or os.PathLike object, not tuple ``` Anything else we need to know?No response Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.11.0 | packaged by conda-forge | (main, Jan 15 2023, 05:44:48) [Clang 14.0.6 ]
python-bits: 64
OS: Darwin
OS-release: 21.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: (None, 'UTF-8')
libhdf5: 1.12.2
libnetcdf: None
xarray: 2023.2.0
pandas: 1.5.3
numpy: 1.24.2
scipy: 1.10.1
netCDF4: None
pydap: None
h5netcdf: 1.1.0
h5py: 3.8.0
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.3.6
cfgrib: None
iris: None
bottleneck: None
dask: 2023.2.1
distributed: 2023.2.1
matplotlib: 3.7.0
cartopy: 0.21.1
seaborn: 0.12.2
numbagg: None
fsspec: 2023.1.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 67.4.0
pip: 23.0.1
conda: None
pytest: 7.2.1
mypy: None
IPython: 8.10.0
sphinx: None
[/Users/jo/miniconda3/envs/rxr/lib/python3.11/site-packages/_distutils_hack/__init__.py:33](https://file+.vscode-resource.vscode-cdn.net/Users/jo/miniconda3/envs/rxr/lib/python3.11/site-packages/_distutils_hack/__init__.py:33): UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7574/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
548263148 | MDU6SXNzdWU1NDgyNjMxNDg= | 3684 | open_mfdataset - different behavior with dask.distributed.LocalCluster | ghost 10137 | open | 0 | 3 | 2020-01-10T19:58:19Z | 2023-09-05T10:56:23Z | NONE | Big fan of Xarray! Not that familiar with submitting tickets like this, so my apologies for rule breaking. Also, if this belongs over in the dask project, I can move there. dask 2.6.0 numpy 1.17.3 xarray 0.14.1 netCDF4 1.5.3 I am attempting to use open_mfdataset on nc files I've generated through dask/xarray after initializing the dask LocalCluster. I've found that I am able to compute successfully when I don't run the distributed cluster. But if I do, I get a variety of issues. I've got a synthetic data generating example here. Running the soundspeed.compute() will sometimes succeed, and will sometimes cause worker restarts resulting in hdf errors and no return. I was thinking it was something with serialization, i've seen other tickets with similar issues, but I don't see how it applies to my test case. Example code: ```python import numpy as np import xarray as xr import os from dask.distributed import Client cl = Client() outpth = r'D:\dasktest\data_dir\EM2040\converted\test' mint = 0 maxt = 1000 for i in range(100): times = np.arange(mint, maxt) beams = np.arange(250) sectors=['40107_0_260000', '40107_1_320000', '40107_2_290000'] soundspeed = np.random.randn(1000,3,250) ds = xr.Dataset({'soundspeed': (('time','sectors','beams'), soundspeed)}, {'time': times, 'sectors': sectors, 'beams':beams},) ds.to_netcdf(os.path.join(outpth, 'test{}.nc'.format(i)), mode='w') mint = maxt maxt += 1000 fils = [os.path.join(outpth, x) for x in os.listdir(outpth) if os.path.splitext(x)[1] == '.nc'] tst = xr.open_mfdataset(fils, concat_dim='time', combine='nested') tst.soundspeed.compute() ``` I've found that running this example with <10 files reduces the number of errors I'm getting dramatically. I've tried this on different machines in different domain environments just to be sure. I really just want to make sure I'm not making a silly mistake somewhere. Appreciate the help. My last run on actual data: ```python
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\dataarray.py", line 837, in compute return new.load(kwargs) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\dataarray.py", line 811, in load ds = self._to_temp_dataset().load(kwargs) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\dataset.py", line 649, in load evaluated_data = da.compute(lazy_data.values(), kwargs) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\dask\base.py", line 436, in compute results = schedule(dsk, keys, kwargs) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\client.py", line 2545, in get results = self.gather(packed, asynchronous=asynchronous, direct=direct) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\client.py", line 1845, in gather asynchronous=asynchronous, File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\client.py", line 762, in sync self.loop, func, args, callback_timeout=callback_timeout, kwargs File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\utils.py", line 333, in sync raise exc.with_traceback(tb) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\utils.py", line 317, in f result[0] = yield future File "C:\PydroXL_19\envs\dasktest\lib\site-packages\tornado\gen.py", line 735, in run value = future.result() File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\client.py", line 1701, in gather raise exception.with_traceback(traceback) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\dask\array\core.py", line 106, in getter c = np.asarray(c) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\numpy\core_asarray.py", line 85, in asarray return array(a, dtype, copy=False, order=order) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\indexing.py", line 481, in array return np.asarray(self.array, dtype=dtype) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\numpy\core_asarray.py", line 85, in asarray return array(a, dtype, copy=False, order=order) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\indexing.py", line 643, in array return np.asarray(self.array, dtype=dtype) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\numpy\core_asarray.py", line 85, in asarray return array(a, dtype, copy=False, order=order) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\indexing.py", line 547, in array return np.asarray(array[self.key], dtype=None) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\netCDF4.py", line 72, in getitem key, self.shape, indexing.IndexingSupport.OUTER, self.getitem File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\indexing.py", line 827, in explicit_indexing_adapter result = raw_indexing_method(raw_key.tuple) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\netCDF4.py", line 83, in getitem original_array = self.get_array(needs_lock=False) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\netCDF4.py", line 62, in get_array ds = self.datastore.acquire(needs_lock) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\netCDF4.py", line 360, in _acquire with self._manager.acquire_context(needs_lock) as root: File "C:\PydroXL_19\envs\dasktest\lib\contextlib.py", line 81, in enter return next(self.gen) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\file_manager.py", line 186, in acquire_context file, cached = self._acquire_with_cache_info(needs_lock) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\file_manager.py", line 204, in _acquire_with_cache_info file = self._opener(*self._args, kwargs) File "netCDF4_netCDF4.pyx", line 2321, in netCDF4._netCDF4.Dataset.init File "netCDF4_netCDF4.pyx", line 1885, in netCDF4._netCDF4._ensure_nc_success OSError: [Errno -101] NetCDF: HDF error: b'D:\dasktest\data_dir\EM2040\converted\rangeangle_20.nc' ``` My last run on the synthetic data set generated above: ```python
distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FCB82D0>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FCB8240>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FCB81F8>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FCB81B0>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FCB8360>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FCB83A8>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FCB8510>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FCB8750>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FCB8990>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FCB8BD0>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FCB8E10>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FC9D090>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FC9D2D0>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FC9D510>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FC9D750>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FC9DC18>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FC9DBD0>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FC9DCA8>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') Traceback (most recent call last): File "<stdin>", line 1, in <module> distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x000001BB5FC9DD38>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(0, 1000, None), slice(0, 3, None), slice(0, 250, None))) kwargs: {} Exception: OSError(-101, 'NetCDF: HDF error') File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\dataarray.py", line 837, in compute return new.load(kwargs) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\dataarray.py", line 811, in load ds = self._to_temp_dataset().load(kwargs) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\dataset.py", line 649, in load evaluated_data = da.compute(lazy_data.values(), kwargs) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\dask\base.py", line 436, in compute results = schedule(dsk, keys, kwargs) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\client.py", line 2545, in get results = self.gather(packed, asynchronous=asynchronous, direct=direct) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\client.py", line 1845, in gather asynchronous=asynchronous, File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\client.py", line 762, in sync self.loop, func, args, callback_timeout=callback_timeout, kwargs File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\utils.py", line 333, in sync raise exc.with_traceback(tb) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\utils.py", line 317, in f result[0] = yield future File "C:\PydroXL_19\envs\dasktest\lib\site-packages\tornado\gen.py", line 735, in run value = future.result() File "C:\PydroXL_19\envs\dasktest\lib\site-packages\distributed\client.py", line 1701, in gather raise exception.with_traceback(traceback) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\dask\array\core.py", line 106, in getter c = np.asarray(c) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\numpy\core_asarray.py", line 85, in asarray return array(a, dtype, copy=False, order=order) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\indexing.py", line 481, in array return np.asarray(self.array, dtype=dtype) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\numpy\core_asarray.py", line 85, in asarray return array(a, dtype, copy=False, order=order) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\indexing.py", line 643, in array return np.asarray(self.array, dtype=dtype) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\numpy\core_asarray.py", line 85, in asarray return array(a, dtype, copy=False, order=order) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\indexing.py", line 547, in array return np.asarray(array[self.key], dtype=None) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\netCDF4.py", line 72, in getitem key, self.shape, indexing.IndexingSupport.OUTER, self.getitem File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\core\indexing.py", line 827, in explicit_indexing_adapter result = raw_indexing_method(raw_key.tuple) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\netCDF4.py", line 83, in getitem original_array = self.get_array(needs_lock=False) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\netCDF4.py", line 62, in get_array ds = self.datastore.acquire(needs_lock) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\netCDF4.py", line 360, in _acquire with self._manager.acquire_context(needs_lock) as root: File "C:\PydroXL_19\envs\dasktest\lib\contextlib.py", line 81, in enter return next(self.gen) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\file_manager.py", line 186, in acquire_context file, cached = self._acquire_with_cache_info(needs_lock) File "C:\PydroXL_19\envs\dasktest\lib\site-packages\xarray\backends\file_manager.py", line 204, in _acquire_with_cache_info file = self._opener(*self._args, kwargs) File "netCDF4_netCDF4.pyx", line 2321, in netCDF4._netCDF4.Dataset.init File "netCDF4_netCDF4.pyx", line 1885, in netCDF4._netCDF4._ensure_nc_success OSError: [Errno -101] NetCDF: HDF error: b'D:\dasktest\data_dir\EM2040\converted\test\test4.nc' ``` |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/3684/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1244030662 | I_kwDOAMm_X85KJmbG | 6625 | Why am I getting 'Passing method to Float64Index.get_loc is deprecated' error when using the .sel method to extract some data, and how do I solve it? | ghost 10137 | closed | 0 | 5 | 2022-05-21T16:40:53Z | 2022-09-26T08:47:03Z | 2022-07-09T00:41:53Z | NONE | What is your issue?
`\anaconda3\lib\site-packages\xarray\core\indexes.py:234: FutureWarning: Passing method to Float64Index.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead. I don't know much about how to solve this issue, can anyone help me out please? |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6625/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
1299316581 | I_kwDOAMm_X85Ncf9l | 6766 | xr.open_dataset(url) gives NetCDF4 (lru_cache.py) error "oc_open: Could not read url" | ghost 10137 | closed | 0 | 8 | 2022-07-08T18:15:18Z | 2022-07-11T14:49:10Z | 2022-07-11T14:49:09Z | NONE | What is your issue?This code I use was working about a year ago but today gives me an error:
Note:Caching=1 Error:curl error: SSL connect error curl error details: Warning:oc_open: Could not read url ``` I have confirmed that the file I am trying to read is on the server and the server is not requiring a password (nothing I am aware of, about the server, has changed since my code used to work successfully). I am on Windows using a conda virtual env (no pip). My xarray is 0.20.2 and my netCDF4 is 1.6.0-- these are almost certainly more recent than the ones I was using when my code used to succeed, but I didn't record which version(s) used to work. It was suggested that I pin netcdf4 to 1.5.8, so I tried this but got the same error. Recently I had to update security certificates locally here, and this could be related, but I'm not sure. Any suggestions for how I should troubleshoot this? Also, should I post an issue at https://github.com/Unidata/netcdf4-python instead of, and/or in addition to, this one? I found these issues, which seem possibly related, but don't seem to be resolved well yet: https://github.com/Unidata/netcdf4-python/issues/755 https://github.com/pydata/xarray/issues/4925 (I also opened 'discussion' #6742 but so far there has been little response there.) |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6766/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
270677100 | MDExOlB1bGxSZXF1ZXN0MTUwMzA4NTg0 | 1682 | Add option “engine” | ghost 10137 | closed | 0 | 11 | 2017-11-02T14:38:07Z | 2022-04-15T02:01:28Z | 2022-04-15T02:01:28Z | NONE | 0 | pydata/xarray/pulls/1682 | Implements a new xarray option This PR is not ready for merge yet but I wanted to verify if the code changes are on the right track. The default
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1682/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
910844095 | MDU6SXNzdWU5MTA4NDQwOTU= | 5434 | xarray.open_rasterio | ghost 10137 | closed | 0 | 2 | 2021-06-03T20:51:38Z | 2022-04-09T01:31:26Z | 2022-04-09T01:31:26Z | NONE | Could you please change |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5434/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
853260893 | MDExOlB1bGxSZXF1ZXN0NjExMzc3OTQ0 | 5131 | Remove trailing space from DatasetGroupBy repr | ghost 10137 | closed | 0 | 1 | 2021-04-08T09:19:30Z | 2021-04-08T14:49:15Z | 2021-04-08T14:49:15Z | NONE | 0 | pydata/xarray/pulls/5131 | Remove trailing whitespace from DatasetGroupBy representation because flake8 reports it as a violation when present in doctests. Fix #5130 |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5131/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
853168658 | MDU6SXNzdWU4NTMxNjg2NTg= | 5130 | Trailing whitespace in DatasetGroupBy text representation | ghost 10137 | closed | 0 | 1 | 2021-04-08T07:39:08Z | 2021-04-08T14:49:14Z | 2021-04-08T14:49:14Z | NONE | When displaying a DatasetGroupBy in an interactive Python session, the first line of output contains a trailing whitespace. The first example in the documentation demonstrate this: ```pycon
There is a trailing whitespace in the first line of output which is "DatasetGroupBy, grouped over 'letters' ". This can be seen more clearly by converting the object to a string (note the whitespace before ```pycon
While this isn't a problem in itself, it causes an issue for us because we use flake8 in continuous integration to verify that our code is correctly formatted and we also have doctests that rely on DatasetGroupBy textual representation. Flake8 reports a violation on the trailing whitespaces in our docstrings. If we remove the trailing whitespaces, our doctests fail because the expected output doesn't match the actual output. So we have conflicting constraints coming from our tools which both seem reasonable. Trailing whitespaces are forbidden by flake8 because, among other reasons, they lead to noisy git diffs. Doctest want the expected output to be exactly the same as the actual output and considers a trailing whitespace to be a significant difference. We could configure flake8 to ignore this particular violation for the files in which we have these doctests, but this may cause other trailing whitespaces to creep in our code, which we don't want. Unfortunately it's not possible to just add Is there a reason for having this trailing whitespace in DatasetGroupBy representation? Whould it be OK to remove it? If so please let me know and I can make a pull request. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5130/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
323703742 | MDU6SXNzdWUzMjM3MDM3NDI= | 2139 | From pandas to xarray without blowing up memory | ghost 10137 | closed | 0 | 15 | 2018-05-16T16:51:09Z | 2020-10-14T19:34:54Z | 2019-08-27T08:54:26Z | NONE | I have a billion rows of data, but really it's just two categorical variables, time, lat, lon and some data variables. Thinking it would somehow help me get the data into xarray, I created a five level pandas MultiIndex array out of the data, but thus far this has not been successful. xarray tries to create a product and that's just not going to work.. Trying to write a NetCDF file has presented its own issues, and I'm left wondering if there isn't a much simpler way to go about this? |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/2139/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
454073421 | MDU6SXNzdWU0NTQwNzM0MjE= | 3007 | NaN values for variables when converting from a pandas dataframe to xarray.DataSet | ghost 10137 | closed | 0 | 5 | 2019-06-10T09:15:21Z | 2020-03-23T13:15:16Z | 2020-03-23T13:15:15Z | NONE | Code Sample, a copy-pastable example if possible```python
wind_surface hurs bui fwi
lat lon time [5 rows x 10 columns] In [81]: df.to_xarray() Problem descriptionHi, I get those nan values for variables when I try to convert from a pandas.DataFrame with MultiIndex to a xarray.DataArray. The same happend if I try to build a xarray.Dataset and then unstack the multiindex as shown below:
Regards Output of
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/3007/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
270701183 | MDExOlB1bGxSZXF1ZXN0MTUwMzI2NzMw | 1683 | Add h5netcdf to the engine import hierarchy | ghost 10137 | closed | 0 | 2 | 2017-11-02T15:39:35Z | 2018-06-05T05:16:40Z | 2018-02-12T16:06:44Z | NONE | 0 | pydata/xarray/pulls/1683 | h5netcdf is now part of the import statements in the
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1683/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
291926319 | MDU6SXNzdWUyOTE5MjYzMTk= | 1860 | IndexError when accesing a data variable through a PydapDataStore | ghost 10137 | closed | 0 | 4 | 2018-01-26T14:58:14Z | 2018-01-27T08:41:21Z | 2018-01-27T08:41:21Z | NONE | Code Sample, a copy-pastable example if possible
Problem descriptionI was trying to connect to NASA MERRA-2 data through OPeNDAP, following the documentation here: http://xarray.pydata.org/en/stable/io.html#OPeNDAP. Opening the dataset works fine, but trying to access a data variable throws a strange File "C:\Anaconda3\envs\jaws\lib\site-packages\IPython\core\formatters.py", line 702, in call
printer.pretty(obj)
File "C:\Anaconda3\envs\jaws\lib\site-packages\IPython\lib\pretty.py", line 395, in pretty
return default_pprint(obj, self, cycle)
File "C:\Anaconda3\envs\jaws\lib\site-packages\IPython\lib\pretty.py", line 510, in _default_pprint
_repr_pprint(obj, p, cycle)
File "C:\Anaconda3\envs\jaws\lib\site-packages\IPython\lib\pretty.py", line 701, in _repr_pprint
output = repr(obj)
File "c:\src\xarray\xarray\core\common.py", line 100, in __repr__
return formatting.array_repr(self)
File "c:\src\xarray\xarray\core\formatting.py", line 393, in array_repr
summary.append(short_array_repr(arr.values))
File "c:\src\xarray\xarray\core\dataarray.py", line 411, in values
return self.variable.values
File "c:\src\xarray\xarray\core\variable.py", line 392, in values
return _as_array_or_item(self._data)
File "c:\src\xarray\xarray\core\variable.py", line 216, in _as_array_or_item
data = np.asarray(data)
File "C:\Anaconda3\envs\jaws\lib\site-packages\numpy\core\numeric.py", line 492, in asarray
return array(a, dtype, copy=False, order=order)
File "c:\src\xarray\xarray\core\indexing.py", line 572, in __array__
self._ensure_cached()
File "c:\src\xarray\xarray\core\indexing.py", line 569, in _ensure_cached
self.array = NumpyIndexingAdapter(np.asarray(self.array))
File "C:\Anaconda3\envs\jaws\lib\site-packages\numpy\core\numeric.py", line 492, in asarray
return array(a, dtype, copy=False, order=order)
File "c:\src\xarray\xarray\core\indexing.py", line 553, in __array__
return np.asarray(self.array, dtype=dtype)
File "C:\Anaconda3\envs\jaws\lib\site-packages\numpy\core\numeric.py", line 492, in asarray
return array(a, dtype, copy=False, order=order)
File "c:\src\xarray\xarray\core\indexing.py", line 520, in __array__
return np.asarray(array[self.key], dtype=None)
File "c:\src\xarray\xarray\conventions.py", line 134, in __getitem__
return np.asarray(self.array[key], dtype=self.dtype)
File "c:\src\xarray\xarray\coding\variables.py", line 71, in __getitem__
return self.func(self.array[key])
File "c:\src\xarray\xarray\coding\variables.py", line 140, in _apply_mask
data = np.asarray(data, dtype=dtype)
File "C:\Anaconda3\envs\jaws\lib\site-packages\numpy\core\numeric.py", line 492, in asarray
return array(a, dtype, copy=False, order=order)
File "c:\src\xarray\xarray\core\indexing.py", line 520, in __array__
return np.asarray(array[self.key], dtype=None)
File "c:\src\xarray\xarray\backends\pydap.py", line 33, in getitem
result = robust_getitem(array, key, catch=ValueError)
File "c:\src\xarray\xarray\backends\common.py", line 67, in robust_getitem
return array[key]
File "C:\src\pydap\src\pydap\model.py", line 320, in getitem
out.data = self._get_data_index(index)
File "C:\src\pydap\src\pydap\model.py", line 350, in _get_data_index
return self._data[index]
File "C:\src\pydap\src\pydap\handlers\dap.py", line 149, in getitem
return dataset[self.id].data
File "C:\src\pydap\src\pydap\model.py", line 426, in getitem
return self._getitem_string(key)
File "C:\src\pydap\src\pydap\model.py", line 410, in _getitem_string
return self[splitted[0]]['.'.join(splitted[1:])]
File "C:\src\pydap\src\pydap\model.py", line 320, in getitem
out.data = self._get_data_index(index)
File "C:\src\pydap\src\pydap\model.py", line 350, in _get_data_index
return self._data[index]
IndexError: only integers, slices ( Expected OutputExpecting to see the value of variable 'tlml' at these coordinates. Output of
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1860/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
291524555 | MDU6SXNzdWUyOTE1MjQ1NTU= | 1857 | AttributeError: '<class 'pydap.model.GridType'>' object has no attribute 'shape' | ghost 10137 | closed | 0 | 6 | 2018-01-25T10:42:20Z | 2018-01-26T13:25:07Z | 2018-01-26T13:25:07Z | NONE | Code Sample, a copy-pastable example if possible
Problem descriptionI was trying to connect to NASA MERRA-2 data through OPeNDAP, following the documentation here: http://xarray.pydata.org/en/stable/io.html#OPeNDAP. I was able to get through a previous bug (#1775) by installing the latest master version of xarray. Expected OutputExpecting the collection (M2T1NXFLX) content to show up as an xarray dataset. Output of
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1857/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
245186903 | MDU6SXNzdWUyNDUxODY5MDM= | 1486 | boolean indexing | ghost 10137 | closed | 0 | 2 | 2017-07-24T19:39:42Z | 2017-09-07T08:06:49Z | 2017-09-07T08:06:48Z | NONE | I am trying to figure out how boolean indexing works in xarray.
I have a couple of data arrays below:
I want to merge X_day and X_night as a new X based on Rule.
First I make a copy of X_day to be X:
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1486/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
244702576 | MDU6SXNzdWUyNDQ3MDI1NzY= | 1484 | Matrix cross product in xarray | ghost 10137 | closed | 0 | 4 | 2017-07-21T15:21:37Z | 2017-07-24T16:36:22Z | 2017-07-24T16:36:22Z | NONE | Hi I am new to xarray and need some advice on one task. I need to do a cross product calculation using variables from a netcdf file and write the output to a new netcdf file. I feel this could be done using netcdf-python and pandas, but I hope to use xarray to simplify the task. My code will be something like this: ds = xr.open_dataset(NC_FILE) var1 = ds['VAR1'] var2 = ds['VAR2'] var3 = ds['VAR3'] var4 = ds['VAR4'] var1-4 above will have dimensions [latitude, longitude]. I will use var1-4 to generate a matrix of dimensions [Nlat, Nlon, M], something like: [var1, var1-var2, var1-var3, (var1-var2)*np.cos(var4)] (here, M=4). My question here is, how do I build this matrix in xarray Dataset? Since this matrix will be eventually used to cross product with another matrix (pd.DataFrame) of dimensions [M, K], is it better to convert var1-4 to pd.DataFrame first? Following code will be like this: matrix = [var1, var1-var2, var1-var3, (var1-var2)*np.cos(var4)] # Nlat x Nlon x M factor = pd.read_csv(somefile) # M x K result = pd.DataFrame.dot(matrix,factor) # Nlat x Nlon x K result2 = xr.Dataset(result) result2.to_netcdf(outfile) Can someone show me the correct code to build the Nlat x Nlon x M matrix? Can the cross product be done in xr.Dataset to avoid conversion to and from pd.DataFrame? Thank you, Xin |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1484/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
98442885 | MDU6SXNzdWU5ODQ0Mjg4NQ== | 505 | Resampling drops datavars with unsigned integer datatypes | ghost 10137 | closed | 0 | 1 | 2015-07-31T18:04:51Z | 2015-07-31T19:44:32Z | 2015-07-31T19:44:32Z | NONE | If a variable has an unsigned integer type (uint16, uint32, etc.), resampling time will drop that variable. Does not occur with signed integer types (int16, etc.). ``` import numpy as np import pandas as pd import xray numbers = np.arange(1, 6).astype('uint32') ds = xray.Dataset( {'numbers': ('time', numbers)}, coords = {'time': pd.date_range('2000-01-01', periods=5)}) resampled = ds.resample('24H', dim='time') assert 'numbers' not in resampled ``` |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/505/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
91676831 | MDU6SXNzdWU5MTY3NjgzMQ== | 448 | asarray Compatibility | ghost 10137 | closed | 0 | 3 | 2015-06-29T02:45:25Z | 2015-06-30T23:02:57Z | 2015-06-30T23:02:57Z | NONE | To "numpify" a function, usually asarray is used: def db2w(arr): return 10 ** (np.asarray(arr) / 20.0) Now you could replace the divide with np.divide, but it seems much simpler to use np.asarray. Unfortunately, if you use any function that has been "vectorized," it will only return the values of the DataArray as an ndarray. This strips the object of any "xray" meta-data and severely limits the use of this class. It requires that any function that wants to work seamlessly with a DataArray explicitly check that it's an instance of DataArray This seems counter-intuitive to the numpy framework where any function, once properly vectorized, can work with python scalars (int, float) or list types (tuple, list) as well as the actual ndarray class. It would be awesome if code that worked for these cases could just work for DataArrays. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/448/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issues] ( [id] INTEGER PRIMARY KEY, [node_id] TEXT, [number] INTEGER, [title] TEXT, [user] INTEGER REFERENCES [users]([id]), [state] TEXT, [locked] INTEGER, [assignee] INTEGER REFERENCES [users]([id]), [milestone] INTEGER REFERENCES [milestones]([id]), [comments] INTEGER, [created_at] TEXT, [updated_at] TEXT, [closed_at] TEXT, [author_association] TEXT, [active_lock_reason] TEXT, [draft] INTEGER, [pull_request] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [state_reason] TEXT, [repo] INTEGER REFERENCES [repos]([id]), [type] TEXT ); CREATE INDEX [idx_issues_repo] ON [issues] ([repo]); CREATE INDEX [idx_issues_milestone] ON [issues] ([milestone]); CREATE INDEX [idx_issues_assignee] ON [issues] ([assignee]); CREATE INDEX [idx_issues_user] ON [issues] ([user]);