issues
10 rows where comments = 4, repo = 13221727 and "updated_at" is on date 2022-04-09 sorted by updated_at descending
This data as json, CSV (advanced)
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at ▲ | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
643035732 | MDU6SXNzdWU2NDMwMzU3MzI= | 4169 | "write to read-only" Error in xarray.open_mfdataset() when trying to write to a netcdf file | EliT1626 65610153 | closed | 0 | 4 | 2020-06-22T12:35:57Z | 2022-04-09T15:50:51Z | 2022-04-09T15:50:51Z | NONE | Code Sample ``` xr.set_options(file_cache_maxsize=10) Assumes daily incrementsdef list_dates(start, end): num_days = (end - start).days return [start + dt.timedelta(days=x) for x in range(num_days)] def list_dates1(start, end): num_days = (end - start).days dates = [start + dt.timedelta(days=x) for x in range(num_days)] sorted_dates = sorted(dates, key=lambda date: (date.month, date.day)) grouped_dates = [list(g) for _, g in groupby(sorted_dates, key=lambda date: (date.month, date.day))] return grouped_dates start_date = dt.date(2010, 1, 1) end_date = dt.date(2019, 12, 31) date_list = list_dates1(start_date, end_date) window1 = dt.timedelta(days=5) window2 = dt.timedelta(days=6) url = 'https://www.ncei.noaa.gov/thredds/dodsC/OisstBase/NetCDF/V2.1/AVHRR/{0:%Y%m}/oisst-avhrr-v02r01.{0:%Y%m%d}.nc' end_date2 = dt.date(2010, 1, 2) sst_mean=[] cur_date = start_date for cur_date in date_list:
sst_mean_calc = []
for i in cur_date:
date_window=list_dates(i - window1, i + window2)
url_list_window = [url.format(x) for x in date_window]
window_data=xr.open_mfdataset(url_list_window).sst
sst_mean_calc.append(window_data.mean('time')) sst_mean_climo_test=xr.concat(sst_mean, dim='time') sst_std=xr.concat(sst_std_calc, dim=pd.DatetimeIndex(date_list, name='time'))sst_min = xr.concat(sst_min_calc, dim=pd.DatetimeIndex(date_list, name='time'))sst_max = xr.concat(sst_max_calc, dim=pd.DatetimeIndex(date_list, name='time'))sst_mean_climo_test.to_netcdf(path='E:/Riskpulse_HD/SST_stuff/sst_mean_climo_test') ``` Explanation of Code This code (climatology for SSTs) creates a list of dates between the specified start and end dates that contains the same day number for every month through the year span. For example, date_list[0] contains 10 datetime dates that start with 1-1-2010, 1-1-2011...1-1-2019. I then request OISST data from an opendap server and take a centered mean of the date in question (this case I did it for the first and second of January). In other words, I am opening the files for Dec 27-Jan 6 and averaging all of them together. The final xarray dataset then contains two 'times', which is 10 years worth of data for Jan 1 and Jan 2. I want to then send this to a netcdf file such that I can save it on my local machine and use to create plots down the road. Hope this makes sense. Error Messages ``` KeyError Traceback (most recent call last) ~\Anaconda3\lib\site-packages\xarray\backends\file_manager.py in _acquire_with_cache_info(self, needs_lock) 197 try: --> 198 file = self._cache[self._key] 199 except KeyError: ~\Anaconda3\lib\site-packages\xarray\backends\lru_cache.py in getitem(self, key) 52 with self._lock: ---> 53 value = self._cache[key] 54 self._cache.move_to_end(key) KeyError: [<class 'netCDF4._netCDF4.Dataset'>, ('https://www.ncei.noaa.gov/thredds/dodsC/OisstBase/NetCDF/V2.1/AVHRR/201801/oisst-avhrr-v02r01.20180106.nc',), 'r', (('clobber', True), ('diskless', False), ('format', 'NETCDF4'), ('persist', False))] During handling of the above exception, another exception occurred: RuntimeError Traceback (most recent call last) <ipython-input-3-f8395dcffb5e> in <module> 1 #xr.set_options(file_cache_maxsize=500) ----> 2 sst_mean_climo_test.to_netcdf(path='E:/Riskpulse_HD/SST_stuff/sst_mean_climo_test') ~\Anaconda3\lib\site-packages\xarray\core\dataarray.py in to_netcdf(self, args, kwargs) 2356 dataset = self.to_dataset() 2357 -> 2358 return dataset.to_netcdf(args, **kwargs) 2359 2360 def to_dict(self, data: bool = True) -> dict: ~\Anaconda3\lib\site-packages\xarray\core\dataset.py in to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute, invalid_netcdf) 1552 unlimited_dims=unlimited_dims, 1553 compute=compute, -> 1554 invalid_netcdf=invalid_netcdf, 1555 ) 1556 ~\Anaconda3\lib\site-packages\xarray\backends\api.py in to_netcdf(dataset, path_or_file, mode, format, group, engine, encoding, unlimited_dims, compute, multifile, invalid_netcdf) 1095 return writer, store 1096 -> 1097 writes = writer.sync(compute=compute) 1098 1099 if path_or_file is None: ~\Anaconda3\lib\site-packages\xarray\backends\common.py in sync(self, compute) 202 compute=compute, 203 flush=True, --> 204 regions=self.regions, 205 ) 206 self.sources = [] ~\Anaconda3\lib\site-packages\dask\array\core.py in store(sources, targets, lock, regions, compute, return_stored, kwargs) 943 944 if compute: --> 945 result.compute(kwargs) 946 return None 947 else: ~\Anaconda3\lib\site-packages\dask\base.py in compute(self, kwargs) 164 dask.base.compute 165 """ --> 166 (result,) = compute(self, traverse=False, kwargs) 167 return result 168 ~\Anaconda3\lib\site-packages\dask\base.py in compute(args, kwargs) 442 postcomputes.append(x.dask_postcompute()) 443 --> 444 results = schedule(dsk, keys, kwargs) 445 return repack([f(r, a) for r, (f, a) in zip(results, postcomputes)]) 446 ~\Anaconda3\lib\site-packages\dask\threaded.py in get(dsk, result, cache, num_workers, pool, kwargs) 82 get_id=_thread_get_id, 83 pack_exception=pack_exception, ---> 84 kwargs 85 ) 86 ~\Anaconda3\lib\site-packages\dask\local.py in get_async(apply_async, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, **kwargs) 484 _execute_task(task, data) # Re-execute locally 485 else: --> 486 raise_exception(exc, tb) 487 res, worker_id = loads(res_info) 488 state["cache"][key] = res ~\Anaconda3\lib\site-packages\dask\local.py in reraise(exc, tb) 314 if exc.traceback is not tb: 315 raise exc.with_traceback(tb) --> 316 raise exc 317 318 ~\Anaconda3\lib\site-packages\dask\local.py in execute_task(key, task_info, dumps, loads, get_id, pack_exception) 220 try: 221 task, data = loads(task_info) --> 222 result = _execute_task(task, data) 223 id = get_id() 224 result = dumps((result, id)) ~\Anaconda3\lib\site-packages\dask\core.py in _execute_task(arg, cache, dsk) 119 # temporaries by their reference count and can execute certain 120 # operations in-place. --> 121 return func(*(_execute_task(a, cache) for a in args)) 122 elif not ishashable(arg): 123 return arg ~\Anaconda3\lib\site-packages\dask\array\core.py in getter(a, b, asarray, lock) 98 c = a[b] 99 if asarray: --> 100 c = np.asarray(c) 101 finally: 102 if lock: ~\Anaconda3\lib\site-packages\numpy\core_asarray.py in asarray(a, dtype, order) 83 84 """ ---> 85 return array(a, dtype, copy=False, order=order) 86 87 ~\Anaconda3\lib\site-packages\xarray\core\indexing.py in array(self, dtype) 489 490 def array(self, dtype=None): --> 491 return np.asarray(self.array, dtype=dtype) 492 493 def getitem(self, key): ~\Anaconda3\lib\site-packages\numpy\core_asarray.py in asarray(a, dtype, order) 83 84 """ ---> 85 return array(a, dtype, copy=False, order=order) 86 87 ~\Anaconda3\lib\site-packages\xarray\core\indexing.py in array(self, dtype) 651 652 def array(self, dtype=None): --> 653 return np.asarray(self.array, dtype=dtype) 654 655 def getitem(self, key): ~\Anaconda3\lib\site-packages\numpy\core_asarray.py in asarray(a, dtype, order) 83 84 """ ---> 85 return array(a, dtype, copy=False, order=order) 86 87 ~\Anaconda3\lib\site-packages\xarray\core\indexing.py in array(self, dtype) 555 def array(self, dtype=None): 556 array = as_indexable(self.array) --> 557 return np.asarray(array[self.key], dtype=None) 558 559 def transpose(self, order): ~\Anaconda3\lib\site-packages\numpy\core_asarray.py in asarray(a, dtype, order) 83 84 """ ---> 85 return array(a, dtype, copy=False, order=order) 86 87 ~\Anaconda3\lib\site-packages\xarray\coding\variables.py in array(self, dtype) 70 71 def array(self, dtype=None): ---> 72 return self.func(self.array) 73 74 def repr(self): ~\Anaconda3\lib\site-packages\xarray\coding\variables.py in _scale_offset_decoding(data, scale_factor, add_offset, dtype) 216 217 def _scale_offset_decoding(data, scale_factor, add_offset, dtype): --> 218 data = np.array(data, dtype=dtype, copy=True) 219 if scale_factor is not None: 220 data *= scale_factor ~\Anaconda3\lib\site-packages\xarray\coding\variables.py in array(self, dtype) 70 71 def array(self, dtype=None): ---> 72 return self.func(self.array) 73 74 def repr(self): ~\Anaconda3\lib\site-packages\xarray\coding\variables.py in _apply_mask(data, encoded_fill_values, decoded_fill_value, dtype) 136 ) -> np.ndarray: 137 """Mask all matching values in a NumPy arrays.""" --> 138 data = np.asarray(data, dtype=dtype) 139 condition = False 140 for fv in encoded_fill_values: ~\Anaconda3\lib\site-packages\numpy\core_asarray.py in asarray(a, dtype, order) 83 84 """ ---> 85 return array(a, dtype, copy=False, order=order) 86 87 ~\Anaconda3\lib\site-packages\xarray\core\indexing.py in array(self, dtype) 555 def array(self, dtype=None): 556 array = as_indexable(self.array) --> 557 return np.asarray(array[self.key], dtype=None) 558 559 def transpose(self, order): ~\Anaconda3\lib\site-packages\xarray\backends\netCDF4_.py in getitem(self, key) 71 def getitem(self, key): 72 return indexing.explicit_indexing_adapter( ---> 73 key, self.shape, indexing.IndexingSupport.OUTER, self._getitem 74 ) 75 ~\Anaconda3\lib\site-packages\xarray\core\indexing.py in explicit_indexing_adapter(key, shape, indexing_support, raw_indexing_method) 835 """ 836 raw_key, numpy_indices = decompose_indexer(key, shape, indexing_support) --> 837 result = raw_indexing_method(raw_key.tuple) 838 if numpy_indices.tuple: 839 # index the loaded np.ndarray ~\Anaconda3\lib\site-packages\xarray\backends\netCDF4_.py in _getitem(self, key) 82 try: 83 with self.datastore.lock: ---> 84 original_array = self.get_array(needs_lock=False) 85 array = getitem(original_array, key) 86 except IndexError: ~\Anaconda3\lib\site-packages\xarray\backends\netCDF4_.py in get_array(self, needs_lock) 61 62 def get_array(self, needs_lock=True): ---> 63 ds = self.datastore._acquire(needs_lock) 64 variable = ds.variables[self.variable_name] 65 variable.set_auto_maskandscale(False) ~\Anaconda3\lib\site-packages\xarray\backends\netCDF4_.py in _acquire(self, needs_lock) 359 360 def _acquire(self, needs_lock=True): --> 361 with self._manager.acquire_context(needs_lock) as root: 362 ds = _nc4_require_group(root, self._group, self._mode) 363 return ds ~\Anaconda3\lib\contextlib.py in enter(self) 110 del self.args, self.kwds, self.func 111 try: --> 112 return next(self.gen) 113 except StopIteration: 114 raise RuntimeError("generator didn't yield") from None ~\Anaconda3\lib\site-packages\xarray\backends\file_manager.py in acquire_context(self, needs_lock) 184 def acquire_context(self, needs_lock=True): 185 """Context manager for acquiring a file.""" --> 186 file, cached = self._acquire_with_cache_info(needs_lock) 187 try: 188 yield file ~\Anaconda3\lib\site-packages\xarray\backends\file_manager.py in _acquire_with_cache_info(self, needs_lock) 206 # ensure file doesn't get overriden when opened again 207 self._mode = "a" --> 208 self._cache[self._key] = file 209 return file, False 210 else: ~\Anaconda3\lib\site-packages\xarray\backends\lru_cache.py in setitem(self, key, value) 71 elif self._maxsize: 72 # make room if necessary ---> 73 self._enforce_size_limit(self._maxsize - 1) 74 self._cache[key] = value 75 elif self._on_evict is not None: ~\Anaconda3\lib\site-packages\xarray\backends\lru_cache.py in _enforce_size_limit(self, capacity) 61 key, value = self._cache.popitem(last=False) 62 if self._on_evict is not None: ---> 63 self._on_evict(key, value) 64 65 def setitem(self, key: K, value: V) -> None: ~\Anaconda3\lib\site-packages\xarray\backends\file_manager.py in <lambda>(k, v) 12 # Global cache for storing open files. 13 FILE_CACHE: LRUCache[str, io.IOBase] = LRUCache( ---> 14 maxsize=cast(int, OPTIONS["file_cache_maxsize"]), on_evict=lambda k, v: v.close() 15 ) 16 assert FILE_CACHE.maxsize, "file cache must be at least size one" netCDF4_netCDF4.pyx in netCDF4._netCDF4.Dataset.close() netCDF4_netCDF4.pyx in netCDF4._netCDF4.Dataset._close() netCDF4_netCDF4.pyx in netCDF4._netCDF4._ensure_nc_success() RuntimeError: NetCDF: HDF error ``` I also tried changing setting xr.set_options(file_cache_maxsize=500) outside of the loop before trying to create the netcdf file and received this error: ``` KeyError Traceback (most recent call last) ~\Anaconda3\lib\site-packages\xarray\backends\file_manager.py in _acquire_with_cache_info(self, needs_lock) 197 try: --> 198 file = self._cache[self._key] 199 except KeyError: ~\Anaconda3\lib\site-packages\xarray\backends\lru_cache.py in getitem(self, key) 52 with self._lock: ---> 53 value = self._cache[key] 54 self._cache.move_to_end(key) KeyError: [<class 'netCDF4._netCDF4.Dataset'>, ('https://www.ncei.noaa.gov/thredds/dodsC/OisstBase/NetCDF/V2.1/AVHRR/201512/oisst-avhrr-v02r01.20151231.nc',), 'r', (('clobber', True), ('diskless', False), ('format', 'NETCDF4'), ('persist', False))] During handling of the above exception, another exception occurred: OSError Traceback (most recent call last) <ipython-input-4-474cdce51e60> in <module> 1 xr.set_options(file_cache_maxsize=500) ----> 2 sst_mean_climo_test.to_netcdf(path='E:/Riskpulse_HD/SST_stuff/sst_mean_climo_test') ~\Anaconda3\lib\site-packages\xarray\core\dataarray.py in to_netcdf(self, args, kwargs) 2356 dataset = self.to_dataset() 2357 -> 2358 return dataset.to_netcdf(args, **kwargs) 2359 2360 def to_dict(self, data: bool = True) -> dict: ~\Anaconda3\lib\site-packages\xarray\core\dataset.py in to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute, invalid_netcdf) 1552 unlimited_dims=unlimited_dims, 1553 compute=compute, -> 1554 invalid_netcdf=invalid_netcdf, 1555 ) 1556 ~\Anaconda3\lib\site-packages\xarray\backends\api.py in to_netcdf(dataset, path_or_file, mode, format, group, engine, encoding, unlimited_dims, compute, multifile, invalid_netcdf) 1095 return writer, store 1096 -> 1097 writes = writer.sync(compute=compute) 1098 1099 if path_or_file is None: ~\Anaconda3\lib\site-packages\xarray\backends\common.py in sync(self, compute) 202 compute=compute, 203 flush=True, --> 204 regions=self.regions, 205 ) 206 self.sources = [] ~\Anaconda3\lib\site-packages\dask\array\core.py in store(sources, targets, lock, regions, compute, return_stored, kwargs) 943 944 if compute: --> 945 result.compute(kwargs) 946 return None 947 else: ~\Anaconda3\lib\site-packages\dask\base.py in compute(self, kwargs) 164 dask.base.compute 165 """ --> 166 (result,) = compute(self, traverse=False, kwargs) 167 return result 168 ~\Anaconda3\lib\site-packages\dask\base.py in compute(args, kwargs) 442 postcomputes.append(x.dask_postcompute()) 443 --> 444 results = schedule(dsk, keys, kwargs) 445 return repack([f(r, a) for r, (f, a) in zip(results, postcomputes)]) 446 ~\Anaconda3\lib\site-packages\dask\threaded.py in get(dsk, result, cache, num_workers, pool, kwargs) 82 get_id=_thread_get_id, 83 pack_exception=pack_exception, ---> 84 kwargs 85 ) 86 ~\Anaconda3\lib\site-packages\dask\local.py in get_async(apply_async, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, **kwargs) 484 _execute_task(task, data) # Re-execute locally 485 else: --> 486 raise_exception(exc, tb) 487 res, worker_id = loads(res_info) 488 state["cache"][key] = res ~\Anaconda3\lib\site-packages\dask\local.py in reraise(exc, tb) 314 if exc.traceback is not tb: 315 raise exc.with_traceback(tb) --> 316 raise exc 317 318 ~\Anaconda3\lib\site-packages\dask\local.py in execute_task(key, task_info, dumps, loads, get_id, pack_exception) 220 try: 221 task, data = loads(task_info) --> 222 result = _execute_task(task, data) 223 id = get_id() 224 result = dumps((result, id)) ~\Anaconda3\lib\site-packages\dask\core.py in _execute_task(arg, cache, dsk) 119 # temporaries by their reference count and can execute certain 120 # operations in-place. --> 121 return func(*(_execute_task(a, cache) for a in args)) 122 elif not ishashable(arg): 123 return arg ~\Anaconda3\lib\site-packages\dask\array\core.py in getter(a, b, asarray, lock) 98 c = a[b] 99 if asarray: --> 100 c = np.asarray(c) 101 finally: 102 if lock: ~\Anaconda3\lib\site-packages\numpy\core_asarray.py in asarray(a, dtype, order) 83 84 """ ---> 85 return array(a, dtype, copy=False, order=order) 86 87 ~\Anaconda3\lib\site-packages\xarray\core\indexing.py in array(self, dtype) 489 490 def array(self, dtype=None): --> 491 return np.asarray(self.array, dtype=dtype) 492 493 def getitem(self, key): ~\Anaconda3\lib\site-packages\numpy\core_asarray.py in asarray(a, dtype, order) 83 84 """ ---> 85 return array(a, dtype, copy=False, order=order) 86 87 ~\Anaconda3\lib\site-packages\xarray\core\indexing.py in array(self, dtype) 651 652 def array(self, dtype=None): --> 653 return np.asarray(self.array, dtype=dtype) 654 655 def getitem(self, key): ~\Anaconda3\lib\site-packages\numpy\core_asarray.py in asarray(a, dtype, order) 83 84 """ ---> 85 return array(a, dtype, copy=False, order=order) 86 87 ~\Anaconda3\lib\site-packages\xarray\core\indexing.py in array(self, dtype) 555 def array(self, dtype=None): 556 array = as_indexable(self.array) --> 557 return np.asarray(array[self.key], dtype=None) 558 559 def transpose(self, order): ~\Anaconda3\lib\site-packages\numpy\core_asarray.py in asarray(a, dtype, order) 83 84 """ ---> 85 return array(a, dtype, copy=False, order=order) 86 87 ~\Anaconda3\lib\site-packages\xarray\coding\variables.py in array(self, dtype) 70 71 def array(self, dtype=None): ---> 72 return self.func(self.array) 73 74 def repr(self): ~\Anaconda3\lib\site-packages\xarray\coding\variables.py in _scale_offset_decoding(data, scale_factor, add_offset, dtype) 216 217 def _scale_offset_decoding(data, scale_factor, add_offset, dtype): --> 218 data = np.array(data, dtype=dtype, copy=True) 219 if scale_factor is not None: 220 data *= scale_factor ~\Anaconda3\lib\site-packages\xarray\coding\variables.py in array(self, dtype) 70 71 def array(self, dtype=None): ---> 72 return self.func(self.array) 73 74 def repr(self): ~\Anaconda3\lib\site-packages\xarray\coding\variables.py in _apply_mask(data, encoded_fill_values, decoded_fill_value, dtype) 136 ) -> np.ndarray: 137 """Mask all matching values in a NumPy arrays.""" --> 138 data = np.asarray(data, dtype=dtype) 139 condition = False 140 for fv in encoded_fill_values: ~\Anaconda3\lib\site-packages\numpy\core_asarray.py in asarray(a, dtype, order) 83 84 """ ---> 85 return array(a, dtype, copy=False, order=order) 86 87 ~\Anaconda3\lib\site-packages\xarray\core\indexing.py in array(self, dtype) 555 def array(self, dtype=None): 556 array = as_indexable(self.array) --> 557 return np.asarray(array[self.key], dtype=None) 558 559 def transpose(self, order): ~\Anaconda3\lib\site-packages\xarray\backends\netCDF4_.py in getitem(self, key) 71 def getitem(self, key): 72 return indexing.explicit_indexing_adapter( ---> 73 key, self.shape, indexing.IndexingSupport.OUTER, self._getitem 74 ) 75 ~\Anaconda3\lib\site-packages\xarray\core\indexing.py in explicit_indexing_adapter(key, shape, indexing_support, raw_indexing_method) 835 """ 836 raw_key, numpy_indices = decompose_indexer(key, shape, indexing_support) --> 837 result = raw_indexing_method(raw_key.tuple) 838 if numpy_indices.tuple: 839 # index the loaded np.ndarray ~\Anaconda3\lib\site-packages\xarray\backends\netCDF4_.py in _getitem(self, key) 82 try: 83 with self.datastore.lock: ---> 84 original_array = self.get_array(needs_lock=False) 85 array = getitem(original_array, key) 86 except IndexError: ~\Anaconda3\lib\site-packages\xarray\backends\netCDF4_.py in get_array(self, needs_lock) 61 62 def get_array(self, needs_lock=True): ---> 63 ds = self.datastore._acquire(needs_lock) 64 variable = ds.variables[self.variable_name] 65 variable.set_auto_maskandscale(False) ~\Anaconda3\lib\site-packages\xarray\backends\netCDF4_.py in _acquire(self, needs_lock) 359 360 def _acquire(self, needs_lock=True): --> 361 with self._manager.acquire_context(needs_lock) as root: 362 ds = _nc4_require_group(root, self._group, self._mode) 363 return ds ~\Anaconda3\lib\contextlib.py in enter(self) 110 del self.args, self.kwds, self.func 111 try: --> 112 return next(self.gen) 113 except StopIteration: 114 raise RuntimeError("generator didn't yield") from None ~\Anaconda3\lib\site-packages\xarray\backends\file_manager.py in acquire_context(self, needs_lock) 184 def acquire_context(self, needs_lock=True): 185 """Context manager for acquiring a file.""" --> 186 file, cached = self._acquire_with_cache_info(needs_lock) 187 try: 188 yield file ~\Anaconda3\lib\site-packages\xarray\backends\file_manager.py in _acquire_with_cache_info(self, needs_lock) 202 kwargs = kwargs.copy() 203 kwargs["mode"] = self._mode --> 204 file = self._opener(self._args, *kwargs) 205 if self._mode == "w": 206 # ensure file doesn't get overriden when opened again netCDF4_netCDF4.pyx in netCDF4._netCDF4.Dataset.init() netCDF4_netCDF4.pyx in netCDF4._netCDF4._ensure_nc_success() OSError: [Errno -37] NetCDF: Write to read only: b'https://www.ncei.noaa.gov/thredds/dodsC/OisstBase/NetCDF/V2.1/AVHRR/201512/oisst-avhrr-v02r01.20151231.nc' ``` I believe these errors have something to do with a post that I created a couple weeks ago (https://github.com/pydata/xarray/issues/4082). I'm not sure if you can @ users on here, but @rsignell-usgs found out something about the caching before hand. It seems that this is some sort of Windows issue. Versions python: 3.7.4 xarray: 0.15.1 pandas: 1.0.3 numpy: 1.18.1 scipy: 1.4.1 netcdf4: 1.4.2 |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/4169/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
878481461 | MDU6SXNzdWU4Nzg0ODE0NjE= | 5276 | open_mfdataset: Not a valid ID | minhhg 11815787 | closed | 0 | 4 | 2021-05-07T05:34:02Z | 2022-04-09T15:49:50Z | 2022-04-09T15:49:50Z | NONE | I have about 601 NETCDF4 files saved using xarray. We try to use open_mfdataset to access these files. The main code calls this function many times. At the first few calls, it works fine, after for a while it throw the following error message "RuntimeError: NetCDF: Not a valid ID"
Environment: Output of <tt>xr.show_versions()</tt>INSTALLED VERSIONS ------------------ commit: None python: 3.6.8.final.0 python-bits: 64 OS: Linux OS-release: 5.4.0-1047-aws machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: C.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.11.0 pandas: 0.24.1 numpy: 1.15.4 scipy: 1.2.0 netCDF4: 1.4.2 h5netcdf: None h5py: 2.9.0 Nio: None zarr: None cftime: 1.0.3.4 PseudonetCDF: None rasterio: None iris: None bottleneck: 1.2.1 cyordereddict: None dask: 1.1.1 distributed: 1.25.3 matplotlib: 3.0.2 cartopy: None seaborn: 0.9.0 setuptools: 40.7.3 pip: 19.0.1 conda: None pytest: 4.2.0 IPython: 7.1.1 sphinx: 1.8.4This error also happens with xarray version 0.10.9 Error trace:
```python
2021-05-05 09:28:19,911, DEBUG 7621, sim_io.py:483 - load_unique_document(), xpa
th=/home/ubuntu/runs/20210331_001/nominal_dfs/uk
2021-05-05 09:28:42,774, ERROR 7621, run_gov_ret.py:33 - <module>(),
Unknown error=NetCDF: Not a valid ID
Traceback (most recent call last):
File "/home/ubuntu/dev/py36/python/ev/model/api3/run_gov_ret.py", line 31, in
<module>
res = govRet()
File "/home/ubuntu/dev/py36/python/ev/model/api3/returns.py", line 56, in __ca
ll__
decompose=self.decompose))
File "/home/ubuntu/dev/py36/python/ev/model/returns/returnsGenerator.py", line
70, in calc_returns
dfs_data = self.mongo_dfs.get_data(mats=[1,mat,mat-1])
File "/home/ubuntu/dev/py36/python/ev/model/api3/dfs.py", line 262, in get_dat
a
record = self.mdb.load_unique_document(self.dfs_collection_name, spec)
File "/home/ubuntu/dev/py36/python/ev/model/api3/sim_io.py", line 1109, in load_unique_document
return self.collections[collection].load_unique_document(query, *args, **kwargs)
File "/home/ubuntu/dev/py36/python/ev/model/api3/sim_io.py", line 501, in load_unique_document
doc['data'] = ar1.load().values
File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/xarray/core/dataarray.py", line 631, in load
ds = self._to_temp_dataset().load(**kwargs)
File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/xarray/core/dataset.py", line 494, in load
evaluated_data = da.compute(*lazy_data.values(), **kwargs)
File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/dask/base.py", line 398, in compute
results = schedule(dsk, keys, **kwargs)
File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/dask/threaded.py", line 76, in get
pack_exception=pack_exception, **kwargs)
pack_exception=pack_exception, **kwargs)
File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/dask/local
.py", line 459, in get_async
raise_exception(exc, tb)
File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/dask/compa
tibility.py", line 112, in reraise
raise exc
File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/dask/local
.py", line 230, in execute_task
result = _execute_task(task, data)
File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/dask/core.
py", line 119, in _execute_task
return func(*args2)
File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/dask/array
/core.py", line 82, in getter
c = np.asarray(c)
File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/numpy/core
/numeric.py", line 501, in asarray
return array(a, dtype, copy=False, order=order)
File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/xarray/cor
e/indexing.py", line 602, in __array__
return np.asarray(self.array, dtype=dtype)
File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/numpy/core/numeric.py", line 501, in asarray
return array(a, dtype, copy=False, order=order)
File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/xarray/core/indexing.py", line 508, in __array__
return np.asarray(array[self.key], dtype=None)
File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/xarray/backends/netCDF4_.py", line 64, in __getitem__
self._getitem)
File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/xarray/core/indexing.py", line 776, in explicit_indexing_adapter
result = raw_indexing_method(raw_key.tuple)
File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/xarray/backends/netCDF4_.py", line 76, in _getitem
array = getitem(original_array, key)
File "netCDF4/_netCDF4.pyx", line 4095, in netCDF4._netCDF4.Variable.__getitem__
File "netCDF4/_netCDF4.pyx", line 3798, in netCDF4._netCDF4.Variable.shape.__get__
File "netCDF4/_netCDF4.pyx", line 3746, in netCDF4._netCDF4.Variable._getdims
File "netCDF4/_netCDF4.pyx", line 1754, in netCDF4._netCDF4._ensure_nc_success
RuntimeError: NetCDF: Not a valid ID
```
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5276/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
208312826 | MDU6SXNzdWUyMDgzMTI4MjY= | 1273 | replace a dim with a coordinate from another dataset | rabernat 1197350 | open | 0 | 4 | 2017-02-17T02:15:36Z | 2022-04-09T15:26:20Z | MEMBER | I often want a function that takes a dataarray / dataset and replaces a dimension with a coordinate from a different dataset. @shoyer proposed the following simple solution. ```python def replace_dim(da, olddim, newdim): renamed = da.rename({olddim: newdim.name})
``` Is this of broad enough interest to add a build in method for? |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1273/reactions", "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
438947247 | MDU6SXNzdWU0Mzg5NDcyNDc= | 2933 | Stack() & unstack() issues on Multindex | ray306 1559890 | closed | 0 | 4 | 2019-04-30T19:47:51Z | 2022-04-09T15:23:28Z | 2022-04-09T15:23:28Z | NONE | I would like to reshape the DataArray by one level in the Multindex, and I thought the Make a DataArray with Multindex:
Stack problem:I want a dimension merges into another one:
Unstack problem:Unstacking by the whole Multindex worked:
Coordinates:
* variable (variable) int32 0 1 2 3
* first (first) object 'bar' 'baz' 'foo'
* second (second) object 'one' 'two'
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/2933/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
607718350 | MDU6SXNzdWU2MDc3MTgzNTA= | 4011 | missing empty group when iterate over groupby_bins | miniufo 9312831 | open | 0 | 4 | 2020-04-27T17:22:31Z | 2022-04-09T03:08:14Z | NONE | When I try to iterate over the object one of these bins will be emptybins = [0,4,5] grouped = array.groupby_bins('dim_0', bins) for i, group in enumerate(grouped): print(str(i)+' '+group) ``` When a bin contains no samples (bin of (4, 5]), the empty group will be dropped. Then how to iterate over the full bins even when some bins contain nothing? I've read this related issue #1019. But my case here need the correct order in grouped and empty groups need to be iterated over. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/4011/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
673682661 | MDU6SXNzdWU2NzM2ODI2NjE= | 4313 | Using Dependabot to manage doc build and CI versions | jthielen 3460034 | open | 0 | 4 | 2020-08-05T16:24:24Z | 2022-04-09T02:59:21Z | CONTRIBUTOR | As brought up on the bi-weekly community developers meeting, it sounds like Pandas v1.1.0 is breaking doc builds on RTD. One solution to the issues of frequent breakages in doc builds and CI due to upstream updates is having fixed version lists for all of these, which are then incrementally updated as new versions come out. @dopplershift has done a lot of great work in MetPy getting such a workflow set up with Dependabot (https://github.com/Unidata/MetPy/pull/1410) among other CI updates, and this could be adapted for use here in xarray. We've generally been quite happy with our updated CI configuration with Dependabot over the past couple weeks. The only major issue has been https://github.com/Unidata/MetPy/issues/1424 / https://github.com/dependabot/dependabot-core/issues/2198#issuecomment-649726022, which has required some contributors to have to delete and recreate their forks in order for Dependabot to not auto-submit PRs to the forked repos. Any thoughts that you had here @dopplershift would be appreciated! xref https://github.com/pydata/xarray/issues/4287, https://github.com/pydata/xarray/pull/4296 |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/4313/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
484699415 | MDU6SXNzdWU0ODQ2OTk0MTU= | 3256 | .item() on a DataArray with dtype='datetime64[ns]' returns int | IvoCrnkovic 1778852 | open | 0 | 4 | 2019-08-23T20:29:50Z | 2022-04-09T02:03:43Z | NONE | MCVE Code Sample```python import datetime import xarray as xr test_da = xr.DataArray(datetime.datetime(2019, 1, 1, 1, 1)) test_da <xarray.DataArray ()>array('2019-01-01T01:01:00.000000000', dtype='datetime64[ns]')test_da.item() 1546304460000000000``` Expected OutputI would think it would be nice to get a Output of
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/3256/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
882105903 | MDU6SXNzdWU4ODIxMDU5MDM= | 5281 | 'Parallelized' apply_ufunc for scripy.interpolate.griddata | LJaksic 74414841 | open | 0 | 4 | 2021-05-09T10:08:46Z | 2022-04-09T01:39:13Z | NONE | Hi, I'm working with large files from an ocean model with an unstructered grid. For instance, variable flow velocity For smaller computational domains (smaller nFlowElement dimension) I ám still able to load the dataarray in my work memory. Then, the following code gives me the wanted result: ``` def interp_to_grid(u,xc,yc,xint,yint): print(u.shape,xc.shape,xint.shape) ug = griddata((xc,yc),u,(xint,yint), method='nearest', fill_value=np.nan) return ug uxg = xr.apply_ufunc(interp_to_grid,
ux, xc, yc, xint, yint,
dask = 'allowed',
input_core_dims=[['nFlowElem','time','laydim'],['nFlowElem'],['nFlowElem'],['dim_0','dim_1'],['dim_0','dim_1']],
output_core_dims=[['dim_0','dim_1','time','laydim']],
output_dtypes = [xr.DataArray]
)
However, for much larger spatial domains it is required to work with dask = 'parallelized', because these input dataarrays can nolonger be loaded into my working memory. I have tried to apply chunks over the time dimension, but also over the nFlowElement dimension. I am aware that it is not possible to chunk over core dimensions. This is one of my "parallel" attempts (with chunks along the time dim): Input ux:
File "interpnd.pyx", line 192, in scipy.interpolate.interpnd._check_init_shape ValueError: different number of values and points
Any advice is very welcome! Best Wishes, Luka |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5281/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1030768250 | I_kwDOAMm_X849cEZ6 | 5877 | Rolling() gives values different from pd.rolling() | chiaral 8453445 | open | 0 | 4 | 2021-10-19T21:41:42Z | 2022-04-09T01:29:07Z | CONTRIBUTOR | I am not sure this is a bug - but it clearly doesn't give the results the user would expect. The rolling sum of zeros gives me values that are not zeros ```python var = np.array([0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.31 , 0.91999996, 8.3 , 1.42 , 0.03 , 1.22 , 0.09999999, 0.14 , 0.13 , 0. , 0.12 , 0.03 , 2.53 , 0. , 0.19999999, 0.19999999, 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ], dtype='float32') timet = np.array([ 43200000000000, 129600000000000, 216000000000000, 302400000000000, 388800000000000, 475200000000000, 561600000000000, 648000000000000, 734400000000000, 820800000000000, 907200000000000, 993600000000000, 1080000000000000, 1166400000000000, 1252800000000000, 1339200000000000, 1425600000000000, 1512000000000000, 1598400000000000, 1684800000000000, 1771200000000000, 1857600000000000, 1944000000000000, 2030400000000000, 2116800000000000, 2203200000000000, 2289600000000000, 2376000000000000, 2462400000000000, 2548800000000000, 2635200000000000, 2721600000000000, 2808000000000000, 2894400000000000, 2980800000000000], dtype='timedelta64[ns]') ds_ex = xr.Dataset(data_vars=dict( pr=(["time"], var), ), coords=dict( time=("time", timet) ), ) ds_ex.rolling(time=3).sum().pr.values ``` it gives me this result: array([ nan, nan, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 3.1000000e-01, 1.2300000e+00, 9.5300007e+00, 1.0640000e+01, 9.7500000e+00, 2.6700001e+00, 1.3500001e+00, 1.4600002e+00, 3.7000012e-01, 2.7000013e-01, 2.5000012e-01, 1.5000013e-01, 2.6800001e+00, 2.5600002e+00, 2.7300003e+00, 4.0000033e-01, 4.0000033e-01, 2.0000035e-01, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07], dtype=float32) Note the non zero values - the non zero value changes depending on whether i use float64 or float32 as precision of my data. So this seems to be a precision related issue (although the first values are correctly set to zero), in fact other sums of values are not exactly what they should be. The small difference at the 8th/9th decimal position can be expected due to precision, but the fact that the 0s become non zeros is problematic imho, especially if not documented. Oftentimes zero in geoscience data can mean a very specific thing (i.e. zero rainfall will be characterized differently than non-zero). in pandas this instead works:
array([[ nan, nan, 0. , 0. , 0. , 0. , 0. , 0.31 , 1.22999996, 9.53000015, 10.6400001 , 9.75000015, 2.66999999, 1.35000001, 1.46000002, 0.36999998, 0.27 , 0.24999999, 0.15 , 2.67999997, 2.55999997, 2.72999996, 0.39999998, 0.39999998, 0.19999999, 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ]]) What you expected to happen: the sum of zeros should be zero. If this cannot be achieved/expected because of precision issues, it should be documented. Anything else we need to know?: I discovered this behavior in my old environments, but I created a new ad hoc environment with the latest versions, and it does the same thing. Environment: INSTALLED VERSIONScommit: None python: 3.9.7 (default, Sep 16 2021, 08:50:36) [Clang 10.0.0 ] python-bits: 64 OS: Darwin OS-release: 17.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 0.19.0 pandas: 1.3.3 numpy: 1.21.2 scipy: None netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: None distributed: None matplotlib: None cartopy: None seaborn: None numbagg: None pint: None setuptools: 58.0.4 pip: 21.2.4 conda: None pytest: None IPython: 7.28.0 sphinx: None |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5877/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
653442225 | MDU6SXNzdWU2NTM0NDIyMjU= | 4209 | `xr.save_mfdataset()` doesn't honor `compute=False` argument | andersy005 13301940 | open | 0 | 4 | 2020-07-08T16:40:11Z | 2022-04-09T01:25:56Z | MEMBER | What happened: While using What you expected to happen: I expect the datasets to be written when I explicitly call Minimal Complete Verifiable Example: ```python In [2]: import xarray as xr In [3]: ds = xr.tutorial.open_dataset('rasm', chunks={}) In [4]: ds Out[4]: <xarray.Dataset> Dimensions: (time: 36, x: 275, y: 205) Coordinates: * time (time) object 1980-09-16 12:00:00 ... 1983-08-17 00:00:00 xc (y, x) float64 dask.array<chunksize=(205, 275), meta=np.ndarray> yc (y, x) float64 dask.array<chunksize=(205, 275), meta=np.ndarray> Dimensions without coordinates: x, y Data variables: Tair (time, y, x) float64 dask.array<chunksize=(36, 205, 275), meta=np.ndarray> Attributes: title: /workspace/jhamman/processed/R1002RBRxaaa01a/l... institution: U.W. source: RACM R1002RBRxaaa01a output_frequency: daily output_mode: averaged convention: CF-1.4 references: Based on the initial model of Liang et al., 19... comment: Output from the Variable Infiltration Capacity... nco_openmp_thread_number: 1 NCO: "4.6.0" history: Tue Dec 27 14:15:22 2016: ncatted -a dimension... In [5]: path = "test.nc" In [7]: ls -ltrh test.nc ls: cannot access test.nc: No such file or directory In [8]: tasks = xr.save_mfdataset(datasets=[ds], paths=[path], compute=False) In [9]: tasks Out[9]: Delayed('list-aa0b52e0-e909-4e65-849f-74526d137542') In [10]: ls -ltrh test.nc -rw-r--r-- 1 abanihi ncar 14K Jul 8 10:29 test.nc ``` Anything else we need to know?: Environment: Output of <tt>xr.show_versions()</tt>```python INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 | packaged by conda-forge | (default, Jun 1 2020, 18:57:50) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 3.10.0-693.21.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.4 xarray: 0.15.1 pandas: 0.25.3 numpy: 1.18.5 scipy: 1.5.0 netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: 1.2.0 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.20.0 distributed: 2.20.0 matplotlib: 3.2.1 cartopy: None seaborn: None numbagg: None setuptools: 49.1.0.post20200704 pip: 20.1.1 conda: None pytest: None IPython: 7.16.1 sphinx: None ``` |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/4209/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issues] ( [id] INTEGER PRIMARY KEY, [node_id] TEXT, [number] INTEGER, [title] TEXT, [user] INTEGER REFERENCES [users]([id]), [state] TEXT, [locked] INTEGER, [assignee] INTEGER REFERENCES [users]([id]), [milestone] INTEGER REFERENCES [milestones]([id]), [comments] INTEGER, [created_at] TEXT, [updated_at] TEXT, [closed_at] TEXT, [author_association] TEXT, [active_lock_reason] TEXT, [draft] INTEGER, [pull_request] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [state_reason] TEXT, [repo] INTEGER REFERENCES [repos]([id]), [type] TEXT ); CREATE INDEX [idx_issues_repo] ON [issues] ([repo]); CREATE INDEX [idx_issues_milestone] ON [issues] ([milestone]); CREATE INDEX [idx_issues_assignee] ON [issues] ([assignee]); CREATE INDEX [idx_issues_user] ON [issues] ([user]);