home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

10 rows where comments = 4 and "updated_at" is on date 2022-04-09 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: author_association

state 2

  • open 7
  • closed 3

type 1

  • issue 10

repo 1

  • xarray 10
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
643035732 MDU6SXNzdWU2NDMwMzU3MzI= 4169 "write to read-only" Error in xarray.open_mfdataset() when trying to write to a netcdf file EliT1626 65610153 closed 0     4 2020-06-22T12:35:57Z 2022-04-09T15:50:51Z 2022-04-09T15:50:51Z NONE      

Code Sample

``` xr.set_options(file_cache_maxsize=10)

Assumes daily increments

def list_dates(start, end): num_days = (end - start).days return [start + dt.timedelta(days=x) for x in range(num_days)]

def list_dates1(start, end): num_days = (end - start).days dates = [start + dt.timedelta(days=x) for x in range(num_days)] sorted_dates = sorted(dates, key=lambda date: (date.month, date.day)) grouped_dates = [list(g) for _, g in groupby(sorted_dates, key=lambda date: (date.month, date.day))] return grouped_dates

start_date = dt.date(2010, 1, 1) end_date = dt.date(2019, 12, 31) date_list = list_dates1(start_date, end_date) window1 = dt.timedelta(days=5) window2 = dt.timedelta(days=6)

url = 'https://www.ncei.noaa.gov/thredds/dodsC/OisstBase/NetCDF/V2.1/AVHRR/{0:%Y%m}/oisst-avhrr-v02r01.{0:%Y%m%d}.nc' end_date2 = dt.date(2010, 1, 2)

sst_mean=[] cur_date = start_date

for cur_date in date_list: sst_mean_calc = [] for i in cur_date: date_window=list_dates(i - window1, i + window2) url_list_window = [url.format(x) for x in date_window] window_data=xr.open_mfdataset(url_list_window).sst sst_mean_calc.append(window_data.mean('time'))
sst_mean.append(xr.concat(sst_mean_calc, dim='time').mean('time')) cur_date+=cur_date if cur_date[0] >= end_date2: break else: continue

sst_mean_climo_test=xr.concat(sst_mean, dim='time')

sst_std=xr.concat(sst_std_calc, dim=pd.DatetimeIndex(date_list, name='time'))

sst_min = xr.concat(sst_min_calc, dim=pd.DatetimeIndex(date_list, name='time'))

sst_max = xr.concat(sst_max_calc, dim=pd.DatetimeIndex(date_list, name='time'))

sst_mean_climo_test.to_netcdf(path='E:/Riskpulse_HD/SST_stuff/sst_mean_climo_test') ``` Explanation of Code This code (climatology for SSTs) creates a list of dates between the specified start and end dates that contains the same day number for every month through the year span. For example, date_list[0] contains 10 datetime dates that start with 1-1-2010, 1-1-2011...1-1-2019. I then request OISST data from an opendap server and take a centered mean of the date in question (this case I did it for the first and second of January). In other words, I am opening the files for Dec 27-Jan 6 and averaging all of them together. The final xarray dataset then contains two 'times', which is 10 years worth of data for Jan 1 and Jan 2. I want to then send this to a netcdf file such that I can save it on my local machine and use to create plots down the road. Hope this makes sense.

Error Messages

``` KeyError Traceback (most recent call last) ~\Anaconda3\lib\site-packages\xarray\backends\file_manager.py in _acquire_with_cache_info(self, needs_lock) 197 try: --> 198 file = self._cache[self._key] 199 except KeyError:

~\Anaconda3\lib\site-packages\xarray\backends\lru_cache.py in getitem(self, key) 52 with self._lock: ---> 53 value = self._cache[key] 54 self._cache.move_to_end(key)

KeyError: [<class 'netCDF4._netCDF4.Dataset'>, ('https://www.ncei.noaa.gov/thredds/dodsC/OisstBase/NetCDF/V2.1/AVHRR/201801/oisst-avhrr-v02r01.20180106.nc',), 'r', (('clobber', True), ('diskless', False), ('format', 'NETCDF4'), ('persist', False))]

During handling of the above exception, another exception occurred:

RuntimeError Traceback (most recent call last) <ipython-input-3-f8395dcffb5e> in <module> 1 #xr.set_options(file_cache_maxsize=500) ----> 2 sst_mean_climo_test.to_netcdf(path='E:/Riskpulse_HD/SST_stuff/sst_mean_climo_test')

~\Anaconda3\lib\site-packages\xarray\core\dataarray.py in to_netcdf(self, args, kwargs) 2356 dataset = self.to_dataset() 2357 -> 2358 return dataset.to_netcdf(args, **kwargs) 2359 2360 def to_dict(self, data: bool = True) -> dict:

~\Anaconda3\lib\site-packages\xarray\core\dataset.py in to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute, invalid_netcdf) 1552 unlimited_dims=unlimited_dims, 1553 compute=compute, -> 1554 invalid_netcdf=invalid_netcdf, 1555 ) 1556

~\Anaconda3\lib\site-packages\xarray\backends\api.py in to_netcdf(dataset, path_or_file, mode, format, group, engine, encoding, unlimited_dims, compute, multifile, invalid_netcdf) 1095 return writer, store 1096 -> 1097 writes = writer.sync(compute=compute) 1098 1099 if path_or_file is None:

~\Anaconda3\lib\site-packages\xarray\backends\common.py in sync(self, compute) 202 compute=compute, 203 flush=True, --> 204 regions=self.regions, 205 ) 206 self.sources = []

~\Anaconda3\lib\site-packages\dask\array\core.py in store(sources, targets, lock, regions, compute, return_stored, kwargs) 943 944 if compute: --> 945 result.compute(kwargs) 946 return None 947 else:

~\Anaconda3\lib\site-packages\dask\base.py in compute(self, kwargs) 164 dask.base.compute 165 """ --> 166 (result,) = compute(self, traverse=False, kwargs) 167 return result 168

~\Anaconda3\lib\site-packages\dask\base.py in compute(args, kwargs) 442 postcomputes.append(x.dask_postcompute()) 443 --> 444 results = schedule(dsk, keys, kwargs) 445 return repack([f(r, a) for r, (f, a) in zip(results, postcomputes)]) 446

~\Anaconda3\lib\site-packages\dask\threaded.py in get(dsk, result, cache, num_workers, pool, kwargs) 82 get_id=_thread_get_id, 83 pack_exception=pack_exception, ---> 84 kwargs 85 ) 86

~\Anaconda3\lib\site-packages\dask\local.py in get_async(apply_async, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, **kwargs) 484 _execute_task(task, data) # Re-execute locally 485 else: --> 486 raise_exception(exc, tb) 487 res, worker_id = loads(res_info) 488 state["cache"][key] = res

~\Anaconda3\lib\site-packages\dask\local.py in reraise(exc, tb) 314 if exc.traceback is not tb: 315 raise exc.with_traceback(tb) --> 316 raise exc 317 318

~\Anaconda3\lib\site-packages\dask\local.py in execute_task(key, task_info, dumps, loads, get_id, pack_exception) 220 try: 221 task, data = loads(task_info) --> 222 result = _execute_task(task, data) 223 id = get_id() 224 result = dumps((result, id))

~\Anaconda3\lib\site-packages\dask\core.py in _execute_task(arg, cache, dsk) 119 # temporaries by their reference count and can execute certain 120 # operations in-place. --> 121 return func(*(_execute_task(a, cache) for a in args)) 122 elif not ishashable(arg): 123 return arg

~\Anaconda3\lib\site-packages\dask\array\core.py in getter(a, b, asarray, lock) 98 c = a[b] 99 if asarray: --> 100 c = np.asarray(c) 101 finally: 102 if lock:

~\Anaconda3\lib\site-packages\numpy\core_asarray.py in asarray(a, dtype, order) 83 84 """ ---> 85 return array(a, dtype, copy=False, order=order) 86 87

~\Anaconda3\lib\site-packages\xarray\core\indexing.py in array(self, dtype) 489 490 def array(self, dtype=None): --> 491 return np.asarray(self.array, dtype=dtype) 492 493 def getitem(self, key):

~\Anaconda3\lib\site-packages\numpy\core_asarray.py in asarray(a, dtype, order) 83 84 """ ---> 85 return array(a, dtype, copy=False, order=order) 86 87

~\Anaconda3\lib\site-packages\xarray\core\indexing.py in array(self, dtype) 651 652 def array(self, dtype=None): --> 653 return np.asarray(self.array, dtype=dtype) 654 655 def getitem(self, key):

~\Anaconda3\lib\site-packages\numpy\core_asarray.py in asarray(a, dtype, order) 83 84 """ ---> 85 return array(a, dtype, copy=False, order=order) 86 87

~\Anaconda3\lib\site-packages\xarray\core\indexing.py in array(self, dtype) 555 def array(self, dtype=None): 556 array = as_indexable(self.array) --> 557 return np.asarray(array[self.key], dtype=None) 558 559 def transpose(self, order):

~\Anaconda3\lib\site-packages\numpy\core_asarray.py in asarray(a, dtype, order) 83 84 """ ---> 85 return array(a, dtype, copy=False, order=order) 86 87

~\Anaconda3\lib\site-packages\xarray\coding\variables.py in array(self, dtype) 70 71 def array(self, dtype=None): ---> 72 return self.func(self.array) 73 74 def repr(self):

~\Anaconda3\lib\site-packages\xarray\coding\variables.py in _scale_offset_decoding(data, scale_factor, add_offset, dtype) 216 217 def _scale_offset_decoding(data, scale_factor, add_offset, dtype): --> 218 data = np.array(data, dtype=dtype, copy=True) 219 if scale_factor is not None: 220 data *= scale_factor

~\Anaconda3\lib\site-packages\xarray\coding\variables.py in array(self, dtype) 70 71 def array(self, dtype=None): ---> 72 return self.func(self.array) 73 74 def repr(self):

~\Anaconda3\lib\site-packages\xarray\coding\variables.py in _apply_mask(data, encoded_fill_values, decoded_fill_value, dtype) 136 ) -> np.ndarray: 137 """Mask all matching values in a NumPy arrays.""" --> 138 data = np.asarray(data, dtype=dtype) 139 condition = False 140 for fv in encoded_fill_values:

~\Anaconda3\lib\site-packages\numpy\core_asarray.py in asarray(a, dtype, order) 83 84 """ ---> 85 return array(a, dtype, copy=False, order=order) 86 87

~\Anaconda3\lib\site-packages\xarray\core\indexing.py in array(self, dtype) 555 def array(self, dtype=None): 556 array = as_indexable(self.array) --> 557 return np.asarray(array[self.key], dtype=None) 558 559 def transpose(self, order):

~\Anaconda3\lib\site-packages\xarray\backends\netCDF4_.py in getitem(self, key) 71 def getitem(self, key): 72 return indexing.explicit_indexing_adapter( ---> 73 key, self.shape, indexing.IndexingSupport.OUTER, self._getitem 74 ) 75

~\Anaconda3\lib\site-packages\xarray\core\indexing.py in explicit_indexing_adapter(key, shape, indexing_support, raw_indexing_method) 835 """ 836 raw_key, numpy_indices = decompose_indexer(key, shape, indexing_support) --> 837 result = raw_indexing_method(raw_key.tuple) 838 if numpy_indices.tuple: 839 # index the loaded np.ndarray

~\Anaconda3\lib\site-packages\xarray\backends\netCDF4_.py in _getitem(self, key) 82 try: 83 with self.datastore.lock: ---> 84 original_array = self.get_array(needs_lock=False) 85 array = getitem(original_array, key) 86 except IndexError:

~\Anaconda3\lib\site-packages\xarray\backends\netCDF4_.py in get_array(self, needs_lock) 61 62 def get_array(self, needs_lock=True): ---> 63 ds = self.datastore._acquire(needs_lock) 64 variable = ds.variables[self.variable_name] 65 variable.set_auto_maskandscale(False)

~\Anaconda3\lib\site-packages\xarray\backends\netCDF4_.py in _acquire(self, needs_lock) 359 360 def _acquire(self, needs_lock=True): --> 361 with self._manager.acquire_context(needs_lock) as root: 362 ds = _nc4_require_group(root, self._group, self._mode) 363 return ds

~\Anaconda3\lib\contextlib.py in enter(self) 110 del self.args, self.kwds, self.func 111 try: --> 112 return next(self.gen) 113 except StopIteration: 114 raise RuntimeError("generator didn't yield") from None

~\Anaconda3\lib\site-packages\xarray\backends\file_manager.py in acquire_context(self, needs_lock) 184 def acquire_context(self, needs_lock=True): 185 """Context manager for acquiring a file.""" --> 186 file, cached = self._acquire_with_cache_info(needs_lock) 187 try: 188 yield file

~\Anaconda3\lib\site-packages\xarray\backends\file_manager.py in _acquire_with_cache_info(self, needs_lock) 206 # ensure file doesn't get overriden when opened again 207 self._mode = "a" --> 208 self._cache[self._key] = file 209 return file, False 210 else:

~\Anaconda3\lib\site-packages\xarray\backends\lru_cache.py in setitem(self, key, value) 71 elif self._maxsize: 72 # make room if necessary ---> 73 self._enforce_size_limit(self._maxsize - 1) 74 self._cache[key] = value 75 elif self._on_evict is not None:

~\Anaconda3\lib\site-packages\xarray\backends\lru_cache.py in _enforce_size_limit(self, capacity) 61 key, value = self._cache.popitem(last=False) 62 if self._on_evict is not None: ---> 63 self._on_evict(key, value) 64 65 def setitem(self, key: K, value: V) -> None:

~\Anaconda3\lib\site-packages\xarray\backends\file_manager.py in <lambda>(k, v) 12 # Global cache for storing open files. 13 FILE_CACHE: LRUCache[str, io.IOBase] = LRUCache( ---> 14 maxsize=cast(int, OPTIONS["file_cache_maxsize"]), on_evict=lambda k, v: v.close() 15 ) 16 assert FILE_CACHE.maxsize, "file cache must be at least size one"

netCDF4_netCDF4.pyx in netCDF4._netCDF4.Dataset.close()

netCDF4_netCDF4.pyx in netCDF4._netCDF4.Dataset._close()

netCDF4_netCDF4.pyx in netCDF4._netCDF4._ensure_nc_success()

RuntimeError: NetCDF: HDF error ``` I also tried changing setting xr.set_options(file_cache_maxsize=500) outside of the loop before trying to create the netcdf file and received this error:

``` KeyError Traceback (most recent call last) ~\Anaconda3\lib\site-packages\xarray\backends\file_manager.py in _acquire_with_cache_info(self, needs_lock) 197 try: --> 198 file = self._cache[self._key] 199 except KeyError:

~\Anaconda3\lib\site-packages\xarray\backends\lru_cache.py in getitem(self, key) 52 with self._lock: ---> 53 value = self._cache[key] 54 self._cache.move_to_end(key)

KeyError: [<class 'netCDF4._netCDF4.Dataset'>, ('https://www.ncei.noaa.gov/thredds/dodsC/OisstBase/NetCDF/V2.1/AVHRR/201512/oisst-avhrr-v02r01.20151231.nc',), 'r', (('clobber', True), ('diskless', False), ('format', 'NETCDF4'), ('persist', False))]

During handling of the above exception, another exception occurred:

OSError Traceback (most recent call last) <ipython-input-4-474cdce51e60> in <module> 1 xr.set_options(file_cache_maxsize=500) ----> 2 sst_mean_climo_test.to_netcdf(path='E:/Riskpulse_HD/SST_stuff/sst_mean_climo_test')

~\Anaconda3\lib\site-packages\xarray\core\dataarray.py in to_netcdf(self, args, kwargs) 2356 dataset = self.to_dataset() 2357 -> 2358 return dataset.to_netcdf(args, **kwargs) 2359 2360 def to_dict(self, data: bool = True) -> dict:

~\Anaconda3\lib\site-packages\xarray\core\dataset.py in to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute, invalid_netcdf) 1552 unlimited_dims=unlimited_dims, 1553 compute=compute, -> 1554 invalid_netcdf=invalid_netcdf, 1555 ) 1556

~\Anaconda3\lib\site-packages\xarray\backends\api.py in to_netcdf(dataset, path_or_file, mode, format, group, engine, encoding, unlimited_dims, compute, multifile, invalid_netcdf) 1095 return writer, store 1096 -> 1097 writes = writer.sync(compute=compute) 1098 1099 if path_or_file is None:

~\Anaconda3\lib\site-packages\xarray\backends\common.py in sync(self, compute) 202 compute=compute, 203 flush=True, --> 204 regions=self.regions, 205 ) 206 self.sources = []

~\Anaconda3\lib\site-packages\dask\array\core.py in store(sources, targets, lock, regions, compute, return_stored, kwargs) 943 944 if compute: --> 945 result.compute(kwargs) 946 return None 947 else:

~\Anaconda3\lib\site-packages\dask\base.py in compute(self, kwargs) 164 dask.base.compute 165 """ --> 166 (result,) = compute(self, traverse=False, kwargs) 167 return result 168

~\Anaconda3\lib\site-packages\dask\base.py in compute(args, kwargs) 442 postcomputes.append(x.dask_postcompute()) 443 --> 444 results = schedule(dsk, keys, kwargs) 445 return repack([f(r, a) for r, (f, a) in zip(results, postcomputes)]) 446

~\Anaconda3\lib\site-packages\dask\threaded.py in get(dsk, result, cache, num_workers, pool, kwargs) 82 get_id=_thread_get_id, 83 pack_exception=pack_exception, ---> 84 kwargs 85 ) 86

~\Anaconda3\lib\site-packages\dask\local.py in get_async(apply_async, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, **kwargs) 484 _execute_task(task, data) # Re-execute locally 485 else: --> 486 raise_exception(exc, tb) 487 res, worker_id = loads(res_info) 488 state["cache"][key] = res

~\Anaconda3\lib\site-packages\dask\local.py in reraise(exc, tb) 314 if exc.traceback is not tb: 315 raise exc.with_traceback(tb) --> 316 raise exc 317 318

~\Anaconda3\lib\site-packages\dask\local.py in execute_task(key, task_info, dumps, loads, get_id, pack_exception) 220 try: 221 task, data = loads(task_info) --> 222 result = _execute_task(task, data) 223 id = get_id() 224 result = dumps((result, id))

~\Anaconda3\lib\site-packages\dask\core.py in _execute_task(arg, cache, dsk) 119 # temporaries by their reference count and can execute certain 120 # operations in-place. --> 121 return func(*(_execute_task(a, cache) for a in args)) 122 elif not ishashable(arg): 123 return arg

~\Anaconda3\lib\site-packages\dask\array\core.py in getter(a, b, asarray, lock) 98 c = a[b] 99 if asarray: --> 100 c = np.asarray(c) 101 finally: 102 if lock:

~\Anaconda3\lib\site-packages\numpy\core_asarray.py in asarray(a, dtype, order) 83 84 """ ---> 85 return array(a, dtype, copy=False, order=order) 86 87

~\Anaconda3\lib\site-packages\xarray\core\indexing.py in array(self, dtype) 489 490 def array(self, dtype=None): --> 491 return np.asarray(self.array, dtype=dtype) 492 493 def getitem(self, key):

~\Anaconda3\lib\site-packages\numpy\core_asarray.py in asarray(a, dtype, order) 83 84 """ ---> 85 return array(a, dtype, copy=False, order=order) 86 87

~\Anaconda3\lib\site-packages\xarray\core\indexing.py in array(self, dtype) 651 652 def array(self, dtype=None): --> 653 return np.asarray(self.array, dtype=dtype) 654 655 def getitem(self, key):

~\Anaconda3\lib\site-packages\numpy\core_asarray.py in asarray(a, dtype, order) 83 84 """ ---> 85 return array(a, dtype, copy=False, order=order) 86 87

~\Anaconda3\lib\site-packages\xarray\core\indexing.py in array(self, dtype) 555 def array(self, dtype=None): 556 array = as_indexable(self.array) --> 557 return np.asarray(array[self.key], dtype=None) 558 559 def transpose(self, order):

~\Anaconda3\lib\site-packages\numpy\core_asarray.py in asarray(a, dtype, order) 83 84 """ ---> 85 return array(a, dtype, copy=False, order=order) 86 87

~\Anaconda3\lib\site-packages\xarray\coding\variables.py in array(self, dtype) 70 71 def array(self, dtype=None): ---> 72 return self.func(self.array) 73 74 def repr(self):

~\Anaconda3\lib\site-packages\xarray\coding\variables.py in _scale_offset_decoding(data, scale_factor, add_offset, dtype) 216 217 def _scale_offset_decoding(data, scale_factor, add_offset, dtype): --> 218 data = np.array(data, dtype=dtype, copy=True) 219 if scale_factor is not None: 220 data *= scale_factor

~\Anaconda3\lib\site-packages\xarray\coding\variables.py in array(self, dtype) 70 71 def array(self, dtype=None): ---> 72 return self.func(self.array) 73 74 def repr(self):

~\Anaconda3\lib\site-packages\xarray\coding\variables.py in _apply_mask(data, encoded_fill_values, decoded_fill_value, dtype) 136 ) -> np.ndarray: 137 """Mask all matching values in a NumPy arrays.""" --> 138 data = np.asarray(data, dtype=dtype) 139 condition = False 140 for fv in encoded_fill_values:

~\Anaconda3\lib\site-packages\numpy\core_asarray.py in asarray(a, dtype, order) 83 84 """ ---> 85 return array(a, dtype, copy=False, order=order) 86 87

~\Anaconda3\lib\site-packages\xarray\core\indexing.py in array(self, dtype) 555 def array(self, dtype=None): 556 array = as_indexable(self.array) --> 557 return np.asarray(array[self.key], dtype=None) 558 559 def transpose(self, order):

~\Anaconda3\lib\site-packages\xarray\backends\netCDF4_.py in getitem(self, key) 71 def getitem(self, key): 72 return indexing.explicit_indexing_adapter( ---> 73 key, self.shape, indexing.IndexingSupport.OUTER, self._getitem 74 ) 75

~\Anaconda3\lib\site-packages\xarray\core\indexing.py in explicit_indexing_adapter(key, shape, indexing_support, raw_indexing_method) 835 """ 836 raw_key, numpy_indices = decompose_indexer(key, shape, indexing_support) --> 837 result = raw_indexing_method(raw_key.tuple) 838 if numpy_indices.tuple: 839 # index the loaded np.ndarray

~\Anaconda3\lib\site-packages\xarray\backends\netCDF4_.py in _getitem(self, key) 82 try: 83 with self.datastore.lock: ---> 84 original_array = self.get_array(needs_lock=False) 85 array = getitem(original_array, key) 86 except IndexError:

~\Anaconda3\lib\site-packages\xarray\backends\netCDF4_.py in get_array(self, needs_lock) 61 62 def get_array(self, needs_lock=True): ---> 63 ds = self.datastore._acquire(needs_lock) 64 variable = ds.variables[self.variable_name] 65 variable.set_auto_maskandscale(False)

~\Anaconda3\lib\site-packages\xarray\backends\netCDF4_.py in _acquire(self, needs_lock) 359 360 def _acquire(self, needs_lock=True): --> 361 with self._manager.acquire_context(needs_lock) as root: 362 ds = _nc4_require_group(root, self._group, self._mode) 363 return ds

~\Anaconda3\lib\contextlib.py in enter(self) 110 del self.args, self.kwds, self.func 111 try: --> 112 return next(self.gen) 113 except StopIteration: 114 raise RuntimeError("generator didn't yield") from None

~\Anaconda3\lib\site-packages\xarray\backends\file_manager.py in acquire_context(self, needs_lock) 184 def acquire_context(self, needs_lock=True): 185 """Context manager for acquiring a file.""" --> 186 file, cached = self._acquire_with_cache_info(needs_lock) 187 try: 188 yield file

~\Anaconda3\lib\site-packages\xarray\backends\file_manager.py in _acquire_with_cache_info(self, needs_lock) 202 kwargs = kwargs.copy() 203 kwargs["mode"] = self._mode --> 204 file = self._opener(self._args, *kwargs) 205 if self._mode == "w": 206 # ensure file doesn't get overriden when opened again

netCDF4_netCDF4.pyx in netCDF4._netCDF4.Dataset.init()

netCDF4_netCDF4.pyx in netCDF4._netCDF4._ensure_nc_success()

OSError: [Errno -37] NetCDF: Write to read only: b'https://www.ncei.noaa.gov/thredds/dodsC/OisstBase/NetCDF/V2.1/AVHRR/201512/oisst-avhrr-v02r01.20151231.nc' ``` I believe these errors have something to do with a post that I created a couple weeks ago (https://github.com/pydata/xarray/issues/4082).

I'm not sure if you can @ users on here, but @rsignell-usgs found out something about the caching before hand. It seems that this is some sort of Windows issue.

Versions python: 3.7.4 xarray: 0.15.1 pandas: 1.0.3 numpy: 1.18.1 scipy: 1.4.1 netcdf4: 1.4.2

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4169/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
878481461 MDU6SXNzdWU4Nzg0ODE0NjE= 5276 open_mfdataset: Not a valid ID minhhg 11815787 closed 0     4 2021-05-07T05:34:02Z 2022-04-09T15:49:50Z 2022-04-09T15:49:50Z NONE      

I have about 601 NETCDF4 files saved using xarray. We try to use open_mfdataset to access these files. The main code calls this function many times. At the first few calls, it works fine, after for a while it throw the following error message "RuntimeError: NetCDF: Not a valid ID"

python def func(xpath, spec): doc = deepcopy(spec) with xr.open_mfdataset(xpath + "/*.nc", concat_dim='maturity') as data: var_name= list(data.data_vars)[0] ar = data[var_name] maturity = spec['maturity'] ann = ar.cumsum(dim='maturity') ann = ann - 1 ar1 = ann.sel(maturity=maturity) doc['data'] = ar1.load().values return doc

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.6.8.final.0 python-bits: 64 OS: Linux OS-release: 5.4.0-1047-aws machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: C.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.11.0 pandas: 0.24.1 numpy: 1.15.4 scipy: 1.2.0 netCDF4: 1.4.2 h5netcdf: None h5py: 2.9.0 Nio: None zarr: None cftime: 1.0.3.4 PseudonetCDF: None rasterio: None iris: None bottleneck: 1.2.1 cyordereddict: None dask: 1.1.1 distributed: 1.25.3 matplotlib: 3.0.2 cartopy: None seaborn: 0.9.0 setuptools: 40.7.3 pip: 19.0.1 conda: None pytest: 4.2.0 IPython: 7.1.1 sphinx: 1.8.4

This error also happens with xarray version 0.10.9

Error trace:

```python 2021-05-05 09:28:19,911, DEBUG 7621, sim_io.py:483 - load_unique_document(), xpa th=/home/ubuntu/runs/20210331_001/nominal_dfs/uk 2021-05-05 09:28:42,774, ERROR 7621, run_gov_ret.py:33 - <module>(), Unknown error=NetCDF: Not a valid ID Traceback (most recent call last): File "/home/ubuntu/dev/py36/python/ev/model/api3/run_gov_ret.py", line 31, in <module> res = govRet() File "/home/ubuntu/dev/py36/python/ev/model/api3/returns.py", line 56, in __ca ll__ decompose=self.decompose)) File "/home/ubuntu/dev/py36/python/ev/model/returns/returnsGenerator.py", line 70, in calc_returns dfs_data = self.mongo_dfs.get_data(mats=[1,mat,mat-1]) File "/home/ubuntu/dev/py36/python/ev/model/api3/dfs.py", line 262, in get_dat a record = self.mdb.load_unique_document(self.dfs_collection_name, spec) File "/home/ubuntu/dev/py36/python/ev/model/api3/sim_io.py", line 1109, in load_unique_document return self.collections[collection].load_unique_document(query, *args, **kwargs) File "/home/ubuntu/dev/py36/python/ev/model/api3/sim_io.py", line 501, in load_unique_document doc['data'] = ar1.load().values File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/xarray/core/dataarray.py", line 631, in load ds = self._to_temp_dataset().load(**kwargs) File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/xarray/core/dataset.py", line 494, in load evaluated_data = da.compute(*lazy_data.values(), **kwargs) File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/dask/base.py", line 398, in compute results = schedule(dsk, keys, **kwargs) File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/dask/threaded.py", line 76, in get pack_exception=pack_exception, **kwargs) pack_exception=pack_exception, **kwargs) File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/dask/local .py", line 459, in get_async raise_exception(exc, tb) File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/dask/compa tibility.py", line 112, in reraise raise exc File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/dask/local .py", line 230, in execute_task result = _execute_task(task, data) File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/dask/core. py", line 119, in _execute_task return func(*args2) File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/dask/array /core.py", line 82, in getter c = np.asarray(c) File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/numpy/core /numeric.py", line 501, in asarray return array(a, dtype, copy=False, order=order) File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/xarray/cor e/indexing.py", line 602, in __array__ return np.asarray(self.array, dtype=dtype) File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/numpy/core/numeric.py", line 501, in asarray return array(a, dtype, copy=False, order=order) File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/xarray/core/indexing.py", line 508, in __array__ return np.asarray(array[self.key], dtype=None) File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/xarray/backends/netCDF4_.py", line 64, in __getitem__ self._getitem) File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/xarray/core/indexing.py", line 776, in explicit_indexing_adapter result = raw_indexing_method(raw_key.tuple) File "/home/ubuntu/miniconda3/envs/egan/lib/python3.6/site-packages/xarray/backends/netCDF4_.py", line 76, in _getitem array = getitem(original_array, key) File "netCDF4/_netCDF4.pyx", line 4095, in netCDF4._netCDF4.Variable.__getitem__ File "netCDF4/_netCDF4.pyx", line 3798, in netCDF4._netCDF4.Variable.shape.__get__ File "netCDF4/_netCDF4.pyx", line 3746, in netCDF4._netCDF4.Variable._getdims File "netCDF4/_netCDF4.pyx", line 1754, in netCDF4._netCDF4._ensure_nc_success RuntimeError: NetCDF: Not a valid ID ```
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5276/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
208312826 MDU6SXNzdWUyMDgzMTI4MjY= 1273 replace a dim with a coordinate from another dataset rabernat 1197350 open 0     4 2017-02-17T02:15:36Z 2022-04-09T15:26:20Z   MEMBER      

I often want a function that takes a dataarray / dataset and replaces a dimension with a coordinate from a different dataset.

@shoyer proposed the following simple solution. ```python def replace_dim(da, olddim, newdim): renamed = da.rename({olddim: newdim.name})

# note that alignment along a dimension is skipped when you are overriding
# the relevant coordinate values
renamed.coords[newdim.name] = newdim
return renamed

```

Is this of broad enough interest to add a build in method for?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1273/reactions",
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
438947247 MDU6SXNzdWU0Mzg5NDcyNDc= 2933 Stack() & unstack() issues on Multindex ray306 1559890 closed 0     4 2019-04-30T19:47:51Z 2022-04-09T15:23:28Z 2022-04-09T15:23:28Z NONE      

I would like to reshape the DataArray by one level in the Multindex, and I thought the stack()/unstack() should be the solution.

Make a DataArray with Multindex: python import pandas as pd arrays = [np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo']), np.array(['one', 'two', 'one', 'two', 'one', 'two'])] da = pd.DataFrame(np.random.randn(6, 4)).to_xarray().to_array() da.coords['index'] = pd.MultiIndex.from_arrays(arrays, names=['first', 'second']) da <xarray.DataArray (variable: 4, index: 6)> array([[ 0.379189, 1.082292, -2.073478, -0.84626 , -1.529927, -0.837407], [-0.267983, -0.2516 , -1.016653, -0.085762, -0.058382, -0.667891], [-0.013488, -0.855332, -0.038072, -0.385211, -2.149742, -0.304361], [ 1.749561, -0.606031, 1.914146, 1.6292 , -0.515519, 1.996283]]) Coordinates: * index (index) MultiIndex - first (index) object 'bar' 'bar' 'baz' 'baz' 'foo' 'foo' - second (index) object 'one' 'two' 'one' 'two' 'one' 'two' * variable (variable) int32 0 1 2 3

Stack problem:

I want a dimension merges into another one: python da.stack({'index':['variable']}) ValueError: cannot create a new dimension with the same name as an existing dimension

Unstack problem:

Unstacking by the whole Multindex worked: python da.unstack('index') ``` <xarray.DataArray (variable: 4, first: 3, second: 2)> array([[[ 0.379189, 1.082292], [-2.073478, -0.84626 ], [-1.529927, -0.837407]],

   [[-0.267983, -0.2516  ],
    [-1.016653, -0.085762],
    [-0.058382, -0.667891]],

   [[-0.013488, -0.855332],
    [-0.038072, -0.385211],
    [-2.149742, -0.304361]],

   [[ 1.749561, -0.606031],
    [ 1.914146,  1.6292  ],
    [-0.515519,  1.996283]]])

Coordinates: * variable (variable) int32 0 1 2 3 * first (first) object 'bar' 'baz' 'foo' * second (second) object 'one' 'two' But unstacking by a specified level failed:python da.unstack('first') ValueError: Dataset does not contain the dimensions: ['first'] ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2933/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
607718350 MDU6SXNzdWU2MDc3MTgzNTA= 4011 missing empty group when iterate over groupby_bins miniufo 9312831 open 0     4 2020-04-27T17:22:31Z 2022-04-09T03:08:14Z   NONE      

When I try to iterate over the object grouped returned by groupby_bins, I found that the empty group is missing silently. Here is a simple case: ```python array = xr.DataArray(np.arange(4), dims='dim_0')

one of these bins will be empty

bins = [0,4,5] grouped = array.groupby_bins('dim_0', bins)

for i, group in enumerate(grouped): print(str(i)+' '+group) ``` When a bin contains no samples (bin of (4, 5]), the empty group will be dropped. Then how to iterate over the full bins even when some bins contain nothing? I've read this related issue #1019. But my case here need the correct order in grouped and empty groups need to be iterated over.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4011/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
673682661 MDU6SXNzdWU2NzM2ODI2NjE= 4313 Using Dependabot to manage doc build and CI versions jthielen 3460034 open 0     4 2020-08-05T16:24:24Z 2022-04-09T02:59:21Z   CONTRIBUTOR      

As brought up on the bi-weekly community developers meeting, it sounds like Pandas v1.1.0 is breaking doc builds on RTD. One solution to the issues of frequent breakages in doc builds and CI due to upstream updates is having fixed version lists for all of these, which are then incrementally updated as new versions come out. @dopplershift has done a lot of great work in MetPy getting such a workflow set up with Dependabot (https://github.com/Unidata/MetPy/pull/1410) among other CI updates, and this could be adapted for use here in xarray.

We've generally been quite happy with our updated CI configuration with Dependabot over the past couple weeks. The only major issue has been https://github.com/Unidata/MetPy/issues/1424 / https://github.com/dependabot/dependabot-core/issues/2198#issuecomment-649726022, which has required some contributors to have to delete and recreate their forks in order for Dependabot to not auto-submit PRs to the forked repos.

Any thoughts that you had here @dopplershift would be appreciated!

xref https://github.com/pydata/xarray/issues/4287, https://github.com/pydata/xarray/pull/4296

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4313/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
484699415 MDU6SXNzdWU0ODQ2OTk0MTU= 3256 .item() on a DataArray with dtype='datetime64[ns]' returns int IvoCrnkovic 1778852 open 0     4 2019-08-23T20:29:50Z 2022-04-09T02:03:43Z   NONE      

MCVE Code Sample

```python import datetime import xarray as xr

test_da = xr.DataArray(datetime.datetime(2019, 1, 1, 1, 1))

test_da

<xarray.DataArray ()>

array('2019-01-01T01:01:00.000000000', dtype='datetime64[ns]')

test_da.item()

1546304460000000000

```

Expected Output

I would think it would be nice to get a datetime out of the .item() call then the nanosecond representation.

Output of xr.show_versions()

When I call xr.show_versions() i get an error but im running xarray 0.12.3
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3256/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
882105903 MDU6SXNzdWU4ODIxMDU5MDM= 5281 'Parallelized' apply_ufunc for scripy.interpolate.griddata LJaksic 74414841 open 0     4 2021-05-09T10:08:46Z 2022-04-09T01:39:13Z   NONE      

Hi,

I'm working with large files from an ocean model with an unstructered grid. For instance, variable flow velocity ux with dimensions (194988, 1009, 20) for respectively: 'nFlowElement' (name unstructered grid element), 'time' and laydim (depth dimension). I'd like to interpolate these results to a structured grid with dimensions (600, 560, 1009, 20)for respectively: latitude, longitude, time and laydim. For this I am using scipy.interpolate.griddata. As these dataarrays are too large to load into your working memory at once, I am trying to work with 'chunks' (dask). Unfortunately, I bump into problems when trying to use apply_ufunc with setting: dask = 'parallelized'.

For smaller computational domains (smaller nFlowElement dimension) I ám still able to load the dataarray in my work memory. Then, the following code gives me the wanted result:

``` def interp_to_grid(u,xc,yc,xint,yint): print(u.shape,xc.shape,xint.shape) ug = griddata((xc,yc),u,(xint,yint), method='nearest', fill_value=np.nan) return ug

uxg = xr.apply_ufunc(interp_to_grid, ux, xc, yc, xint, yint, dask = 'allowed', input_core_dims=[['nFlowElem','time','laydim'],['nFlowElem'],['nFlowElem'],['dim_0','dim_1'],['dim_0','dim_1']], output_core_dims=[['dim_0','dim_1','time','laydim']], output_dtypes = [xr.DataArray] ) `` Notice that in the function interp_to_grid the input variables have the following dimensions: -u(i.e. ux, the original flow velocity output): (194988, 1009, 20) for (nFlowElem, time, laydim) -xc,yc(the latitude and longitude coordinates associated with these 194988 elements) so both (194988,) -xint, yint(the structured grid coordinates to which I would like to interpolate the data): both are (600, 560) for (dim_0,dim_1) Notice that scipy.interpolate.griddata does not require me to loop over the time and laydim dimension (as formulated in the code above). For this it is criticial to feedgriddata` the dimensions in the right order ('time' and 'laydim' last). The interpolated result, uxg, has dimensions (600, 560, 1009, 20) - as wanted and expected.

However, for much larger spatial domains it is required to work with dask = 'parallelized', because these input dataarrays can nolonger be loaded into my working memory. I have tried to apply chunks over the time dimension, but also over the nFlowElement dimension. I am aware that it is not possible to chunk over core dimensions.

This is one of my "parallel" attempts (with chunks along the time dim):

Input ux: <xarray.DataArray 'ucx' (nFlowElem: 194988, time: 1009, laydim: 20)> dask.array<transpose, shape=(194988, 1009, 20), dtype=float64, chunksize=(194988, 10, 20), chunktype=numpy.ndarray> Coordinates: FlowElem_xcc (nFlowElem) float64 dask.array<chunksize=(194988,), meta=np.ndarray> FlowElem_ycc (nFlowElem) float64 dask.array<chunksize=(194988,), meta=np.ndarray> * time (time) datetime64[ns] 2014-09-17 ... 2014-10-01 Dimensions without coordinates: nFlowElem, laydim Attributes: standard_name: eastward_sea_water_velocity long_name: velocity on flow element center, x-component units: m s-1 grid_mapping: wgs84 Apply_func: uxg = xr.apply_ufunc(interp_to_grid, ux, xc, yc, xint, yint, dask = 'parallelized', input_core_dims=[['nFlowElem'],['nFlowElem'],['nFlowElem'],['dim_0','dim_1'],['dim_0','dim_1']], output_core_dims=[['dim_0','dim_1']], output_dtypes = [xr.DataArray], ) Gives error: ``` File "interpnd.pyx", line 78, in scipy.interpolate.interpnd.NDInterpolatorBase.init

File "interpnd.pyx", line 192, in scipy.interpolate.interpnd._check_init_shape

ValueError: different number of values and points `` I have played around a lot with changing the core dimensions in apply_ufunc and the dimension along which to chunk. Also I have tried to manually change the order of dimensions of dataarrayuwhich is 'fed to' griddata (ininterp_to_grid`).

Any advice is very welcome! Best Wishes, Luka

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5281/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1030768250 I_kwDOAMm_X849cEZ6 5877 Rolling() gives values different from pd.rolling() chiaral 8453445 open 0     4 2021-10-19T21:41:42Z 2022-04-09T01:29:07Z   CONTRIBUTOR      

I am not sure this is a bug - but it clearly doesn't give the results the user would expect.

The rolling sum of zeros gives me values that are not zeros

```python var = np.array([0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.31 , 0.91999996, 8.3 , 1.42 , 0.03 , 1.22 , 0.09999999, 0.14 , 0.13 , 0. , 0.12 , 0.03 , 2.53 , 0. , 0.19999999, 0.19999999, 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ], dtype='float32')

timet = np.array([ 43200000000000, 129600000000000, 216000000000000, 302400000000000, 388800000000000, 475200000000000, 561600000000000, 648000000000000, 734400000000000, 820800000000000, 907200000000000, 993600000000000, 1080000000000000, 1166400000000000, 1252800000000000, 1339200000000000, 1425600000000000, 1512000000000000, 1598400000000000, 1684800000000000, 1771200000000000, 1857600000000000, 1944000000000000, 2030400000000000, 2116800000000000, 2203200000000000, 2289600000000000, 2376000000000000, 2462400000000000, 2548800000000000, 2635200000000000, 2721600000000000, 2808000000000000, 2894400000000000, 2980800000000000], dtype='timedelta64[ns]')

ds_ex = xr.Dataset(data_vars=dict( pr=(["time"], var), ), coords=dict( time=("time", timet) ), )

ds_ex.rolling(time=3).sum().pr.values

``` it gives me this result:

array([ nan, nan, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 3.1000000e-01, 1.2300000e+00, 9.5300007e+00, 1.0640000e+01, 9.7500000e+00, 2.6700001e+00, 1.3500001e+00, 1.4600002e+00, 3.7000012e-01, 2.7000013e-01, 2.5000012e-01, 1.5000013e-01, 2.6800001e+00, 2.5600002e+00, 2.7300003e+00, 4.0000033e-01, 4.0000033e-01, 2.0000035e-01, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07, 3.5762787e-07], dtype=float32)

Note the non zero values - the non zero value changes depending on whether i use float64 or float32 as precision of my data. So this seems to be a precision related issue (although the first values are correctly set to zero), in fact other sums of values are not exactly what they should be.

The small difference at the 8th/9th decimal position can be expected due to precision, but the fact that the 0s become non zeros is problematic imho, especially if not documented. Oftentimes zero in geoscience data can mean a very specific thing (i.e. zero rainfall will be characterized differently than non-zero).

in pandas this instead works:

python df_ex = ds_ex.to_dataframe() df_ex.rolling(window=3).sum().values.T gives me

array([[ nan, nan, 0. , 0. , 0. , 0. , 0. , 0.31 , 1.22999996, 9.53000015, 10.6400001 , 9.75000015, 2.66999999, 1.35000001, 1.46000002, 0.36999998, 0.27 , 0.24999999, 0.15 , 2.67999997, 2.55999997, 2.72999996, 0.39999998, 0.39999998, 0.19999999, 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ]])

What you expected to happen:

the sum of zeros should be zero. If this cannot be achieved/expected because of precision issues, it should be documented.

Anything else we need to know?:

I discovered this behavior in my old environments, but I created a new ad hoc environment with the latest versions, and it does the same thing.

Environment:

INSTALLED VERSIONS

commit: None python: 3.9.7 (default, Sep 16 2021, 08:50:36) [Clang 10.0.0 ] python-bits: 64 OS: Darwin OS-release: 17.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None

xarray: 0.19.0 pandas: 1.3.3 numpy: 1.21.2 scipy: None netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: None distributed: None matplotlib: None cartopy: None seaborn: None numbagg: None pint: None setuptools: 58.0.4 pip: 21.2.4 conda: None pytest: None IPython: 7.28.0 sphinx: None

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5877/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
653442225 MDU6SXNzdWU2NTM0NDIyMjU= 4209 `xr.save_mfdataset()` doesn't honor `compute=False` argument andersy005 13301940 open 0     4 2020-07-08T16:40:11Z 2022-04-09T01:25:56Z   MEMBER      

What happened:

While using xr.save_mfdataset() function with compute=False I noticed that the function returns a dask.delayed object, but it doesn't actually defer the computation i.e. it actually writes datasets right away.

What you expected to happen:

I expect the datasets to be written when I explicitly call .compute() on the returned delayed object.

Minimal Complete Verifiable Example:

```python In [2]: import xarray as xr

In [3]: ds = xr.tutorial.open_dataset('rasm', chunks={})

In [4]: ds Out[4]: <xarray.Dataset> Dimensions: (time: 36, x: 275, y: 205) Coordinates: * time (time) object 1980-09-16 12:00:00 ... 1983-08-17 00:00:00 xc (y, x) float64 dask.array<chunksize=(205, 275), meta=np.ndarray> yc (y, x) float64 dask.array<chunksize=(205, 275), meta=np.ndarray> Dimensions without coordinates: x, y Data variables: Tair (time, y, x) float64 dask.array<chunksize=(36, 205, 275), meta=np.ndarray> Attributes: title: /workspace/jhamman/processed/R1002RBRxaaa01a/l... institution: U.W. source: RACM R1002RBRxaaa01a output_frequency: daily output_mode: averaged convention: CF-1.4 references: Based on the initial model of Liang et al., 19... comment: Output from the Variable Infiltration Capacity... nco_openmp_thread_number: 1 NCO: "4.6.0" history: Tue Dec 27 14:15:22 2016: ncatted -a dimension...

In [5]: path = "test.nc"

In [7]: ls -ltrh test.nc ls: cannot access test.nc: No such file or directory

In [8]: tasks = xr.save_mfdataset(datasets=[ds], paths=[path], compute=False)

In [9]: tasks Out[9]: Delayed('list-aa0b52e0-e909-4e65-849f-74526d137542')

In [10]: ls -ltrh test.nc -rw-r--r-- 1 abanihi ncar 14K Jul 8 10:29 test.nc ```

Anything else we need to know?:

Environment:

Output of <tt>xr.show_versions()</tt> ```python INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 | packaged by conda-forge | (default, Jun 1 2020, 18:57:50) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 3.10.0-693.21.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.4 xarray: 0.15.1 pandas: 0.25.3 numpy: 1.18.5 scipy: 1.5.0 netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: 1.2.0 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.20.0 distributed: 2.20.0 matplotlib: 3.2.1 cartopy: None seaborn: None numbagg: None setuptools: 49.1.0.post20200704 pip: 20.1.1 conda: None pytest: None IPython: 7.16.1 sphinx: None ```
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4209/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 4161.4ms · About: xarray-datasette