issues
3 rows where user = 42455466 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date), closed_at (date)
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at ▲ | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1083621690 | I_kwDOAMm_X85AlsE6 | 6084 | Initialise zarr metadata without computing dask graph | dougiesquire 42455466 | open | 0 | 6 | 2021-12-17T21:17:42Z | 2024-04-03T19:08:26Z | NONE | Is your feature request related to a problem? Please describe.
On writing large zarr stores, the xarray docs recommend first creating an initial Zarr store without writing all of its array data. The recommended approach is to first create a dummy dask-backed It seems that in one common use case for this approach (including the example in the above docs), the entire dataset to be written to zarr is already represented in a https://discourse.pangeo.io/t/many-netcdf-to-single-zarr-store-using-concurrent-futures/2029 https://discourse.pangeo.io/t/map-blocks-and-to-zarr-region/2019 https://discourse.pangeo.io/t/netcdf-to-zarr-best-practices/1119/12 https://discourse.pangeo.io/t/best-practice-for-memory-management-to-iteratively-write-a-large-dataset-with-xarray/1989 However, calling Describe the solution you'd like
Is there scope to add an option to |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6084/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1063046540 | I_kwDOAMm_X84_XM2M | 6026 | Delaying open produces different type of `cftime` object | dougiesquire 42455466 | closed | 0 | 3 | 2021-11-25T00:47:22Z | 2022-01-13T13:49:27Z | 2022-01-13T13:49:27Z | NONE | What happened:
The task is opening a dataset (e.g. a netcdf or zarr file) with a time coordinate using What you expected to happen:
Consistent Minimal Complete Verifiable Example: ```python import dask import numpy as np import xarray as xr from dask.distributed import LocalCluster, Client cluster = LocalCluster() client = Client(cluster) Write some datavar = np.random.random(4) time = xr.cftime_range('2000-01-01', periods=4, calendar='julian') ds = xr.Dataset(data_vars={'var': ('time', var)}, coords={'time': time}) ds.to_netcdf('test.nc', mode='w') Open written datads1 = xr.open_dataset('test.nc', use_cftime=True) print(f'ds1: {ds1.time} \n') Delayed open written datads2 = dask.delayed(xr.open_dataset)('test.nc', use_cftime=True) ds2 = dask.compute(ds2)[0] print(f'ds2: {ds2.time} \n') Operations like xr.open_mfdataset which use dask.delayed internallywhen parallel=True (I think) produce the same result as ds2ds3 = xr.open_mfdataset('test.nc', use_cftime=True, parallel=True)
print(f'ds3: {ds3.time}')
ds2: <xarray.DataArray 'time' (time: 4)> array([cftime.datetime(2000, 1, 1, 0, 0, 0, 0, calendar='julian', has_year_zero=False), cftime.datetime(2000, 1, 2, 0, 0, 0, 0, calendar='julian', has_year_zero=False), cftime.datetime(2000, 1, 3, 0, 0, 0, 0, calendar='julian', has_year_zero=False), cftime.datetime(2000, 1, 4, 0, 0, 0, 0, calendar='julian', has_year_zero=False)], dtype=object) Coordinates: * time (time) object 2000-01-01 00:00:00 ... 2000-01-04 00:00:00 ds3: <xarray.DataArray 'time' (time: 4)> array([cftime.datetime(2000, 1, 1, 0, 0, 0, 0, calendar='julian', has_year_zero=False), cftime.datetime(2000, 1, 2, 0, 0, 0, 0, calendar='julian', has_year_zero=False), cftime.datetime(2000, 1, 3, 0, 0, 0, 0, calendar='julian', has_year_zero=False), cftime.datetime(2000, 1, 4, 0, 0, 0, 0, calendar='julian', has_year_zero=False)], dtype=object) Coordinates: * time (time) object 2000-01-01 00:00:00 ... 2000-01-04 00:00:00 ``` Anything else we need to know?:
I noticed this because the DatetimeAccessor TypeError Traceback (most recent call last) <ipython-input-10-613e63624953> in <module> ----> 1 ds3.time.dt.floor('D') /g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/core/accessor_dt.py in floor(self, freq) 220 """ 221 --> 222 return self._tslib_round_accessor("floor", freq) 223 224 def ceil(self, freq): /g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/core/accessor_dt.py in _tslib_round_accessor(self, name, freq) 202 def _tslib_round_accessor(self, name, freq): 203 obj_type = type(self._obj) --> 204 result = _round_field(self._obj.data, name, freq) 205 return obj_type(result, name=name, coords=self._obj.coords, dims=self._obj.dims) 206 /g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/core/accessor_dt.py in _round_field(values, name, freq) 142 ) 143 else: --> 144 return _round_through_series_or_index(values, name, freq) 145 146 /g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/core/accessor_dt.py in _round_through_series_or_index(values, name, freq) 110 method = getattr(values_as_cftimeindex, name) 111 --> 112 field_values = method(freq=freq).values 113 114 return field_values.reshape(values.shape) /g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/coding/cftimeindex.py in floor(self, freq) 733 CFTimeIndex 734 """ --> 735 return self._round_via_method(freq, _floor_int) 736 737 def ceil(self, freq): /g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/coding/cftimeindex.py in _round_via_method(self, freq, method) 714 715 unit = _total_microseconds(offset.as_timedelta()) --> 716 values = self.asi8 717 rounded = method(values, unit) 718 return _cftimeindex_from_i8(rounded, self.date_type, self.name) /g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/coding/cftimeindex.py in asi8(self) 684 epoch = self.date_type(1970, 1, 1) 685 return np.array( --> 686 [ 687 _total_microseconds(exact_cftime_datetime_difference(epoch, date)) 688 for date in self.values /g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/coding/cftimeindex.py in <listcomp>(.0) 685 return np.array( 686 [ --> 687 _total_microseconds(exact_cftime_datetime_difference(epoch, date)) 688 for date in self.values 689 ], /g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/core/resample_cftime.py in exact_cftime_datetime_difference(a, b) 356 datetime.timedelta 357 """ --> 358 seconds = b.replace(microsecond=0) - a.replace(microsecond=0) 359 seconds = int(round(seconds.total_seconds())) 360 microseconds = b.microsecond - a.microsecond src/cftime/_cftime.pyx in cftime._cftime.datetime.sub() TypeError: cannot compute the time difference between dates with different calendars ``` My apologies for conflating two issues here. I'm happy to open a separate issue for this if that's preferred. Environment: Output of <tt>xr.show_versions()</tt>INSTALLED VERSIONS ------------------ commit: None python: 3.9.4 | packaged by conda-forge | (default, May 10 2021, 22:13:33) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 4.18.0-305.19.1.el8.nci.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.20.1 pandas: 1.3.4 numpy: 1.21.4 scipy: 1.6.3 netCDF4: 1.5.6 pydap: None h5netcdf: 0.11.0 h5py: 3.3.0 Nio: None zarr: 2.9.5 cftime: 1.5.0 nc_time_axis: 1.4.0 PseudoNetCDF: None rasterio: 1.2.4 cfgrib: None iris: None bottleneck: 1.3.2 dask: 2021.11.2 distributed: 2021.11.2 matplotlib: 3.4.2 cartopy: 0.19.0.post1 seaborn: None numbagg: None fsspec: 2021.05.0 cupy: None pint: 0.18 sparse: None setuptools: 49.6.0.post20210108 pip: 21.1.2 conda: 4.10.1 pytest: None IPython: 7.24.0 sphinx: None |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6026/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
789755611 | MDU6SXNzdWU3ODk3NTU2MTE= | 4833 | Strange behaviour when overwriting files with to_netcdf and html repr | dougiesquire 42455466 | closed | 0 | 2 | 2021-01-20T08:28:35Z | 2021-01-20T20:00:23Z | 2021-01-20T20:00:23Z | NONE | What happened: I'm experiencing some strange behaviour when overwriting netcdf files using I can generate some data, save it to a netcdf file, reopen it and everything works as expected: ```python import numpy as np import xarray as xr ones = xr.DataArray(np.ones(5), coords=[range(5)], dims=['x']).to_dataset(name='a') ones.to_netcdf('./a.nc')
print(xr.open_dataset('./a.nc')['a'])
KeyError Traceback (most recent call last) .../lib/python3.8/site-packages/xarray/backends/file_manager.py in _acquire_with_cache_info(self, needs_lock) 198 try: --> 199 file = self._cache[self._key] 200 except KeyError: .../lib/python3.8/site-packages/xarray/backends/lru_cache.py in getitem(self, key) 52 with self._lock: ---> 53 value = self._cache[key] 54 self._cache.move_to_end(key) KeyError: [<class 'netCDF4._netCDF4.Dataset'>, ('.../a.nc',), 'a', (('clobber', True), ('diskless', False), ('format', 'NETCDF4'), ('persist', False))] During handling of the above exception, another exception occurred:
.
.
.
PermissionError: [Errno 13] Permission denied: b'.../a.nc'
Note that in the last example, the data saved on disk is correct (i.e. contains ones) but xarray is still somehow linked to the Anything else we need to know?: I've come across this unexpected behaviour a few times. In the above example, I've had to add |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/4833/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issues] ( [id] INTEGER PRIMARY KEY, [node_id] TEXT, [number] INTEGER, [title] TEXT, [user] INTEGER REFERENCES [users]([id]), [state] TEXT, [locked] INTEGER, [assignee] INTEGER REFERENCES [users]([id]), [milestone] INTEGER REFERENCES [milestones]([id]), [comments] INTEGER, [created_at] TEXT, [updated_at] TEXT, [closed_at] TEXT, [author_association] TEXT, [active_lock_reason] TEXT, [draft] INTEGER, [pull_request] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [state_reason] TEXT, [repo] INTEGER REFERENCES [repos]([id]), [type] TEXT ); CREATE INDEX [idx_issues_repo] ON [issues] ([repo]); CREATE INDEX [idx_issues_milestone] ON [issues] ([milestone]); CREATE INDEX [idx_issues_assignee] ON [issues] ([assignee]); CREATE INDEX [idx_issues_user] ON [issues] ([user]);