github: issues: 22 rows where state = "closed", type = "issue" and user = 17162724 sorted by updated

22 rows where state = "closed", type = "issue" and user = 17162724 sorted by updated_at descending

Search:

descending

id	node_id	number	title	user	state	comments	created_at	updated_at ▲	closed_at	author_association	body	reactions	state_reason	repo	type
1013881639	I_kwDOAMm_X848bpsn	5836	to_zarr returns <xarray.backends.zarr.ZarrStore at ...>	raybellwaves 17162724	closed	2	2021-10-02T01:53:27Z	2023-03-12T15:59:14Z	2023-03-12T15:59:13Z	CONTRIBUTOR	What happened: When doing to zarr it returns `<xarray.backends.zarr.ZarrStore at ...>` What you expected to happen: Returns None. Same behaviour as pandas.to_parquet() Minimal Complete Verifiable Example: ```python import xarray as xr ds = xr.tutorial.open_dataset("air_temperature") ds.to_zarr("ds.zarr") ds.to_dataframe().to_parquet("ds.parquet") ``` Anything else we need to know?: Environment: Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.9.7 \| packaged by conda-forge \| (default, Sep 23 2021, 07:31:23) [Clang 11.1.0 ] python-bits: 64 OS: Darwin OS-release: 20.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.8.0 xarray: 0.19.0 pandas: 1.3.3 numpy: 1.21.2 scipy: 1.7.1 netCDF4: 1.5.7 pydap: installed h5netcdf: 0.11.0 h5py: 3.3.0 Nio: None zarr: 2.10.0 cftime: 1.5.0 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.6 cfgrib: 0.9.9.0 iris: None bottleneck: 1.3.2 dask: 2021.09.1 distributed: 2021.09.1 matplotlib: 3.4.3 cartopy: 0.20.0 seaborn: 0.11.2 numbagg: None pint: 0.17 setuptools: 58.0.2 pip: 21.2.4 conda: None pytest: 6.2.5 IPython: 7.27.0 sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5836/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1294978633	I_kwDOAMm_X85NL85J	6754	to_array to create a dimension as last axis	raybellwaves 17162724	closed	6	2022-07-06T01:28:50Z	2022-07-06T17:12:25Z	2022-07-06T17:12:25Z	CONTRIBUTOR	Is your feature request related to a problem? I do `ds.to_array(dim="variable").transpose("latitude", "longitude", "variable")`. I would like to avoid the extra transpose call. Describe the solution you'd like `ds.to_array(dim="variable", new_axis="last")` where `new_axis` is `Literal["first", "last"] = "first"` or `ds.to_array(dim="variable", new_axis=0)` where `new_axis` is `Literal[0, -1] = 0` code to change: https://github.com/pydata/xarray/blob/main/xarray/core/dataset.py#L5770 Describe alternatives you've considered No response Additional context I imagine `new_axis` could be of type int to place the new axis where you would like but the proposal above may be a good first step. For reference, i'm doing deep learning and want to shape the data as width (latitude), height (longitude), channel (feature)	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6754/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
903922477	MDU6SXNzdWU5MDM5MjI0Nzc=	5386	Add xr.open_dataset("file.tif", engine="rasterio") to docs	raybellwaves 17162724	closed	1	2021-05-27T15:39:29Z	2022-04-09T03:15:45Z	2022-04-09T03:15:45Z	CONTRIBUTOR	Kind of related to https://github.com/pydata/xarray/issues/4697 I see https://corteva.github.io/rioxarray/stable/getting_started/getting_started.html#rioxarray shows `ds = xarray.open_dataset("file.tif", engine="rasterio")` This could be added to https://xarray.pydata.org/en/latest/user-guide/io.html#rasterio	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5386/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1015260231	I_kwDOAMm_X848g6RH	5838	update ValueError for open_mfdataset for wild-card matching	raybellwaves 17162724	closed	0	2021-10-04T14:32:45Z	2021-10-11T02:58:47Z	2021-10-11T02:58:47Z	CONTRIBUTOR	What happened: Took `engine="zarr"` out of my `open_mfdataset` call and got ``` ValueError Traceback (most recent call last) /tmp/ipykernel_24527/4212238570.py in <module> 3 TqdmCallback(desc="dask tasks").register() 4 ----> 5 ds = xr.open_mfdataset( 6 "s3://era5-pds/zarr///data/air_temperature_at_2_metres.zarr", 7 parallel=True, /opt/userenvs/ray.bell/main/lib/python3.9/site-packages/xarray/backends/api.py in open_mfdataset(paths, chunks, concat_dim, compat, preprocess, engine, data_vars, coords, combine, parallel, join, attrs_file, combine_attrs, kwargs) 860 paths = [fs.get_mapper(path) for path in paths] 861 elif is_remote_uri(paths): --> 862 raise ValueError( 863 "cannot do wild-card matching for paths that are remote URLs: " 864 "{!r}. Instead, supply paths as an explicit list of strings.".format( ValueError: cannot do wild-card matching for paths that are remote URLs: 's3://era5-pds/zarr///data/air_temperature_at_2_metres.zarr'. Instead, supply paths as an explicit list of strings. ``` What you expected to happen: Give a suggestion that this can be fixed if `engine="zarr"` Minimal Complete Verifiable Example*: ```python import xarray as xr ds = xr.open_mfdataset( "s3://era5-pds/zarr/2020/1/data/eastward_wind_at_10_metres.zarr", backend_kwargs=dict(storage_options={"anon": True}), ) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/ray/miniconda3/envs/main/lib/python3.9/site-packages/xarray/backends/api.py", line 862, in open_mfdataset raise ValueError( ValueError: cannot do wild-card matching for paths that are remote URLs: 's3://era5-pds/zarr/2020/1/data/eastward_wind_at_10_metres.zarr'. Instead, supply paths as an explicit list of strings. ``` ```python ds = xr.open_mfdataset( "s3://era5-pds/zarr/2020/1/data/eastward_wind_at_10_metres.zarr", backend_kwargs=dict(storage_options={"anon": True}), engine="zarr", ) works ``` Anything else we need to know?: message here: https://github.com/pydata/xarray/blob/main/xarray/backends/api.py#L861 Environment: Output of <tt>xr.show_versions()</tt>	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5838/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
962974408	MDU6SXNzdWU5NjI5NzQ0MDg=	5679	DOC: remove suggestion to install pytest-xdist in docs	raybellwaves 17162724	closed	1	2021-08-06T18:58:47Z	2021-08-19T22:16:19Z	2021-08-19T22:16:19Z	CONTRIBUTOR	In http://xarray.pydata.org/en/stable/contributing.html#running-the-test-suite The suggestion is Using pytest-xdist, one can speed up local testing on multicore machines. To use this feature, you will need to install pytest-xdist via: pip install pytest-xdist Then, run pytest with the optional -n argument: pytest xarray -n 4 pytest-xdist is in the environment (https://github.com/pydata/xarray/blob/main/ci/requirements/environment.yml#L39) no need to reinstall	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5679/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
942198905	MDU6SXNzdWU5NDIxOTg5MDU=	5596	permission error on writing zarr to s3	raybellwaves 17162724	closed	1	2021-07-12T15:48:07Z	2021-07-12T16:01:47Z	2021-07-12T16:01:46Z	CONTRIBUTOR	Note: for upstream issue see https://github.com/dask/dask/issues/7887 What happened: `ds.to_zarr("s3://BUCKET/file.zarr")` gives What you expected to happen: Finds AWS credentials and writes zarr object to AWS S3 cloud storage. Minimal Complete Verifiable Example: ```python import xarray as xr ds = xr.tutorial.open_dataset("air_temperature").isel(time=0) ds.to_zarr("s3://BUCKET/file.zarr") ClientError Traceback (most recent call last) ~/opt/miniconda3/envs/main/lib/python3.9/site-packages/s3fs/core.py in _call_s3(self, method, akwarglist, kwargs) 245 try: --> 246 out = await method(additional_kwargs) 247 return out ~/opt/miniconda3/envs/main/lib/python3.9/site-packages/aiobotocore/client.py in _make_api_call(self, operation_name, api_params) 153 error_class = self.exceptions.from_code(error_code) --> 154 raise error_class(parsed_response, operation_name) 155 else: ClientError: An error occurred (AccessDenied) when calling the PutObject operation: Access Denied The above exception was the direct cause of the following exception: PermissionError Traceback (most recent call last) ~/opt/miniconda3/envs/main/lib/python3.9/site-packages/zarr/storage.py in setitem(self, key, value) 1110 self.fs.rm(path, recursive=True) -> 1111 self.map[key] = value 1112 self.fs.invalidate_cache(self.fs._parent(path)) ~/opt/miniconda3/envs/main/lib/python3.9/site-packages/fsspec/mapping.py in setitem(self, key, value) 151 self.fs.mkdirs(self.fs._parent(key), exist_ok=True) --> 152 self.fs.pipe_file(key, maybe_convert(value)) 153 ~/opt/miniconda3/envs/main/lib/python3.9/site-packages/fsspec/asyn.py in wrapper(args,* kwargs) 86 self = obj or args[0] ---> 87 return sync(self.loop, func, args, *kwargs) 88 ~/opt/miniconda3/envs/main/lib/python3.9/site-packages/fsspec/asyn.py in sync(loop, func, timeout, args,* kwargs) 67 if isinstance(result[0], BaseException): ---> 68 raise result[0] 69 return result[0] ~/opt/miniconda3/envs/main/lib/python3.9/site-packages/fsspec/asyn.py in _runner(event, coro, result, timeout) 23 try: ---> 24 result[0] = await coro 25 except Exception as ex: ~/opt/miniconda3/envs/main/lib/python3.9/site-packages/s3fs/core.py in _pipe_file(self, path, data, chunksize, kwargs) 865 if size < min(5 2 ** 30, 2 * chunksize): --> 866 return await self._call_s3( 867 "put_object", Bucket=bucket, Key=key, Body=data,** kwargs ~/opt/miniconda3/envs/main/lib/python3.9/site-packages/s3fs/core.py in _call_s3(self, method, akwarglist, kwargs) 264 err = e --> 265 raise translate_boto_error(err) 266 PermissionError: Access Denied The above exception was the direct cause of the following exception: KeyError Traceback (most recent call last) /var/folders/rf/26llfhwd68x7cftb1z3h000w0000gp/T/ipykernel_10507/3272073269.py in <module> 1 ds = xr.tutorial.open_dataset("air_temperature").isel(time=0) ----> 2 ds.to_zarr("s3://BUCKET/file.zarr") ~/opt/miniconda3/envs/main/lib/python3.9/site-packages/xarray/core/dataset.py in to_zarr(self, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks) 1920 encoding = {} 1921 -> 1922 return to_zarr( 1923 self, 1924 store=store, ~/opt/miniconda3/envs/main/lib/python3.9/site-packages/xarray/backends/api.py in to_zarr(dataset, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks) 1432 ) 1433 -> 1434 zstore = backends.ZarrStore.open_group( 1435 store=store, 1436 mode=mode, ~/opt/miniconda3/envs/main/lib/python3.9/site-packages/xarray/backends/zarr.py in open_group(cls, store, mode, synchronizer, group, consolidated, consolidate_on_close, chunk_store, storage_options, append_dim, write_region, safe_chunks) 336 zarr_group = zarr.open_consolidated(store, open_kwargs) 337 else: --> 338 zarr_group = zarr.open_group(store,* open_kwargs) 339 return cls( 340 zarr_group, consolidate_on_close, append_dim, write_region, safe_chunks ~/opt/miniconda3/envs/main/lib/python3.9/site-packages/zarr/hierarchy.py in open_group(store, mode, cache_attrs, synchronizer, path, chunk_store, storage_options) 1183 raise ContainsGroupError(path) 1184 else: -> 1185 init_group(store, path=path, chunk_store=chunk_store) 1186 1187 # determine read only status ~/opt/miniconda3/envs/main/lib/python3.9/site-packages/zarr/storage.py in init_group(store, overwrite, path, chunk_store) 486 487 # initialise metadata --> 488 _init_group_metadata(store=store, overwrite=overwrite, path=path, 489 chunk_store=chunk_store) 490 ~/opt/miniconda3/envs/main/lib/python3.9/site-packages/zarr/storage.py in _init_group_metadata(store, overwrite, path, chunk_store) 513 meta = dict() # type: ignore 514 key = _path_to_prefix(path) + group_meta_key --> 515 store[key] = encode_group_metadata(meta) 516 517 ~/opt/miniconda3/envs/main/lib/python3.9/site-packages/zarr/storage.py in setitem(self, key, value) 1112 self.fs.invalidate_cache(self.fs._parent(path)) 1113 except self.exceptions as e: -> 1114 raise KeyError(key) from e 1115 1116 def delitem(self, key): KeyError: '.zgroup' ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5596/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
830507003	MDU6SXNzdWU4MzA1MDcwMDM=	5028	Saving zarr to remote location lower cases all data_vars	raybellwaves 17162724	closed	5	2021-03-12T21:40:37Z	2021-06-17T12:53:22Z	2021-06-15T21:37:15Z	CONTRIBUTOR	What happened: I saved a zarr store to a remote location (s3) and read it again and realized the name of the data variables (DataArrays) had reverted to lower case What you expected to happen: This does not happen when you save the zarr store locally Minimal Complete Verifiable Example: ```python ds = xr.tutorial.open_dataset("air_temperature") ds = ds.rename({'air': "AIR"}) Save to local ds.to_zarr("ds.zarr") Save to remote ds.to_zarr("s3://BUCKET/ds.zarr") Read local xr.open_dataset("ds.zarr", engine="zarr").data_vars Data variables: AIR (time, lat, lon) float32 ... Read remote xr.open_dataset("s3://BUCKET/ds.zarr", engine="zarr").data_vars Data variables: air (time, lat, lon) float32 ... ``` Anything else we need to know?: Environment: Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.8.8 \| packaged by conda-forge \| (default, Feb 20 2021, 16:22:27) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 5.4.0-1009-aws machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: C.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.17.0 pandas: 1.2.3 numpy: 1.20.1 scipy: 1.5.3 netCDF4: 1.5.6 pydap: installed h5netcdf: 0.10.0 h5py: 3.1.0 Nio: None zarr: 2.6.1 cftime: 1.4.1 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.1 cfgrib: 0.9.8.5 iris: None bottleneck: 1.3.2 dask: 2021.03.0 distributed: 2021.03.0 matplotlib: 3.3.4 cartopy: 0.18.0 seaborn: 0.11.1 numbagg: None pint: 0.16.1 setuptools: 49.6.0.post20210108 pip: 21.0.1 conda: None pytest: 6.2.2 IPython: 7.21.0 sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5028/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
858096239	MDU6SXNzdWU4NTgwOTYyMzk=	5159	DOC: link to_zarr docs to io section on zarr	raybellwaves 17162724	closed	0	2021-04-14T17:12:27Z	2021-04-16T15:30:12Z	2021-04-16T15:30:12Z	CONTRIBUTOR	the to_zarr docs http://xarray.pydata.org/en/latest/generated/xarray.Dataset.to_zarr.html could add a "See Also" and point to http://xarray.pydata.org/en/latest/user-guide/io.html?highlight=zarr#zarr	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5159/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
822250814	MDU6SXNzdWU4MjIyNTA4MTQ=	4993	add xr.set_options to docs	raybellwaves 17162724	closed	3	2021-03-04T15:41:20Z	2021-03-07T16:45:46Z	2021-03-07T16:45:46Z	CONTRIBUTOR	https://github.com/pydata/xarray/issues/4992#issuecomment-790685921	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4993/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
822202759	MDU6SXNzdWU4MjIyMDI3NTk=	4992	Feature request: xr.set_option('display.max_data_vars', N)	raybellwaves 17162724	closed	4	2021-03-04T14:48:29Z	2021-03-04T15:14:50Z	2021-03-04T15:14:50Z	CONTRIBUTOR	idea discussed here: https://github.com/pydata/xarray/discussions/4991 Copying here the data_vars method provides a convenient ways to see the varaibles and the formatting (https://github.com/pydata/xarray/blob/master/xarray/core/dataset.py#L484) is nice. However, there are times I would look to see the full variables names not just a subset (12 is the default). Ideally there would be a setting to toggle to specify the length of the formatter akin to https://pandas.pydata.org/pandas-docs/stable/user_guide/options.html#overview xr.set_option('display.max_data_vars', N)	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4992/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
748229907	MDU6SXNzdWU3NDgyMjk5MDc=	4598	Calling pd.to_datetime on cftime variable	raybellwaves 17162724	closed	4	2020-11-22T12:14:27Z	2021-02-16T02:42:35Z	2021-02-16T02:42:35Z	CONTRIBUTOR	It would be nice to be able to convert cftime variables to pandas datetime to utilize the functionality there. I understand this is an upstream issue as pandas probably isn't aware of cftime. However, i'm curious if a method could be added to cftime such as .to_dataframe(). I've found `pd.to_datetime(np.datetime64(date_cf))` is the best way to do this currently. ``` import xarray as xr import numpy as np import pandas as pd date_str = '2020-01-01' date_np = np.datetime64(date_str) date_np numpy.datetime64('2020-01-01') date_pd = pd.to_datetime(date_np) date_pd Timestamp('2020-01-01 00:00:00') date_cf = xr.cftime_range(start=date_str, periods=1)[0] pd.to_datetime(date_cf) pd.to_datetime(date_cf) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/ray/local/bin/anaconda3/envs/a/lib/python3.8/site-packages/pandas/core/tools/datetimes.py", line 830, in to_datetime result = convert_listlike(np.array([arg]), format)[0] File "/home/ray/local/bin/anaconda3/envs/a/lib/python3.8/site-packages/pandas/core/tools/datetimes.py", line 459, in _convert_listlike_datetimes result, tz_parsed = objects_to_datetime64ns( File "/home/ray/local/bin/anaconda3/envs/a/lib/python3.8/site-packages/pandas/core/arrays/datetimes.py", line 2044, in objects_to_datetime64ns result, tz_parsed = tslib.array_to_datetime( File "pandas/_libs/tslib.pyx", line 352, in pandas._libs.tslib.array_to_datetime File "pandas/_libs/tslib.pyx", line 579, in pandas._libs.tslib.array_to_datetime File "pandas/_libs/tslib.pyx", line 718, in pandas._libs.tslib.array_to_datetime_object File "pandas/_libs/tslib.pyx", line 552, in pandas._libs.tslib.array_to_datetime TypeError: <class 'cftime._cftime.DatetimeGregorian'> is not convertible to datetime ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4598/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
807614965	MDU6SXNzdWU4MDc2MTQ5NjU=	4901	document example of preprocess with open_mfdataset	raybellwaves 17162724	closed	1	2021-02-12T23:37:04Z	2021-02-13T03:21:08Z	2021-02-13T03:21:08Z	CONTRIBUTOR	@jhamman's SO answer circa 2018 helped me this week https://stackoverflow.com/a/51714004/6046019 I wonder if it's worth (not sure where) providing an example of how to use `preprocesses` with `open_mfdataset`? Add an Examples entry to the doc string? (http://xarray.pydata.org/en/latest/generated/xarray.open_mfdataset.html / https://github.com/pydata/xarray/blob/5296ed18272a856d478fbbb3d3253205508d1c2d/xarray/backends/api.py#L895) While not a small example (as the remote files are large) this is how I used it: ``` import xarray as xr import s3fs def preprocess(ds): return ds.expand_dims('time') fs = s3fs.S3FileSystem(anon=True) f1 = fs.open('s3://fmi-opendata-rcrhirlam-surface-grib/2021/02/03/00/numerical-hirlam74-forecast-MaximumWind-20210203T000000Z.grb2') f2 = fs.open('s3://fmi-opendata-rcrhirlam-surface-grib/2021/02/03/06/numerical-hirlam74-forecast-MaximumWind-20210203T060000Z.grb2') ds = xr.open_mfdataset([f1, f2], engine="cfgrib", preprocess=preprocess, parallel=True) ``` with one file looking like: xr.open_dataset("LOCAL_numerical-hirlam74-forecast-MaximumWind-20210203T000000Z.grb2", engine="cfgrib") <xarray.Dataset> Dimensions: (latitude: 947, longitude: 5294, step: 55) Coordinates: time datetime64[ns] ... * step (step) timedelta64[ns] 01:00:00 ... 2 days 07:00:00 heightAboveGround int64 ... * latitude (latitude) float64 25.65 25.72 25.78 ... 89.86 89.93 90.0 * longitude (longitude) float64 -180.0 -179.9 -179.9 ... 179.9 180.0 valid_time (step) datetime64[ns] ... Data variables: fg10 (step, latitude, longitude) float32 ... Attributes: GRIB_edition: 2 GRIB_centre: ecmf GRIB_centreDescription: European Centre for Medium-Range Weather Forecasts GRIB_subCentre: 0 Conventions: CF-1.7 institution: European Centre for Medium-Range Weather Forecasts history: 2021-02-12T18:06:52 GRIB to CDM+CF via cfgrib-0.... A smaller example could be (WIP; note I was hoping ds would concat along t but it doesn't do what I expect) ``` import numpy as np import xarray as xr f1 = xr.DataArray(np.arange(2), coords=[np.arange(2)], dims=["a"], name="f1") f1 = f1.assign_coords(t=0) f1.to_dataset().to_zarr("f1.zarr") # What's the best way to store small files to open again with mf_dataset? csv via xarray objects? can you use open_mfdataset on pkl objects? f2 = xr.DataArray(np.arange(2), coords=[np.arange(2)], dims=["a"], name="f2") f2 = f2.assign_coords(t=1) f2.to_dataset().to_zarr("f2.zarr") Concat along t def preprocess(ds): return ds.expand_dims('t') ds = xr.open_mfdataset(["f1.zarr", "f2.zarr"], engine="zarr", concat_dim="t", preprocess=preprocess) ds <xarray.Dataset> Dimensions: (a: 2, t: 1) Coordinates: * t (t) int64 0 * a (a) int64 0 1 Data variables: f1 (t, a) int64 dask.array<chunksize=(1, 2), meta=np.ndarray> f2 (t, a) int64 dask.array<chunksize=(1, 2), meta=np.ndarray> ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4901/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
753037814	MDU6SXNzdWU3NTMwMzc4MTQ=	4620	Link concat info to concat doc string	raybellwaves 17162724	closed	1	2020-11-29T22:59:08Z	2020-12-19T23:20:22Z	2020-12-19T23:20:22Z	CONTRIBUTOR	Could link http://xarray.pydata.org/en/stable/combining.html#concatenate or add the example to the concat doc string: http://xarray.pydata.org/en/stable/generated/xarray.concat.html	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4620/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
697283733	MDU6SXNzdWU2OTcyODM3MzM=	4413	rtd DataArray.to_netcdf	raybellwaves 17162724	closed	1	2020-09-10T00:59:03Z	2020-09-17T12:59:09Z	2020-09-17T12:59:09Z	CONTRIBUTOR	http://xarray.pydata.org/en/stable/generated/xarray.DataArray.to_netcdf.html#xarray.DataArray.to_netcdf I believe there is a way to hyperlink to `xarray.Dataset.to_netcdf`. I believe there is also a way for sphinx to render the Note section e.g. in the doc string add `.. note::`	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4413/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
638080883	MDU6SXNzdWU2MzgwODA4ODM=	4151	doc: reading via cfgrib	raybellwaves 17162724	closed	2	2020-06-13T02:37:37Z	2020-06-17T16:52:30Z	2020-06-17T16:52:30Z	CONTRIBUTOR	In the docs http://xarray.pydata.org/en/stable/io.html#grib-format-via-cfgrib Curious if `eccodes` is needed if reading a grib using using the `cfgrib` backend? Does installing `cfgrib` via conda also install the binary dependencies? https://github.com/ecmwf/cfgrib#installation cc. @alexamici	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4151/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
588118015	MDU6SXNzdWU1ODgxMTgwMTU=	3895	xarray.Dataset.from_dataframe link to pandas.DataFrame.to_xarray	raybellwaves 17162724	closed	3	2020-03-26T02:57:24Z	2020-04-23T07:58:09Z	2020-04-23T07:58:09Z	CONTRIBUTOR	Is it worth referencing https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_xarray.html in http://xarray.pydata.org/en/stable/generated/xarray.Dataset.from_dataframe.html? is the pandas method preferred?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3895/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
561043452	MDU6SXNzdWU1NjEwNDM0NTI=	3756	pynio package not found	raybellwaves 17162724	closed	1	2020-02-06T14:20:45Z	2020-03-09T14:07:03Z	2020-03-09T14:07:03Z	CONTRIBUTOR	Copied the install instructions from the docs http://xarray.pydata.org/en/stable/installing.html#instructions on my windows machine. `conda install -c conda-forge xarray cartopy pynio pseudonetcdf` Got a `PackagesNotFoundError: pynio`	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3756/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
558742452	MDU6SXNzdWU1NTg3NDI0NTI=	3742	GRIB Data Example notebook - dataset not found	raybellwaves 17162724	closed	3	2020-02-02T19:15:55Z	2020-02-03T17:05:07Z	2020-02-03T17:05:07Z	CONTRIBUTOR	Just testing the docs on binder (good job). Noticed the ERA5-Grib-example.ipynb (https://github.com/pydata/xarray/blob/master/doc/examples/ERA5-GRIB-example.ipynb) was not working ``` HTTPError Traceback (most recent call last) <ipython-input-2-783584127f97> in <module> ----> 1 ds = xr.tutorial.load_dataset('era5-2mt-2019-03-uk.grib', engine='cfgrib') /srv/conda/envs/notebook/lib/python3.8/site-packages/xarray/tutorial.py in load_dataset(args, kwargs) 107 open_dataset 108 """ --> 109 with open_dataset(args, kwargs) as ds: 110 return ds.load() 111 /srv/conda/envs/notebook/lib/python3.8/site-packages/xarray/tutorial.py in open_dataset(name, cache, cache_dir, github_url, branch, kws) 75 76 url = "/".join((github_url, "raw", branch, fullname)) ---> 77 urlretrieve(url, localfile) 78 url = "/".join((github_url, "raw", branch, md5name)) 79 urlretrieve(url, md5file) /srv/conda/envs/notebook/lib/python3.8/urllib/request.py in urlretrieve(url, filename, reporthook, data) 245 url_type, path = _splittype(url) 246 --> 247 with contextlib.closing(urlopen(url, data)) as fp: 248 headers = fp.info() 249 /srv/conda/envs/notebook/lib/python3.8/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context) 220 else: 221 opener = _opener --> 222 return opener.open(url, data, timeout) 223 224 def install_opener(opener): /srv/conda/envs/notebook/lib/python3.8/urllib/request.py in open(self, fullurl, data, timeout) 529 for processor in self.process_response.get(protocol, []): 530 meth = getattr(processor, meth_name) --> 531 response = meth(req, response) 532 533 return response /srv/conda/envs/notebook/lib/python3.8/urllib/request.py in http_response(self, request, response) 638 # request was successfully received, understood, and accepted. 639 if not (200 <= code < 300): --> 640 response = self.parent.error( 641 'http', request, response, code, msg, hdrs) 642 /srv/conda/envs/notebook/lib/python3.8/urllib/request.py in error(self, proto, args) 567 if http_err: 568 args = (dict, 'default', 'http_error_default') + orig_args --> 569 return self._call_chain(args) 570 571 # XXX probably also want an abstract factory that knows when it makes /srv/conda/envs/notebook/lib/python3.8/urllib/request.py in _call_chain(self, chain, kind, meth_name, args) 500 for handler in handlers: 501 func = getattr(handler, meth_name) --> 502 result = func(args) 503 if result is not None: 504 return result /srv/conda/envs/notebook/lib/python3.8/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs) 647 class HTTPDefaultErrorHandler(BaseHandler): 648 def http_error_default(self, req, fp, code, msg, hdrs): --> 649 raise HTTPError(req.full_url, code, msg, hdrs, fp) 650 651 class HTTPRedirectHandler(BaseHandler): HTTPError: HTTP Error 404: Not Found ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3742/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
223610210	MDU6SXNzdWUyMjM2MTAyMTA=	1381	xr.where not picking up nan from another NetCDF file	raybellwaves 17162724	closed	5	2017-04-23T03:45:24Z	2019-04-25T15:23:43Z	2019-04-25T15:23:43Z	CONTRIBUTOR	I posted this question here: http://stackoverflow.com/questions/43485347/python-xarray-copy-nan-from-one-dataarray-to-another I have uploaded the files and code to https://github.com/raybellwaves/xarray_issue.git When I originally discovered the issue I was using python version 3.6.0 (default, Jan 28 2017, 13:49:14) [GCC Intel(R) C++ gcc 4.4 mode] I tested the code with version 2.7.13 \| packaged by conda-forge \| (default, Mar 20 2017, 14:26:36) [GCC 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.53)] and I did not get the error message which I posted in the question. The code worked by the variable ws10_masked is empty	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1381/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
321917084	MDU6SXNzdWUzMjE5MTcwODQ=	2113	Rolling mean of dask array conflicting sizes for data and coordinate in rolling operation	raybellwaves 17162724	closed	4	2018-05-10T12:40:19Z	2018-05-12T06:15:55Z	2018-05-12T06:15:55Z	CONTRIBUTOR	Code Sample, a copy-pastable example if possible `python import xarray as xr remote_data = xr.open_dataarray('http://iridl.ldeo.columbia.edu/SOURCES/.Models'\ '/.SubX/.RSMAS/.CCSM4/.hindcast/.zg/dods', chunks={'L': 1, 'S': 1}) da = remote_data.isel(P=0,L=0,M=0,X=0,Y=0) da_day_clim = da.groupby('S.dayofyear').mean('S') da_day_clim2 = da_day_clim.chunk({'dayofyear': 366}) da_day_clim_smooth = da_day_clim2.rolling(dayofyear=31, center=True).mean()` Problem description Initially discussed on SO: https://stackoverflow.com/questions/50265586/xarray-rolling-mean-of-dask-array-conflicting-sizes-for-data-and-coordinate-in The rolling operation gives a `ValueError: conflicting sizes for dimension 'dayofyear': length 351 on the data but length 366 on coordinate 'dayofyear'` The length of 351 in the data is created in the rolling operation. Here's the full traceback: ``` ValueError Traceback (most recent call last) <ipython-input-57-6acf382cdd3d> in <module>() 4 da_day_clim = da.groupby('S.dayofyear').mean('S') 5 da_day_clim2 = da_day_clim.chunk({'dayofyear': 366}) ----> 6 da_day_clim_smooth = da_day_clim2.rolling(dayofyear=31, center=True).mean() ~/anaconda/envs/SubXNAO/lib/python3.6/site-packages/xarray/core/rolling.py in wrapped_func(self, kwargs) 307 if self.center: 308 values = values[valid] --> 309 result = DataArray(values, self.obj.coords) 310 311 return result ~/anaconda/envs/SubXNAO/lib/python3.6/site-packages/xarray/core/dataarray.py in init**(self, data, coords, dims, name, attrs, encoding, fastpath) 224 225 data = as_compatible_data(data) --> 226 coords, dims = _infer_coords_and_dims(data.shape, coords, dims) 227 variable = Variable(dims, data, attrs, encoding, fastpath=True) 228 ~/anaconda/envs/SubXNAO/lib/python3.6/site-packages/xarray/core/dataarray.py in _infer_coords_and_dims(shape, coords, dims) 79 raise ValueError('conflicting sizes for dimension %r: ' 80 'length %s on the data but length %s on ' ---> 81 'coordinate %r' % (d, sizes[d], s, k)) 82 83 if k in sizes and v.shape != (sizes[k],): ValueError: conflicting sizes for dimension 'dayofyear': length 351 on the data but length 366 on coordinate 'dayofyear' ``` Expected Output The rolling operation would work on the dask array as it would on the dataarray e.g. ``` import pandas as pd import xarray as xr import numpy as np dates = pd.date_range('1/1/1980', '31/12/2000', freq='D') data = np.linspace(1, len(dates), num=len(dates), dtype=np.float) da = xr.DataArray(data, coords=[dates], dims='time') da_day_clim = da.groupby('time.dayofyear').mean('time') da_day_clim_smooth = da_day_clim.rolling(dayofyear=31, center=True).mean() ``` Output of `xr.show_versions()` /Users/Ray/anaconda/envs/SubXNAO/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. from ._conv import register_converters as _register_converters INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Darwin OS-release: 17.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.3 pandas: 0.22.0 numpy: 1.14.2 scipy: 1.0.1 netCDF4: 1.3.1 h5netcdf: 0.5.1 h5py: 2.7.1 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: 0.17.4 distributed: 1.21.8 matplotlib: 2.2.2 cartopy: 0.16.0 seaborn: None setuptools: 39.1.0 pip: 9.0.3 conda: None pytest: None IPython: 6.3.1 sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2113/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
320007162	MDU6SXNzdWUzMjAwMDcxNjI=	2102	resample DeprecationWarning only on 1-D arrays?	raybellwaves 17162724	closed	1	2018-05-03T17:13:55Z	2018-05-08T17:36:22Z	2018-05-08T17:36:22Z	CONTRIBUTOR	Code Sample, a copy-pastable example if possible ```python da = xr.DataArray(np.array([1,2,3,4], dtype=np.float).reshape(2,2), ... coords=[pd.date_range('1/1/2000', '1/2/2000', freq='D'), ... np.linspace(0,1,num=2)], ... dims=['time', 'latitude']) da.resample(freq='M', dim='time', how='mean') /Users/Ray/anaconda/envs/rot-eof-dev-env/bin/ipython:1: DeprecationWarning: .resample() has been modified to defer calculations. Instead of passing 'dim' and 'how="mean", #instead consider using .resample(time="M").mean() #!/Users/Ray/anaconda/envs/rot-eof-dev-env/bin/python Out[66]: <xarray.DataArray (time: 1, latitude: 2)> array([[2., 3.]]) Coordinates: * time (time) datetime64[ns] 2000-01-31 * latitude (latitude) float64 0.0 1.0 da.resample(time="M").mean() <xarray.DataArray (time: 1)> array([2.5]) Coordinates: * time (time) datetime64[ns] 2000-01-31 ``` Problem description The DeprecationWarning example seems to only work for 1d arrays as it doesn't average along any dimension. A quick fix could be to show the warning only if the DataArray/Dataset is 1D. A more thorough fix could be to wrap `.resample(time="M").mean()` as `.resample(freq='M', dim='time', how='mean')`??? Expected Output Same as `da.resample(freq='M', dim='time', how='mean')` Output of `xr.show_versions()` xr.show_versions() # Not sure about the h5py FutureWarning? /Users/Ray/anaconda/envs/rot-eof-dev-env/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. from ._conv import register_converters as _register_converters INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Darwin OS-release: 17.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.3 pandas: 0.22.0 numpy: 1.14.2 scipy: 1.0.1 netCDF4: 1.3.1 h5netcdf: 0.5.1 h5py: 2.7.1 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: 0.17.2 distributed: 1.21.6 matplotlib: 2.2.2 cartopy: 0.16.0 seaborn: None setuptools: 39.0.1 pip: 9.0.3 conda: None pytest: None IPython: 6.3.1 sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2102/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
278743801	MDU6SXNzdWUyNzg3NDM4MDE=	1757	open_dataarray docs still contains args, *kwargs	raybellwaves 17162724	closed	4	2017-12-03T04:36:48Z	2018-01-19T05:13:51Z	2018-01-19T05:13:51Z	CONTRIBUTOR	I noticed the open_dataset docs provides the full input parameters whereas, open_dataarray docs still lists `args, *kwargs` If you point me to where this is created I can change it, if the change is preferred	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1757/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);

issues

22 rows where state = "closed", type = "issue" and user = 17162724 sorted by updated_at descending

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

works

Save to local

Save to remote

Read local

Data variables:

AIR (time, lat, lon) float32 ...

Read remote

Data variables:

air (time, lat, lon) float32 ...

Concat along t

```

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of `xr.show_versions()`

Code Sample, a copy-pastable example if possible

/Users/Ray/anaconda/envs/rot-eof-dev-env/bin/ipython:1: DeprecationWarning:

.resample() has been modified to defer calculations. Instead of passing 'dim' and 'how="mean", #instead consider using .resample(time="M").mean()

#!/Users/Ray/anaconda/envs/rot-eof-dev-env/bin/python

Out[66]:

<xarray.DataArray (time: 1, latitude: 2)>

array([[2., 3.]])

Coordinates:

* time (time) datetime64[ns] 2000-01-31

* latitude (latitude) float64 0.0 1.0

<xarray.DataArray (time: 1)>

array([2.5])

Coordinates:

* time (time) datetime64[ns] 2000-01-31

Problem description

Expected Output

Output of `xr.show_versions()`

Advanced export

issues

22 rows where state = "closed", type = "issue" and user = 17162724 sorted by updated_at descending

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

works

Save to local

Save to remote

Read local

Data variables:

AIR (time, lat, lon) float32 ...

Read remote

Data variables:

air (time, lat, lon) float32 ...

Concat along t

```

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of xr.show_versions()

Code Sample, a copy-pastable example if possible

/Users/Ray/anaconda/envs/rot-eof-dev-env/bin/ipython:1: DeprecationWarning:

.resample() has been modified to defer calculations. Instead of passing 'dim' and 'how="mean", #instead consider using .resample(time="M").mean()

#!/Users/Ray/anaconda/envs/rot-eof-dev-env/bin/python

Out[66]:

<xarray.DataArray (time: 1, latitude: 2)>

array([[2., 3.]])

Coordinates:

* time (time) datetime64[ns] 2000-01-31

* latitude (latitude) float64 0.0 1.0

<xarray.DataArray (time: 1)>

array([2.5])

Coordinates:

* time (time) datetime64[ns] 2000-01-31

Problem description

Expected Output

Output of xr.show_versions()

Advanced export

Output of `xr.show_versions()`

Output of `xr.show_versions()`