home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

22 rows where state = "closed", type = "issue" and user = 17162724 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, created_at (date), updated_at (date), closed_at (date)

type 1

  • issue · 22 ✖

state 1

  • closed · 22 ✖

repo 1

  • xarray 22
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1013881639 I_kwDOAMm_X848bpsn 5836 to_zarr returns <xarray.backends.zarr.ZarrStore at ...> raybellwaves 17162724 closed 0     2 2021-10-02T01:53:27Z 2023-03-12T15:59:14Z 2023-03-12T15:59:13Z CONTRIBUTOR      

What happened:

When doing to zarr it returns <xarray.backends.zarr.ZarrStore at ...>

What you expected to happen:

Returns None. Same behaviour as pandas.to_parquet()

Minimal Complete Verifiable Example:

```python import xarray as xr

ds = xr.tutorial.open_dataset("air_temperature") ds.to_zarr("ds.zarr")

ds.to_dataframe().to_parquet("ds.parquet") ```

Anything else we need to know?:

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.9.7 | packaged by conda-forge | (default, Sep 23 2021, 07:31:23) [Clang 11.1.0 ] python-bits: 64 OS: Darwin OS-release: 20.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.8.0 xarray: 0.19.0 pandas: 1.3.3 numpy: 1.21.2 scipy: 1.7.1 netCDF4: 1.5.7 pydap: installed h5netcdf: 0.11.0 h5py: 3.3.0 Nio: None zarr: 2.10.0 cftime: 1.5.0 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.6 cfgrib: 0.9.9.0 iris: None bottleneck: 1.3.2 dask: 2021.09.1 distributed: 2021.09.1 matplotlib: 3.4.3 cartopy: 0.20.0 seaborn: 0.11.2 numbagg: None pint: 0.17 setuptools: 58.0.2 pip: 21.2.4 conda: None pytest: 6.2.5 IPython: 7.27.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5836/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1294978633 I_kwDOAMm_X85NL85J 6754 to_array to create a dimension as last axis raybellwaves 17162724 closed 0     6 2022-07-06T01:28:50Z 2022-07-06T17:12:25Z 2022-07-06T17:12:25Z CONTRIBUTOR      

Is your feature request related to a problem?

I do ds.to_array(dim="variable").transpose("latitude", "longitude", "variable"). I would like to avoid the extra transpose call.

Describe the solution you'd like

ds.to_array(dim="variable", new_axis="last") where new_axis is Literal["first", "last"] = "first" or ds.to_array(dim="variable", new_axis=0) where new_axis is Literal[0, -1] = 0

code to change: https://github.com/pydata/xarray/blob/main/xarray/core/dataset.py#L5770

Describe alternatives you've considered

No response

Additional context

I imagine new_axis could be of type int to place the new axis where you would like but the proposal above may be a good first step.

For reference, i'm doing deep learning and want to shape the data as width (latitude), height (longitude), channel (feature)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6754/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
903922477 MDU6SXNzdWU5MDM5MjI0Nzc= 5386 Add xr.open_dataset("file.tif", engine="rasterio") to docs raybellwaves 17162724 closed 0     1 2021-05-27T15:39:29Z 2022-04-09T03:15:45Z 2022-04-09T03:15:45Z CONTRIBUTOR      

Kind of related to https://github.com/pydata/xarray/issues/4697

I see https://corteva.github.io/rioxarray/stable/getting_started/getting_started.html#rioxarray

shows

ds = xarray.open_dataset("file.tif", engine="rasterio")

This could be added to

https://xarray.pydata.org/en/latest/user-guide/io.html#rasterio

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5386/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1015260231 I_kwDOAMm_X848g6RH 5838 update ValueError for open_mfdataset for wild-card matching raybellwaves 17162724 closed 0     0 2021-10-04T14:32:45Z 2021-10-11T02:58:47Z 2021-10-11T02:58:47Z CONTRIBUTOR      

What happened:

Took engine="zarr" out of my open_mfdataset call and got

``` ValueError Traceback (most recent call last) /tmp/ipykernel_24527/4212238570.py in <module> 3 TqdmCallback(desc="dask tasks").register() 4 ----> 5 ds = xr.open_mfdataset( 6 "s3://era5-pds/zarr///data/air_temperature_at_2_metres.zarr", 7 parallel=True,

/opt/userenvs/ray.bell/main/lib/python3.9/site-packages/xarray/backends/api.py in open_mfdataset(paths, chunks, concat_dim, compat, preprocess, engine, data_vars, coords, combine, parallel, join, attrs_file, combine_attrs, **kwargs) 860 paths = [fs.get_mapper(path) for path in paths] 861 elif is_remote_uri(paths): --> 862 raise ValueError( 863 "cannot do wild-card matching for paths that are remote URLs: " 864 "{!r}. Instead, supply paths as an explicit list of strings.".format(

ValueError: cannot do wild-card matching for paths that are remote URLs: 's3://era5-pds/zarr///data/air_temperature_at_2_metres.zarr'. Instead, supply paths as an explicit list of strings. ```

What you expected to happen:

Give a suggestion that this can be fixed if engine="zarr"

Minimal Complete Verifiable Example:

```python import xarray as xr

ds = xr.open_mfdataset( "s3://era5-pds/zarr/2020/1*/data/eastward_wind_at_10_metres.zarr", backend_kwargs=dict(storage_options={"anon": True}), )

Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/ray/miniconda3/envs/main/lib/python3.9/site-packages/xarray/backends/api.py", line 862, in open_mfdataset raise ValueError( ValueError: cannot do wild-card matching for paths that are remote URLs: 's3://era5-pds/zarr/2020/1*/data/eastward_wind_at_10_metres.zarr'. Instead, supply paths as an explicit list of strings. ```

```python ds = xr.open_mfdataset( "s3://era5-pds/zarr/2020/1*/data/eastward_wind_at_10_metres.zarr", backend_kwargs=dict(storage_options={"anon": True}), engine="zarr", )

works

```

Anything else we need to know?:

message here: https://github.com/pydata/xarray/blob/main/xarray/backends/api.py#L861

Environment:

Output of <tt>xr.show_versions()</tt>
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5838/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
962974408 MDU6SXNzdWU5NjI5NzQ0MDg= 5679 DOC: remove suggestion to install pytest-xdist in docs raybellwaves 17162724 closed 0     1 2021-08-06T18:58:47Z 2021-08-19T22:16:19Z 2021-08-19T22:16:19Z CONTRIBUTOR      

In http://xarray.pydata.org/en/stable/contributing.html#running-the-test-suite

The suggestion is

Using pytest-xdist, one can speed up local testing on multicore machines. To use this feature, you will need to install pytest-xdist via: pip install pytest-xdist Then, run pytest with the optional -n argument: pytest xarray -n 4

pytest-xdist is in the environment (https://github.com/pydata/xarray/blob/main/ci/requirements/environment.yml#L39) no need to reinstall

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5679/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
942198905 MDU6SXNzdWU5NDIxOTg5MDU= 5596 permission error on writing zarr to s3 raybellwaves 17162724 closed 0     1 2021-07-12T15:48:07Z 2021-07-12T16:01:47Z 2021-07-12T16:01:46Z CONTRIBUTOR      

Note: for upstream issue see https://github.com/dask/dask/issues/7887

What happened:

ds.to_zarr("s3://BUCKET/file.zarr") gives

What you expected to happen:

Finds AWS credentials and writes zarr object to AWS S3 cloud storage.

Minimal Complete Verifiable Example:

```python import xarray as xr

ds = xr.tutorial.open_dataset("air_temperature").isel(time=0) ds.to_zarr("s3://BUCKET/file.zarr")


ClientError Traceback (most recent call last) ~/opt/miniconda3/envs/main/lib/python3.9/site-packages/s3fs/core.py in _call_s3(self, method, akwarglist, kwargs) 245 try: --> 246 out = await method(*additional_kwargs) 247 return out

~/opt/miniconda3/envs/main/lib/python3.9/site-packages/aiobotocore/client.py in _make_api_call(self, operation_name, api_params) 153 error_class = self.exceptions.from_code(error_code) --> 154 raise error_class(parsed_response, operation_name) 155 else:

ClientError: An error occurred (AccessDenied) when calling the PutObject operation: Access Denied

The above exception was the direct cause of the following exception:

PermissionError Traceback (most recent call last) ~/opt/miniconda3/envs/main/lib/python3.9/site-packages/zarr/storage.py in setitem(self, key, value) 1110 self.fs.rm(path, recursive=True) -> 1111 self.map[key] = value 1112 self.fs.invalidate_cache(self.fs._parent(path))

~/opt/miniconda3/envs/main/lib/python3.9/site-packages/fsspec/mapping.py in setitem(self, key, value) 151 self.fs.mkdirs(self.fs._parent(key), exist_ok=True) --> 152 self.fs.pipe_file(key, maybe_convert(value)) 153

~/opt/miniconda3/envs/main/lib/python3.9/site-packages/fsspec/asyn.py in wrapper(args, kwargs) 86 self = obj or args[0] ---> 87 return sync(self.loop, func, args, **kwargs) 88

~/opt/miniconda3/envs/main/lib/python3.9/site-packages/fsspec/asyn.py in sync(loop, func, timeout, args, *kwargs) 67 if isinstance(result[0], BaseException): ---> 68 raise result[0] 69 return result[0]

~/opt/miniconda3/envs/main/lib/python3.9/site-packages/fsspec/asyn.py in _runner(event, coro, result, timeout) 23 try: ---> 24 result[0] = await coro 25 except Exception as ex:

~/opt/miniconda3/envs/main/lib/python3.9/site-packages/s3fs/core.py in _pipe_file(self, path, data, chunksize, kwargs) 865 if size < min(5 * 2 ** 30, 2 * chunksize): --> 866 return await self._call_s3( 867 "put_object", Bucket=bucket, Key=key, Body=data, kwargs

~/opt/miniconda3/envs/main/lib/python3.9/site-packages/s3fs/core.py in _call_s3(self, method, akwarglist, *kwargs) 264 err = e --> 265 raise translate_boto_error(err) 266

PermissionError: Access Denied

The above exception was the direct cause of the following exception:

KeyError Traceback (most recent call last) /var/folders/rf/26llfhwd68x7cftb1z3h000w0000gp/T/ipykernel_10507/3272073269.py in <module> 1 ds = xr.tutorial.open_dataset("air_temperature").isel(time=0) ----> 2 ds.to_zarr("s3://BUCKET/file.zarr")

~/opt/miniconda3/envs/main/lib/python3.9/site-packages/xarray/core/dataset.py in to_zarr(self, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks) 1920 encoding = {} 1921 -> 1922 return to_zarr( 1923 self, 1924 store=store,

~/opt/miniconda3/envs/main/lib/python3.9/site-packages/xarray/backends/api.py in to_zarr(dataset, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks) 1432 ) 1433 -> 1434 zstore = backends.ZarrStore.open_group( 1435 store=store, 1436 mode=mode,

~/opt/miniconda3/envs/main/lib/python3.9/site-packages/xarray/backends/zarr.py in open_group(cls, store, mode, synchronizer, group, consolidated, consolidate_on_close, chunk_store, storage_options, append_dim, write_region, safe_chunks) 336 zarr_group = zarr.open_consolidated(store, open_kwargs) 337 else: --> 338 zarr_group = zarr.open_group(store, open_kwargs) 339 return cls( 340 zarr_group, consolidate_on_close, append_dim, write_region, safe_chunks

~/opt/miniconda3/envs/main/lib/python3.9/site-packages/zarr/hierarchy.py in open_group(store, mode, cache_attrs, synchronizer, path, chunk_store, storage_options) 1183 raise ContainsGroupError(path) 1184 else: -> 1185 init_group(store, path=path, chunk_store=chunk_store) 1186 1187 # determine read only status

~/opt/miniconda3/envs/main/lib/python3.9/site-packages/zarr/storage.py in init_group(store, overwrite, path, chunk_store) 486 487 # initialise metadata --> 488 _init_group_metadata(store=store, overwrite=overwrite, path=path, 489 chunk_store=chunk_store) 490

~/opt/miniconda3/envs/main/lib/python3.9/site-packages/zarr/storage.py in _init_group_metadata(store, overwrite, path, chunk_store) 513 meta = dict() # type: ignore 514 key = _path_to_prefix(path) + group_meta_key --> 515 store[key] = encode_group_metadata(meta) 516 517

~/opt/miniconda3/envs/main/lib/python3.9/site-packages/zarr/storage.py in setitem(self, key, value) 1112 self.fs.invalidate_cache(self.fs._parent(path)) 1113 except self.exceptions as e: -> 1114 raise KeyError(key) from e 1115 1116 def delitem(self, key):

KeyError: '.zgroup' ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5596/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
830507003 MDU6SXNzdWU4MzA1MDcwMDM= 5028 Saving zarr to remote location lower cases all data_vars raybellwaves 17162724 closed 0     5 2021-03-12T21:40:37Z 2021-06-17T12:53:22Z 2021-06-15T21:37:15Z CONTRIBUTOR      

What happened:

I saved a zarr store to a remote location (s3) and read it again and realized the name of the data variables (DataArrays) had reverted to lower case

What you expected to happen:

This does not happen when you save the zarr store locally

Minimal Complete Verifiable Example:

```python ds = xr.tutorial.open_dataset("air_temperature") ds = ds.rename({'air': "AIR"})

Save to local

ds.to_zarr("ds.zarr")

Save to remote

ds.to_zarr("s3://BUCKET/ds.zarr")

Read local

xr.open_dataset("ds.zarr", engine="zarr").data_vars

Data variables:

AIR (time, lat, lon) float32 ...

Read remote

xr.open_dataset("s3://BUCKET/ds.zarr", engine="zarr").data_vars

Data variables:

air (time, lat, lon) float32 ...

```

Anything else we need to know?:

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.8.8 | packaged by conda-forge | (default, Feb 20 2021, 16:22:27) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 5.4.0-1009-aws machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: C.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.17.0 pandas: 1.2.3 numpy: 1.20.1 scipy: 1.5.3 netCDF4: 1.5.6 pydap: installed h5netcdf: 0.10.0 h5py: 3.1.0 Nio: None zarr: 2.6.1 cftime: 1.4.1 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.1 cfgrib: 0.9.8.5 iris: None bottleneck: 1.3.2 dask: 2021.03.0 distributed: 2021.03.0 matplotlib: 3.3.4 cartopy: 0.18.0 seaborn: 0.11.1 numbagg: None pint: 0.16.1 setuptools: 49.6.0.post20210108 pip: 21.0.1 conda: None pytest: 6.2.2 IPython: 7.21.0 sphinx: None ​
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5028/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
858096239 MDU6SXNzdWU4NTgwOTYyMzk= 5159 DOC: link to_zarr docs to io section on zarr raybellwaves 17162724 closed 0     0 2021-04-14T17:12:27Z 2021-04-16T15:30:12Z 2021-04-16T15:30:12Z CONTRIBUTOR      

the to_zarr docs http://xarray.pydata.org/en/latest/generated/xarray.Dataset.to_zarr.html could add a "See Also" and point to http://xarray.pydata.org/en/latest/user-guide/io.html?highlight=zarr#zarr

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5159/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
822250814 MDU6SXNzdWU4MjIyNTA4MTQ= 4993 add xr.set_options to docs raybellwaves 17162724 closed 0     3 2021-03-04T15:41:20Z 2021-03-07T16:45:46Z 2021-03-07T16:45:46Z CONTRIBUTOR      

https://github.com/pydata/xarray/issues/4992#issuecomment-790685921

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4993/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
822202759 MDU6SXNzdWU4MjIyMDI3NTk= 4992 Feature request: xr.set_option('display.max_data_vars', N) raybellwaves 17162724 closed 0     4 2021-03-04T14:48:29Z 2021-03-04T15:14:50Z 2021-03-04T15:14:50Z CONTRIBUTOR      

idea discussed here: https://github.com/pydata/xarray/discussions/4991

Copying here

the data_vars method provides a convenient ways to see the varaibles and the formatting (https://github.com/pydata/xarray/blob/master/xarray/core/dataset.py#L484) is nice. However, there are times I would look to see the full variables names not just a subset (12 is the default).

Ideally there would be a setting to toggle to specify the length of the formatter akin to https://pandas.pydata.org/pandas-docs/stable/user_guide/options.html#overview

xr.set_option('display.max_data_vars', N)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4992/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
748229907 MDU6SXNzdWU3NDgyMjk5MDc= 4598 Calling pd.to_datetime on cftime variable raybellwaves 17162724 closed 0     4 2020-11-22T12:14:27Z 2021-02-16T02:42:35Z 2021-02-16T02:42:35Z CONTRIBUTOR      

It would be nice to be able to convert cftime variables to pandas datetime to utilize the functionality there.

I understand this is an upstream issue as pandas probably isn't aware of cftime. However, i'm curious if a method could be added to cftime such as .to_dataframe().

I've found pd.to_datetime(np.datetime64(date_cf)) is the best way to do this currently.

``` import xarray as xr import numpy as np import pandas as pd

date_str = '2020-01-01' date_np = np.datetime64(date_str)

date_np numpy.datetime64('2020-01-01') date_pd = pd.to_datetime(date_np) date_pd Timestamp('2020-01-01 00:00:00')

date_cf = xr.cftime_range(start=date_str, periods=1)[0] pd.to_datetime(date_cf)

pd.to_datetime(date_cf) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/ray/local/bin/anaconda3/envs/a/lib/python3.8/site-packages/pandas/core/tools/datetimes.py", line 830, in to_datetime result = convert_listlike(np.array([arg]), format)[0] File "/home/ray/local/bin/anaconda3/envs/a/lib/python3.8/site-packages/pandas/core/tools/datetimes.py", line 459, in _convert_listlike_datetimes result, tz_parsed = objects_to_datetime64ns( File "/home/ray/local/bin/anaconda3/envs/a/lib/python3.8/site-packages/pandas/core/arrays/datetimes.py", line 2044, in objects_to_datetime64ns result, tz_parsed = tslib.array_to_datetime( File "pandas/_libs/tslib.pyx", line 352, in pandas._libs.tslib.array_to_datetime File "pandas/_libs/tslib.pyx", line 579, in pandas._libs.tslib.array_to_datetime File "pandas/_libs/tslib.pyx", line 718, in pandas._libs.tslib.array_to_datetime_object File "pandas/_libs/tslib.pyx", line 552, in pandas._libs.tslib.array_to_datetime TypeError: <class 'cftime._cftime.DatetimeGregorian'> is not convertible to datetime ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4598/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
807614965 MDU6SXNzdWU4MDc2MTQ5NjU= 4901 document example of preprocess with open_mfdataset raybellwaves 17162724 closed 0     1 2021-02-12T23:37:04Z 2021-02-13T03:21:08Z 2021-02-13T03:21:08Z CONTRIBUTOR      

@jhamman's SO answer circa 2018 helped me this week https://stackoverflow.com/a/51714004/6046019

I wonder if it's worth (not sure where) providing an example of how to use preprocesses with open_mfdataset?

Add an Examples entry to the doc string? (http://xarray.pydata.org/en/latest/generated/xarray.open_mfdataset.html / https://github.com/pydata/xarray/blob/5296ed18272a856d478fbbb3d3253205508d1c2d/xarray/backends/api.py#L895)

While not a small example (as the remote files are large) this is how I used it:

``` import xarray as xr import s3fs

def preprocess(ds): return ds.expand_dims('time')

fs = s3fs.S3FileSystem(anon=True) f1 = fs.open('s3://fmi-opendata-rcrhirlam-surface-grib/2021/02/03/00/numerical-hirlam74-forecast-MaximumWind-20210203T000000Z.grb2') f2 = fs.open('s3://fmi-opendata-rcrhirlam-surface-grib/2021/02/03/06/numerical-hirlam74-forecast-MaximumWind-20210203T060000Z.grb2')

ds = xr.open_mfdataset([f1, f2], engine="cfgrib", preprocess=preprocess, parallel=True) ```

with one file looking like: xr.open_dataset("LOCAL_numerical-hirlam74-forecast-MaximumWind-20210203T000000Z.grb2", engine="cfgrib") <xarray.Dataset> Dimensions: (latitude: 947, longitude: 5294, step: 55) Coordinates: time datetime64[ns] ... * step (step) timedelta64[ns] 01:00:00 ... 2 days 07:00:00 heightAboveGround int64 ... * latitude (latitude) float64 25.65 25.72 25.78 ... 89.86 89.93 90.0 * longitude (longitude) float64 -180.0 -179.9 -179.9 ... 179.9 180.0 valid_time (step) datetime64[ns] ... Data variables: fg10 (step, latitude, longitude) float32 ... Attributes: GRIB_edition: 2 GRIB_centre: ecmf GRIB_centreDescription: European Centre for Medium-Range Weather Forecasts GRIB_subCentre: 0 Conventions: CF-1.7 institution: European Centre for Medium-Range Weather Forecasts history: 2021-02-12T18:06:52 GRIB to CDM+CF via cfgrib-0....

A smaller example could be (WIP; note I was hoping ds would concat along t but it doesn't do what I expect) ``` import numpy as np import xarray as xr

f1 = xr.DataArray(np.arange(2), coords=[np.arange(2)], dims=["a"], name="f1") f1 = f1.assign_coords(t=0) f1.to_dataset().to_zarr("f1.zarr") # What's the best way to store small files to open again with mf_dataset? csv via xarray objects? can you use open_mfdataset on pkl objects?

f2 = xr.DataArray(np.arange(2), coords=[np.arange(2)], dims=["a"], name="f2") f2 = f2.assign_coords(t=1) f2.to_dataset().to_zarr("f2.zarr")

Concat along t

def preprocess(ds): return ds.expand_dims('t') ds = xr.open_mfdataset(["f1.zarr", "f2.zarr"], engine="zarr", concat_dim="t", preprocess=preprocess)

ds <xarray.Dataset> Dimensions: (a: 2, t: 1) Coordinates: * t (t) int64 0 * a (a) int64 0 1 Data variables: f1 (t, a) int64 dask.array<chunksize=(1, 2), meta=np.ndarray> f2 (t, a) int64 dask.array<chunksize=(1, 2), meta=np.ndarray> ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4901/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
753037814 MDU6SXNzdWU3NTMwMzc4MTQ= 4620 Link concat info to concat doc string raybellwaves 17162724 closed 0     1 2020-11-29T22:59:08Z 2020-12-19T23:20:22Z 2020-12-19T23:20:22Z CONTRIBUTOR      

Could link http://xarray.pydata.org/en/stable/combining.html#concatenate or add the example to the concat doc string: http://xarray.pydata.org/en/stable/generated/xarray.concat.html

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4620/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
697283733 MDU6SXNzdWU2OTcyODM3MzM= 4413 rtd DataArray.to_netcdf raybellwaves 17162724 closed 0     1 2020-09-10T00:59:03Z 2020-09-17T12:59:09Z 2020-09-17T12:59:09Z CONTRIBUTOR      

http://xarray.pydata.org/en/stable/generated/xarray.DataArray.to_netcdf.html#xarray.DataArray.to_netcdf

I believe there is a way to hyperlink to xarray.Dataset.to_netcdf. I believe there is also a way for sphinx to render the Note section e.g. in the doc string add .. note::

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4413/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
638080883 MDU6SXNzdWU2MzgwODA4ODM= 4151 doc: reading via cfgrib raybellwaves 17162724 closed 0     2 2020-06-13T02:37:37Z 2020-06-17T16:52:30Z 2020-06-17T16:52:30Z CONTRIBUTOR      

In the docs http://xarray.pydata.org/en/stable/io.html#grib-format-via-cfgrib

Curious if eccodes is needed if reading a grib using using the cfgrib backend?

Does installing cfgrib via conda also install the binary dependencies? https://github.com/ecmwf/cfgrib#installation

cc. @alexamici

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4151/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
588118015 MDU6SXNzdWU1ODgxMTgwMTU= 3895 xarray.Dataset.from_dataframe link to pandas.DataFrame.to_xarray raybellwaves 17162724 closed 0     3 2020-03-26T02:57:24Z 2020-04-23T07:58:09Z 2020-04-23T07:58:09Z CONTRIBUTOR      

Is it worth referencing https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_xarray.html in http://xarray.pydata.org/en/stable/generated/xarray.Dataset.from_dataframe.html?

is the pandas method preferred?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3895/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
561043452 MDU6SXNzdWU1NjEwNDM0NTI= 3756 pynio package not found raybellwaves 17162724 closed 0     1 2020-02-06T14:20:45Z 2020-03-09T14:07:03Z 2020-03-09T14:07:03Z CONTRIBUTOR      

Copied the install instructions from the docs http://xarray.pydata.org/en/stable/installing.html#instructions on my windows machine. conda install -c conda-forge xarray cartopy pynio pseudonetcdf

Got a PackagesNotFoundError: pynio

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3756/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
558742452 MDU6SXNzdWU1NTg3NDI0NTI= 3742 GRIB Data Example notebook - dataset not found raybellwaves 17162724 closed 0     3 2020-02-02T19:15:55Z 2020-02-03T17:05:07Z 2020-02-03T17:05:07Z CONTRIBUTOR      

Just testing the docs on binder (good job). Noticed the ERA5-Grib-example.ipynb (https://github.com/pydata/xarray/blob/master/doc/examples/ERA5-GRIB-example.ipynb) was not working

```

HTTPError Traceback (most recent call last) <ipython-input-2-783584127f97> in <module> ----> 1 ds = xr.tutorial.load_dataset('era5-2mt-2019-03-uk.grib', engine='cfgrib')

/srv/conda/envs/notebook/lib/python3.8/site-packages/xarray/tutorial.py in load_dataset(args, kwargs) 107 open_dataset 108 """ --> 109 with open_dataset(args, **kwargs) as ds: 110 return ds.load() 111

/srv/conda/envs/notebook/lib/python3.8/site-packages/xarray/tutorial.py in open_dataset(name, cache, cache_dir, github_url, branch, **kws) 75 76 url = "/".join((github_url, "raw", branch, fullname)) ---> 77 urlretrieve(url, localfile) 78 url = "/".join((github_url, "raw", branch, md5name)) 79 urlretrieve(url, md5file)

/srv/conda/envs/notebook/lib/python3.8/urllib/request.py in urlretrieve(url, filename, reporthook, data) 245 url_type, path = _splittype(url) 246 --> 247 with contextlib.closing(urlopen(url, data)) as fp: 248 headers = fp.info() 249

/srv/conda/envs/notebook/lib/python3.8/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context) 220 else: 221 opener = _opener --> 222 return opener.open(url, data, timeout) 223 224 def install_opener(opener):

/srv/conda/envs/notebook/lib/python3.8/urllib/request.py in open(self, fullurl, data, timeout) 529 for processor in self.process_response.get(protocol, []): 530 meth = getattr(processor, meth_name) --> 531 response = meth(req, response) 532 533 return response

/srv/conda/envs/notebook/lib/python3.8/urllib/request.py in http_response(self, request, response) 638 # request was successfully received, understood, and accepted. 639 if not (200 <= code < 300): --> 640 response = self.parent.error( 641 'http', request, response, code, msg, hdrs) 642

/srv/conda/envs/notebook/lib/python3.8/urllib/request.py in error(self, proto, args) 567 if http_err: 568 args = (dict, 'default', 'http_error_default') + orig_args --> 569 return self._call_chain(args) 570 571 # XXX probably also want an abstract factory that knows when it makes

/srv/conda/envs/notebook/lib/python3.8/urllib/request.py in _call_chain(self, chain, kind, meth_name, args) 500 for handler in handlers: 501 func = getattr(handler, meth_name) --> 502 result = func(args) 503 if result is not None: 504 return result

/srv/conda/envs/notebook/lib/python3.8/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs) 647 class HTTPDefaultErrorHandler(BaseHandler): 648 def http_error_default(self, req, fp, code, msg, hdrs): --> 649 raise HTTPError(req.full_url, code, msg, hdrs, fp) 650 651 class HTTPRedirectHandler(BaseHandler):

HTTPError: HTTP Error 404: Not Found ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3742/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
223610210 MDU6SXNzdWUyMjM2MTAyMTA= 1381 xr.where not picking up nan from another NetCDF file raybellwaves 17162724 closed 0     5 2017-04-23T03:45:24Z 2019-04-25T15:23:43Z 2019-04-25T15:23:43Z CONTRIBUTOR      

I posted this question here: http://stackoverflow.com/questions/43485347/python-xarray-copy-nan-from-one-dataarray-to-another

I have uploaded the files and code to https://github.com/raybellwaves/xarray_issue.git

When I originally discovered the issue I was using python version 3.6.0 (default, Jan 28 2017, 13:49:14) [GCC Intel(R) C++ gcc 4.4 mode]

I tested the code with version 2.7.13 | packaged by conda-forge | (default, Mar 20 2017, 14:26:36) [GCC 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.53)] and I did not get the error message which I posted in the question. The code worked by the variable ws10_masked is empty

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1381/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
321917084 MDU6SXNzdWUzMjE5MTcwODQ= 2113 Rolling mean of dask array conflicting sizes for data and coordinate in rolling operation raybellwaves 17162724 closed 0     4 2018-05-10T12:40:19Z 2018-05-12T06:15:55Z 2018-05-12T06:15:55Z CONTRIBUTOR      

Code Sample, a copy-pastable example if possible

python import xarray as xr remote_data = xr.open_dataarray('http://iridl.ldeo.columbia.edu/SOURCES/.Models'\ '/.SubX/.RSMAS/.CCSM4/.hindcast/.zg/dods', chunks={'L': 1, 'S': 1}) da = remote_data.isel(P=0,L=0,M=0,X=0,Y=0) da_day_clim = da.groupby('S.dayofyear').mean('S') da_day_clim2 = da_day_clim.chunk({'dayofyear': 366}) da_day_clim_smooth = da_day_clim2.rolling(dayofyear=31, center=True).mean()

Problem description

Initially discussed on SO: https://stackoverflow.com/questions/50265586/xarray-rolling-mean-of-dask-array-conflicting-sizes-for-data-and-coordinate-in

The rolling operation gives a ValueError: conflicting sizes for dimension 'dayofyear': length 351 on the data but length 366 on coordinate 'dayofyear' The length of 351 in the data is created in the rolling operation.

Here's the full traceback: ``` ValueError Traceback (most recent call last) <ipython-input-57-6acf382cdd3d> in <module>() 4 da_day_clim = da.groupby('S.dayofyear').mean('S') 5 da_day_clim2 = da_day_clim.chunk({'dayofyear': 366}) ----> 6 da_day_clim_smooth = da_day_clim2.rolling(dayofyear=31, center=True).mean()

~/anaconda/envs/SubXNAO/lib/python3.6/site-packages/xarray/core/rolling.py in wrapped_func(self, **kwargs) 307 if self.center: 308 values = values[valid] --> 309 result = DataArray(values, self.obj.coords) 310 311 return result

~/anaconda/envs/SubXNAO/lib/python3.6/site-packages/xarray/core/dataarray.py in init(self, data, coords, dims, name, attrs, encoding, fastpath) 224 225 data = as_compatible_data(data) --> 226 coords, dims = _infer_coords_and_dims(data.shape, coords, dims) 227 variable = Variable(dims, data, attrs, encoding, fastpath=True) 228

~/anaconda/envs/SubXNAO/lib/python3.6/site-packages/xarray/core/dataarray.py in _infer_coords_and_dims(shape, coords, dims) 79 raise ValueError('conflicting sizes for dimension %r: ' 80 'length %s on the data but length %s on ' ---> 81 'coordinate %r' % (d, sizes[d], s, k)) 82 83 if k in sizes and v.shape != (sizes[k],):

ValueError: conflicting sizes for dimension 'dayofyear': length 351 on the data but length 366 on coordinate 'dayofyear' ```

Expected Output

The rolling operation would work on the dask array as it would on the dataarray e.g. ``` import pandas as pd import xarray as xr import numpy as np

dates = pd.date_range('1/1/1980', '31/12/2000', freq='D') data = np.linspace(1, len(dates), num=len(dates), dtype=np.float) da = xr.DataArray(data, coords=[dates], dims='time') da_day_clim = da.groupby('time.dayofyear').mean('time') da_day_clim_smooth = da_day_clim.rolling(dayofyear=31, center=True).mean() ```

Output of xr.show_versions()

/Users/Ray/anaconda/envs/SubXNAO/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. from ._conv import register_converters as _register_converters INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Darwin OS-release: 17.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.3 pandas: 0.22.0 numpy: 1.14.2 scipy: 1.0.1 netCDF4: 1.3.1 h5netcdf: 0.5.1 h5py: 2.7.1 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: 0.17.4 distributed: 1.21.8 matplotlib: 2.2.2 cartopy: 0.16.0 seaborn: None setuptools: 39.1.0 pip: 9.0.3 conda: None pytest: None IPython: 6.3.1 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2113/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
320007162 MDU6SXNzdWUzMjAwMDcxNjI= 2102 resample DeprecationWarning only on 1-D arrays? raybellwaves 17162724 closed 0     1 2018-05-03T17:13:55Z 2018-05-08T17:36:22Z 2018-05-08T17:36:22Z CONTRIBUTOR      

Code Sample, a copy-pastable example if possible

```python da = xr.DataArray(np.array([1,2,3,4], dtype=np.float).reshape(2,2), ... coords=[pd.date_range('1/1/2000', '1/2/2000', freq='D'), ... np.linspace(0,1,num=2)], ... dims=['time', 'latitude'])

da.resample(freq='M', dim='time', how='mean')

/Users/Ray/anaconda/envs/rot-eof-dev-env/bin/ipython:1: DeprecationWarning:

.resample() has been modified to defer calculations. Instead of passing 'dim' and 'how="mean", #instead consider using .resample(time="M").mean()

#!/Users/Ray/anaconda/envs/rot-eof-dev-env/bin/python

Out[66]:

<xarray.DataArray (time: 1, latitude: 2)>

array([[2., 3.]])

Coordinates:

* time (time) datetime64[ns] 2000-01-31

* latitude (latitude) float64 0.0 1.0

da.resample(time="M").mean()

<xarray.DataArray (time: 1)>

array([2.5])

Coordinates:

* time (time) datetime64[ns] 2000-01-31

```

Problem description

The DeprecationWarning example seems to only work for 1d arrays as it doesn't average along any dimension.

A quick fix could be to show the warning only if the DataArray/Dataset is 1D.

A more thorough fix could be to wrap .resample(time="M").mean() as .resample(freq='M', dim='time', how='mean')???

Expected Output

Same as da.resample(freq='M', dim='time', how='mean')

Output of xr.show_versions()

xr.show_versions() # Not sure about the h5py FutureWarning? /Users/Ray/anaconda/envs/rot-eof-dev-env/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. from ._conv import register_converters as _register_converters INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Darwin OS-release: 17.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.3 pandas: 0.22.0 numpy: 1.14.2 scipy: 1.0.1 netCDF4: 1.3.1 h5netcdf: 0.5.1 h5py: 2.7.1 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: 0.17.2 distributed: 1.21.6 matplotlib: 2.2.2 cartopy: 0.16.0 seaborn: None setuptools: 39.0.1 pip: 9.0.3 conda: None pytest: None IPython: 6.3.1 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2102/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
278743801 MDU6SXNzdWUyNzg3NDM4MDE= 1757 open_dataarray docs still contains *args, **kwargs raybellwaves 17162724 closed 0     4 2017-12-03T04:36:48Z 2018-01-19T05:13:51Z 2018-01-19T05:13:51Z CONTRIBUTOR      

I noticed the open_dataset docs provides the full input parameters whereas, open_dataarray docs still lists *args, **kwargs If you point me to where this is created I can change it, if the change is preferred

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1757/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 25.701ms · About: xarray-datasette