id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 144630996,MDU6SXNzdWUxNDQ2MzA5OTY=,810,correct DJF mean,10194086,closed,0,,,4,2016-03-30T15:36:42Z,2022-04-06T16:19:47Z,2016-05-04T12:56:30Z,MEMBER,,,,"This started as a question and I add it as reference. Maybe you have a comment. There are several ways to calculate time series of seasonal data (starting from monthly or daily data): ``` # load libraries import pandas as pd import matplotlib.pyplot import numpy as np import xarray as xr # Create Example Dataset time = pd.date_range('2000.01.01', '2010.12.31', freq='M') data = np.random.rand(*time.shape) ds = xr.DataArray(data, coords=dict(time=time)) # (1) using resample ds_res = ds.resample('Q-FEB', 'time') ds_res = ds_res.sel(time=ds_res['time.month'] == 2) ds_res = ds_res.groupby('time.year').mean('time') # (2) this is wrong ds_season = ds.where(ds['time.season'] == 'DJF').groupby('time.year').mean('time') # (3) using where and rolling # mask other months with nan ds_DJF = ds.where(ds['time.season'] == 'DJF') # rolling mean -> only Jan is not nan # however, we loose Jan/ Feb in the first year and Dec in the last ds_DJF = ds_DJF.rolling(min_periods=3, center=True, time=3).mean() # make annual mean ds_DJF = ds_DJF.groupby('time.year').mean('time') ds_res.plot(marker='*') ds_season.plot() ds_DJF.plot() plt.show() ``` (1) The first is to use resample with 'Q-FEB' as argument. This works fine. It does include Jan/ Feb in the first year, and Dec in the last year + 1. If this makes sense can be debated. One case where this does not work is when you have, say, two regions in your data set, for one you want to calculate DJF and for the other you want NovDecJan. (2) Using 'time.season' is wrong as it combines Jan, Feb and Dec from the same year. (3) The third uses `where` and `rolling` and you lose 'incomplete' seasons. If you replace `ds.where(ds['time.season'] == 'DJF')` with `ds.groupby('time.month').where(summer_months)`, where `summer_months` is a boolean array it works also for non-standard 'summers' (or seasons) across the globe. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/810/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 800118528,MDU6SXNzdWU4MDAxMTg1Mjg=,4858,doctest failure with numpy 1.20,10194086,closed,0,,,4,2021-02-03T08:57:43Z,2021-02-07T21:57:34Z,2021-02-07T21:57:34Z,MEMBER,,,," **What happened**: Our doctests fail since numpy 1.20 came out: https://github.com/pydata/xarray/pull/4760/checks?check_run_id=1818512841#step:8:69 **What you expected to happen**: They don't ;-) **Minimal Complete Verifiable Example**: The following fails with numpy 1.20 while it converted `np.NaN` to an integer before ([xarray.DataArray.pad](http://xarray.pydata.org/en/v0.16.2/generated/xarray.DataArray.pad.html#xarray.DataArray.pad) at the bottom) ```python import numpy as np x = np.arange(10) x = np.pad(x, 1, ""constant"", constant_values=np.nan) ``` requires numpy 1.20 **Anything else we need to know?**: - that's probably related to https://numpy.org/doc/stable/release/1.20.0-notes.html#numpy-scalars-are-cast-when-assigned-to-arrays - I asked if this behavior will stay: https://github.com/numpy/numpy/issues/16499#issuecomment-772342087 - One possibility is to add a check `np.can_cast(constant_values.dtype, array.dtype)` (or similar) for a better error message. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4858/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 688115687,MDU6SXNzdWU2ODgxMTU2ODc=,4385,warnings from internal use of apply_ufunc,10194086,closed,0,,,4,2020-08-28T14:28:56Z,2020-08-30T16:37:52Z,2020-08-30T16:37:52Z,MEMBER,,,,"Another follow up from #4060: `quantile` now emits a `FutureWarning`: **Minimal Complete Verifiable Example**: ```python xr.DataArray([1, 2, 3]).quantile(q=0.5) ``` ``` ~/.conda/envs/ipcc_ar6/lib/python3.7/site-packages/xarray/core/variable.py:1866: FutureWarning: ``output_sizes`` should be given in the ``dask_gufunc_kwargs`` parameter. It will be removed as direct parameter in a future version. kwargs={""q"": q, ""axis"": axis, ""interpolation"": interpolation}, ``` We should probably check the warnings in the test suite - there may be others. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4385/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 559864146,MDU6SXNzdWU1NTk4NjQxNDY=,3750,isort pre-commit hook does not skip text files,10194086,closed,0,,,4,2020-02-04T17:18:31Z,2020-05-06T01:50:29Z,2020-03-28T20:58:15Z,MEMBER,,,,"#### MCVE Code Sample Add arbitrary change to the file `doc/pandas.rst` ``` bash git add doc/pandas.rst git commit -m ""test"" ``` The pre-commit hook will fail. #### Expected Output the pre-commit hook to pass #### Problem Description running `isort -rc doc/*` will change the following files: ``` bash modified: contributing.rst modified: howdoi.rst modified: internals.rst modified: io.rst modified: pandas.rst modified: quick-overview.rst ``` unfortunately it does not behave properly and deletes/ changes arbitrary lines. Can the pre-commit hook be told to only run on *.py files? On the command line this would be `isort -rc *.py` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3750/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 545764524,MDU6SXNzdWU1NDU3NjQ1MjQ=,3665,Cannot roundtrip time in NETCDF4_CLASSIC,10194086,closed,0,,,4,2020-01-06T14:47:48Z,2020-01-16T18:27:15Z,2020-01-16T18:27:14Z,MEMBER,,,,"#### MCVE Code Sample ``` python import numpy as np import xarray as xr time = xr.cftime_range(""2006-01-01"", periods=2, calendar=""360_day"") da = xr.DataArray(time, dims=[""time""]) da.encoding[""dtype""] = np.float da.to_netcdf(""tst.nc"", format=""NETCDF4_CLASSIC"") ds = xr.open_dataset(""tst.nc"") ds.to_netcdf(""tst2.nc"", format=""NETCDF4_CLASSIC"") ``` yields: ```python ValueError: could not safely cast array from dtype int64 to int32 ``` Or an example without `to_netcdf`: ```python import numpy as np import xarray as xr time = xr.cftime_range(""2006-01-01"", periods=2, calendar=""360_day"") da = xr.DataArray(time, dims=[""time""]) da.encoding[""_FillValue""] = np.array([np.nan]) xr.backends.netcdf3.encode_nc3_variable(xr.conventions.encode_cf_variable(da)) ``` #### Expected Output Xarray can save the dataset/ an `xr.Variable`. #### Problem Description If there is a time variable that can be encoded using integers only, but that has a `_FillValue` set to `NaN`, saving `to_netcdf(name, format=""NETCDF4_CLASSIC"")` fails. The problem is that xarray adds a (unnecessary) `_FillValue` when saving a file. Note: if the time cannot be encoded using integers only, it works: ``` python da = xr.DataArray(time, dims=[""time""]) da.encoding[""_FillValue""] = np.array([np.nan]) da.encoding[""units""] = ""days since 2006-01-01T12:00:00"" xr.backends.netcdf3.encode_nc3_variable(xr.conventions.encode_cf_variable(da)) ``` Another note: when saving with NETCDF4 ``` python da = xr.DataArray(time, dims=[""time""]) da.encoding[""_FillValue""] = np.array([np.nan]) xr.backends.netCDF4_._encode_nc4_variable(xr.conventions.encode_cf_variable(da)) ``` The following is returned: ``` array([0, 1]) Attributes: units: days since 2006-01-01 00:00:00.000000 calendar: proleptic_gregorian _FillValue: [-9223372036854775808] ``` #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 21:52:21) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 4.12.14-lp151.28.36-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.1 xarray: 0.14.1 pandas: 0.25.2 numpy: 1.17.3 scipy: 1.3.1 netCDF4: 1.5.3 pydap: None h5netcdf: 0.7.4 h5py: 2.10.0 Nio: None zarr: None cftime: 1.0.4.2 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: 1.1.1 cfgrib: None iris: None bottleneck: 1.3.1 dask: 2.6.0 distributed: 2.6.0 matplotlib: 3.1.2 cartopy: 0.17.0 seaborn: 0.9.0 numbagg: None setuptools: 41.4.0 pip: 19.3.1 conda: None pytest: 5.2.2 IPython: 7.9.0 sphinx: 2.2.1
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3665/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 106595746,MDU6SXNzdWUxMDY1OTU3NDY=,577,wrap lon coordinates to 360,10194086,closed,0,,,4,2015-09-15T16:36:37Z,2019-01-17T09:34:56Z,2019-01-15T20:15:01Z,MEMBER,,,,"Assume I have two datasets with the same lat/ lon grid. However, one has `lon = 0...359`and the other `lon = -180...179`. How can I wrap around one of them? ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/577/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 310819233,MDU6SXNzdWUzMTA4MTkyMzM=,2036,better error message for to_netcdf -> unlimited_dims,10194086,closed,0,,,4,2018-04-03T12:39:21Z,2018-05-18T14:48:32Z,2018-05-18T14:48:32Z,MEMBER,,,,"#### Code Sample, a copy-pastable example if possible ```python # Your code here import numpy as np import xarray as xr x = np.arange(10) da = xr.Dataset(data_vars=dict(data=('dim1', x)), coords=dict(dim1=('dim1', x), dim2=('dim2', x))) da.to_netcdf('tst.nc', format='NETCDF4_CLASSIC', unlimited_dims='dim1') ``` #### Problem description This creates the error `RuntimeError: NetCDF: NC_UNLIMITED size already in use`. With `format='NETCDF4'` silently creates the dimensions `d`, `i`, `m`, and `\1`. The correct syntax is `unlimited_dims=['dim1']`. With `format='NETCDF4_CLASSIC'` and `unlimited_dims=['dim1', 'dim2']`, still raises the not-so-helpful `NC_UNLIMITED` error. I only tested with netCDF4 as backend. #### Expected Output * better error message * work with `unlimited_dims='dim1'` #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Linux OS-release: 4.4.120-45-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8 xarray: 0.10.2 pandas: 0.22.0 numpy: 1.14.2 scipy: 1.0.1 netCDF4: 1.3.1 h5netcdf: 0.5.0 h5py: 2.7.1 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: 1.0.0 dask: 0.17.2 distributed: 1.21.5 matplotlib: 2.2.2 cartopy: 0.16.0 seaborn: 0.8.1 setuptools: 39.0.1 pip: 9.0.3 conda: None pytest: 3.5.0 IPython: 6.3.0 sphinx: 1.7.2
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2036/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 106581329,MDU6SXNzdWUxMDY1ODEzMjk=,576,define fill value for where,10194086,closed,0,,,4,2015-09-15T15:27:32Z,2017-08-08T17:00:30Z,2017-08-08T17:00:30Z,MEMBER,,,,"It would be nice if `where` accepts an `other` argument: ``` def where(self, cond, other=np.NaN): pass ``` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/576/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 67332234,MDU6SXNzdWU2NzMzMjIzNA==,386,"""loosing"" virtual variables",10194086,closed,0,,,4,2015-04-09T10:35:31Z,2015-04-20T03:55:44Z,2015-04-20T03:55:44Z,MEMBER,,,,"Once I take a mean over virtual variables, they are not available any more. ``` import pandas as pd import numpy as np import xray t = pd.date_range('2000-01-01', '2000-12-31', freq='6H') x = np.random.rand(*t.shape) time = xray.DataArray(t, name='t', dims='time') ts = xray.Dataset({'x' : ('time', x), 'time' : time}) ts_mean = ts.groupby('time.date').mean() ts_mean.virtual_variables ``` Is this intended behaviour? And could I get them back somehow? ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/386/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue