id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
1309966595,PR_kwDOAMm_X847rKxS,6812,Improved CF decoding,145117,open,0,,,7,2022-07-19T19:44:27Z,2023-04-01T15:26:04Z,,CONTRIBUTOR,,0,pydata/xarray/pulls/6812,"
- [X] Closes #2304 - but only for my specific use case.
- [x] Tests added
The comments above this line state, ""so we just use a float64"" but then it returns `np.float32`. I assume the comments are correct. Changing this also fixes a bug I ran into.
Note that currently, `_choose_float_dtype` returns `float32` if the data is float16 or float32, even if the `scale_factor` dtype is float64.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6812/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
1568970352,PR_kwDOAMm_X85JKJls,7499,Make text match code example,145117,closed,0,,,1,2023-02-03T00:21:02Z,2023-02-03T00:47:14Z,2023-02-03T00:47:14Z,CONTRIBUTOR,,0,pydata/xarray/pulls/7499,"
- [ ] Closes #xxxx
- [ ] Tests added
- [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst`
- [ ] New functions/methods are listed in `api.rst`
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7499/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
1322645651,PR_kwDOAMm_X848Vs5i,6851,"Fix logic bug - add_offset is in encoding, not attrs.",145117,closed,0,,,3,2022-07-29T20:21:47Z,2022-08-01T18:07:04Z,2022-08-01T18:07:04Z,CONTRIBUTOR,,0,pydata/xarray/pulls/6851,"`_pop_to` does a pop operation - it removes the key/value pair. So the line above this change will remove `add_offset` from `attrs` if it exists. The second line then checks for `add_offset` in `attrs` which should always be `False`.
This is implemented in #6812 but the work there is significantly more involved (see for example different PR addressing the same problem, #2304), so I'm pulling this fix out and submitting it as its own PR.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6851/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
718436141,MDU6SXNzdWU3MTg0MzYxNDE=,4498,Resample is ~100x slower than Pandas resample; Speed is related to resample period (unlike Pandas),145117,closed,0,,,7,2020-10-09T21:37:20Z,2022-05-15T02:38:29Z,2022-05-15T02:38:29Z,CONTRIBUTOR,,,,"**What happened**:
I have a 10 minute frequency time series. When I resample to hourly it is slow. When I resample to daily it is fast. If I drop to Pandas and resample the speeds are ~100x faster than xarray, and also the same time regardless of the resample period. I've posted this to SO: https://stackoverflow.com/questions/64282393/
**What you expected to happen**:
I expect xarray to be within an order of magnitude speed of Pandas, not > 2 orders of magnitude slower.
**Minimal Complete Verifiable Example**:
```python
import numpy as np
import xarray as xr
import pandas as pd
import time
size = 10000
times = pd.date_range('2000-01-01', periods=size, freq=""10Min"")
da = xr.DataArray(data = np.random.random(size), dims = ['time'], coords = {'time': times}, name='foo')
start = time.time()
da_ = da.resample({'time':""1H""}).mean()
print(""1H"", 'xr', str(time.time() - start))
start = time.time()
da_ = da.to_dataframe().resample(""1H"").mean()
print(""1H"", 'pd', str(time.time() - start), ""\n"")
start = time.time()
da_ = da.resample({'time':""1D""}).mean()
print(""1D"", 'xr', str(time.time() - start))
start = time.time()
da_ = da.to_dataframe().resample(""1D"").mean()
print(""1D"", 'pd', str(time.time() - start))
```
Output/timings
```
: 1H xr 0.1761918067932129
: 1H pd 0.0021948814392089844
:
: 1D xr 0.00958395004272461
: 1D pd 0.001646280288696289
```
**Anything else we need to know?**:
**Environment**:
Output of xr.show_versions()
xr.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.8.5 | packaged by conda-forge | (default, Aug 21 2020, 18:21:27)
[GCC 7.5.0]
python-bits: 64
OS: Linux
OS-release: 5.4.0-48-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.6
libnetcdf: 4.7.4
xarray: 0.16.0
pandas: 1.1.1
numpy: 1.19.1
scipy: 1.5.2
netCDF4: 1.5.4
pydap: None
h5netcdf: 0.8.1
h5py: 2.10.0
Nio: None
zarr: None
cftime: 1.2.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.1.5
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.3.1
cartopy: None
seaborn: None
numbagg: None
pint: 0.15
setuptools: 49.6.0.post20200814
pip: 20.2.2
conda: None
pytest: None
IPython: 7.17.0
sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4498/reactions"", ""total_count"": 4, ""+1"": 4, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
297780998,MDU6SXNzdWUyOTc3ODA5OTg=,1917,Decode times adds micro-second noise to standard calendar,145117,closed,0,,,5,2018-02-16T13:14:15Z,2018-02-26T10:28:17Z,2018-02-26T10:28:17Z,CONTRIBUTOR,,,,"#### Code Sample, a copy-pastable example if possible
I have a simplified NetCDF file with the following header:
```bash
netcdf foo {
dimensions:
time = UNLIMITED ; // (366 currently)
x = 2 ;
y = 2 ;
variables:
float time(time) ;
time:standard_name = ""time"" ;
time:long_name = ""time"" ;
time:units = ""DAYS since 2000-01-01 00:00:00"" ;
time:calendar = ""standard"" ;
time:axis = ""T"" ;
...
}
```
I would expect xarray to be able to decode these times. It does, but appears to do so incorrectly and without reporting any issues. Note the fractional time added to each date.
```python
In [4]: xr.open_dataset('foo.nc').time
Out[4]:
array(['2000-01-01T00:00:00.000000000', '2000-01-02T00:00:00.003211264',
'2000-01-03T00:00:00.006422528', ..., '2000-12-29T00:00:01.962606592',
'2000-12-30T00:00:01.672216576', '2000-12-31T00:00:01.381826560'], dtype='datetime64[ns]')
Coordinates:
* time (time) datetime64[ns] 2000-01-01 2000-01-02T00:00:00.003211264 ...
Attributes:
standard_name: time
long_name: time
axis: T
```
#### Problem description
Days since a valid date on a `standard` calendar should not add microseconds.
I know that xarray has time issues, for example #118 #521 numpy:#6207 #531 #789 and #848. But all of those appear to address non-standard times. This bug (if it is a bug) seems to occur with a very simple and straight forward calendar, and is silent, so it took me 2 days to figure out what was going on.
#### Output of ``xr.show_versions()``
In [5]: xr.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.5.4.final.0
python-bits: 64
OS: Darwin
OS-release: 16.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
xarray: 0.10.0
pandas: 0.22.0
numpy: 1.12.1
scipy: 0.19.1
netCDF4: 1.3.1
h5netcdf: None
Nio: None
bottleneck: 1.2.1
cyordereddict: 1.0.0
dask: 0.16.0
matplotlib: 2.1.1
cartopy: 0.15.1
seaborn: 0.8.1
setuptools: 38.4.0
pip: 9.0.1
conda: None
pytest: None
IPython: 6.2.1
sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1917/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue