html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/pull/7500#issuecomment-1416861976,https://api.github.com/repos/pydata/xarray/issues/7500,1416861976,IC_kwDOAMm_X85Uc5kY,43613877,2023-02-04T22:16:49Z,2023-02-04T22:16:49Z,CONTRIBUTOR,I'm just mimicking the netCDF4 driver here. Maybe one could use less probable attributes than `source`? Maybe adding a prefix like `xarray_` to those attributes? I'm open to suggestions.,"{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1571143098
https://github.com/pydata/xarray/issues/7115#issuecomment-1265699141,https://api.github.com/repos/pydata/xarray/issues/7115,1265699141,IC_kwDOAMm_X85LcQlF,43613877,2022-10-03T16:13:20Z,2022-10-03T16:13:20Z,CONTRIBUTOR,"I was about to open the same issue but can confirm that I only get an issue with `python<3.8` and `importlib_metadata==5.0.0`
@Illviljan , `importlib-metadata` should not be necessary for python>=3.8 as it became part of the standard library.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1394854820
https://github.com/pydata/xarray/issues/6995#issuecomment-1263824497,https://api.github.com/repos/pydata/xarray/issues/6995,1263824497,IC_kwDOAMm_X85LVG5x,43613877,2022-09-30T17:18:22Z,2022-09-30T17:18:22Z,CONTRIBUTOR,This issue might be a duplicate of #5897 and it continues to exist in version `2022.09.0`.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1362683132
https://github.com/pydata/xarray/pull/6656#issuecomment-1146453489,https://api.github.com/repos/pydata/xarray/issues/6656,1146453489,IC_kwDOAMm_X85EVX3x,43613877,2022-06-03T23:36:54Z,2022-06-03T23:37:34Z,CONTRIBUTOR,"> Our tests on `min-all-deps` are running with pydap 3.2.2 but were passing. Was the test xfailed? If so can we remove it.
Actually, the general unit test should fail but the specific tests are skipped. Only when the flag `--run-network-tests` is provided to pytest the tests run and would have failed in the past.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1255359858
https://github.com/pydata/xarray/pull/6656#issuecomment-1143316526,https://api.github.com/repos/pydata/xarray/issues/6656,1143316526,IC_kwDOAMm_X85EJaAu,43613877,2022-06-01T08:50:43Z,2022-06-01T08:50:43Z,CONTRIBUTOR,Shall we raise a warning in case `verify` and/or `user_charset` are given and the used pydap version is older than 3.0.0? Or is it fine to just ignore those arguments in this case without warning the user?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1255359858
https://github.com/pydata/xarray/issues/6069#issuecomment-1033814820,https://api.github.com/repos/pydata/xarray/issues/6069,1033814820,IC_kwDOAMm_X849nsMk,43613877,2022-02-09T14:23:54Z,2022-02-09T14:36:48Z,CONTRIBUTOR,"You are right, the coordinates should not be dropped.
I think the function [_validate_region](https://github.com/pydata/xarray/blob/39860f9bd3ed4e84a5d694adda10c82513ed519f/xarray/backends/api.py#L1244) has a bug. Currently it checks for all `ds.variables` if at least one of their dimensions agrees with the ones given in the region argument. However, `ds.variables` also returns the [coordinates](https://xarray.pydata.org/en/stable/generated/xarray.Dataset.variables.html). However, we actually only want to check if the `ds.data_vars` have a dimension intersecting with the given `region`.
Changing the function to
```python
def _validate_region(ds, region):
if not isinstance(region, dict):
raise TypeError(f""``region`` must be a dict, got {type(region)}"")
for k, v in region.items():
if k not in ds.dims:
raise ValueError(
f""all keys in ``region`` are not in Dataset dimensions, got ""
f""{list(region)} and {list(ds.dims)}""
)
if not isinstance(v, slice):
raise TypeError(
""all values in ``region`` must be slice objects, got ""
f""region={region}""
)
if v.step not in {1, None}:
raise ValueError(
""step on all slices in ``region`` must be 1 or None, got ""
f""region={region}""
)
non_matching_vars = [
k for k, v in ds.data_vars.items() if not set(region).intersection(v.dims)
]
if non_matching_vars:
raise ValueError(
f""when setting `region` explicitly in to_zarr(), all ""
f""variables in the dataset to write must have at least ""
f""one dimension in common with the region's dimensions ""
f""{list(region.keys())}, but that is not ""
f""the case for some variables here. To drop these variables ""
f""from this dataset before exporting to zarr, write: ""
f"".drop({non_matching_vars!r})""
)
```
seems to work.","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 1, ""eyes"": 0}",,1077079208
https://github.com/pydata/xarray/issues/6069#issuecomment-1031773761,https://api.github.com/repos/pydata/xarray/issues/6069,1031773761,IC_kwDOAMm_X849f55B,43613877,2022-02-07T18:19:08Z,2022-02-07T18:19:08Z,CONTRIBUTOR,"Hi @Boorhin,
I just ran into the same issue. The `region` argument has to be of type `slice`, in your case `slice(t)` instead of just `t` works:
```python
import xarray as xr
from datetime import datetime,timedelta
import numpy as np
dt= datetime.now()
times= np.arange(dt,dt+timedelta(days=6), timedelta(hours=1))
nodesx,nodesy,layers=np.arange(10,50), np.arange(10,50)+15, np.arange(10)
ds=xr.Dataset()
ds.coords['time']=('time', times)
ds.coords['node_x']=('node', nodesx)
ds.coords['node_y']=('node', nodesy)
ds.coords['layer']=('layer', layers)
outfile='my_zarr'
varnames=['potato','banana', 'apple']
for var in varnames:
ds[var]=(('time', 'layer', 'node'), np.zeros((len(times), len(layers),len(nodesx))))
ds.to_zarr(outfile, mode='a')
for t in range(len(times)):
for var in varnames:
ds[var].isel(time=slice(t)).values += np.random.random((len(layers),len(nodesx)))
ds.isel(time=slice(t)).to_zarr(outfile, region={""time"": slice(t)})
```
This leads however to another issue:
```python
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
in
18 for var in varnames:
19 ds[var].isel(time=slice(t)).values += np.random.random((len(layers),len(nodesx)))
---> 20 ds.isel(time=slice(t)).to_zarr(outfile, region={""time"": slice(t)})
~/.local/lib/python3.8/site-packages/xarray/core/dataset.py in to_zarr(self, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks)
2029 encoding = {}
2030
-> 2031 return to_zarr(
2032 self,
2033 store=store,
~/.local/lib/python3.8/site-packages/xarray/backends/api.py in to_zarr(dataset, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks)
1359
1360 if region is not None:
-> 1361 _validate_region(dataset, region)
1362 if append_dim is not None and append_dim in region:
1363 raise ValueError(
~/.local/lib/python3.8/site-packages/xarray/backends/api.py in _validate_region(ds, region)
1272 ]
1273 if non_matching_vars:
-> 1274 raise ValueError(
1275 f""when setting `region` explicitly in to_zarr(), all ""
1276 f""variables in the dataset to write must have at least ""
ValueError: when setting `region` explicitly in to_zarr(), all variables in the dataset to write must have at least one dimension in common with the region's dimensions ['time'], but that is not the case for some variables here. To drop these variables from this dataset before exporting to zarr, write: .drop(['node_x', 'node_y', 'layer'])
```
Here, the solution is however provided with the error message. Following the instructions, the snippet below finally works (as far as I can tell):
```python
import xarray as xr
from datetime import datetime,timedelta
import numpy as np
dt= datetime.now()
times= np.arange(dt,dt+timedelta(days=6), timedelta(hours=1))
nodesx,nodesy,layers=np.arange(10,50), np.arange(10,50)+15, np.arange(10)
ds=xr.Dataset()
ds.coords['time']=('time', times)
# ds.coords['node_x']=('node', nodesx)
# ds.coords['node_y']=('node', nodesy)
# ds.coords['layer']=('layer', layers)
outfile='my_zarr'
varnames=['potato','banana', 'apple']
for var in varnames:
ds[var]=(('time', 'layer', 'node'), np.zeros((len(times), len(layers),len(nodesx))))
ds.to_zarr(outfile, mode='a')
for t in range(len(times)):
for var in varnames:
ds[var].isel(time=slice(t)).values += np.random.random((len(layers),len(nodesx)))
ds.isel(time=slice(t)).to_zarr(outfile, region={""time"": slice(t)})
```
Maybe one would like to generalise `region` in `api.py` to allow for single indices or throw a hint in case an a type different to a slice is provided.
Cheers","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1077079208
https://github.com/pydata/xarray/pull/4994#issuecomment-800145214,https://api.github.com/repos/pydata/xarray/issues/4994,800145214,MDEyOklzc3VlQ29tbWVudDgwMDE0NTIxNA==,43613877,2021-03-16T10:34:19Z,2021-03-16T10:34:19Z,CONTRIBUTOR,Thanks for this great tool and the great support!,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,822256201
https://github.com/pydata/xarray/pull/4994#issuecomment-799452244,https://api.github.com/repos/pydata/xarray/issues/4994,799452244,MDEyOklzc3VlQ29tbWVudDc5OTQ1MjI0NA==,43613877,2021-03-15T14:12:30Z,2021-03-15T14:12:30Z,CONTRIBUTOR,"Great! @spencerkclark I added the information to [`api-hidden.rst`](https://github.com/pydata/xarray/blob/master/doc/api-hidden.rst) and also to [`api.rst`](https://github.com/pydata/xarray/blob/master/doc/api.rst).
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,822256201
https://github.com/pydata/xarray/issues/4995#issuecomment-799047819,https://api.github.com/repos/pydata/xarray/issues/4995,799047819,MDEyOklzc3VlQ29tbWVudDc5OTA0NzgxOQ==,43613877,2021-03-15T02:28:51Z,2021-03-15T02:28:51Z,CONTRIBUTOR,"Thanks @dcherian, this is doing the job. I'll close this issue as there seems to be no need to implement this into the `sel` method.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,822320976
https://github.com/pydata/xarray/issues/4995#issuecomment-791019238,https://api.github.com/repos/pydata/xarray/issues/4995,791019238,MDEyOklzc3VlQ29tbWVudDc5MTAxOTIzOA==,43613877,2021-03-04T23:10:11Z,2021-03-04T23:10:11Z,CONTRIBUTOR,"Introducing a `fill_value` seems like a good idea, such that the size of the output does not change compared to the intended selection.
Choosing the original/requested coordinate as a label for the missing datapoint seems to be a valid choice because this position has been checked for valid data nearby without success.
I would suggest, that the `fill_value` should then be automatically determined from the `_FillValue`, the datatype and only at last requires the `fill_value` to be set.
However, the shortcoming that I see in using a `fill_value` is that the indexing has to modify the data ( insert e.g. `-999`) and also 'invent' a new coordinate point ( here `40`). This gets reasonably complex, when applying to a dataset with DataArrays of different types, e.g.
```python
import numpy as np
import xarray as xr
ds = xr.Dataset()
ds['data1'] = xr.DataArray(np.array([1,2,3,4,5], dtype=int), dims=[""lat""], coords={'lat':[10,20,30,50,60]})
ds['data2'] = xr.DataArray(np.array([1,2,3,4,5], dtype=float), dims=[""lat""], coords={'lat':[10,20,30,50,60]})
```
One `fill_value` might not fit to all data arrays being it because of the datatype or the actual data. E.g. `-999` might be a good `fill_value` for one DataArray but a valid datapoint in another one.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,822320976
https://github.com/pydata/xarray/issues/4983#issuecomment-790718449,https://api.github.com/repos/pydata/xarray/issues/4983,790718449,MDEyOklzc3VlQ29tbWVudDc5MDcxODQ0OQ==,43613877,2021-03-04T15:52:38Z,2021-03-04T15:53:22Z,CONTRIBUTOR,"I didn't thought of using `da.time.dt.floor(""D"")`. This is indeed great to know, but as there seems to be more folks who are expecting `da.time.dt.date` to work, so I'd still like to see this implemented.
The `time` attribute that is already implemented has the same issue that it does not exists in `cftime`:
```python
import numpy as np
import pandas as pd
import xarray as xr
attrs = {""units"": ""hours since 3000-01-01""}
ds = xr.Dataset({""time"": (""time"", [0, 1, 2, 3], attrs)})
xr.decode_cf(ds).time.dt.time
# AttributeError: 'CFTimeIndex' object has no attribute 'time'
```
I implemented the `date` attribute in PR #4994. The usage of `date` and `CFTimeIndex` raises an explicit AttributeError now and mentions the usage of `floor(""D"")`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,819897789
https://github.com/pydata/xarray/pull/2716#issuecomment-457963623,https://api.github.com/repos/pydata/xarray/issues/2716,457963623,MDEyOklzc3VlQ29tbWVudDQ1Nzk2MzYyMw==,43613877,2019-01-27T23:16:58Z,2019-01-27T23:24:46Z,CONTRIBUTOR,"Sure @jhamman, I'll add some tests.
However, I thought the test should rather go into [test_dataarray.py](https://github.com/pydata/xarray/blob/master/xarray/tests/test_dataarray.py) than [test_missing.py](https://github.com/pydata/xarray/blob/master/xarray/tests/test_missing.py), because
this is an improvement to resample/_upsample?
Something like
```python
def test_upsample_tolerance(self):
# Test tolerance keyword for upsample methods bfill, pad, nearest
times = pd.date_range('2000-01-01', freq='1D', periods=2)
times_upsampled = pd.date_range('2000-01-01', freq='6H', periods=5)
array = DataArray(np.arange(2), [('time', times)])
# Forward fill
actual = array.resample(time='6H').ffill(tolerance='12H')
expected = DataArray([0., 0., 0., np.nan, 1.],
[('time', times_upsampled)])
assert_identical(expected, actual)
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,403462155
https://github.com/pydata/xarray/issues/2695#issuecomment-456239824,https://api.github.com/repos/pydata/xarray/issues/2695,456239824,MDEyOklzc3VlQ29tbWVudDQ1NjIzOTgyNA==,43613877,2019-01-22T01:25:55Z,2019-01-22T01:25:55Z,CONTRIBUTOR,"Thanks for the clarification! I think the `tolerance` argument might even be superior to the `limit` or to say the least, the resample methods would benefit from any of these arguments.
My above mentioned changes to the code, despite mixing up `limit` and `tolerance`, actually seem to implement the `tolerance` argument.
```python
import xarray as xr
import pandas as pd
import datetime as dt
dates=[dt.datetime(2018,1,1), dt.datetime(2018,1,2)]
data=[10,20]
df=pd.DataFrame(data,index=dates)
xdf = xr.Dataset.from_dataframe(df)
xdf.resample({'index':'1H'}).nearest(tolerance=dt.timedelta(hours=2))
```
would lead to
```
Dimensions: (index: 25)
Coordinates:
* index (index) datetime64[ns] 2018-01-01 ... 2018-01-02
Data variables:
0 (index) float64 10.0 10.0 10.0 nan nan ... nan nan 20.0 20.0 20.0
```
Would that be helpful to include and write a pull request?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,401392318