id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
114773593,MDU6SXNzdWUxMTQ3NzM1OTM=,644,Feature request: only allow nearest-neighbor .sel for valid data (not NaN positions),13906519,closed,0,,,10,2015-11-03T09:17:21Z,2023-05-12T22:31:46Z,2019-03-01T04:00:07Z,NONE,,,,"Hi.
Maybe I'm missing something, but I'm trying to get data by providing lat & lon coordinates to the .sel operator together with the nearest argument. This works, but in my case often ocean pixels are nearby and those contain NaN. Is there a way to only return the nearest _valid_ point even if this is a bit further away then the invalid locations?
Christian
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/644/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
606165039,MDU6SXNzdWU2MDYxNjUwMzk=,4000,Add hook to get progress of long-running operations,13906519,closed,0,,,3,2020-04-24T09:13:02Z,2022-04-09T03:08:45Z,2022-04-09T03:08:45Z,NONE,,,,"
Hi. I currently work on a large dataframe that I convert to a Xarray dataset. It works, but takes quite some (unknown) amount of time.
#### MCVE Code Sample
```python
data = pd.DataFrame(""huge data frame with time, lat, Lon as multiindex and about 60 data columns "")
dsout = xr.Dataset()
dsout = dsout.from_dataframe(data)
```
#### Expected Output
A progress report/ bar about the operation
#### Problem Description
It would be nice to have some hook or other functionality to tap into the xr.from_dataframe() and return a progress status that I then could pass to tqdm or something similar...
#### Versions
0.15.1
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4000/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
307444427,MDU6SXNzdWUzMDc0NDQ0Mjc=,2005,What is the recommended way to do proper compression/ scaling of vars?,13906519,closed,0,,,3,2018-03-21T22:45:05Z,2020-03-22T10:38:42Z,2020-03-22T09:38:42Z,NONE,,,,"#### Code Sample, a copy-pastable example if possible
```python
if level == 'HIGH':
ctype = 'i4'
else:
ctype = 'i2'
max_int = np.iinfo(ctype).max
min_val = ds[data_var].min().values
max_val = ds[data_var].max().values
offset_ = min_val
if max_val - min_val == 0:
scale_ = 1.0
else:
scale_ = float(max_int / (max_val - min_val))
```
#### Problem description
Using i2 I mostly get proper output. However, when I try to use i4 or i8 I don't get anything resembling my input...
I write to file with format='NETCDF4_CLASSIC'
My DataArray encoding is:
```python
ENCODING = dict(dtype=ctype, add_offset=offset_, scale_factor=scale_, zlib=True, _FillValue=-9999)
ds[data_var] = ds[data_var].astype(ctype) #, casting='unsafe')
ds[data_var].extra.update_encoding(ENCODING)
```
extra is added to DataArray like so:
```python
@xr.register_dataarray_accessor('extra')
class TileAccessor(object):
def __init__(self, xarray_obj):
self._obj = xarray_obj
def update_encoding(self, *args, **kwargs):
""""""Update the encoding in a xarray DataArray""""""
def _update_encoding(obj, *args, **kwargs):
obj.encoding.update(*args, **kwargs)
return obj
self._obj.pipe(_update_encoding, *args, **kwargs)
```
#### Expected Output
Not sure if one is supposed to use types other than i2 for compressed storage? Or is my approach wrong?
#### Output of ``xr.show_versions()``
xarray: 0.10.0
pandas: 0.22.0
numpy: 1.14.2
scipy: 1.0.0
netCDF4: 1.3.1
h5netcdf: None
Nio: None
bottleneck: None
cyordereddict: None
dask: None
matplotlib: 2.1.2
cartopy: None
seaborn: 0.8.1
setuptools: 38.5.2
pip: 9.0.1
conda: None
pytest: 3.4.2
IPython: 6.2.1
sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2005/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
507211596,MDU6SXNzdWU1MDcyMTE1OTY=,3399,"With sel_points deprecated, how do you replace it?",13906519,closed,0,,,2,2019-10-15T12:27:54Z,2019-10-15T13:55:48Z,2019-10-15T13:55:48Z,NONE,,,,"#### MCVE Code Sample
```python
import numpy as np
import xarray as xr
da = xr.DataArray(np.arange(16).reshape((4,4)), coords=[('lat', [40,45,50,55]), ('lon', [5,10,15,20])])
points = da.sel_points(lat=[40.1, 42.3, 39.78], lon=[7.1, 6.2, 13.2], method='nearest')
for p in points:
print(p)
```
#### Expected Output
This works fine < 0.13, but is now deprecated. I don't know (and cannot find an example in the docs) how one would replicate this with vectorised indexing?!?
I have to iterate over the returned points as each will be exported into an XML element.
Note, that in my real code I work with a xr.Dataset with many variables instead of a simple 2d xr. DataArray)
#### Problem Description
I think this deprecation and the current docs will trip a lot of users using sel_points/ isle_points.
Any example how to replace the code with the current API is highly appreciated!
#### Output of ``xr.show_versions()``
# Paste the output here xr.show_versions() here
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3399/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
258316935,MDU6SXNzdWUyNTgzMTY5MzU=,1575,Allow rasterio_open to be used on in-memory rasterio objects? ,13906519,closed,0,,,2,2017-09-17T16:57:31Z,2019-09-17T20:26:31Z,2019-09-17T20:26:31Z,NONE,,,,"After I open and reproject a GTiff to WGS84 I want to place the bands into a xarray Dataset object. As I try to reduce disc writes as possible I work with a rasterio in-memory object.
As far as I can see it is currently not possible to pass a rasterio In-Memory object to the rasterio_open function, right? Is there a workaround for this or du I have to build the dataset from scratch?
If it's currently not possible I'd like to suggest this as a future addition.
[and thanks for your great xarray module]","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1575/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
159791786,MDU6SXNzdWUxNTk3OTE3ODY=,880,Possibility to provide default map background/ elements to plot,13906519,closed,0,,,2,2016-06-11T20:11:25Z,2019-02-26T17:51:16Z,2019-02-26T17:51:16Z,NONE,,,,"Hi.
Really enjoying xarray! Converting all my scripts and on the quest to simplify my plotting.
For my current project I need loads of maps and generally add a land/ ocean layer and country boundaries as well as an default extent to the plot axes...
I use for instance a extra layer like this:
```
import cartopy.crs as cars
import cartopy.feature as feature
# default map backgrounds
countries = cfeature.NaturalEarthFeature(
category='cultural',
name='admin_0_countries',
scale='50m',
facecolor='none')
```
An example from a class I work on right now:
```
def plot_monthly(self, figsize=(12,10)):
_unit = 'n.d.'
_name = 'var'
if 'units' in self._dataArray.attrs.keys():
_unit = self._dataArray.attrs['units']
self._fig, self._axes = plt.subplots(ncols=4, nrows=3, \
figsize=figsize, subplot_kw={'projection': ccrs.PlateCarree()})
_min = self._dataArray.min()
_max = self._dataArray.max()
print self._dataArray
for i, ax in enumerate(self._axes.flatten()):
ax.set_extent( self._extent )
ax.add_feature(self._countries, edgecolor='gray')
ax.add_feature(self._ocean)
ax.coastlines(resolution='50m')
ax.set_title(self._monthNames[i+1])
_plot = xr.plot.plot(self._dataArray.sel(month=i+1), ax=ax, \
vmin=_min, vmax=_max, \
transform=ccrs.PlateCarree(), \
add_labels=False,
robust=True, add_colorbar=False)
self._fig.subplots_adjust(right=0.8)
_cbar_ax = self._fig.add_axes([0.85, 0.15, 0.05, 0.7])
_var_str = ""%s [%s]"" % (_name, _unit)
self._fig.colorbar(_plot, cax=_cbar_ax, label=_var_str)
```
This is a bit clumsy at the moment - I basically define the axes for each subplot (say 1..12 month), loop over them, select the axis and add them with ax.set_extent(), ax.add_feature() etc...
This works, but I'd rather use the plot function within DataArray or DataSet... Also thinking about using FacetGrid instead of doing it manually.
I thought about using the new decorators to patch this onto a custom plot directive but my Python is not really advanced enough for decorators and the inner functioning of array it seems...
```
@xr.register_dataarray_accessor('map')
class MapDecorator(object):
def __init__(self, xarray_obj):
self._obj = xarray_obj
def plot(self, **kwargs):
""""""Plot data on a map.""""""
print ""my decorator""
#ax.set_extent([102, 110, 8, 24])
## add map elements
#ax.add_feature(countries, edgecolor='gray') # country borders
#ax.add_feature(ocean) # ocean
#ax.coastlines(resolution='50m') # coastlines
p = self._obj.plot( kwargs )
return p
```
Not sure if this is a valid path or how one would do that? I would also like to have some default arguments (projection=)
Cheers,
Christian
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/880/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
102416901,MDU6SXNzdWUxMDI0MTY5MDE=,545,Negative timesteps after .to_netcdf with long time periods?,13906519,closed,0,,,3,2015-08-21T16:36:36Z,2019-02-01T01:15:22Z,2019-02-01T01:15:22Z,NONE,,,,"Hi.
I discovered that I get negative times on the time dimension for long intervals (years 1700-2013, monthly timestep).
``` python
import xray
import numpy as np
import pandas as pd
years = range(1700,2014)
LATS = np.arange(-89.75, 90.0, 0.5)
LONS = np.arange(-179.75, 180.0, 0.5)
tlist = pd.date_range('%d-01-01' % years[0], periods=12*len(years), freq='M')
da = xray.DataArray(np.ones((12*len(years), 360, 720))*-9999, \
[('time', tlist), ('latitude', LATS), ('longitude', LONS) ])
# i then fill the dataarray with info from a text file (using read_csv from pandas)
# eventually I dump to netcdf
ds = xray.Dataset({""mgpp"": da})
ds.to_netcdf('test_%d-%d.nc' % (years[0], years[-1]))
```
If I ""ncdump -c mgpp_1700-2013.nc I get:
```
netcdf mgpp_1700-2013 {
dimensions:
latitude = 360 ;
time = 3768 ;
longitude = 720 ;
variables:
float latitude(latitude) ;
float mgpp(time, latitude, longitude) ;
mgpp:units = ""gCm-2"" ;
float longitude(longitude) ;
float time(time) ;
time:units = ""days since 1700-01-31 00:00:00"" ;
time:calendar = ""proleptic_gregorian"" ;
data:
time = 0, 28, 59, 89, 120, 150, 181, 212, 242, 273, 303, 334, 365, 393, 424,
454, 485, 515, 546, 577, 607, 638, 668, 699, 730, 758, 789, 819, 850,
880, 911, 942, 972, 1003, 1033, 1064, 1095, 1123, 1154, 1184, 1215, 1245,
1276, 1307, 1337, 1368, 1398, 1429, 1460, 1489, 1520, 1550, 1581, 1611,
1642, 1673, 1703, 1734, 1764, 1795, 1826, 1854, 1885, 1915, 1946, 1976,
2007, 2038, 2068, 2099, 2129, 2160, 2191, 2219, 2250, 2280, 2311, 2341,
2372, 2403, 2433, 2464, 2494, 2525, 2556, 2584, 2615, 2645, 2676, 2706,
2737, 2768, 2798, 2829, 2859, 2890, 2921, 2950, 2981, 3011, 3042, 3072,
3103, 3134, 3164, 3195, 3225, 3256, 3287, 3315, 3346, 3376, 3407, 3437,
3468, 3499, 3529, 3560, 3590, 3621, 3652, 3680, 3711, 3741, 3772, 3802, (...)
```
and eventually:
```
(...) 106435, 106466, 106497, 106527, 106558, 106588, 106619, 106650, 106679,
106710, 106740, -106732.982337963, -106702.982337963, -106671.982337963,
-106640.982337963, -106610.982337963, -106579.982337963,
-106549.982337963, -106518.982337963, -106487.982337963,
-106459.982337963, -106428.982337963, -106398.982337963,
-106367.982337963, -106337.982337963, -106306.982337963,
-106275.982337963, -106245.982337963, -106214.982337963,
-106184.982337963, -106153.982337963, -106122.982337963,
-106094.982337963, -106063.982337963, -106033.982337963,
-106002.982337963, -105972.982337963, -105941.982337963,
-105910.982337963, -105880.982337963, -105849.982337963, (...)
```
Not sure if I can inflence that at ""dump"" time with to_netcdf? I know about the time limitation, but my years should be non-critical, no?
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/545/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
105536609,MDU6SXNzdWUxMDU1MzY2MDk=,564,to_netcdf() writes attrs as unicode strings ,13906519,closed,0,,,2,2015-09-09T07:37:17Z,2015-09-09T16:57:48Z,2015-09-09T16:57:48Z,NONE,,,,"Hi.
Seems like xray writes attributes as unicode strings to netcdf files. This causes problems with other software. The attrs show up as ""string ..."" with ncdump for instance and ncview does not recognise the units attributes etc.
For the moment I do this:
nc.attrs = OrderedDict( (str(k),str(v)) for k,v in nc1_backup.attrs.iteritems() )
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/564/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue