id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 114773593,MDU6SXNzdWUxMTQ3NzM1OTM=,644,Feature request: only allow nearest-neighbor .sel for valid data (not NaN positions),13906519,closed,0,,,10,2015-11-03T09:17:21Z,2023-05-12T22:31:46Z,2019-03-01T04:00:07Z,NONE,,,,"Hi. Maybe I'm missing something, but I'm trying to get data by providing lat & lon coordinates to the .sel operator together with the nearest argument. This works, but in my case often ocean pixels are nearby and those contain NaN. Is there a way to only return the nearest _valid_ point even if this is a bit further away then the invalid locations? Christian ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/644/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 809332917,MDU6SXNzdWU4MDkzMzI5MTc=,4914,Record processing steps into history attribute with context manager,13906519,open,0,,,4,2021-02-16T13:55:05Z,2022-05-23T13:28:48Z,,NONE,,,,"I often want to record an entry into history of my netcdf file/ xarray. While one can always add it manually, i.e. ```python ds.attrs[""history""] = ds.attrs[""history""] + ""\n"" + ""message"" ``` I was wondering if there's a better way... In a first attempt I tried using a context manager for this. Not sure if there are other approaches? Would that be something useful for xarray core? What are other people using for this? **Demo:** ```python import datetime import xarray as xr class XrHistory(): def __init__(self, array, message, timestamp=True): self._array = array self._message = message self._timestamp = timestamp def __enter__(self): if 'history' not in self._array.attrs: self._array.attrs['history'] = """" if self._message != self._array.attrs['history'].split('\n')[-1]: ts = f""{datetime.datetime.now().strftime('%a %b %d %H:%M:%S %Y')}: "" if self._timestamp else """" self._array.attrs['history'] += f""\n{ts}{self._message}"" self._message = None return self._array def __exit__(self, exc_type,exc_value, exc_traceback): pass # ds is any xarray dataset... with XrHistory(ds, ""normalise data"") as ds: ds[""array_one""] = (ds.array_one - ds.array_one.mean(dim='time')) / ds.array_one.std(dim='time') with XrHistory(ds, ""subset data"") as ds: ds = ds.sel(x=slice(10, 20), y=slice(10,20)) # ... ```","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4914/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 616025171,MDU6SXNzdWU2MTYwMjUxNzE=,4052,Opendap access problem when subsetting via latitude and longitude...,13906519,open,0,,,1,2020-05-11T16:44:02Z,2022-04-29T00:37:51Z,,NONE,,,," I am trying to access a subset of a netcdf files hosted on a THREDDS server. I can inspect the metadata, but I cannot subset the file via lat and lon slices. A download via the http link provided on the page works... #### MCVE Code Sample ```python import xarray as xr # this works test1 = xr.open_dataset(URL) display(test1) # this also works test2 = xr.open_dataset(URL).sel(lat=slice(30,40)) display(test2) # this also works test3 = xr.open_dataset(URL).sel(lon=slice(100,110)) display(test3) # this fails test4 = xr.open_dataset(URL).sel(lat=slice(30,40), lon=slice(100,110)) display(test4) ``` #### Problem Description Error: ``` ~/.pyenv/versions/miniconda3-latest/envs/datascience/lib/python3.7/site-packages/xarray/backends/common.py in robust_getitem(array, key, catch, max_retries, initial_delay) 52 for n in range(max_retries + 1): 53 try: ---> 54 return array[key] 55 except catch: 56 if n == max_retries: netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Variable.__getitem__() netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Variable._get() netCDF4/_netCDF4.pyx in netCDF4._netCDF4._ensure_nc_success() RuntimeError: NetCDF: Access failure ``` I also tried to use the pydap engine, but I'm not sure if this tells me something about the problem or if I use this option incorrectly... ```python URL = ""https://thredds.daac.ornl.gov/thredds/dodsC/ornldaac/1247/T_CLAY.nc4"" test1 = xr.open_dataset(URL, engine='pydap') test1 ``` result: ``` UnicodeDecodeError: 'ascii' codec can't decode byte 0x8b in position 1: ordinal not in range(128) ``` #### Versions ``` INSTALLED VERSIONS ------------------ commit: None python: 3.7.4 (default, Aug 13 2019, 15:17:50) [Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 19.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.1 xarray: 0.15.1 pandas: 1.0.1 numpy: 1.18.1 scipy: 1.4.1 netCDF4: 1.4.2 pydap: installed h5netcdf: None h5py: None Nio: None zarr: 2.4.0 cftime: 1.0.4.2 nc_time_axis: None PseudoNetCDF: None rasterio: 1.0.21 cfgrib: None iris: None bottleneck: None dask: 2.11.0 distributed: 2.11.0 matplotlib: 3.1.3 cartopy: 0.17.0 seaborn: 0.10.0 numbagg: None setuptools: 45.2.0.post20200210 pip: 20.0.2 conda: None pytest: None IPython: 7.12.0 sphinx: None ```","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4052/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 606165039,MDU6SXNzdWU2MDYxNjUwMzk=,4000,Add hook to get progress of long-running operations,13906519,closed,0,,,3,2020-04-24T09:13:02Z,2022-04-09T03:08:45Z,2022-04-09T03:08:45Z,NONE,,,," Hi. I currently work on a large dataframe that I convert to a Xarray dataset. It works, but takes quite some (unknown) amount of time. #### MCVE Code Sample ```python data = pd.DataFrame(""huge data frame with time, lat, Lon as multiindex and about 60 data columns "") dsout = xr.Dataset() dsout = dsout.from_dataframe(data) ``` #### Expected Output A progress report/ bar about the operation #### Problem Description It would be nice to have some hook or other functionality to tap into the xr.from_dataframe() and return a progress status that I then could pass to tqdm or something similar... #### Versions 0.15.1 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4000/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 650549352,MDU6SXNzdWU2NTA1NDkzNTI=,4197,"Provide a ""shrink"" command to remove bounding nan/ whitespace of DataArray",13906519,open,0,,,7,2020-07-03T11:55:05Z,2022-04-09T01:22:31Z,,NONE,,,," I'm currently trying to come up with an elegant solution to remove extra whitespace/ nan-values along the edges of a 2D DataArray. I'm working with geographic data and search for an automatic way to shrink the extend to valid data only. Think a map of the EU, but remove all cols/ rows of the array (starting from the edges) that only contain nan. **Describe the solution you'd like** A shrink command that removes all nan rows/ cols at the edges of a DataArray. **Describe alternatives you've considered** I currently do this with NumPy operating on the raw data and creating a new DataArray afterwards ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4197/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 307444427,MDU6SXNzdWUzMDc0NDQ0Mjc=,2005,What is the recommended way to do proper compression/ scaling of vars?,13906519,closed,0,,,3,2018-03-21T22:45:05Z,2020-03-22T10:38:42Z,2020-03-22T09:38:42Z,NONE,,,,"#### Code Sample, a copy-pastable example if possible ```python if level == 'HIGH': ctype = 'i4' else: ctype = 'i2' max_int = np.iinfo(ctype).max min_val = ds[data_var].min().values max_val = ds[data_var].max().values offset_ = min_val if max_val - min_val == 0: scale_ = 1.0 else: scale_ = float(max_int / (max_val - min_val)) ``` #### Problem description Using i2 I mostly get proper output. However, when I try to use i4 or i8 I don't get anything resembling my input... I write to file with format='NETCDF4_CLASSIC' My DataArray encoding is: ```python ENCODING = dict(dtype=ctype, add_offset=offset_, scale_factor=scale_, zlib=True, _FillValue=-9999) ds[data_var] = ds[data_var].astype(ctype) #, casting='unsafe') ds[data_var].extra.update_encoding(ENCODING) ``` extra is added to DataArray like so: ```python @xr.register_dataarray_accessor('extra') class TileAccessor(object): def __init__(self, xarray_obj): self._obj = xarray_obj def update_encoding(self, *args, **kwargs): """"""Update the encoding in a xarray DataArray"""""" def _update_encoding(obj, *args, **kwargs): obj.encoding.update(*args, **kwargs) return obj self._obj.pipe(_update_encoding, *args, **kwargs) ``` #### Expected Output Not sure if one is supposed to use types other than i2 for compressed storage? Or is my approach wrong? #### Output of ``xr.show_versions()``
xarray: 0.10.0 pandas: 0.22.0 numpy: 1.14.2 scipy: 1.0.0 netCDF4: 1.3.1 h5netcdf: None Nio: None bottleneck: None cyordereddict: None dask: None matplotlib: 2.1.2 cartopy: None seaborn: 0.8.1 setuptools: 38.5.2 pip: 9.0.1 conda: None pytest: 3.4.2 IPython: 6.2.1 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2005/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 507211596,MDU6SXNzdWU1MDcyMTE1OTY=,3399,"With sel_points deprecated, how do you replace it?",13906519,closed,0,,,2,2019-10-15T12:27:54Z,2019-10-15T13:55:48Z,2019-10-15T13:55:48Z,NONE,,,,"#### MCVE Code Sample ```python import numpy as np import xarray as xr da = xr.DataArray(np.arange(16).reshape((4,4)), coords=[('lat', [40,45,50,55]), ('lon', [5,10,15,20])]) points = da.sel_points(lat=[40.1, 42.3, 39.78], lon=[7.1, 6.2, 13.2], method='nearest') for p in points: print(p) ``` #### Expected Output This works fine < 0.13, but is now deprecated. I don't know (and cannot find an example in the docs) how one would replicate this with vectorised indexing?!? I have to iterate over the returned points as each will be exported into an XML element. Note, that in my real code I work with a xr.Dataset with many variables instead of a simple 2d xr. DataArray) #### Problem Description I think this deprecation and the current docs will trip a lot of users using sel_points/ isle_points. Any example how to replace the code with the current API is highly appreciated! #### Output of ``xr.show_versions()``
# Paste the output here xr.show_versions() here
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3399/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 258316935,MDU6SXNzdWUyNTgzMTY5MzU=,1575,Allow rasterio_open to be used on in-memory rasterio objects? ,13906519,closed,0,,,2,2017-09-17T16:57:31Z,2019-09-17T20:26:31Z,2019-09-17T20:26:31Z,NONE,,,,"After I open and reproject a GTiff to WGS84 I want to place the bands into a xarray Dataset object. As I try to reduce disc writes as possible I work with a rasterio in-memory object. As far as I can see it is currently not possible to pass a rasterio In-Memory object to the rasterio_open function, right? Is there a workaround for this or du I have to build the dataset from scratch? If it's currently not possible I'd like to suggest this as a future addition. [and thanks for your great xarray module]","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1575/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 159791786,MDU6SXNzdWUxNTk3OTE3ODY=,880,Possibility to provide default map background/ elements to plot,13906519,closed,0,,,2,2016-06-11T20:11:25Z,2019-02-26T17:51:16Z,2019-02-26T17:51:16Z,NONE,,,,"Hi. Really enjoying xarray! Converting all my scripts and on the quest to simplify my plotting. For my current project I need loads of maps and generally add a land/ ocean layer and country boundaries as well as an default extent to the plot axes... I use for instance a extra layer like this: ``` import cartopy.crs as cars import cartopy.feature as feature # default map backgrounds countries = cfeature.NaturalEarthFeature( category='cultural', name='admin_0_countries', scale='50m', facecolor='none') ``` An example from a class I work on right now: ``` def plot_monthly(self, figsize=(12,10)): _unit = 'n.d.' _name = 'var' if 'units' in self._dataArray.attrs.keys(): _unit = self._dataArray.attrs['units'] self._fig, self._axes = plt.subplots(ncols=4, nrows=3, \ figsize=figsize, subplot_kw={'projection': ccrs.PlateCarree()}) _min = self._dataArray.min() _max = self._dataArray.max() print self._dataArray for i, ax in enumerate(self._axes.flatten()): ax.set_extent( self._extent ) ax.add_feature(self._countries, edgecolor='gray') ax.add_feature(self._ocean) ax.coastlines(resolution='50m') ax.set_title(self._monthNames[i+1]) _plot = xr.plot.plot(self._dataArray.sel(month=i+1), ax=ax, \ vmin=_min, vmax=_max, \ transform=ccrs.PlateCarree(), \ add_labels=False, robust=True, add_colorbar=False) self._fig.subplots_adjust(right=0.8) _cbar_ax = self._fig.add_axes([0.85, 0.15, 0.05, 0.7]) _var_str = ""%s [%s]"" % (_name, _unit) self._fig.colorbar(_plot, cax=_cbar_ax, label=_var_str) ``` This is a bit clumsy at the moment - I basically define the axes for each subplot (say 1..12 month), loop over them, select the axis and add them with ax.set_extent(), ax.add_feature() etc... This works, but I'd rather use the plot function within DataArray or DataSet... Also thinking about using FacetGrid instead of doing it manually. I thought about using the new decorators to patch this onto a custom plot directive but my Python is not really advanced enough for decorators and the inner functioning of array it seems... ``` @xr.register_dataarray_accessor('map') class MapDecorator(object): def __init__(self, xarray_obj): self._obj = xarray_obj def plot(self, **kwargs): """"""Plot data on a map."""""" print ""my decorator"" #ax.set_extent([102, 110, 8, 24]) ## add map elements #ax.add_feature(countries, edgecolor='gray') # country borders #ax.add_feature(ocean) # ocean #ax.coastlines(resolution='50m') # coastlines p = self._obj.plot( kwargs ) return p ``` Not sure if this is a valid path or how one would do that? I would also like to have some default arguments (projection=) Cheers, Christian ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/880/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 102416901,MDU6SXNzdWUxMDI0MTY5MDE=,545,Negative timesteps after .to_netcdf with long time periods?,13906519,closed,0,,,3,2015-08-21T16:36:36Z,2019-02-01T01:15:22Z,2019-02-01T01:15:22Z,NONE,,,,"Hi. I discovered that I get negative times on the time dimension for long intervals (years 1700-2013, monthly timestep). ``` python import xray import numpy as np import pandas as pd years = range(1700,2014) LATS = np.arange(-89.75, 90.0, 0.5) LONS = np.arange(-179.75, 180.0, 0.5) tlist = pd.date_range('%d-01-01' % years[0], periods=12*len(years), freq='M') da = xray.DataArray(np.ones((12*len(years), 360, 720))*-9999, \ [('time', tlist), ('latitude', LATS), ('longitude', LONS) ]) # i then fill the dataarray with info from a text file (using read_csv from pandas) # eventually I dump to netcdf ds = xray.Dataset({""mgpp"": da}) ds.to_netcdf('test_%d-%d.nc' % (years[0], years[-1])) ``` If I ""ncdump -c mgpp_1700-2013.nc I get: ``` netcdf mgpp_1700-2013 { dimensions: latitude = 360 ; time = 3768 ; longitude = 720 ; variables: float latitude(latitude) ; float mgpp(time, latitude, longitude) ; mgpp:units = ""gCm-2"" ; float longitude(longitude) ; float time(time) ; time:units = ""days since 1700-01-31 00:00:00"" ; time:calendar = ""proleptic_gregorian"" ; data: time = 0, 28, 59, 89, 120, 150, 181, 212, 242, 273, 303, 334, 365, 393, 424, 454, 485, 515, 546, 577, 607, 638, 668, 699, 730, 758, 789, 819, 850, 880, 911, 942, 972, 1003, 1033, 1064, 1095, 1123, 1154, 1184, 1215, 1245, 1276, 1307, 1337, 1368, 1398, 1429, 1460, 1489, 1520, 1550, 1581, 1611, 1642, 1673, 1703, 1734, 1764, 1795, 1826, 1854, 1885, 1915, 1946, 1976, 2007, 2038, 2068, 2099, 2129, 2160, 2191, 2219, 2250, 2280, 2311, 2341, 2372, 2403, 2433, 2464, 2494, 2525, 2556, 2584, 2615, 2645, 2676, 2706, 2737, 2768, 2798, 2829, 2859, 2890, 2921, 2950, 2981, 3011, 3042, 3072, 3103, 3134, 3164, 3195, 3225, 3256, 3287, 3315, 3346, 3376, 3407, 3437, 3468, 3499, 3529, 3560, 3590, 3621, 3652, 3680, 3711, 3741, 3772, 3802, (...) ``` and eventually: ``` (...) 106435, 106466, 106497, 106527, 106558, 106588, 106619, 106650, 106679, 106710, 106740, -106732.982337963, -106702.982337963, -106671.982337963, -106640.982337963, -106610.982337963, -106579.982337963, -106549.982337963, -106518.982337963, -106487.982337963, -106459.982337963, -106428.982337963, -106398.982337963, -106367.982337963, -106337.982337963, -106306.982337963, -106275.982337963, -106245.982337963, -106214.982337963, -106184.982337963, -106153.982337963, -106122.982337963, -106094.982337963, -106063.982337963, -106033.982337963, -106002.982337963, -105972.982337963, -105941.982337963, -105910.982337963, -105880.982337963, -105849.982337963, (...) ``` Not sure if I can inflence that at ""dump"" time with to_netcdf? I know about the time limitation, but my years should be non-critical, no? ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/545/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 105536609,MDU6SXNzdWUxMDU1MzY2MDk=,564,to_netcdf() writes attrs as unicode strings ,13906519,closed,0,,,2,2015-09-09T07:37:17Z,2015-09-09T16:57:48Z,2015-09-09T16:57:48Z,NONE,,,,"Hi. Seems like xray writes attributes as unicode strings to netcdf files. This causes problems with other software. The attrs show up as ""string ..."" with ncdump for instance and ncview does not recognise the units attributes etc. For the moment I do this: nc.attrs = OrderedDict( (str(k),str(v)) for k,v in nc1_backup.attrs.iteritems() ) ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/564/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue