github: issues: 64 rows where state = "closed", type = "issue" and user = 2443309 sorted by updated

64 rows where state = "closed", type = "issue" and user = 2443309 sorted by updated_at descending

Search:

descending

id	node_id	number	title	user	state	assignee	milestone	comments	created_at	updated_at ▲	closed_at	author_association	body	reactions	state_reason	repo	type
33637243	MDU6SXNzdWUzMzYzNzI0Mw==	131	Dataset summary methods	jhamman 2443309	closed		0.2 650893	10	2014-05-16T00:17:56Z	2023-09-28T12:42:34Z	2014-05-21T21:47:29Z	MEMBER	Add summary methods to Dataset object. For example, it would be great if you could summarize a entire dataset in a single line. (1) Mean of all variables in dataset. `python mean_ds = ds.mean()` (2) Mean of all variables in dataset along a dimension: `python time_mean_ds = ds.mean(dim='time')` In the case where a dimension is specified and there are variables that don't use that dimension, I'd imagine you would just pass that variable through unchanged. Related to #122.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/131/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1644429340	I_kwDOAMm_X85iBAAc	7692	Feature proposal: DataArray.to_zarr()	jhamman 2443309	closed			5	2023-03-28T18:00:24Z	2023-04-03T15:53:37Z	2023-04-03T15:53:37Z	MEMBER	Is your feature request related to a problem? It would be nice to mimic the behavior of `DataArray.to_netcdf` for the Zarr backend. Describe the solution you'd like This should be possible: `python xr.open_dataarray('file.nc').to_zarr('store.zarr')` Describe alternatives you've considered None. Additional context xref `DataArray.to_netcdf` issue/PR: #915 / #990	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7692/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1642635191	I_kwDOAMm_X85h6J-3	7686	Add reset_encoding to Dataset and DataArray objects	jhamman 2443309	closed			2	2023-03-27T18:51:39Z	2023-03-30T21:09:17Z	2023-03-30T21:09:17Z	MEMBER	Is your feature request related to a problem? Xarray maintains the encoding of datasets read from most of its supported backend formats (e.g. NetCDF, Zarr, etc.). This is very useful when you want to perfectly roundtrip but it often gets in the way, causing conflicts when writing a modified dataset or when appending to another dataset. Most of the time, the solution is to just remove the encoding from the dataset and continue on. The following code sample is found in a number of issues that reference this problem. ```python for v in list(ds.coords.keys()): if ds.coords[v].dtype == object: ds[v].encoding.clear() `for v in list(ds.variables.keys()): if ds[v].dtype == object: ds[v].encoding.clear()` ``` A sample of issues that show variants of this problem. https://github.com/pydata/xarray/issues/3476 https://github.com/pydata/xarray/issues/3739 https://github.com/pydata/xarray/issues/4380 https://github.com/pydata/xarray/issues/5219 https://github.com/pydata/xarray/issues/5969 https://github.com/pydata/xarray/issues/6329 https://github.com/pydata/xarray/issues/6352 Describe the solution you'd like In many cases, the solution to these problems is to leave the original dataset encoding behind and either use Xarray's default encoding (or the backends default) or to specify one's own encoding options. Both cases would benefit from a convenience method to reset the original encoding. Something like would serve this process: `python ds = xr.open_dataset(...).reset_encoding()` Describe alternatives you've considered Variations on the API above could also be considered: `python xr.open_dataset(..., keep_encoding=False)` or even: `python with xr.set_options(keep_encoding=False): ds = xr.open_dataset(...)` We can/should also do a better job of surfacing inconsistent encoding in our backends (e.g. `to_netcdf`). Additional context No response	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7686/reactions", "total_count": 2, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 2, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1558497871	I_kwDOAMm_X85c5MpP	7479	Use NumPy's SupportsDType	jhamman 2443309	closed			0	2023-01-26T17:21:32Z	2023-02-28T23:23:47Z	2023-02-28T23:23:47Z	MEMBER	What is your issue? Now that we've bumped our minimum NumPy version to 1.21, we can address this comment: https://github.com/pydata/xarray/blob/b21f62ee37eea3650a58e9ffa3a7c9f4ae83006b/xarray/core/types.py#L57-L62 I decided not to tackle this as part of #7461 but we may be able to do something like this: `python from numpy.typing._dtype_like import _DTypeLikeNested, _ShapeLike, _SupportsDType` xref: #6834 cc @headtr1ck	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7479/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1247014308	I_kwDOAMm_X85KU-2k	6634	Optionally include encoding in Dataset to_dict	jhamman 2443309	closed			0	2022-05-24T19:10:01Z	2022-05-26T19:17:35Z	2022-05-26T19:17:35Z	MEMBER	Is your feature request related to a problem? When using Xarray's `to_dict` methods to record a `Dataset`'s schema, it would be useful to (optionally) include `encoding` in the output. Describe the solution you'd like The feature request may be resolved by simply adding an `encoding` keyword argument. This may look like this: `python ds = xr.Dataset(...) ds.to_dict(data=False, encoding=True)` Describe alternatives you've considered It is currently possible to manually extract encoding attributes but this is a less desirable solution. xref: https://github.com/pangeo-forge/pangeo-forge-recipes/issues/256	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6634/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
636449225	MDU6SXNzdWU2MzY0NDkyMjU=	4139	[Feature request] Support file-like objects in open_rasterio	jhamman 2443309	closed			2	2020-06-10T18:11:26Z	2022-04-19T17:15:21Z	2022-04-19T17:15:20Z	MEMBER	With some acrobatics, it is possible to open file-like objects to rasterio. It would be useful if xarray supported this workflow, particularly for working with cloud optimized geotiffs and fs-spec. MCVE Code Sample ```python with open('my_data.tif', 'rb') as f: da = xr.open_rasterio(f) ``` Expected Output DataArray -> equivalent to `xr.open_rasterio('my_data.tif')` Problem Description We only currently allow str, rasterio.DatasetReader, or rasterio.WarpedVRT as inputs to `open_rasterio`. Versions Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: 2a288f6ed4286910fcf3ab9895e1e9cbd44d30b4 python: 3.8.2 \| packaged by conda-forge \| (default, Apr 24 2020, 07:56:27) [Clang 9.0.1 ] python-bits: 64 OS: Darwin OS-release: 18.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: None libnetcdf: None xarray: 0.15.2.dev68+gb896a68f pandas: 1.0.4 numpy: 1.18.5 scipy: None netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: 1.1.5 cfgrib: None iris: None bottleneck: None dask: 2.18.1 distributed: 2.18.0 matplotlib: None cartopy: None seaborn: None numbagg: None pint: None setuptools: 46.1.3.post20200325 pip: 20.1 conda: None pytest: 5.4.3 IPython: 7.13.0 sphinx: 3.0.3 xref: https://github.com/pangeo-data/pangeo-datastore/issues/109	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4139/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1108564253	I_kwDOAMm_X85CE1kd	6176	Xarray versioning to switch to CalVer	jhamman 2443309	closed			10	2022-01-19T21:09:45Z	2022-03-03T04:32:10Z	2022-01-31T18:35:27Z	MEMBER	Xarray is planning to switch to Calendar versioning (calver). This issue serves as a general announcement. The idea has come up in multiple developer meetings (#4001) and is part of a larger effort to increase our release cadence (#5927). Today's developer meeting included unanimous consent for the change. Other projects in Xarray's ecosystem have also made this change recently (e.g. https://github.com/dask/community/issues/100). While it is likely we will make this change in the next release or two, users and developers should feel free to voice objections here. The proposed calver implementation follows the same schema as the Dask project, that is; `YYYY.MM.X` (4 digit year, two digit month, one digit micro zero-indexed version. For example, the code block below provides comparison of the current and future version tags: ```python In [1]: import xarray as xr current In [2]: xr.version Out[2]: '0.19.1' proposed In [2]: xr.version Out[2]: '2022.01.0' ``` cc @pydata/xarray	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6176/reactions", "total_count": 6, "+1": 6, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
139064764	MDU6SXNzdWUxMzkwNjQ3NjQ=	787	Add Groupby and Rolling methods to docs	jhamman 2443309	closed			2	2016-03-07T19:10:26Z	2021-11-08T19:51:00Z	2021-11-08T19:51:00Z	MEMBER	The injected `apply`/`reduce` methods for the `Groupby` and `Rolling` objects are not shown in the api documentation page. While there is obviously a fair bit of overlap between the similar `DataArray`/`Dataset` methods, it would help users to know what methods are available to the `Groupby` and `Rolling` methods if we explicitly listed them in the documentation. Suggestions on the best format to show these mehtods (e.g. `Rolling.mean`) are welcomed.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/787/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
663968779	MDU6SXNzdWU2NjM5Njg3Nzk=	4253	[community] Backends refactor meeting	jhamman 2443309	closed			13	2020-07-22T18:39:19Z	2021-03-11T20:42:33Z	2021-03-11T20:42:33Z	MEMBER	In today's dev call, we opted to schedule a separate meeting to discuss the backends refactor that BOpen (@alexamici and his team) is beginning to work on. This issue is meant to coordinate the scheduling of this meeting. To that end, I've created the following Doodle Poll to help choose a time: https://doodle.com/poll/4mtzxncka7gee4mq Anyone from @pydata/xarray should feel free to join if there is interest. At a minimum, I'm hoping to have @alexamici, @aurghs, @shoyer, and @rabernat there. Please respond to the poll by COB tomorrow so I can quickly get the meeting on the books. Thanks!	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4253/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
287223508	MDU6SXNzdWUyODcyMjM1MDg=	1815	apply_ufunc(dask='parallelized') with multiple outputs	jhamman 2443309	closed			17	2018-01-09T20:40:52Z	2020-08-19T06:57:55Z	2020-08-19T06:57:55Z	MEMBER	I have an application where I'd like to use `apply_ufunc` with dask on a function that requires multiple inputs and outputs. This was left as a TODO item in the #1517. However, its not clear to me looking at the code how this can be done given the current form of dask's atop. I'm hoping @shoyer has already thought of a clever solution here... Code Sample, a copy-pastable example if possible ```python def func(foo, bar): `assert foo.shape == bar.shape spam = np.zeros_like(bar) spam2 = np.full_like(bar, 2) return spam, spam2` foo = xr.DataArray(np.zeros((10, 10))).chunk() bar = xr.DataArray(np.zeros((10, 10))).chunk() + 5 xrfunc = xr.apply_ufunc(func, foo, bar, output_core_dims=[[], []], dask='parallelized') ``` Problem description This currently raises a `NotImplementedError`. Expected Output Multiple dask arrays. In my example above, two dask arrays. Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.6.4.final.0 python-bits: 64 OS: Linux OS-release: 4.4.86+ machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.0+dev.c92020a pandas: 0.22.0 numpy: 1.13.3 scipy: 1.0.0 netCDF4: 1.3.1 h5netcdf: 0.5.0 Nio: None zarr: 2.2.0a2.dev176 bottleneck: 1.2.1 cyordereddict: None dask: 0.16.0 distributed: 1.20.2+36.g7387410 matplotlib: 2.1.1 cartopy: None seaborn: None setuptools: 38.4.0 pip: 9.0.1 conda: 4.3.29 pytest: 3.3.2 IPython: 6.2.1 sphinx: None cc @mrocklin, @arbennett	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1815/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
318988669	MDU6SXNzdWUzMTg5ODg2Njk=	2094	Drop win-32 platform CI from appveyor matrix?	jhamman 2443309	closed			3	2018-04-30T18:29:17Z	2020-03-30T20:30:58Z	2020-03-24T03:41:24Z	MEMBER	Conda-forge has dropped support for 32-bit windows builds (https://github.com/conda-forge/cftime-feedstock/issues/2#issuecomment-385485144). Do we want to continue testing against this environment? The point becomes moot after #1876 gets wrapped up in ~7 months. xref: https://github.com/pydata/xarray/pull/1252	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2094/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
578017585	MDU6SXNzdWU1NzgwMTc1ODU=	3851	Exposing Zarr backend internals as semi-public API	jhamman 2443309	closed			3	2020-03-09T16:04:49Z	2020-03-27T22:37:26Z	2020-03-27T22:37:26Z	MEMBER	We recently built a prototype REST API for serving xarray datasets via a Fast-API application (see #3850 for more details). In the process of doing this, we needed to use a few internal functions in Xarray's Zarr backend: `python from xarray.backends.zarr import ( _DIMENSION_KEY, _encode_zarr_attr_value, _extract_zarr_variable_encoding, encode_zarr_variable, ) from xarray.core.pycompat import dask_array_type from xarray.util.print_versions import get_sys_info, netcdf_and_hdf5_versions` Obviously, none of these imports are really meant for use outside of Xarray's backends so I'd like to discuss how we may go about exposing these functions (or variables) as semi-public (advanced use) API features. Thoughts? cc @rabernat	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3851/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
197920258	MDU6SXNzdWUxOTc5MjAyNTg=	1188	Should we deprecate the compat and encoding constructor arguments?	jhamman 2443309	closed			5	2016-12-28T21:41:26Z	2020-03-24T14:34:37Z	2020-03-24T14:34:37Z	MEMBER	In https://github.com/pydata/xarray/pull/1170#discussion_r94078121, @shoyer writes: ...I would consider deprecating the encoding argument to DataArray instead. It would also make sense to get rid of the compat argument to Dataset. These extra arguments are not part of the fundamental xarray data model and thus are a little distracting, especially to new users. @pydata/xarray and others, what do we think about deprecating the `compat` argument to the `Dataset` constructor and the `encoding` arguement to the `DataArray` (and `Dataset` via #1170).	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1188/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
508743579	MDU6SXNzdWU1MDg3NDM1Nzk=	3413	Can apply_ufunc be used on arrays with different dimension sizes	jhamman 2443309	closed			2	2019-10-17T22:04:00Z	2019-12-11T22:32:23Z	2019-12-11T22:32:23Z	MEMBER	We have an application where we want to use `apply_ufunc` to apply a function that takes two 1-D arrays and returns a scalar value (basically a reduction over the only axis). We start with two DataArrays that share all the same dimensions - except for the lengths of the dimension we'll be reducing along (`t` in this case): ```python def diff_mean(X, y): ''' a function that only works on 1d arrays that are different lengths''' assert X.ndim == 1, X.ndim assert y.ndim == 1, y.ndim assert len(X) != len(y), X return X.mean() - y.mean() X = np.random.random((10, 4, 5)) y = np.random.random((6, 4, 5)) Xda = xr.DataArray(X, dims=('t', 'x', 'y')).chunk({'t': -1, 'x': 2, 'y': 2}) yda = xr.DataArray(y, dims=('t', 'x', 'y')).chunk({'t': -1, 'x': 2, 'y': 2}) ``` Then, we'd like to use `apply_ufunc` to apply our function (e.g. `diff_mean`): `python out = xr.apply_ufunc( diff_mean, Xda, yda, vectorize=True, dask="parallelized", output_dtypes=[np.float], input_core_dims=[['t'], ['t']], )` This fails with an error when aligning the `t` dimensions: ```python-traceback ValueError Traceback (most recent call last) <ipython-input-4-e90cf6fba482> in <module> 9 dask="parallelized", 10 output_dtypes=[np.float], ---> 11 input_core_dims=[['t'], ['t']], 12 ) ~/miniconda3/envs/xarray-ml/lib/python3.7/site-packages/xarray/core/computation.py in apply_ufunc(func, input_core_dims, output_core_dims, exclude_dims, vectorize, join, dataset_join, dataset_fill_value, keep_attrs, kwargs, dask, output_dtypes, output_sizes, args) 1042 join=join, 1043 exclude_dims=exclude_dims, -> 1044 keep_attrs=keep_attrs 1045 ) 1046 elif any(isinstance(a, Variable) for a in args): ~/miniconda3/envs/xarray-ml/lib/python3.7/site-packages/xarray/core/computation.py in apply_dataarray_vfunc(func, signature, join, exclude_dims, keep_attrs, args) 222 if len(args) > 1: 223 args = deep_align( --> 224 args, join=join, copy=False, exclude=exclude_dims, raise_on_invalid=False 225 ) 226 ~/miniconda3/envs/xarray-ml/lib/python3.7/site-packages/xarray/core/alignment.py in deep_align(objects, join, copy, indexes, exclude, raise_on_invalid, fill_value) 403 indexes=indexes, 404 exclude=exclude, --> 405 fill_value=fill_value 406 ) 407 ~/miniconda3/envs/xarray-ml/lib/python3.7/site-packages/xarray/core/alignment.py in align(join, copy, indexes, exclude, fill_value, *objects) 321 "arguments without labels along dimension %r cannot be " 322 "aligned because they have different dimension sizes: %r" --> 323 % (dim, sizes) 324 ) 325 ValueError: arguments without labels along dimension 't' cannot be aligned because they have different dimension sizes: {10, 6} ``` https://nbviewer.jupyter.org/gist/jhamman/0e52d9bb29f679e26b0878c58bb813d2 I'm curious if this can be made to work with `apply_ufunc` or if we should pursue other options here. Advice and suggestions appreciated. Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 \| packaged by conda-forge \| (default, Jul 1 2019, 14:38:56) [Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 18.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: None libnetcdf: None xarray: 0.14.0 pandas: 0.25.1 numpy: 1.17.1 scipy: 1.3.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.3.2 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.3.0 distributed: 2.3.2 matplotlib: 3.1.1 cartopy: None seaborn: None numbagg: None setuptools: 41.2.0 pip: 19.2.3 conda: None pytest: 5.0.1 IPython: 7.8.0 sphinx: 2.2.0	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3413/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
503700649	MDU6SXNzdWU1MDM3MDA2NDk=	3380	[Release] 0.14	jhamman 2443309	closed			19	2019-10-07T21:28:28Z	2019-10-15T01:08:11Z	2019-10-14T21:26:59Z	MEMBER	3358 is going to make some fairly major changes to the minimum supported versions of required and optional dependencies. We also have a few bug fixes that have landed since releasing 0.13 that would be good to get out. From what I can tell, the following pending PRs are close enough to get into this release. - [ ] ~tests for arrays with units #3238~ - [x] map_blocks #3276 - [x] Rolling minimum dependency versions policy #3358 - [x] Remove all OrderedDict's (#3389) - [x] Speed up isel and __getitem__ #3375 - [x] Fix concat bug when concatenating unlabeled dimensions. #3362 - [ ] ~Add hypothesis test for netCDF4 roundtrip #3283~ - [x] Fix groupby reduce for dataarray #3338 - [x] Need a fix for https://github.com/pydata/xarray/issues/3377 Am I missing anything else that needs to get in? I think we should aim to wrap this release up soon (this week). I can volunteer to go through the release steps once we're ready.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3380/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
297227247	MDU6SXNzdWUyOTcyMjcyNDc=	1910	Pynio tests are being skipped on TravisCI	jhamman 2443309	closed			3	2018-02-14T20:03:31Z	2019-02-07T00:08:17Z	2019-02-07T00:08:17Z	MEMBER	Problem description Currently on Travis, the Pynio tests are being skipped. The `py27-cdat+iris+pynio` is supposed to be running tests for each of these but it is not. https://travis-ci.org/pydata/xarray/jobs/341426116#L2429-L2518 I can't look at this right now in depth but I'm wondering if this is related to #1531. reported by @WeatherGod	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1910/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
302930480	MDU6SXNzdWUzMDI5MzA0ODA=	1971	Should we be testing against multiple dask schedulers?	jhamman 2443309	closed			5	2018-03-07T01:25:37Z	2019-01-13T20:58:21Z	2019-01-13T20:58:20Z	MEMBER	Almost all of our unit tests are against the dask's default scheduler (usually dask.threaded). While it is true that beauty of dask is that one can separate the scheduler from the logical implementation, there are a few idiosyncrasies to consider, particularly in xarray's backends. To that end, we have a few tests covering the integration of the distributed scheduler with xarray's backends but the test coverage is not particularly complete. If nothing more, I think it is worth considering tests that use the threaded, multiprocessing, and distributed schedulers for a larger subset of the backends tests (those that use dask). Note, I'm bringing this up because I'm seeing some failing tests in #1793 that are unrelated to my code change but do appear to be related to dask and possibly a different different default scheduler (example failure).	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1971/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
293414745	MDU6SXNzdWUyOTM0MTQ3NDU=	1876	DEP: drop Python 2.7 support	jhamman 2443309	closed			2	2018-02-01T06:11:07Z	2019-01-02T04:52:04Z	2019-01-02T04:52:04Z	MEMBER	The timeline for dropping Python 2.7 support for new Xarray releases is the end of 2018. This issue can be used to track the necessary documentation and code changes to make that happen. xref: #1830	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1876/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
323765896	MDU6SXNzdWUzMjM3NjU4OTY=	2142	add CFTimeIndex enabled date_range function	jhamman 2443309	closed			1	2018-05-16T20:02:08Z	2018-09-19T20:24:40Z	2018-09-19T20:24:40Z	MEMBER	Pandas' `date_range` function is a fast and flexible way to create `DateTimeIndex` objects. Now that we have a functioning `CFTimeIndex`, it would be great to add a version of the `date_range` function that supports other calendars and dates out of range for Pandas. Code Sampl and expected output ```python In [1]: import xarray as xr In [2]: xr.date_range('2000-02-26', '2000-03-02') Out[2]: DatetimeIndex(['2000-02-26', '2000-02-27', '2000-02-28', '2000-02-29', '2000-03-01', '2000-03-02'], dtype='datetime64[ns]', freq='D') In [3]: xr.date_range('2000-02-26', '2000-03-02', calendar='noleap') Out[3]: CFTimeIndex(['2000-02-26', '2000-02-27', '2000-02-28', '2000-03-01', '2000-03-02'], dtype='cftime.datetime', freq='D') ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2142/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
288465429	MDU6SXNzdWUyODg0NjU0Mjk=	1829	Drop support for Python 3.4	jhamman 2443309	closed		0.11 2856429	13	2018-01-15T02:38:19Z	2018-07-08T00:55:32Z	2018-07-08T00:55:32Z	MEMBER	Python 3.7-final is due out in June (PEP 537). When do we want to deprecate 3.4 and when should we drop support all together. @maxim-lian brought this up in a PR he's working on: https://github.com/pydata/xarray/pull/1828#issuecomment-357562144. For reference, we dropped Python 3.3 in #1175 (12/20/2016).	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1829/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
327893262	MDU6SXNzdWUzMjc4OTMyNjI=	2203	Update minimum version of dask	jhamman 2443309	closed			6	2018-05-30T20:47:57Z	2018-07-08T00:55:32Z	2018-07-08T00:55:32Z	MEMBER	Xarray currently states that it supports dask version 0.9 and later. However, 1) I don't think this is true and my quick test shows that some of our tests fail using dask 0.9, and 2) we have a growing number of tests that are being skipped for older dask versions: $ grep -irn "dask.__version__" xarray/tests/*py xarray/tests/__init__.py:90: if LooseVersion(dask.__version__) < '0.18': xarray/tests/test_computation.py:755: if LooseVersion(dask.__version__) < LooseVersion('0.17.3'): xarray/tests/test_computation.py:841: if not use_dask or LooseVersion(dask.__version__) > LooseVersion('0.17.4'): xarray/tests/test_dask.py:211: @pytest.mark.skipif(LooseVersion(dask.__version__) <= '0.15.4', xarray/tests/test_dask.py:223: @pytest.mark.skipif(LooseVersion(dask.__version__) <= '0.15.4', xarray/tests/test_dask.py:284: @pytest.mark.skipif(LooseVersion(dask.__version__) <= '0.15.4', xarray/tests/test_dask.py:296: @pytest.mark.skipif(LooseVersion(dask.__version__) <= '0.15.4', xarray/tests/test_dask.py:387: if LooseVersion(dask.__version__) == LooseVersion('0.15.3'): xarray/tests/test_dask.py:784: pytest.mark.skipif(LooseVersion(dask.__version__) <= '0.15.4', xarray/tests/test_dask.py:802: pytest.mark.skipif(LooseVersion(dask.__version__) <= '0.15.4', xarray/tests/test_dask.py:818:@pytest.mark.skipif(LooseVersion(dask.__version__) <= '0.15.4', xarray/tests/test_variable.py:1664: if LooseVersion(dask.__version__) <= LooseVersion('0.15.1'): xarray/tests/test_variable.py:1670: if LooseVersion(dask.__version__) <= LooseVersion('0.15.1'): I'd like to see xarray bump the minimum version number of dask to something around 0.15.4 (Oct. 2017) or 0.16 (Nov. 2017). cc @mrocklin, @pydata/xarray	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2203/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
327875183	MDU6SXNzdWUzMjc4NzUxODM=	2200	DEPS: drop numpy < 1.12	jhamman 2443309	closed			0	2018-05-30T19:52:40Z	2018-07-08T00:55:31Z	2018-07-08T00:55:31Z	MEMBER	Pandas is dropping Numpy 1.11 and earlier in their 0.24 release. It is probably easiest to follow suit with xarray. xref: https://github.com/pandas-dev/pandas/issues/21242	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2200/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
331415995	MDU6SXNzdWUzMzE0MTU5OTU=	2225	Zarr Backend: check for non-uniform chunks is too strict	jhamman 2443309	closed			3	2018-06-12T02:36:05Z	2018-06-13T05:51:36Z	2018-06-13T05:51:36Z	MEMBER	I think the following block of code is more strict than either dask or zarr requires: https://github.com/pydata/xarray/blob/6c3abedf906482111b06207b9016ea8493c42713/xarray/backends/zarr.py#L80-L89 It should be possible to have uneven chunks in the last position of multiple dimensions in a zarr dataset. Code Sample, a copy-pastable example if possible ```python In [1]: import xarray as xr In [2]: import dask.array as dsa In [3]: da = xr.DataArray(dsa.random.random((8, 7, 11), chunks=(3, 3, 3)), dims=('x', 'y', 't')) In [4]: da Out[4]: <xarray.DataArray 'da.random.random_sample-1aed3ea2f9dd784ec947cb119459fa56' (x: 8, y: 7, t: 11)> dask.array<shape=(8, 7, 11), dtype=float64, chunksize=(3, 3, 3)> Dimensions without coordinates: x, y, t In [5]: da.data.chunks Out[5]: ((3, 3, 2), (3, 3, 1), (3, 3, 3, 2)) In [6]: da.to_dataset('varname').to_zarr('/Users/jhamman/workdir/test_chunks.zarr') /Users/jhamman/anaconda/bin/ipython:1: FutureWarning: the order of the arguments on DataArray.to_dataset has changed; you now need to supply `name` as a keyword argument #!/Users/jhamman/anaconda/bin/python ValueError Traceback (most recent call last) <ipython-input-7-32fa9a7d0276> in <module>() ----> 1 da.to_dataset('varname').to_zarr('/Users/jhamman/workdir/test_chunks.zarr') ~/anaconda/lib/python3.6/site-packages/xarray/core/dataset.py in to_zarr(self, store, mode, synchronizer, group, encoding, compute) 1185 from ..backends.api import to_zarr 1186 return to_zarr(self, store=store, mode=mode, synchronizer=synchronizer, -> 1187 group=group, encoding=encoding, compute=compute) 1188 1189 def unicode(self): ~/anaconda/lib/python3.6/site-packages/xarray/backends/api.py in to_zarr(dataset, store, mode, synchronizer, group, encoding, compute) 856 # I think zarr stores should always be sync'd immediately 857 # TODO: figure out how to properly handle unlimited_dims --> 858 dataset.dump_to_store(store, sync=True, encoding=encoding, compute=compute) 859 860 if not compute: ~/anaconda/lib/python3.6/site-packages/xarray/core/dataset.py in dump_to_store(self, store, encoder, sync, encoding, unlimited_dims, compute) 1073 1074 store.store(variables, attrs, check_encoding, -> 1075 unlimited_dims=unlimited_dims) 1076 if sync: 1077 store.sync(compute=compute) ~/anaconda/lib/python3.6/site-packages/xarray/backends/zarr.py in store(self, variables, attributes, args, kwargs) 341 def store(self, variables, attributes, args, *kwargs): 342 AbstractWritableDataStore.store(self, variables, attributes, --> 343 args, kwargs) 344 345 def sync(self, compute=True): ~/anaconda/lib/python3.6/site-packages/xarray/backends/common.py in store(self, variables, attributes, check_encoding_set, unlimited_dims) 366 self.set_dimensions(variables, unlimited_dims=unlimited_dims) 367 self.set_variables(variables, check_encoding_set, --> 368 unlimited_dims=unlimited_dims) 369 370 def set_attributes(self, attributes): ~/anaconda/lib/python3.6/site-packages/xarray/backends/common.py in set_variables(self, variables, check_encoding_set, unlimited_dims) 403 check = vn in check_encoding_set 404 target, source = self.prepare_variable( --> 405 name, v, check, unlimited_dims=unlimited_dims) 406 407 self.writer.add(source, target) ~/anaconda/lib/python3.6/site-packages/xarray/backends/zarr.py in prepare_variable(self, name, variable, check_encoding, unlimited_dims) 325 326 encoding = _extract_zarr_variable_encoding( --> 327 variable, raise_on_invalid=check_encoding) 328 329 encoded_attrs = OrderedDict() ~/anaconda/lib/python3.6/site-packages/xarray/backends/zarr.py in _extract_zarr_variable_encoding(variable, raise_on_invalid) 181 182 chunks = _determine_zarr_chunks(encoding.get('chunks'), variable.chunks, --> 183 variable.ndim) 184 encoding['chunks'] = chunks 185 return encoding ~/anaconda/lib/python3.6/site-packages/xarray/backends/zarr.py in _determine_zarr_chunks(enc_chunks, var_chunks, ndim) 87 "Zarr requires uniform chunk sizes excpet for final chunk." 88 " Variable %r has incompatible chunks. Consider " ---> 89 "rechunking using `chunk()`." % (var_chunks,)) 90 # last chunk is allowed to be smaller 91 last_var_chunk = all_var_chunks[-1] ValueError: Zarr requires uniform chunk sizes excpet for final chunk. Variable ((3, 3, 2), (3, 3, 1), (3, 3, 3, 2)) has incompatible chunks. Consider rechunking using `chunk()`. ``` Problem description [this should explain why** the current behavior is a problem and why the expected output is a better solution.] Expected Output IIUC, Zarr allows multiple dims to have uneven chunks, so long as they are all in the last position: ```Python In [9]: import zarr In [10]: z = zarr.zeros((8, 7, 11), chunks=(3, 3, 3), dtype='i4') In [11]: z.chunks Out[11]: (3, 3, 3) ``` Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Darwin OS-release: 17.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.7 pandas: 0.22.0 numpy: 1.14.3 scipy: 1.1.0 netCDF4: 1.3.1 h5netcdf: 0.5.1 h5py: 2.7.1 Nio: None zarr: 2.2.0 bottleneck: 1.2.1 cyordereddict: None dask: 0.17.2 distributed: 1.21.6 matplotlib: 2.2.2 cartopy: 0.16.0 seaborn: 0.8.1 setuptools: 39.0.1 pip: 9.0.3 conda: 4.5.4 pytest: 3.5.1 IPython: 6.3.1 sphinx: 1.7.4	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2225/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
322445312	MDU6SXNzdWUzMjI0NDUzMTI=	2121	rasterio backend should use DataStorePickleMixin (or something similar)	jhamman 2443309	closed			2	2018-05-11T21:51:59Z	2018-06-07T18:02:56Z	2018-06-07T18:02:56Z	MEMBER	Code Sample, a copy-pastable example if possible ```Python In [1]: import xarray as xr In [2]: ds = xr.open_rasterio('RGB.byte.tif') In [3]: ds Out[3]: <xarray.DataArray (band: 3, y: 718, x: 791)> [1703814 values with dtype=uint8] Coordinates: * band (band) int64 1 2 3 * y (y) float64 2.827e+06 2.826e+06 2.826e+06 2.826e+06 2.826e+06 ... * x (x) float64 1.021e+05 1.024e+05 1.027e+05 1.03e+05 1.033e+05 ... Attributes: transform: (101985.0, 300.0379266750948, 0.0, 2826915.0, 0.0, -300.0417... crs: +init=epsg:32618 res: (300.0379266750948, 300.041782729805) is_tiled: 0 nodatavals: (0.0, 0.0, 0.0) In [4]: import pickle In [5]: pickle.dumps(ds) TypeError Traceback (most recent call last) <ipython-input-5-a165c2473431> in <module>() ----> 1 pickle.dumps(ds) TypeError: can't pickle rasterio._io.RasterReader objects ``` Problem description Originally reported by @rsignell-usgs in https://github.com/pangeo-data/pangeo/issues/249#issuecomment-388445370, the rasterio backend is not pickle-able. This obviously causes problems when using dask-distributed. We probably need to use `DataStorePickleMixin` or something similar on rasterio datasets to allow multiple readers of the same dataset. Expected Output `python pickle.dumps(ds)` returns a pickled dataset. Output of `xr.show_versions()` xr.show_versions() /Users/jhamman/anaconda/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. from ._conv import register_converters as _register_converters INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Darwin OS-release: 17.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.3 pandas: 0.22.0 numpy: 1.14.2 scipy: 1.0.1 netCDF4: 1.3.1 h5netcdf: 0.5.1 h5py: 2.7.1 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: 0.17.2 distributed: 1.21.6 matplotlib: 2.2.2 cartopy: 0.16.0 seaborn: 0.8.1 setuptools: 39.0.1 pip: 9.0.3 conda: 4.5.1 pytest: 3.5.1 IPython: 6.3.1 sphinx: 1.7.4	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2121/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
304201107	MDU6SXNzdWUzMDQyMDExMDc=	1981	use dask to open datasets in parallel	jhamman 2443309	closed			5	2018-03-11T22:33:52Z	2018-04-20T12:04:23Z	2018-04-20T12:04:23Z	MEMBER	Code Sample, a copy-pastable example if possible `python xr.open_mfdataset('path/to/many/files.nc', method='parallel')` Problem description We have many issues describing the less than stelar performance of open_mfdataset (e.g. #511, #893, #1385, #1788, #1823). The problem can be broken into three pieces: 1) open each file, 2) decode/preprocess each datasets, and 3) merge/combine/concat the collection of datasets. We can perform (1) and (2) in parallel (performance improvements to (3) would be a separate task). Lately, I'm finding that for large numbers of files, it can take many seconds to many minutes just to open all the files in a multi-file dataset of mine. I'm proposing that we use something like `dask.bag` to parallelize steps (1) and (2). I've played around with this a bit and it "works" almost right out of the box, provided you are using the "autoclose=True" option. A concrete example: We could change the line: `Python datasets = [open_dataset(p, open_kwargs) for p in paths]` to `Python import dask.bag as db paths_bag = db.from_sequence(paths) datasets = paths_bag.map(open_dataset, *open_kwargs).compute()` I'm curious what others think of this idea and what the potential downfalls may be.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1981/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
295621576	MDU6SXNzdWUyOTU2MjE1NzY=	1897	Vectorized indexing with cache=False	jhamman 2443309	closed			5	2018-02-08T18:38:18Z	2018-03-06T22:00:57Z	2018-03-06T22:00:57Z	MEMBER	Code Sample, a copy-pastable example if possible ```python import numpy as np import xarray as xr n_times = 4; n_lats = 10; n_lons = 15 n_points = 4 ds = xr.Dataset({'test_var': (['time', 'latitude', 'longitude'], np.random.random((n_times, n_lats, n_lons)))}) ds.to_netcdf('test.nc') rand_lons = xr.Variable('points', np.random.randint(0, high=n_lons, size=n_points)) rand_lats = xr.Variable('points', np.random.randint(0, high=n_lats, size=n_points)) ds = xr.open_dataset('test.nc', cache=False) points = ds['test_var'][:, rand_lats, rand_lons] `yields:` NotImplementedError Traceback (most recent call last) <ipython-input-7-f16e4cae9456> in <module>() 12 13 ds = xr.open_dataset('test.nc', cache=False) ---> 14 points = ds['test_var'][:, rand_lats, rand_lons] ~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/core/dataarray.py in getitem(self, key) 478 else: 479 # xarray-style array indexing --> 480 return self.isel(self._item_key_to_dict(key)) 481 482 def setitem(self, key, value): ~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/core/dataarray.py in isel(self, drop, indexers) 759 DataArray.sel 760 """ --> 761 ds = self._to_temp_dataset().isel(drop=drop, indexers) 762 return self._from_temp_dataset(ds) 763 ~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/core/dataset.py in isel(self, drop, indexers) 1390 for name, var in iteritems(self._variables): 1391 var_indexers = {k: v for k, v in indexers_list if k in var.dims} -> 1392 new_var = var.isel(var_indexers) 1393 if not (drop and name in var_indexers): 1394 variables[name] = new_var ~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/core/variable.py in isel(self, indexers) 851 if dim in indexers: 852 key[i] = indexers[dim] --> 853 return self[tuple(key)] 854 855 def squeeze(self, dim=None): ~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/core/variable.py in getitem(self, key) 620 """ 621 dims, indexer, new_order = self._broadcast_indexes(key) --> 622 data = as_indexable(self._data)[indexer] 623 if new_order: 624 data = np.moveaxis(data, range(len(new_order)), new_order) ~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/core/indexing.py in getitem(self, key) 554 555 def getitem(self, key): --> 556 return type(self)(_wrap_numpy_scalars(self.array[key])) 557 558 def setitem(self, key, value): ~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/core/indexing.py in getitem(self, indexer) 521 522 def getitem(self, indexer): --> 523 return type(self)(self.array, self._updated_key(indexer)) 524 525 def setitem(self, key, value): ~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/core/indexing.py in _updated_key(self, new_key) 491 'Vectorized indexing for {} is not implemented. Load your ' 492 'data first with .load() or .compute(), or disable caching by ' --> 493 'setting cache=False in open_dataset.'.format(type(self))) 494 495 iter_new_key = iter(expanded_indexer(new_key.tuple, self.ndim)) NotImplementedError: Vectorized indexing for <class 'xarray.core.indexing.LazilyIndexedArray'> is not implemented. Load your data first with .load() or .compute(), or disable caching by setting cache=False in open_dataset. ``` Problem description Raising a `NotImplementedError` here is fine but it instructs the user to "disable caching by setting cache=False in open_dataset" which I've already done. So my questions are 1) should we expect this to work and 2) if not Expected Output Ideally, we can get the same behavior as: ```python ds = xr.open_dataset('test2.nc', cache=False).load() points = ds['test_var'][:, rand_lats, rand_lons] <xarray.DataArray 'test_var' (time: 4, points: 4)> array([[0.939469, 0.406885, 0.939469, 0.759075], [0.470116, 0.585546, 0.470116, 0.37833 ], [0.274321, 0.648218, 0.274321, 0.383391], [0.754121, 0.078878, 0.754121, 0.903788]]) Dimensions without coordinates: time, points ``` without needing to use `.load()` Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.6.4.final.0 python-bits: 64 OS: Linux OS-release: 3.10.0-693.5.2.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.0+dev55.g1d32399 pandas: 0.22.0 numpy: 1.14.0 scipy: 1.0.0 netCDF4: 1.3.1 h5netcdf: 0.5.0 h5py: 2.7.1 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: 0.16.1 distributed: 1.20.2 matplotlib: 2.1.2 cartopy: 0.15.1 seaborn: 0.8.1 setuptools: 38.4.0 pip: 9.0.1 conda: None pytest: 3.4.0 IPython: 6.2.1 sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1897/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
287852184	MDU6SXNzdWUyODc4NTIxODQ=	1821	v0.10.1 Release	jhamman 2443309	closed		0.10.3 3008859	11	2018-01-11T16:56:08Z	2018-02-26T23:20:45Z	2018-02-26T01:48:32Z	MEMBER	We're close to a minor/bug-fix release (0.10.1). What do we need to get done before that can happen? [x] #1800 Performance improvements to Zarr (@jhamman) [ ] #1793 Fix for to_netcdf writes with dask-distributed (@jhamman, could use help) [x] #1819 Normalisation for RGB imshow Help wanted / bugs that no-one is working on: - [ ] #1792 Comparison to masked numpy arrays - [ ] #1764 groupby_bins fails for empty bins What else?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1821/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
113497063	MDU6SXNzdWUxMTM0OTcwNjM=	640	Use pytest to simplify unit tests	jhamman 2443309	closed			2	2015-10-27T03:06:48Z	2018-02-05T21:00:02Z	2018-02-05T21:00:02Z	MEMBER	xray's unit testing system uses Python's standard `unittest` framework. pytest offers a more flexible framework requiring less boilerplate code. I recently (#638) introduced pytest into xray's CI builds. This issue proposes incrementally migrating and simplifying xray's unit testing framework to pytest.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/640/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
288466108	MDU6SXNzdWUyODg0NjYxMDg=	1830	Drop support for Python 2	jhamman 2443309	closed			7	2018-01-15T02:44:15Z	2018-02-01T06:04:08Z	2018-02-01T06:04:08Z	MEMBER	When do we want to drop Python 2 support for Xarray. For reference, Pandas has a stated drop date for Python 2 of the end of 2018 (this year) and Numpy is slightly later and includes an incremental depreciation, final on Jan. 1, 2020. We may also consider signing this pledge to help make it clear when/why we're dropping Python 2 support: http://www.python3statement.org/ xref: https://github.com/pandas-dev/pandas/issues/18894, https://github.com/numpy/numpy/pull/10006, https://github.com/python3statement/python3statement.github.io/issues/11	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1830/reactions", "total_count": 5, "+1": 4, "-1": 0, "laugh": 0, "hooray": 1, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
287186057	MDU6SXNzdWUyODcxODYwNTc=	1813	Test Failure: test_datetime_line_plot	jhamman 2443309	closed			3	2018-01-09T18:29:35Z	2018-01-10T07:13:53Z	2018-01-10T07:13:53Z	MEMBER	We're getting a single test failure in the plot tests on master (link to travis failure. I haven't been able to reproduce this locally yet so I'm just going to post here to see if anyone has any ideas. Code Sample ```python ___ TestDatetimePlot.test_datetime_line_plot _____ self = <xarray.tests.test_plot.TestDatetimePlot testMethod=test_datetime_line_plot> def test_datetime_line_plot(self): # test if line plot raises no Exception `self.darray.plot.line()` xarray/tests/test_plot.py:1333: xarray/plot/plot.py:328: in line return line(self._da, args, kwargs) xarray/plot/plot.py:223: in line _ensure_plottable(x) args = (<xarray.DataArray 'time' (time: 12)> array([datetime.datetime(2017, 1, 1, 0, 0), datetime.datetime(2017, 2, 1,... 12, 1, 0, 0)], dtype=object) Coordinates: time (time) object 2017-01-01 2017-02-01 2017-03-01 2017-04-01 ...,) numpy_types = [<class 'numpy.floating'>, <class 'numpy.integer'>, <class 'numpy.timedelta64'>, <class 'numpy.datetime64'>] other_types = [<class 'datetime.datetime'>] x = <xarray.DataArray 'time' (time: 12)> array([datetime.datetime(2017, 1, 1, 0, 0), datetime.datetime(2017, 2, 1, ...7, 12, 1, 0, 0)], dtype=object) Coordinates: * time (time) object 2017-01-01 2017-02-01 2017-03-01 2017-04-01 ... def _ensure_plottable(args): """ Raise exception if there is anything in args that can't be plotted on an axis. """ numpy_types = [np.floating, np.integer, np.timedelta64, np.datetime64] other_types = [datetime] `for x in args: if not (_valid_numpy_subdtype(np.array(x), numpy_types) or _valid_other_type(np.array(x), other_types)):` `raise TypeError('Plotting requires coordinates to be numeric ' 'or dates.')` E TypeError: Plotting requires coordinates to be numeric or dates. xarray/plot/plot.py:57: TypeError ``` Expected Output This test was previously passing* Output of `xr.show_versions()` https://travis-ci.org/pydata/xarray/jobs/326640013#L1262	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1813/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
265056503	MDU6SXNzdWUyNjUwNTY1MDM=	1631	Resample / upsample behavior diverges from pandas	jhamman 2443309	closed			5	2017-10-12T19:22:44Z	2017-12-30T06:21:42Z	2017-12-30T06:21:42Z	MEMBER	I've found a few issues where xarray's new resample / upsample functionality is diverging from Pandas. I think they are mostly surrounding how NaNs are treated. Thoughts from @shoyer, @darothen and others. Gist with all the juicy details: https://gist.github.com/jhamman/354f0e5ff32a39550ffd25800e7214fc#file-xarray_resample-ipynb xref: #1608, #1272	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1631/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
283984555	MDU6SXNzdWUyODM5ODQ1NTU=	1798	BUG: set_variables in backends.commons loads target dataset	jhamman 2443309	closed			1	2017-12-21T19:43:05Z	2017-12-28T05:40:17Z	2017-12-28T05:40:17Z	MEMBER	Problem description In #1609 we (I) implemented a fix for appending to datasets with existing variables. In doing so, it looks like I added a regression wherein the `variables` property on the `AbstractWritableDataStore` is repeatedly queried. This property calls `.load()` on the underlying dataset. This was discovered while diagnosing some problems with the zarr backend (#1770, https://github.com/pangeo-data/pangeo/issues/48#issuecomment-353223737). I have a potential fix for this that I will post once the tests pass. cc @rabernat, @mrocklin Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: 20f957db105a9348b0f7d2dac076c17c31cbccee python: 3.6.0.final.0 python-bits: 64 OS: Darwin OS-release: 17.3.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.0+dev18.g4a9c1e3 pandas: 0.21.0 numpy: 1.13.3 scipy: 0.19.1 netCDF4: 1.3.0 h5netcdf: 0.5.0 Nio: None zarr: 2.1.4 bottleneck: 1.2.1 cyordereddict: None dask: 0.15.4 distributed: 1.19.3 matplotlib: 2.0.2 cartopy: 0.15.1 seaborn: 0.8.1 setuptools: 33.1.0.post20170122 pip: 9.0.1 conda: None pytest: 3.2.3 IPython: 5.2.2 sphinx: 1.6.3	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1798/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
279958650	MDU6SXNzdWUyNzk5NTg2NTA=	1766	Pandas has deprecated the TimeGrouper	jhamman 2443309	closed			0	2017-12-07T00:40:11Z	2017-12-07T01:33:29Z	2017-12-07T01:33:29Z	MEMBER	Code Sample, a copy-pastable example if possible `python da.resample(time='MS').sum('time')` Problem description Pandas has deprecated the `TimeGrouper` class (https://github.com/pandas-dev/pandas/issues/16747) and that warning has started popping out during xarray resample operations. We can make this go away quite easily. (I'll submit a PR shortly). Output of `xr.show_versions()` In [2]: xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.6.0.final.0 python-bits: 64 OS: Darwin OS-release: 16.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.9.6-75-g246c352 pandas: 0.21.0 numpy: 1.13.3 scipy: 0.19.1 netCDF4: 1.3.0 h5netcdf: 0.5.0 Nio: None bottleneck: 1.2.1 cyordereddict: None dask: 0.15.4 matplotlib: 2.0.2 cartopy: 0.15.1 seaborn: 0.8.1 setuptools: 33.1.0.post20170122 pip: 9.0.1 conda: None pytest: 3.2.3 IPython: 5.2.2 sphinx: 1.6.3	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1766/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
253463226	MDU6SXNzdWUyNTM0NjMyMjY=	1535	v0.10 Release	jhamman 2443309	closed			18	2017-08-28T21:31:43Z	2017-11-20T20:13:52Z	2017-11-20T17:27:24Z	MEMBER	I'd like to issue the v0.10 release in within the next few weeks, after merging the following PRs: Features [x] #1272 Groupby-like API for resampling (@darothen) [x] #1473 Indexing with broadcasting (@fujiisoup, @shoyer) [x] #1489 `to_dask_dataframe()` (@jmunroe) [x] #1508 Support using opened netCDF4.Dataset (@dopplershift) [x] #1514 Add `pathlib.Path` support to `open_(mf)dataset` (@willirath) [x] #1543 pass dask compute/persist args through from load/compute/perist (@jhamman) Bug Fixes [x] #1532 Avoid computing dask variables on `__repr__` and `__getattr__` (@crusaderky) [x] #1542 Pandas dev test failures (@shoyer) [x] #1538 Disallow improper DataArray construction (@jhamman) Misc [x] #1485 `xr.show_versions()` (@jhamman) [x] #1530 Deprecate old pandas support (@fujiisoup) [x] #1539 Remove support for dataset construction w/o dims. (@jhamman) TODO [x] #1333 Deprecate indexing with non-aligned DataArray objects Let me know if there's anything else critical to get in. CC @pydata/xarray	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1535/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
267354113	MDU6SXNzdWUyNjczNTQxMTM=	1644	Formalize contract between XArray and the dask.distributed scheduler	jhamman 2443309	closed			1	2017-10-21T06:09:22Z	2017-11-14T23:40:06Z	2017-11-14T23:40:06Z	MEMBER	From @mrocklin in https://github.com/pangeo-data/pangeo/issues/5#issue-255329911: XArray was designed long before the dask.distributed task scheduler. As a result newer ways of doing things, like asynchronous computing, persist, etc. either don't function well, or were hacked on in a less-than-optimal-way. We should improve this relationship so that XArray can take advantage of newer dask.distributed features today and also adhere to contracts so that it benefits from changes in the future. There is conversation towards the end of dask/dask#1068 about what such a contract might look like. I think that @jcrist is planning to work on this on the Dask side some time in the next week or two. There is a new "Dask Collection Interface" implemented in https://github.com/dask/dask/pull/2748 (and the dask docs docs). I'm creating this issue here (in addition to https://github.com/pangeo-data/pangeo/issues/5) to track design considerations on the xarray side and to get input from the @pydata/xarray team. cc @mrocklin, @shoyer, @jcrist, @rabernat	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1644/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
270808895	MDU6SXNzdWUyNzA4MDg4OTU=	1684	Dask arrays and DataArray coords that share name with dimensions	jhamman 2443309	closed			3	2017-11-02T21:11:58Z	2017-11-05T01:29:45Z	2017-11-05T01:29:45Z	MEMBER	First reported by @mrocklin in here. ```python In [1]: import xarray In [2]: import dask.array as da In [3]: coord = da.arange(8, chunks=(4,)) ...: data = da.random.random((8, 8), chunks=(4, 4)) + 1 ...: array = xarray.DataArray(data, ...: coords={'x': coord, 'y': coord}, ...: dims=['x', 'y']) ...: ValueError Traceback (most recent call last) <ipython-input-3-b90a33ebf436> in <module>() 3 array = xarray.DataArray(data, 4 coords={'x': coord, 'y': coord}, ----> 5 dims=['x', 'y']) /home/mrocklin/workspace/xarray/xarray/core/dataarray.py in init(self, data, coords, dims, name, attrs, encoding, fastpath) 227 228 data = as_compatible_data(data) --> 229 coords, dims = _infer_coords_and_dims(data.shape, coords, dims) 230 variable = Variable(dims, data, attrs, encoding, fastpath=True) 231 /home/mrocklin/workspace/xarray/xarray/core/dataarray.py in _infer_coords_and_dims(shape, coords, dims) 68 if utils.is_dict_like(coords): 69 for k, v in coords.items(): ---> 70 new_coords[k] = as_variable(v, name=k) 71 elif coords is not None: 72 for dim, coord in zip(dims, coords): /home/mrocklin/workspace/xarray/xarray/core/variable.py in as_variable(obj, name) 94 '{}'.format(obj)) 95 elif utils.is_scalar(obj): ---> 96 obj = Variable([], obj) 97 elif getattr(obj, 'name', None) is not None: 98 obj = Variable(obj.name, obj) /home/mrocklin/workspace/xarray/xarray/core/variable.py in init(self, dims, data, attrs, encoding, fastpath) 275 """ 276 self._data = as_compatible_data(data, fastpath=fastpath) --> 277 self._dims = self._parse_dimensions(dims) 278 self._attrs = None 279 self._encoding = None /home/mrocklin/workspace/xarray/xarray/core/variable.py in _parse_dimensions(self, dims) 439 raise ValueError('dimensions %s must have the same length as the ' 440 'number of data dimensions, ndim=%s' --> 441 % (dims, self.ndim)) 442 return dims 443 ValueError: dimensions () must have the same length as the number of data dimensions, ndim=1 ``` or a similiar setup that computes the coordinates imediately ```Python In [18]: x = xr.Variable('x', da.arange(8, chunks=(4,))) ...: y = xr.Variable('y', da.arange(8, chunks=(4,)) * 2) ...: data = da.random.random((8, 8), chunks=(4, 4)) + 1 ...: array = xr.DataArray(data, ...: dims=['x', 'y']) ...: array.coords['x'] = x ...: array.coords['y'] = y ...: In [19]: array Out[19]: <xarray.DataArray 'add-7d8ed340e5dd8fe107ea681573c72e87' (x: 8, y: 8)> dask.array<shape=(8, 8), dtype=float64, chunksize=(4, 4)> Coordinates: * x (x) int64 0 1 2 3 4 5 6 7 * y (y) int64 0 2 4 6 8 10 12 14 ``` Problem description I think we have two, possiblely related problems with using dask arrays as DataArray coordinates. As the first snippet shows, the constructor fails when coordinates are specified as raw dask arrays. This does not occur when `coord` is a numpy array. When coordinates are specified as dask arrays via the `coords` attribute, they are computed immediately. Expected Output Output of `xr.show_versions()` In [23]: xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.6.0.final.0 python-bits: 64 OS: Darwin OS-release: 16.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.0rc1 pandas: 0.20.3 numpy: 1.13.1 scipy: 0.19.1 netCDF4: None h5netcdf: 0.3.1 Nio: None bottleneck: 1.2.0 cyordereddict: None dask: 0.15.4 matplotlib: 2.0.2 cartopy: 0.15.1 seaborn: 0.8.1 setuptools: 36.6.0 pip: 9.0.1 conda: 4.3.29 pytest: 3.0.5 IPython: 5.1.0 sphinx: 1.5.1	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1684/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
265827204	MDU6SXNzdWUyNjU4MjcyMDQ=	1633	seaborn.apionly module is deprecated	jhamman 2443309	closed			1	2017-10-16T16:11:29Z	2017-10-23T15:58:09Z	2017-10-23T15:58:09Z	MEMBER	Xarray is using the apionly module from seaborn which is now raising this warning: `Python ...python3.6/site-packages/seaborn/apionly.py:6: UserWarning: As seaborn no longer sets a default style on import, the seaborn.apionly module is deprecated. It will be removed in a future version. warnings.warn(msg, UserWarning)` I think the only places we use seaborn are here: https://github.com/pydata/xarray/blob/2949558b75a65404a500a237ec54834fd6946d07/xarray/plot/utils.py#L76-L87 This shouldn't a difficult fix if/when we decide to change it. xref: https://github.com/mwaskom/seaborn/pull/1216	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1633/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
266250898	MDU6SXNzdWUyNjYyNTA4OTg=	1636	support writing unlimited dimensions with h5netcdf	jhamman 2443309	closed			0	2017-10-17T19:33:11Z	2017-10-18T19:56:43Z	2017-10-18T19:56:43Z	MEMBER	`h5netcdf`v0.5 (just released) added support for unlimited dimensions. This may (should) allow us to enable writing unlimited dimensions with the `h5netcdf` backend. xref: https://github.com/shoyer/h5netcdf/pull/33	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1636/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
262847801	MDU6SXNzdWUyNjI4NDc4MDE=	1605	Resample interpolate failing on tutorial dataset	jhamman 2443309	closed			3	2017-10-04T16:17:56Z	2017-10-05T16:34:14Z	2017-10-05T16:34:14Z	MEMBER	I'm getting some unexpected behavior/errors from the new resample/interpolate methods. @darothen - any idea what's going on here? ```Python-traceback In [1]: import xarray as xr In [2]: ds = xr.tutorial.load_dataset('air_temperature') In [3]: ds.resample(time='15d').interpolate(kind='linear') AttributeError Traceback (most recent call last) <ipython-input-3-ef931d7ebbda> in <module>() ----> 1 ds.resample(time='15d').interpolate(kind='linear') /glade/p/work/jhamman/storylines/src/xarray/xarray/core/resample.py in interpolate(self, kind) 110 111 """ --> 112 return self._interpolate(kind=kind) 113 114 def _interpolate(self, kind='linear'): /glade/p/work/jhamman/storylines/src/xarray/xarray/core/resample.py in _interpolate(self, kind) 312 313 old_times = self._obj[self._dim].astype(float) --> 314 new_times = self._full_index.values.astype(float) 315 316 data_vars = OrderedDict() AttributeError: 'NoneType' object has no attribute 'values' ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1605/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
262858955	MDU6SXNzdWUyNjI4NTg5NTU=	1606	BUG: _extract_nc4_variable_encoding raises when shuffle argument is set	jhamman 2443309	closed			0	2017-10-04T16:55:59Z	2017-10-05T00:12:38Z	2017-10-05T00:12:38Z	MEMBER	I think we're missing the `shuffle` key from the valid encodings list below: https://github.com/pydata/xarray/blob/24643ecee2eab04d0f84c41715d753e829f448e6/xarray/backends/netCDF4_.py#L155-L156 `Python var = xr.Variable(('x',), [1, 2, 3], {}, {'chunking': (2, 1)}) encoding = _extract_nc4_variable_encoding(var, raise_on_invalid=True)` ``` variable = <xarray.Variable (x: 3)> array([1, 2, 3]), raise_on_invalid = True, lsd_okay = True, backend = 'netCDF4' def _extract_nc4_variable_encoding(variable, raise_on_invalid=False, lsd_okay=True, backend='netCDF4'): encoding = variable.encoding.copy() safe_to_drop = set(['source', 'original_shape']) valid_encodings = set(['zlib', 'complevel', 'fletcher32', 'contiguous', 'chunksizes']) if lsd_okay: valid_encodings.add('least_significant_digit') if (encoding.get('chunksizes') is not None and (encoding.get('original_shape', variable.shape) != variable.shape) and not raise_on_invalid): del encoding['chunksizes'] for k in safe_to_drop: if k in encoding: del encoding[k] if raise_on_invalid: invalid = [k for k in encoding if k not in valid_encodings] if invalid: raise ValueError('unexpected encoding parameters for %r backend: ' `' %r' % (backend, invalid))` E ValueError: unexpected encoding parameters for 'netCDF4' backend: ['shuffle'] xarray/backends/netCDF4_.py:173: ValueError ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1606/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
245893358	MDU6SXNzdWUyNDU4OTMzNTg=	1493	ENH: points coord from isel/sel_points should be a MultiIndex	jhamman 2443309	closed			1	2017-07-27T00:33:42Z	2017-09-07T15:25:40Z	2017-09-07T15:25:40Z	MEMBER	We implemented the pointwise indexing methods (`isel_points` and `sel_points`) before we had MultiIndex support. Would it make sense to update these methods to return objects with coordinates defined as a MultiIndex? Current behavior: ```Python print('original --> \n', ds) lons = [-88, -85.9] lats = [34.2, 31.9] subset = ds.sel_points(lon=lons, lat=lats, method='nearest') print('subset --> \n', subset) ``` yields: original --> <xarray.Dataset> Dimensions: (lat: 224, lon: 464, time: 19709) Coordinates: * lat (lat) float64 25.06 25.19 25.31 25.44 25.56 25.69 25.81 25.94 ... * lon (lon) float64 -124.9 -124.8 -124.7 -124.6 -124.4 -124.3 -124.2 ... * time (time) float64 5.548e+04 5.548e+04 5.548e+04 5.548e+04 ... Data variables: pcp (time, lat, lon) float64 nan nan nan nan nan nan nan nan nan ... subset --> <xarray.Dataset> Dimensions: (points: 2, time: 19709) Coordinates: lat (points) float64 34.19 31.94 lon (points) float64 -87.94 -85.94 * time (time) float64 5.548e+04 5.548e+04 5.548e+04 5.548e+04 ... Dimensions without coordinates: points Data variables: pcp (points, time) float64 0.0 5.698 0.0 0.0 14.66 0.0 0.0 0.0 0.0 ... Maybe it makes sense to return an object with a MultiIndex like: `Python new = pd.MultiIndex.from_arrays([subset.lon.to_index(), subset.lat.to_index()], names=['lon', 'lat']) print(new)` `MultiIndex(levels=[[-87.9375, -85.9375], [31.9375, 34.1875]], labels=[[0, 1], [1, 0]], names=['lon', 'lat'])` xref: #214, #475, #507	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1493/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
254430377	MDU6SXNzdWUyNTQ0MzAzNzc=	1542	Testing: Failing tests on py36-pandas-dev	jhamman 2443309	closed		0.10 2415632	4	2017-08-31T18:40:47Z	2017-09-05T22:22:32Z	2017-09-05T22:22:32Z	MEMBER	We currently have 7 failing tests when run against the pandas development code (travis). Question for @shoyer - can you take a look at these and see if we should try to get a fix in place prior to v.0.10.0? It looks like Pandas.0.21 is slated for release on Sept. 30.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1542/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
254217141	MDU6SXNzdWUyNTQyMTcxNDE=	1540	BUG: Dask distributed integration tests failing on Travis	jhamman 2443309	closed			10	2017-08-31T05:41:50Z	2017-09-05T09:18:01Z	2017-09-01T01:09:11Z	MEMBER	Recent builds on travis are failing for the integration tests for dask distributed (example). Those tests are: `test_dask_distributed_integration_test[h5netcdf]` `test_dask_distributed_integration_test[netcdf4]` The traceback includes this detail: ``` __ test_dask_distributed_integration_test[netcdf4] ________________ loop = <tornado.platform.epoll.EPollIOLoop object at 0x7fe36dc9e250> engine = 'netcdf4' @pytest.mark.parametrize('engine', ENGINES) def test_dask_distributed_integration_test(loop, engine): with cluster() as (s, ): with distributed.Client(s['address'], loop=loop): original = create_test_data() with create_tmp_file(allow_cleanup_failure=ON_WINDOWS) as filename: original.to_netcdf(filename, engine=engine) with xr.open_dataset(filename, chunks=3, engine=engine) as restored: assert isinstance(restored.var1.data, da.Array) `computed = restored.compute()` xarray/tests/test_distributed.py:33: xarray/core/dataset.py:487: in compute return new.load() xarray/core/dataset.py:464: in load evaluated_data = da.compute(lazy_data.values()) ../../../miniconda/envs/test_env/lib/python2.7/site-packages/dask/base.py:206: in compute results = get(dsk, keys, kwargs) ../../../miniconda/envs/test_env/lib/python2.7/site-packages/distributed/client.py:1923: in get results = self.gather(packed, asynchronous=asynchronous) ../../../miniconda/envs/test_env/lib/python2.7/site-packages/distributed/client.py:1368: in gather asynchronous=asynchronous) ../../../miniconda/envs/test_env/lib/python2.7/site-packages/distributed/client.py:540: in sync return sync(self.loop, func, args, kwargs) ../../../miniconda/envs/test_env/lib/python2.7/site-packages/distributed/utils.py:239: in sync six.reraise(error[0]) ../../../miniconda/envs/test_env/lib/python2.7/site-packages/distributed/utils.py:227: in f result[0] = yield make_coro() ../../../miniconda/envs/test_env/lib/python2.7/site-packages/tornado/gen.py:1055: in run value = future.result() ../../../miniconda/envs/test_env/lib/python2.7/site-packages/tornado/concurrent.py:238: in result raise_exc_info(self._exc_info) ../../../miniconda/envs/test_env/lib/python2.7/site-packages/tornado/gen.py:1063: in run yielded = self.gen.throw(exc_info) ../../../miniconda/envs/test_env/lib/python2.7/site-packages/distributed/client.py:1246: in _gather traceback) c = a[b] E TypeError: string indices must be integers ``` Distributed v.1.18.1 was released 5 days ago so there must have been a breaking change that has been passed down to us. cc @shoyer, @mrocklin	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1540/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
140063713	MDU6SXNzdWUxNDAwNjM3MTM=	790	ENH: Optional Read-Only RasterIO backend	jhamman 2443309	closed			15	2016-03-11T02:00:32Z	2017-06-06T10:25:22Z	2017-06-06T10:25:22Z	MEMBER	RasterIO is a GDAL based library that provides Fast and direct raster I/O for use with Numpy and SciPy. I've just used it a bit but have been generally impressed with its support for a range of ASCII and binary raster formats. It might be a nice addition to the suite of backends already available in `xarray`. I'm envisioning a functionality akin to what we provide in the PyNIO backend, which is to say, read-only support for which ever file types RasterIO supports.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/790/reactions", "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
124569898	MDU6SXNzdWUxMjQ1Njk4OTg=	696	Doc updates	jhamman 2443309	closed			1	2016-01-02T01:37:58Z	2016-12-29T02:36:56Z	2016-12-29T02:36:56Z	MEMBER	Now that ReadTheDocs supports using `conda`, we can - use `cartopy` to plot the maps at build time - standardize on Python 3 xref: #695	{ "url": "https://api.github.com/repos/pydata/xarray/issues/696/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
138045063	MDU6SXNzdWUxMzgwNDUwNjM=	781	Don't infer x/y coordinates interval breaks for cartopy plot axes	jhamman 2443309	closed			9	2016-03-03T01:22:19Z	2016-11-10T22:55:05Z	2016-11-10T22:55:05Z	MEMBER	The `DataArray.plot.pcolormesh()` method modifies the x/y coordinates of its plots. I'm finding that, at least for custom `cartopy` projections, the offset applied here causes some real issues downstream. @clarkfitzg - Do you see any problem with treating the x/y offset in the same way as the axis limits?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/781/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
109589162	MDU6SXNzdWUxMDk1ODkxNjI=	605	Support Two-Dimensional Coordinate Variables	jhamman 2443309	closed		1.0 741199	11	2015-10-02T23:27:18Z	2016-07-31T23:02:46Z	2016-07-31T23:02:46Z	MEMBER	The CF Conventions supports the notion of a 2d coordinate variable in the case of irregularly spaced data. An example of this sort of dataset is below. The CF Convention is to add a "coordinates" attribute with a string describing the 2d coordinates. dimensions: xc = 128 ; yc = 64 ; lev = 18 ; variables: float T(lev,yc,xc) ; T:long_name = "temperature" ; T:units = "K" ; T:coordinates = "lon lat" ; float xc(xc) ; xc:axis = "X" ; xc:long_name = "x-coordinate in Cartesian system" ; xc:units = "m" ; float yc(yc) ; yc:axis = "Y" ; yc:long_name = "y-coordinate in Cartesian system" ; yc:units = "m" ; float lev(lev) ; lev:long_name = "pressure level" ; lev:units = "hPa" ; float lon(yc,xc) ; lon:long_name = "longitude" ; lon:units = "degrees_east" ; float lat(yc,xc) ; lat:long_name = "latitude" ; lat:units = "degrees_north" ; I'd like to discuss how we could support this in xray. There motivating application for this is in plotting operations but it may also have application in other grouping and remapping operations (e.g. #324, #475, #486). One option would just to honor the "coordinates" attr in plotting and use the specified coordinates as the x/y values. ref: http://cfconventions.org/Data/cf-conventions/cf-conventions-1.6/build/cf-conventions.html#idp5559280	{ "url": "https://api.github.com/repos/pydata/xarray/issues/605/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
122776511	MDU6SXNzdWUxMjI3NzY1MTE=	681	to_netcdf on Python 3: "string" qualifier on attributes	jhamman 2443309	closed			8	2015-12-17T16:56:59Z	2016-06-16T08:27:33Z	2016-03-01T21:49:36Z	MEMBER	I've had a number of people ask me about this and I think we can figure out a way to fix this. In python3, variabile attributes in files written with `Dataset.to_netcdf` end up with the "string" type qualifier shown below. This causes all sorts of problems with other netcdf programs. Is this related to https://github.com/Unidata/netcdf4-python/issues/485? ``` bash PRISM$ ncdump -h prism_historical_conus4k.189501-201510.nc netcdf prism_historical_conus4k.189501-201510 { dimensions: latitude = 621 ; longitude = 1405 ; time = 1450 ; variables: double latitude(latitude) ; double longitude(longitude) ; int64 time(time) ; string time:units = "days since 1895-01-01 00:00:00" ; string time:calendar = "proleptic_gregorian" ; float prcp(time, latitude, longitude) ; string prcp:units = "mm" ; string prcp:description = "precipitation " ; string prcp:long_name = "precipitation" ; // global attributes: string :title = "PRISM: Parameter-elevation Regressions on Independent Slopes Model" ; } ``` cc @lizaclark	{ "url": "https://api.github.com/repos/pydata/xarray/issues/681/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
156186767	MDU6SXNzdWUxNTYxODY3Njc=	855	drop support for Python 2.6	jhamman 2443309	closed			0	2016-05-23T01:53:15Z	2016-05-23T19:38:07Z	2016-05-23T19:38:07Z	MEMBER	@shoyer polled the xarray users list about dropping Python 2.6 from the supported versions of Python for xarray. There were no complaints so it looks like we are moving forward on this at the next major release (0.8). xref: https://groups.google.com/forum/#!searchin/xarray/2.6/xarray/JVIUiIhEW_8/qBjxmestCQAJ	{ "url": "https://api.github.com/repos/pydata/xarray/issues/855/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
113499493	MDU6SXNzdWUxMTM0OTk0OTM=	641	add rolling_apply method or function	jhamman 2443309	closed			13	2015-10-27T03:30:11Z	2016-02-20T02:32:33Z	2016-02-20T02:32:33Z	MEMBER	Pandas has a generic `rolling_apply` function. It would be nice to support a similar api on xray objects. The api I have in mind is something like: ``` Python DataArray.rolling_apply(window, func, min_periods=None, freq=None, center=False, args=(), kwargs={}) da.rolling_apply(7, np.mean) ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/641/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
108769226	MDU6SXNzdWUxMDg3NjkyMjY=	593	Bug when accessing sorted dataset before loading	jhamman 2443309	closed			6	2015-09-28T23:58:29Z	2016-01-04T23:11:55Z	2015-10-02T21:41:11Z	MEMBER	I ran into this bug this afternoon. If I sort a Dataset using `isel` before loading the data, I end up with an error in the `netCDF4` backend. If I call `Dataset.load()` before sorting the Dataset, I get the expected behavior. First some info on my environment (everything should be fresh): `Python version : 3.4.3 \|Anaconda 2.3.0 (x86_64)\| (default, Mar 6 2015, 12:07:41) [GCC 4.2.1 (Apple Inc. build 5577)] xray version : 0.6.0 numpy version : 1.9.3 netCDF4 version : 1.1.9` Now for a simplified example that reproduces the bug: ``` Python In [1]: import xray import numpy as np import netCDF4 In [2]: random_data = np.random.random(size=(4, 6)) dim0 = [0, 1, 2, 3] dim1 = [0, 2, 1, 3, 5, 4] # We will sort this in a later step da = xray.DataArray(data=random_data, dims=('dim0', 'dim1'), coords={'dim0': dim0, 'dim1': dim1}, name='randovar') ds = da.to_dataset() ds.to_netcdf('rando.nc') In [3]: ds2 = xray.open_dataset('rando.nc') ds2.load() # work around to prevent IndexError inds = np.argsort(ds2.dim1.values) ds2 = ds2.isel(dim1=inds) print(ds2.randovar) Out[3]: IndexError Traceback (most recent call last) <ipython-input-3-9b4ab63c0fd2> in <module>() 2 inds = np.argsort(ds2.dim1.values) 3 ds2 = ds2.isel(dim1=inds) ----> 4 print(ds2.randovar) ... /Users/jhamman/anaconda/lib/python3.4/site-packages/xray/backends/netCDF4_.py in getitem(self, key) 43 else: 44 getitem = operator.getitem ---> 45 data = getitem(self.array, key) 46 if self.ndim == 0: 47 # work around for netCDF4-python's broken handling of 0-d netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Variable.getitem (netCDF4/_netCDF4.c:30994)() /Users/jhamman/anaconda/lib/python3.4/site-packages/netCDF4/utils.py in _StartCountStride(elem, shape, dimensions, grp, datashape, put) 220 # duplicate indices in the sequence) 221 msg = "integer sequences in slices must be sorted and cannot have duplicates" --> 222 raise IndexError(msg) 223 # convert to boolean array. 224 # if unlim, let boolean array be longer than current dimension IndexError: integer sequences in slices must be sorted and cannot have duplicates ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/593/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
110102454	MDU6SXNzdWUxMTAxMDI0NTQ=	611	facet grid axis labels are None	jhamman 2443309	closed		0.6.1 1307323	4	2015-10-06T21:12:50Z	2016-01-04T23:11:55Z	2015-10-09T14:25:57Z	MEMBER	The dim names on this plot are not showing up (e.g. `None` is not right, it should be `x` and `y`): `Python data = (np.random.random(size=(20, 25, 12)) + np.linspace(-3, 3, 12)) # range is ~ -3 to 4 da = xray.DataArray(data, dims=['x', 'y', 'time'], name='data') fg = da.plot.pcolormesh(col='time', col_wrap=4)`	{ "url": "https://api.github.com/repos/pydata/xarray/issues/611/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
109434899	MDU6SXNzdWUxMDk0MzQ4OTk=	602	latest docs are broken	jhamman 2443309	closed	shoyer 1217238	0.7.0 1368762	4	2015-10-02T05:48:21Z	2016-01-02T01:31:17Z	2016-01-02T01:31:17Z	MEMBER	Looking at the doc build from tonight, something happened and netCDF4 isn't getting picked up. All the docs depending on the netCDF4 package are broken (e.g. plotting, IO, etc.). @shoyer - You may be able to just resubmit the doc build, or maybe we need to fix something.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/602/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
110801359	MDU6SXNzdWUxMTA4MDEzNTk=	617	travis builds are broken	jhamman 2443309	closed			2	2015-10-10T15:39:51Z	2015-10-23T22:26:43Z	2015-10-23T22:26:43Z	MEMBER	Tests are failing on Python 2.7 and 3.4. We just started getting pandas 0.17 and numpy 1.10 so that is probably the source of the issue.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/617/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
110040239	MDU6SXNzdWUxMTAwNDAyMzk=	610	don't throw away variable specific coordinates information	jhamman 2443309	closed			0	2015-10-06T15:50:41Z	2015-10-08T18:03:19Z	2015-10-08T18:03:19Z	MEMBER	Currently, we decode the `coordinates` attribute, when present, but it doesn't end up in the DataArray's `encoding` attribute (https://github.com/xray/xray/blob/master/xray/conventions.py#L822-L832). This should be changed so the user can reference the `coordinates` attribute after decoding. xref: #605	{ "url": "https://api.github.com/repos/pydata/xarray/issues/610/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
109553434	MDU6SXNzdWUxMDk1NTM0MzQ=	603	support using Cartopy with facet grids	jhamman 2443309	closed			1	2015-10-02T19:06:33Z	2015-10-06T15:10:01Z	2015-10-06T15:10:01Z	MEMBER	Currently, I don't think it is possible to specify a Cartopy projection for facet grid plots. It would be nice to be able to specify either the subplots array including Cartopy projections (e.g. `ax=axes`) or a projection key word argument via (`subplots_kw=dict(projection=...)`) directly when using the xray's facet grid.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/603/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
101061611	MDU6SXNzdWUxMDEwNjE2MTE=	533	DataArray.name should always be a string	jhamman 2443309	closed			2	2015-08-14T17:36:02Z	2015-09-18T17:35:26Z	2015-09-18T17:35:26Z	MEMBER	Consider the following example: ``` Python import numpy as np import xray da = xray.DataArray(np.random.random((4, 5))) ds = da.to_dataset(name=0) # or name=True, or name=(4) ds.to_netcdf('test.nc') ``` raises this error: ``` python /Users/jhamman/anaconda/lib/python3.4/site-packages/xray/backends/netCDF4_.py in prepare_variable(self, name, variable) 228 endian='native', 229 least_significant_digit=encoding.get('least_significant_digit'), --> 230 fill_value=fill_value) 231 nc4_var.set_auto_maskandscale(False) 232 netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Dataset.createVariable (netCDF4/_netCDF4.c:13217)() /Users/jhamman/anaconda/lib/python3.4/posixpath.py in normpath(path) 330 if path == empty: 331 return dot --> 332 initial_slashes = path.startswith(sep) 333 # POSIX allows one or two initial slashes, but treats three or more 334 # as single slash. AttributeError: 'int' object has no attribute 'startswith' ``` I think one way to solve this is to cast the name attribute to a string at the time of assignment. Another way is just to raise an error if a not string variable name is used.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/533/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
97861940	MDU6SXNzdWU5Nzg2MTk0MA==	500	discrete colormap option for imshow and pcolormesh	jhamman 2443309	closed			9	2015-07-29T05:07:18Z	2015-08-06T16:06:33Z	2015-08-06T16:06:33Z	MEMBER	It may be nice to include an option for a discrete colormap/colorbar for the `imshow` and `pcolormesh` methods. I would suggest that the default behavior remains a continuous colormap. Perhaps adding an argument such as `cmap_intervals` would allow for easy discretization of the colormap. The logic in #499 takes care of most of the details for this issue.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/500/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
83000406	MDU6SXNzdWU4MzAwMDQwNg==	411	unexpected positional indexing behavior with Dataset and DataArray "isel"	jhamman 2443309	closed			5	2015-05-31T04:48:10Z	2015-06-01T05:03:38Z	2015-06-01T05:03:29Z	MEMBER	I may be missing something here but I think the indexing behavior in `isel` is surprisingly different to that of `numpy` and is incongruent with the `xray` documentation. Either this is a bug or a feature that I don't understand. From the `xray` docs on positional indexing: Indexing a DataArray directly works (mostly) just like it does for numpy arrays, except that the returned object is always another DataArray My example below uses two 1d `numpy` arrays to select from a 3d numpy array. When using pure `numpy`, I get a 2d array back. In my view, this is the expected behavior. When using the `xray.Dataset` or `xray.DataArray`, I get an oddly shaped 3d array back with a duplicate dimension. ``` python import numpy as np import xray import sys print('python version:', sys.version) print('numpy version:', np.version.full_version) print('xray version:', xray.version.version) ``` `python version: 3.4.3 \|Anaconda 2.2.0 (x86_64)\| (default, Mar 6 2015, 12:07:41) [GCC 4.2.1 (Apple Inc. build 5577)] numpy version: 1.9.2 xray version: 0.4.1` ``` python A few numpy arrays time = np.arange(100) lons = np.arange(40, 60) lats = np.arange(25, 70) np_data = np.random.random(size=(len(time), len(lats), len(lons))) pick some random points to select ys, xs = np.nonzero(np_data[0] > 0.8) print(len(ys)) ``` `176` ``` python create a xray.DataArray and xray.Dataset xr_data = xray.DataArray(np_data, [('time', time), ('y', lats), ('x', lons)]) # DataArray xr_ds = xr_data.to_dataset(name='data') # Dataset numpy indexing print('numpy: ', np_data[:, ys, xs].shape) xray positional indexing print('xray1: ', xr_data.isel(y=ys, x=xs).shape) print('xray2: ', xr_data[:, ys, xs].shape) print('xray3: ', xr_ds.isel(y=ys, x=xs)) ``` `numpy: (100, 176) xray1: (100, 176, 176) xray2: (100, 176, 176) xray3: <xray.Dataset> Dimensions: (time: 100, x: 176, y: 176) Coordinates: * x (x) int64 46 47 57 45 48 50 51 54 57 59 48 52 49 50 52 53 55 57 43 46 47 48 53 ... * time (time) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 ... * y (y) int64 25 25 25 26 26 26 26 26 26 26 27 27 28 28 28 28 28 28 29 29 29 29 29 ... Data variables: data (time, y, x) float64 0.9343 0.8311 0.8842 0.3188 0.02052 0.4506 0.04177 ...`	{ "url": "https://api.github.com/repos/pydata/xarray/issues/411/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
33273199	MDU6SXNzdWUzMzI3MzE5OQ==	122	Dataset.groupby summary methods	jhamman 2443309	closed			3	2014-05-11T23:28:18Z	2014-06-23T07:25:08Z	2014-06-23T07:25:08Z	MEMBER	This may just be a documentation issue but the summary apply and combine methods for the `Dataset.GroupBy` object seem to be missing. ``` python In [146]: foo_values = np.random.RandomState(0).rand(3, 4) times = pd.date_range('2000-01-01', periods=3) ds = xray.Dataset({'time': ('time', times), 'foo': (['time', 'space'], foo_values)}) ds.groupby('time').mean() #replace time with time.month after #121 is adressed ds.groupby('time').apply(np.mean) # also Errors here AttributeError Traceback (most recent call last) <ipython-input-146-eec1e73cff23> in <module>() 3 ds = xray.Dataset({'time': ('time', times), 4 'foo': (['time', 'space'], foo_values)}) ----> 5 ds.groupby('time').mean() 6 ds.groupby('time').apply(np.mean) AttributeError: 'DatasetGroupBy' object has no attribute 'mean' ``` Adding this functionality, if not already present, seems like a really nice addition to the package.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/122/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
33942756	MDU6SXNzdWUzMzk0Mjc1Ng==	138	keep attrs when reducing xray objects	jhamman 2443309	closed			4	2014-05-21T00:40:19Z	2014-05-22T00:29:22Z	2014-05-22T00:29:22Z	MEMBER	Reduction operations currently drop all `Variable` and `Dataset` `attrs` when a reduction operation is performed. I'm proposing adding a keyword to these methods to allow for copying of the original `Variable` or `Dataset` `attrs`. The default value of the `keep_attrs` keyword would be `False`. For example: `python new = ds.mean(keep_attrs=True)` returns `new` with all the `Variable` and `Dataset` `attrs` as `ds` contained. Some previous discussion in #131 and #137.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/138/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
33272937	MDU6SXNzdWUzMzI3MjkzNw==	121	virtual variables not available when using open_dataset	jhamman 2443309	closed			5	2014-05-11T23:11:21Z	2014-05-16T00:37:39Z	2014-05-16T00:37:39Z	MEMBER	The tutorial provides an example of how to use xray's `virtual_variables`. The same functionality is not availble from a Dataset object created by open_dataset. Tutorial: ``` python In [135]: foo_values = np.random.RandomState(0).rand(3, 4) times = pd.date_range('2000-01-01', periods=3) ds = xray.Dataset({'time': ('time', times), 'foo': (['time', 'space'], foo_values)}) ds['time.dayofyear'] Out[135]: <xray.DataArray 'time.dayofyear' (time: 3)> array([1, 2, 3], dtype=int32) Attributes: Empty ``` however, reading a time coordinate / variable from a netCDF4 file, and applying the same logic raises an error: ``` In [136]: ds = xray.open_dataset('sample_for_xray.nc') ds['time'] Out[136]: <xray.DataArray 'time' (time: 4)> array([1979-09-16 12:00:00, 1979-10-17 00:00:00, 1979-11-16 12:00:00, 1979-12-17 00:00:00], dtype=object) Attributes: dimensions: 1 long_name: time type_preferred: int In [137]: ds['time.dayofyear'] ValueError Traceback (most recent call last) <ipython-input-137-bfe1ae778782> in <module>() ----> 1 ds['time.dayofyear'] /Users/jhamman/anaconda/lib/python2.7/site-packages/xray-0.2.0.dev_cc5e1b2-py2.7.egg/xray/dataset.pyc in getitem(self, key) 408 """Access the given variable name in this dataset as a `DataArray`. 409 """ --> 410 return data_array.DataArray._constructor(self, key) 411 412 def setitem(self, key, value): /Users/jhamman/anaconda/lib/python2.7/site-packages/xray-0.2.0.dev_cc5e1b2-py2.7.egg/xray/data_array.pyc in _constructor(cls, dataset, name) 95 if name not in dataset and name not in dataset.virtual_variables: 96 raise ValueError('name %r must be a variable in dataset %r' ---> 97 % (name, dataset)) 98 obj._dataset = dataset 99 obj._name = name ValueError: name 'time.dayofyear' must be a variable in dataset <xray.Dataset> Dimensions: (time: 4, x: 275, y: 205) Coordinates: time X x X y X Noncoordinates: Wind 0 2 1 Attributes: sample data for xray from RASM project ``` Is there a reason that the virtual time variables are only available if the dataset is created from a pandas date_range? Lastly, this could be related to the #118 .	{ "url": "https://api.github.com/repos/pydata/xarray/issues/121/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
33112594	MDU6SXNzdWUzMzExMjU5NA==	118	Problems parsing time variable using open_dataset	jhamman 2443309	closed			4	2014-05-08T18:57:31Z	2014-05-16T00:37:28Z	2014-05-16T00:37:28Z	MEMBER	I'm noticing a problem parsing the time variable for at least the `noleap` calendar for a properly formatted time dimension. Any thoughts on why this is? ``` bash ncdump -c -t sample_for_xray.nc netcdf sample_for_xray { dimensions: time = UNLIMITED ; // (4 currently) y = 205 ; x = 275 ; variables: double Wind(time, y, x) ; Wind:units = "m/s" ; Wind:long_name = "Wind speed" ; Wind:coordinates = "latitude longitude" ; Wind:dimensions = "2" ; Wind:type_preferred = "double" ; Wind:time_rep = "instantaneous" ; Wind:_FillValue = 9.96920996838687e+36 ; double time(time) ; time:calendar = "noleap" ; time:dimensions = "1" ; time:long_name = "time" ; time:type_preferred = "int" ; time:units = "days since 0001-01-01 0:0:0" ; // global attributes: ... data: time = "1979-09-16 12", "1979-10-17", "1979-11-16 12", "1979-12-17" ; ``` `python ds = xray.open_dataset('sample_for_xray.nc') print ds['time']` ``` TypeError Traceback (most recent call last) <ipython-input-46-65c280e7a283> in <module>() 1 ds = xray.open_dataset('sample_for_xray.nc') ----> 2 print ds['time'] /home/jhamman/anaconda/lib/python2.7/site-packages/xray/common.pyc in repr(self) 40 41 def repr(self): ---> 42 return array_repr(self) 43 44 def _iter(self): /home/jhamman/anaconda/lib/python2.7/site-packages/xray/common.pyc in array_repr(arr) 122 summary = ['<xray.%s %s(%s)>'% (type(arr).name, name_str, dim_summary)] 123 if arr.size < 1e5 or arr._in_memory(): --> 124 summary.append(repr(arr.values)) 125 else: 126 summary.append('[%s values with dtype=%s]' % (arr.size, arr.dtype)) /home/jhamman/anaconda/lib/python2.7/site-packages/xray/data_array.pyc in values(self) 147 def values(self): 148 """The variables's data as a numpy.ndarray""" --> 149 return self.variable.values 150 151 @values.setter /home/jhamman/anaconda/lib/python2.7/site-packages/xray/variable.pyc in values(self) 217 def values(self): 218 """The variable's data as a numpy.ndarray""" --> 219 return utils.as_array_or_item(self._data_cached()) 220 221 @values.setter /home/jhamman/anaconda/lib/python2.7/site-packages/xray/utils.pyc in as_array_or_item(values, dtype) 56 # converted into an integer instead :( 57 return values ---> 58 values = as_safe_array(values, dtype=dtype) 59 if values.ndim == 0 and values.dtype.kind == 'O': 60 # unpack 0d object arrays to be consistent with numpy /home/jhamman/anaconda/lib/python2.7/site-packages/xray/utils.pyc in as_safe_array(values, dtype) 40 """Like np.asarray, but convert all datetime64 arrays to ns precision 41 """ ---> 42 values = np.asarray(values, dtype=dtype) 43 if values.dtype.kind == 'M': 44 # np.datetime64 /home/jhamman/anaconda/lib/python2.7/site-packages/numpy/core/numeric.pyc in asarray(a, dtype, order) 458 459 """ --> 460 return array(a, dtype, copy=False, order=order) 461 462 def asanyarray(a, dtype=None, order=None): /home/jhamman/anaconda/lib/python2.7/site-packages/xray/variable.pyc in array(self, dtype) 121 if dtype is None: 122 dtype = self.dtype --> 123 return self.array.values.astype(dtype) 124 125 def getitem(self, key): TypeError: Cannot cast datetime.date object from metadata [D] to [ns] according to the rule 'same_kind' ``` This file is available here: ftp://ftp.hydro.washington.edu/pub/jhamman/sample_for_xray.nc	{ "url": "https://api.github.com/repos/pydata/xarray/issues/118/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
33273376	MDU6SXNzdWUzMzI3MzM3Ng==	123	Selective variable reads in open_dataset	jhamman 2443309	closed			2	2014-05-11T23:39:12Z	2014-05-12T02:25:10Z	2014-05-12T02:25:10Z	MEMBER	One of the beautiful things about the netCDF data model is that the variables can be read individually. I'm suggesting adding a `variables` keyword (or something along those lines) to the `open_dataset` function to support selecting one or more or all variables in a file. This will allow for faster reads and smaller memory usage when the full set of variables is not needed.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/123/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);

issues

64 rows where state = "closed", type = "issue" and user = 2443309 sorted by updated_at descending

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

What is your issue?

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

MCVE Code Sample

Expected Output

Problem Description

Versions

current

proposed

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of xr.show_versions()

```python-traceback

Output of xr.show_versions()

3358 is going to make some fairly major changes to the minimum supported versions of required and optional dependencies. We also have a few bug fixes that have landed since releasing 0.13 that would be good to get out.

Problem description

Code Sampl and expected output

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of xr.show_versions()

Code Sample, a copy-pastable example if possible

In [5]: pickle.dumps(ds)

Problem description

Expected Output

Output of xr.show_versions()

Code Sample, a copy-pastable example if possible

Problem description

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of xr.show_versions()

Code Sample

Expected Output

Output of xr.show_versions()

Problem description

Output of xr.show_versions()

Code Sample, a copy-pastable example if possible

Problem description

Output of xr.show_versions()

Features

Bug Fixes

Misc

TODO

Problem description

Expected Output

Output of xr.show_versions()

In [3]: ds.resample(time='15d').interpolate(kind='linear')

DataArray.rolling_apply(window, func, min_periods=None, freq=None,

center=False, args=(), kwargs={})

ds2.load() # work around to prevent IndexError

Out[3]:

A few numpy arrays

pick some random points to select

create a xray.DataArray and xray.Dataset

numpy indexing

xray positional indexing

ds.groupby('time').apply(np.mean) # also Errors here

```

Advanced export

Output of `xr.show_versions()`

Output of `xr.show_versions()`

Output of `xr.show_versions()`

Output of `xr.show_versions()`

Output of `xr.show_versions()`

Output of `xr.show_versions()`

Output of `xr.show_versions()`

Output of `xr.show_versions()`

Output of `xr.show_versions()`