id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 602256880,MDU6SXNzdWU2MDIyNTY4ODA=,3981,[Proposal] Expose Variable without Pandas dependency,2443309,open,0,,,23,2020-04-17T22:00:10Z,2024-04-24T17:19:55Z,,MEMBER,,,,"This issue proposes exposing Xarray's `Variable` class as a stand-alone array class with named axes (`dims`) and arbitrary metadata (`attrs`) but without coordinates (`indexes`). Yes, this already exists but the `Variable` class in currently inseparable from our Pandas dependency, despite not utilizing any of its functionality. What would this entail? The biggest change would be in making Pandas an optional dependency and isolating any imports. This change could be confined to the `Variable` object or could be propagated further as the Explicit Indexes work proceeds (#1603). ### Why? Within Xarray, the `Variable` class is a vital building block for many of our internal data structures. Recently, the utility of a simple array with named dimensions has been highlighted by a few potential user communities: - Scikit-learn: https://github.com/scikit-learn/enhancement_proposals/pull/18 - PyTorch: (https://pytorch.org/tutorials/intermediate/named_tensor_tutorial.html, http://nlp.seas.harvard.edu/NamedTensor) [An example from the above linked SLEP](https://github.com/scikit-learn/enhancement_proposals/pull/18#issuecomment-511226842) as to why users may not want Pandas a dependency in Xarray: > @amueller: ...If we go this route, I think we need to make xarray, and therefore pandas, a mandatory dependency... > ... > @adrinjalali: ...And we still do have the option of making a NamedArray. xarray uses the pandas' index classes for the indexing and stuff, which is something we really don't need... Since we already have a class developed that meets these applications' use cases, its seems only prudent to evaluate the feasibility in exposing the `Variable` as a low-level api object. In conclusion, I'm not sure this is currently worth the effort but its probably worth exploring at this point. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3981/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 606530049,MDU6SXNzdWU2MDY1MzAwNDk=,4001,[community] Bi-weekly community developers meeting,2443309,open,0,,,14,2020-04-24T19:22:01Z,2024-03-27T15:33:28Z,,MEMBER,,,,"Hello Xarray Community and @pydata/xarray, Starting next week, we will be hosting a bi-weekly 30-minute community/developers meeting. The goal of this meeting is to help coordinate Xarray development efforts and better connect the user/developer community. ### When Every other Wednesday at 8:30a PT (11:30a ET) beginning April 29th, 2020. Calendar options: - [Google Calendar](https://calendar.google.com/calendar/embed?src=59589f9634ab4ef304e8209be66cda9812dababca71eb8a01a6fa2d167f90d94%40group.calendar.google.com&ctz=America%2FLos_Angeles) - [Ical format](https://calendar.google.com/calendar/ical/59589f9634ab4ef304e8209be66cda9812dababca71eb8a01a6fa2d167f90d94%40group.calendar.google.com/public/basic.ics) ### Where https://us02web.zoom.us/j/87503265754?pwd=cEFJMzFqdTFaS3BMdkx4UkNZRk1QZz09 ### Rolling agenda and meeting notes We'll keep a rolling agenda and set of meeting notes - [Through Sept. 2022](https://hackmd.io/@U4W-olO3TX-hc-cvbjNe4A/xarray-dev-meeting/edit). - [Starting October 2022](https://hackmd.io/fx7KNO2vTKutZeUysE-SIA?both) (requires sign-in) - [Starting March 2024](https://hackmd.io/LFOk5e8BSnqjX3QiKWy5Mw)","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4001/reactions"", ""total_count"": 5, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 5, ""eyes"": 0}",,,13221727,issue 1564627108,I_kwDOAMm_X85dQlCk,7495,"Deprecate open_zarr in favor of open_dataset(..., engine='zarr')",2443309,open,0,,,2,2023-01-31T16:21:07Z,2023-12-12T18:00:15Z,,MEMBER,,,,"### What is your issue? We have discussed many time deprecating `xarray.open_zarr` in favor of `xarray.open_dataset(..., engine='zarr')`. This issue tracks that process and is a place for us to discuss any issues that may arise as a result of the change. xref: https://github.com/pydata/xarray/issues/2812, https://github.com/pydata/xarray/issues/7293 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7495/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 33637243,MDU6SXNzdWUzMzYzNzI0Mw==,131,Dataset summary methods,2443309,closed,0,,650893,10,2014-05-16T00:17:56Z,2023-09-28T12:42:34Z,2014-05-21T21:47:29Z,MEMBER,,,,"Add summary methods to Dataset object. For example, it would be great if you could summarize a entire dataset in a single line. (1) Mean of all variables in dataset. ``` python mean_ds = ds.mean() ``` (2) Mean of all variables in dataset along a dimension: ``` python time_mean_ds = ds.mean(dim='time') ``` In the case where a dimension is specified and there are variables that don't use that dimension, I'd imagine you would just pass that variable through unchanged. Related to #122. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/131/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 1673579421,I_kwDOAMm_X85jwMud,7765,Revisiting Xarray's Minimum dependency versions policy,2443309,open,0,,,9,2023-04-18T17:46:03Z,2023-09-19T15:54:09Z,,MEMBER,,,,"### What is your issue? We have recently had a few reports expressing frustration with our minimum dependency version policy. This issue aims to discuss if changes to our policy are needed. ## Background 1. Our current minimum dependency versions policy reads: > ### Minimum dependency versions > > Xarray adopts a rolling policy regarding the minimum supported version of its dependencies: > > - Python: 24 months ([NEP-29](https://numpy.org/neps/nep-0029-deprecation_policy.html)) > - numpy: 18 months ([NEP-29](https://numpy.org/neps/nep-0029-deprecation_policy.html)) > - all other libraries: 12 months > > This means the latest minor (X.Y) version from N months prior. Patch versions (x.y.Z) are not pinned, and only the latest available at the moment of publishing the xarray release is guaranteed to work. > > You can see the actual minimum tested versions: > > [pydata/xarray](https://github.com/pydata/xarray/blob/main/ci/requirements/min-all-deps.yml) 2. We have a script that checks versions and dates and advises us on when to bump minimum versions. https://github.com/pydata/xarray/blob/main/ci/min_deps_check.py ## Diagnosis 1. Our policy and `min_deps_check.py` script have greatly reduced our deliberations on which versions to support and the maintenance burden of supporting out dated versions of dependencies. 2. We likely need to update our policy and `min_deps_check.py` script to properly account for Python's SEMVER bugfix releases. Depending on how you interpret the policy, we may have prematurely dropped Python 3.8 (see below for a potential action item). ## Discussion questions 1. Is the policy working as designed, are the support windows documented above still appropriate for where Xarray is at today? 2. Is this policy still in line with how our peer libraries are operating? ## Action items 1. There is likely a bug in the patch-version comparison in the minimum Python version. Moreover, we don't differentiate between bugfix and security releases. I suggest we have a special policy for our minimum supported Python version that reads something like: > Python: 24 months from the last bugfix release (security releases are not considered). ------ xref: https://github.com/pydata/xarray/issues/4179, https://github.com/pydata/xarray/pull/7461 Moderators note: I suspect a number of folks will want to comment on this issue with ""Please support Python 3.8 for longer..."". If that is the nature of your comment, please just give this a ❤️ reaction rather than filling up the discussion. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7765/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,reopened,13221727,issue 95114700,MDU6SXNzdWU5NTExNDcwMA==,475,API design for pointwise indexing,2443309,open,0,,,39,2015-07-15T06:04:47Z,2023-08-23T12:37:23Z,,MEMBER,,,,"There have been a number of threads discussing possible improvements/extensions to `xray` indexing. The current indexing behavior for `isel` is orthogonal indexing - in other words, each coordinate is treated independently (see #214 and #411 for more discussion). So the question: what is the best way to incorporate diagonal or pointwise indexing in `xray`? I see two main goals / applications: 1. support simple form of [_`numpy`_ style integer array indexing](http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#purely-integer-array-indexing) 2. support pointwise array indexing along coordinates via computation of nearest-neighbor indexes - I think this can also be thought of as a form of resampling. Input from @WeatherGod, @wholmgren, and @shoyer would be great. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/475/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 1383037028,I_kwDOAMm_X85Sb3hk,7071,Should Xarray have a read_csv method?,2443309,open,0,,,5,2022-09-22T21:28:46Z,2023-06-13T01:45:33Z,,MEMBER,,,,"### Is your feature request related to a problem? Most users of Xarray/Pandas start with an IO call of some sort. In Xarray, our `open_dataset(..., engine=engine)` interface provides an extensible interface to more complex backends (NetCDF, Zarr, GRIB, etc.). For tabular data types, we have [traditionally](https://docs.xarray.dev/en/stable/user-guide/io.html#csv-and-other-formats-supported-by-pandas) pointed users to Pandas. While this works for users that are comfortable with Pandas, it is an added hurdle to users getting started with Xarray. ### Describe the solution you'd like It should be easy and obvious how a user can get a CSV (or other tabular data) into Xarray. Ideally, we don't force the user to use a third part library. ### Describe alternatives you've considered I can think of three possible solutions: 1. We expose a new function `read_csv`, it may do something like this: ```python def read_csv(filepath_or_buffer, **kwargs): df = pd.read_csv(filepath_or_buffer, **kwargs) ds = xr.Dataset.from_dataframe(df) return ds ``` 2. We develop a storage backend to support reading CSV-like data: ```python ds = open_dataset(filepath, engine='csv') ``` 3. We copy (1) as an example and put it in Xarray's documentation. Explicitly showing how you would use Pandas to produce a Dataset from a CSV. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7071/reactions"", ""total_count"": 5, ""+1"": 5, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 1575494367,I_kwDOAMm_X85d6CLf,7515,Aesara as an array backend in Xarray,2443309,open,0,,,11,2023-02-08T05:15:35Z,2023-05-01T14:40:39Z,,MEMBER,,,,"### Is your feature request related to a problem? I recently learned about a meta-tensor library called [Aesara](https://aesara.readthedocs.io/) which got me wondering if it would be a good array backend for Xarray. > Aesara is a Python library that allows you to define, optimize/rewrite, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It is composed of different parts: > - Symbolic representation of mathematical operations on arrays > - Speed and stability optimization > - Efficient symbolic differentiation > - Powerful rewrite system to programmatically modify your models > - Extendable backends. Aesara currently compiles to C, Jax and Numba. ![image](https://user-images.githubusercontent.com/2443309/217439615-724db5f3-80c1-4577-ac95-620b2b27bf72.png) xref: https://github.com/aesara-devs/aesara/issues/352, @OriolAbril, @twiecki Has anyone looked into this yet? ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7515/reactions"", ""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 1}",,,13221727,issue 110820316,MDU6SXNzdWUxMTA4MjAzMTY=,620,Don't squeeze DataArray before plotting,2443309,open,0,,,5,2015-10-10T22:26:51Z,2023-04-08T17:20:50Z,,MEMBER,,,,"As was discussed in #608, we should honor the shape of the DataArray when selecting plot methods. Currently, we're squeezing the DataArray before plotting. This ends up plotting a line plot for a DataArray with shape `(N, 1)`. We should find a way to plot a pcolormesh or imshow plot in this case. The trick will be figuring out what to do in `_infer_interval_breaks`. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/620/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 1644429340,I_kwDOAMm_X85iBAAc,7692,Feature proposal: DataArray.to_zarr(),2443309,closed,0,,,5,2023-03-28T18:00:24Z,2023-04-03T15:53:37Z,2023-04-03T15:53:37Z,MEMBER,,,,"### Is your feature request related to a problem? It would be nice to mimic the behavior of `DataArray.to_netcdf` for the Zarr backend. ### Describe the solution you'd like This should be possible: ```python xr.open_dataarray('file.nc').to_zarr('store.zarr') ``` ### Describe alternatives you've considered None. ### Additional context xref `DataArray.to_netcdf` issue/PR: #915 / #990 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7692/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 1642635191,I_kwDOAMm_X85h6J-3,7686,Add reset_encoding to Dataset and DataArray objects,2443309,closed,0,,,2,2023-03-27T18:51:39Z,2023-03-30T21:09:17Z,2023-03-30T21:09:17Z,MEMBER,,,,"### Is your feature request related to a problem? Xarray maintains the encoding of datasets read from most of its supported backend formats (e.g. NetCDF, Zarr, etc.). This is very useful when you want to perfectly roundtrip but it often gets in the way, causing conflicts when writing a modified dataset or when appending to another dataset. Most of the time, the solution is to just remove the encoding from the dataset and continue on. The following code sample is found in a number of issues that reference this problem. ```python for v in list(ds.coords.keys()): if ds.coords[v].dtype == object: ds[v].encoding.clear() for v in list(ds.variables.keys()): if ds[v].dtype == object: ds[v].encoding.clear() ``` A sample of issues that show variants of this problem. - https://github.com/pydata/xarray/issues/3476 - https://github.com/pydata/xarray/issues/3739 - https://github.com/pydata/xarray/issues/4380 - https://github.com/pydata/xarray/issues/5219 - https://github.com/pydata/xarray/issues/5969 - https://github.com/pydata/xarray/issues/6329 - https://github.com/pydata/xarray/issues/6352 ### Describe the solution you'd like In many cases, the solution to these problems is to leave the original dataset encoding behind and either use Xarray's default encoding (or the backends default) or to specify one's own encoding options. Both cases would benefit from a convenience method to reset the original encoding. Something like would serve this process: ```python ds = xr.open_dataset(...).reset_encoding() ``` ### Describe alternatives you've considered Variations on the API above could also be considered: ```python xr.open_dataset(..., keep_encoding=False) ``` or even: ```python with xr.set_options(keep_encoding=False): ds = xr.open_dataset(...) ``` We can/should also do a better job of surfacing inconsistent encoding in our backends (e.g. `to_netcdf`). ### Additional context _No response_","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7686/reactions"", ""total_count"": 2, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 2, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 1558497871,I_kwDOAMm_X85c5MpP,7479,Use NumPy's SupportsDType,2443309,closed,0,,,0,2023-01-26T17:21:32Z,2023-02-28T23:23:47Z,2023-02-28T23:23:47Z,MEMBER,,,,"### What is your issue? Now that we've bumped our minimum NumPy version to 1.21, we can address this comment: https://github.com/pydata/xarray/blob/b21f62ee37eea3650a58e9ffa3a7c9f4ae83006b/xarray/core/types.py#L57-L62 I decided not to tackle this as part of #7461 but we may be able to do something like this: ```python from numpy.typing._dtype_like import _DTypeLikeNested, _ShapeLike, _SupportsDType ``` xref: #6834 cc @headtr1ck ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7479/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 681291824,MDU6SXNzdWU2ODEyOTE4MjQ=,4348,maximum recursion with dask and pydap backend,2443309,open,0,,,2,2020-08-18T19:47:26Z,2022-12-15T18:47:38Z,,MEMBER,,,," **What happened**: I'm getting a maximum recursion error when using the Pydap backend with Dask distributed. It seems the we're failing to successfully pickle the pydap backend store. **What you expected to happen**: Successful parallel loading of opendap dataset. **Minimal Complete Verifiable Example**: ```python import xarray as xr from dask.distributed import Client client = Client() ds = xr.open_dataset('http://thredds.northwestknowledge.net:8080/thredds/dodsC/agg_terraclimate_pet_1958_CurrentYear_GLOBE.nc', engine='pydap', chunks={'lat': 1024, 'lon': 1024, 'time': 12}).load() ``` yields: Killed worker on the client:
--------------------------------------------------------------------------- KilledWorker Traceback (most recent call last) in 4 client = Client() 5 ----> 6 ds = xr.open_dataset('http://thredds.northwestknowledge.net:8080/thredds/dodsC/agg_terraclimate_pet_1958_CurrentYear_GLOBE.nc', 7 engine='pydap', chunks={'lat': 1024, 'lon': 1024, 'time': 12}).load() ~/miniconda3/envs/carbonplan38/lib/python3.8/site-packages/xarray/core/dataset.py in load(self, **kwargs) 652 653 # evaluate all the dask arrays simultaneously --> 654 evaluated_data = da.compute(*lazy_data.values(), **kwargs) 655 656 for k, data in zip(lazy_data, evaluated_data): ~/miniconda3/envs/carbonplan38/lib/python3.8/site-packages/dask/base.py in compute(*args, **kwargs) 435 keys = [x.__dask_keys__() for x in collections] 436 postcomputes = [x.__dask_postcompute__() for x in collections] --> 437 results = schedule(dsk, keys, **kwargs) 438 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)]) 439 ~/miniconda3/envs/carbonplan38/lib/python3.8/site-packages/distributed/client.py in get(self, dsk, keys, restrictions, loose_restrictions, resources, sync, asynchronous, direct, retries, priority, fifo_timeout, actors, **kwargs) 2594 should_rejoin = False 2595 try: -> 2596 results = self.gather(packed, asynchronous=asynchronous, direct=direct) 2597 finally: 2598 for f in futures.values(): ~/miniconda3/envs/carbonplan38/lib/python3.8/site-packages/distributed/client.py in gather(self, futures, errors, direct, asynchronous) 1886 else: 1887 local_worker = None -> 1888 return self.sync( 1889 self._gather, 1890 futures, ~/miniconda3/envs/carbonplan38/lib/python3.8/site-packages/distributed/client.py in sync(self, func, asynchronous, callback_timeout, *args, **kwargs) 775 return future 776 else: --> 777 return sync( 778 self.loop, func, *args, callback_timeout=callback_timeout, **kwargs 779 ) ~/miniconda3/envs/carbonplan38/lib/python3.8/site-packages/distributed/utils.py in sync(loop, func, callback_timeout, *args, **kwargs) 346 if error[0]: 347 typ, exc, tb = error[0] --> 348 raise exc.with_traceback(tb) 349 else: 350 return result[0] ~/miniconda3/envs/carbonplan38/lib/python3.8/site-packages/distributed/utils.py in f() 330 if callback_timeout is not None: 331 future = asyncio.wait_for(future, callback_timeout) --> 332 result[0] = yield future 333 except Exception as exc: 334 error[0] = sys.exc_info() ~/miniconda3/envs/carbonplan38/lib/python3.8/site-packages/tornado/gen.py in run(self) 733 734 try: --> 735 value = future.result() 736 except Exception: 737 exc_info = sys.exc_info() ~/miniconda3/envs/carbonplan38/lib/python3.8/site-packages/distributed/client.py in _gather(self, futures, errors, direct, local_worker) 1751 exc = CancelledError(key) 1752 else: -> 1753 raise exception.with_traceback(traceback) 1754 raise exc 1755 if errors == ""skip"": KilledWorker: ('open_dataset-54c87cd25bf4e9df37cb3030e6602974pet-d39db76f8636f3803611948183e52c13', )
and the above mentioned recursion error on the workers:
distributed.worker - INFO - ------------------------------------------------- distributed.worker - INFO - Registered to: tcp://127.0.0.1:57334 distributed.worker - INFO - ------------------------------------------------- distributed.worker - ERROR - maximum recursion depth exceeded Traceback (most recent call last): File ""/Users/jhamman/miniconda3/envs/carbonplan38/lib/python3.8/site-packages/distributed/worker.py"", line 931, in handle_scheduler await self.handle_stream( File ""/Users/jhamman/miniconda3/envs/carbonplan38/lib/python3.8/site-packages/distributed/core.py"", line 455, in handle_stream msgs = await comm.read() File ""/Users/jhamman/miniconda3/envs/carbonplan38/lib/python3.8/site-packages/distributed/comm/tcp.py"", line 211, in read msg = await from_frames( File ""/Users/jhamman/miniconda3/envs/carbonplan38/lib/python3.8/site-packages/distributed/comm/utils.py"", line 75, in from_frames res = _from_frames() File ""/Users/jhamman/miniconda3/envs/carbonplan38/lib/python3.8/site-packages/distributed/comm/utils.py"", line 60, in _from_frames return protocol.loads( File ""/Users/jhamman/miniconda3/envs/carbonplan38/lib/python3.8/site-packages/distributed/protocol/core.py"", line 130, in loads value = _deserialize(head, fs, deserializers=deserializers) File ""/Users/jhamman/miniconda3/envs/carbonplan38/lib/python3.8/site-packages/distributed/protocol/serialize.py"", line 269, in deserialize return loads(header, frames) File ""/Users/jhamman/miniconda3/envs/carbonplan38/lib/python3.8/site-packages/distributed/protocol/serialize.py"", line 59, in pickle_loads return pickle.loads(b"""".join(frames)) File ""/Users/jhamman/miniconda3/envs/carbonplan38/lib/python3.8/site-packages/distributed/protocol/pickle.py"", line 59, in loads return pickle.loads(x) File ""/Users/jhamman/miniconda3/envs/carbonplan38/lib/python3.8/site-packages/pydap/model.py"", line 235, in __getattr__ return self.attributes[attr] File ""/Users/jhamman/miniconda3/envs/carbonplan38/lib/python3.8/site-packages/pydap/model.py"", line 235, in __getattr__ return self.attributes[attr] File ""/Users/jhamman/miniconda3/envs/carbonplan38/lib/python3.8/site-packages/pydap/model.py"", line 235, in __getattr__ return self.attributes[attr] [Previous line repeated 973 more times] RecursionError: maximum recursion depth exceeded distributed.worker - INFO - Connection to scheduler broken. Reconnecting...
**Anything else we need to know?**: I've found this to be reproducible with a few kinds of Dask clusters. Setting `Client(processes=False)` does correct the problem at the expense of multiprocessiing. **Environment**:
Output of xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.8.2 | packaged by conda-forge | (default, Mar 5 2020, 16:54:44) [Clang 9.0.1 ] python-bits: 64 OS: Darwin OS-release: 19.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.3 xarray: 0.15.1 pandas: 1.0.3 numpy: 1.18.1 scipy: 1.4.1 netCDF4: 1.5.3 pydap: installed h5netcdf: 0.8.0 h5py: 2.10.0 Nio: None zarr: 2.4.0 cftime: 1.1.1.2 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: 1.0.28 cfgrib: None iris: None bottleneck: 1.3.2 dask: 2.13.0 distributed: 2.13.0 matplotlib: 3.2.1 cartopy: 0.17.0 seaborn: 0.10.0 numbagg: installed setuptools: 46.1.3.post20200325 pip: 20.0.2 conda: installed pytest: 5.4.1 IPython: 7.13.0 sphinx: 3.1.1
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4348/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 1247014308,I_kwDOAMm_X85KU-2k,6634,Optionally include encoding in Dataset to_dict,2443309,closed,0,,,0,2022-05-24T19:10:01Z,2022-05-26T19:17:35Z,2022-05-26T19:17:35Z,MEMBER,,,,"### Is your feature request related to a problem? When using Xarray's `to_dict` methods to record a `Dataset`'s schema, it would be useful to (optionally) include `encoding` in the output. ### Describe the solution you'd like The feature request may be resolved by simply adding an `encoding` keyword argument. This may look like this: ```python ds = xr.Dataset(...) ds.to_dict(data=False, encoding=True) ``` ### Describe alternatives you've considered It is currently possible to manually extract encoding attributes but this is a less desirable solution. xref: https://github.com/pangeo-forge/pangeo-forge-recipes/issues/256","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6634/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 636449225,MDU6SXNzdWU2MzY0NDkyMjU=,4139,[Feature request] Support file-like objects in open_rasterio,2443309,closed,0,,,2,2020-06-10T18:11:26Z,2022-04-19T17:15:21Z,2022-04-19T17:15:20Z,MEMBER,,,," With some acrobatics, it is possible to open file-like objects to rasterio. It would be useful if xarray supported this workflow, particularly for working with cloud optimized geotiffs and fs-spec. #### MCVE Code Sample ```python with open('my_data.tif', 'rb') as f: da = xr.open_rasterio(f) ``` #### Expected Output DataArray -> equivalent to `xr.open_rasterio('my_data.tif')` #### Problem Description We only currently allow str, rasterio.DatasetReader, or rasterio.WarpedVRT as inputs to `open_rasterio`. #### Versions
Output of xr.show_versions() INSTALLED VERSIONS ------------------ commit: 2a288f6ed4286910fcf3ab9895e1e9cbd44d30b4 python: 3.8.2 | packaged by conda-forge | (default, Apr 24 2020, 07:56:27) [Clang 9.0.1 ] python-bits: 64 OS: Darwin OS-release: 18.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: None libnetcdf: None xarray: 0.15.2.dev68+gb896a68f pandas: 1.0.4 numpy: 1.18.5 scipy: None netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: 1.1.5 cfgrib: None iris: None bottleneck: None dask: 2.18.1 distributed: 2.18.0 matplotlib: None cartopy: None seaborn: None numbagg: None pint: None setuptools: 46.1.3.post20200325 pip: 20.1 conda: None pytest: 5.4.3 IPython: 7.13.0 sphinx: 3.0.3
xref: https://github.com/pangeo-data/pangeo-datastore/issues/109","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4139/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 1108564253,I_kwDOAMm_X85CE1kd,6176,Xarray versioning to switch to CalVer,2443309,closed,0,,,10,2022-01-19T21:09:45Z,2022-03-03T04:32:10Z,2022-01-31T18:35:27Z,MEMBER,,,,"Xarray is planning to switch to [Calendar versioning (calver)](https://calver.org/). This issue serves as a general announcement. The idea has come up in multiple developer meetings (#4001) and is part of a larger effort to increase our release cadence (#5927). Today's developer meeting included unanimous consent for the change. Other projects in Xarray's ecosystem have also made this change recently (e.g. https://github.com/dask/community/issues/100). While it is likely we will make this change in the next release or two, users and developers should feel free to voice objections here. The proposed calver implementation follows the same schema as the Dask project, that is; `YYYY.MM.X` (4 digit year, two digit month, one digit micro zero-indexed version. For example, the code block below provides comparison of the current and future version tags: ```python In [1]: import xarray as xr # current In [2]: xr.__version__ Out[2]: '0.19.1' # proposed In [2]: xr.__version__ Out[2]: '2022.01.0' ``` cc @pydata/xarray ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6176/reactions"", ""total_count"": 6, ""+1"": 6, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 139064764,MDU6SXNzdWUxMzkwNjQ3NjQ=,787,Add Groupby and Rolling methods to docs,2443309,closed,0,,,2,2016-03-07T19:10:26Z,2021-11-08T19:51:00Z,2021-11-08T19:51:00Z,MEMBER,,,,"The injected `apply`/`reduce` methods for the `Groupby` and `Rolling` objects are not shown in the api documentation page. While there is obviously a fair bit of overlap between the similar `DataArray`/`Dataset` methods, it would help users to know what methods are available to the `Groupby` and `Rolling` methods if we explicitly listed them in the documentation. Suggestions on the best format to show these mehtods (e.g. `Rolling.mean`) are welcomed. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/787/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 663968779,MDU6SXNzdWU2NjM5Njg3Nzk=,4253,[community] Backends refactor meeting,2443309,closed,0,,,13,2020-07-22T18:39:19Z,2021-03-11T20:42:33Z,2021-03-11T20:42:33Z,MEMBER,,,,"In today's dev call, we opted to schedule a separate meeting to discuss the backends refactor that BOpen (@alexamici and his team) is beginning to work on. This issue is meant to coordinate the scheduling of this meeting. To that end, I've created the following Doodle Poll to help choose a time: https://doodle.com/poll/4mtzxncka7gee4mq Anyone from @pydata/xarray should feel free to join if there is interest. At a minimum, I'm hoping to have @alexamici, @aurghs, @shoyer, and @rabernat there. _Please respond to the poll by COB tomorrow so I can quickly get the meeting on the books. Thanks!_","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4253/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 110807626,MDU6SXNzdWUxMTA4MDc2MjY=,619,Improve plot aspect handling when using cartopy,2443309,open,0,,,5,2015-10-10T17:43:55Z,2021-01-03T16:17:29Z,,MEMBER,,,,"This applies to single plots and FacetGrids. The current plotting behavior when using a projection that changes the plot aspect is as follows: ``` Python from xray.tutorial import load_dataset ds = load_dataset('air_temperature') ax = plt.subplot(projection=ccrs.LambertConformal()) ds.air.isel(time=0).plot(transform=ccrs.PlateCarree()) ax.coastlines() ax.gridlines() ``` ![single](https://cloud.githubusercontent.com/assets/2443309/10412452/2fb3f428-6f3b-11e5-8ef8-7bda8bc33426.png) ``` Python fg = ds.air.isel(time=slice(0, 9)).plot(col='time', col_wrap=3, transform=ccrs.PlateCarree(), subplot_kws=dict(projection=ccrs.LambertConformal())) for ax in fg.axes.flat: ax.coastlines() ax.gridlines() ``` ![facet](https://cloud.githubusercontent.com/assets/2443309/10412453/45e81cec-6f3b-11e5-9dc5-7aba5a8053b8.png) There are two problems here, I think both are related to the aspect of the subplot: 1. In the single case, the subplot aspect is correct but the colorbar is not scaled appropriately 2. In the FacetGrid case, the subplot aspects are not correct but the colorbar is. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/619/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 140264913,MDU6SXNzdWUxNDAyNjQ5MTM=,792,ENH: Don't infer pcolormesh interval breaks for unevenly spaced coordiantes,2443309,open,0,,,7,2016-03-11T19:06:30Z,2020-12-29T17:50:33Z,,MEMBER,,,,"Based on discussion in #781 and #782, it seems like a bad idea to infer (guess) the spacing of coordinates when they are unevenly spaced. As @ocefpaf points out: > guessing should be an active user choice, not the automatic behavior. So the options moving forward are to 1. never infer the interval breaks and be okay with pcolormesh and imshow producing dissimilar plots, or 2. only infer the interval breaks when the coordinates are evenly spaced. cc @clarkfitzg ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/792/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 302806158,MDU6SXNzdWUzMDI4MDYxNTg=,1970,API Design for Xarray Backends,2443309,open,0,,,9,2018-03-06T18:02:05Z,2020-10-06T06:15:56Z,,MEMBER,,,,"It has come time to formalize the API for Xarray backends. We now have the following backends implemented in xarray: | Backend | Read | Write | |----------------|------|-------| | netcdf4-python | x | x | | h5netcdf | x | x | | pydap | x | | | pynio | x | | | scipy | x | x | | rasterio* | x | | | zarr | x | x | \* currently does not inherit from `backends.AbstractDatastore` And there are conversations about adding additional backends, for example: - TileDB: https://github.com/pangeo-data/storage-benchmarks/issues/6 - PseudoNetCDF: #1905 However, as anyone who has worked on implementing or optimizing any of our current backends can attest, the existing DataStore API is not particularly user/developer friendly. @shoyer asked me to open an issue to discuss what a more user friendly backend API would look like so that is what this issue will be. I have left out a thorough description of the current API because, well, I don't think it can done in a succinct manner (thats the problem). Note that @shoyer started down a API refactor some time ago in #1087 but that effort has stalled, presumably because we don't have a well defined set of development goals here. cc @pydata/xarray ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1970/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 287223508,MDU6SXNzdWUyODcyMjM1MDg=,1815,apply_ufunc(dask='parallelized') with multiple outputs,2443309,closed,0,,,17,2018-01-09T20:40:52Z,2020-08-19T06:57:55Z,2020-08-19T06:57:55Z,MEMBER,,,,"I have an application where I'd like to use `apply_ufunc` with dask on a function that requires multiple inputs and outputs. This was left as a TODO item in the #1517. However, its not clear to me looking at the code how this can be done given the current form of dask's atop. I'm hoping @shoyer has already thought of a clever solution here... #### Code Sample, a copy-pastable example if possible ```python def func(foo, bar): assert foo.shape == bar.shape spam = np.zeros_like(bar) spam2 = np.full_like(bar, 2) return spam, spam2 foo = xr.DataArray(np.zeros((10, 10))).chunk() bar = xr.DataArray(np.zeros((10, 10))).chunk() + 5 xrfunc = xr.apply_ufunc(func, foo, bar, output_core_dims=[[], []], dask='parallelized') ``` #### Problem description This currently raises a `NotImplementedError`. #### Expected Output Multiple dask arrays. In my example above, two dask arrays. #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: None python: 3.6.4.final.0 python-bits: 64 OS: Linux OS-release: 4.4.86+ machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.0+dev.c92020a pandas: 0.22.0 numpy: 1.13.3 scipy: 1.0.0 netCDF4: 1.3.1 h5netcdf: 0.5.0 Nio: None zarr: 2.2.0a2.dev176 bottleneck: 1.2.1 cyordereddict: None dask: 0.16.0 distributed: 1.20.2+36.g7387410 matplotlib: 2.1.1 cartopy: None seaborn: None setuptools: 38.4.0 pip: 9.0.1 conda: 4.3.29 pytest: 3.3.2 IPython: 6.2.1 sphinx: None
cc @mrocklin, @arbennett ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1815/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 264049503,MDU6SXNzdWUyNjQwNDk1MDM=,1614,Rules for propagating attrs and encoding,2443309,open,0,,,15,2017-10-09T22:56:02Z,2020-04-05T19:12:10Z,,MEMBER,,,,"We need to come up with some clear rules for when and how xarray should propagate metadata (attrs/encoding). This has come up routinely (e.g. #25, #138, #442, #688, #828, #988, #1009, #1271, #1297, #1586) and we don't have a clear direction as to when to keep/drop metadata. I'll take a first cut: | operation | attrs | encoding | status | |------------ |------------ |------------ |------------ | | reduce | drop | drop | | | arithmetic | drop | drop | implemented | | copy | keep | keep | | | concat | keep first | keep first | implemented | | slice | keep | drop | | | where | keep | keep | | cc @shoyer (following up on https://github.com/pydata/xarray/issues/1586#issuecomment-334954046) ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1614/reactions"", ""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 318988669,MDU6SXNzdWUzMTg5ODg2Njk=,2094,Drop win-32 platform CI from appveyor matrix?,2443309,closed,0,,,3,2018-04-30T18:29:17Z,2020-03-30T20:30:58Z,2020-03-24T03:41:24Z,MEMBER,,,,"Conda-forge has dropped support for 32-bit windows builds (https://github.com/conda-forge/cftime-feedstock/issues/2#issuecomment-385485144). Do we want to continue testing against this environment? The point becomes moot after #1876 gets wrapped up in ~7 months. xref: https://github.com/pydata/xarray/pull/1252 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2094/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 578017585,MDU6SXNzdWU1NzgwMTc1ODU=,3851,Exposing Zarr backend internals as semi-public API,2443309,closed,0,,,3,2020-03-09T16:04:49Z,2020-03-27T22:37:26Z,2020-03-27T22:37:26Z,MEMBER,,,,"We recently built a prototype REST API for serving xarray datasets via a Fast-API application (see #3850 for more details). In the process of doing this, we needed to use [a few internal functions in Xarray's Zarr backend](https://github.com/jhamman/xpublish/blob/aff49ec09136a29b56167a1d627fcb3a13fa4d01/xpublish/rest.py#L13-L20): ```python from xarray.backends.zarr import ( _DIMENSION_KEY, _encode_zarr_attr_value, _extract_zarr_variable_encoding, encode_zarr_variable, ) from xarray.core.pycompat import dask_array_type from xarray.util.print_versions import get_sys_info, netcdf_and_hdf5_versions ``` Obviously, none of these imports are really meant for use outside of Xarray's backends so I'd like to discuss how we may go about exposing these functions (or variables) as semi-public (advanced use) API features. Thoughts? cc @rabernat ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3851/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 197920258,MDU6SXNzdWUxOTc5MjAyNTg=,1188,Should we deprecate the compat and encoding constructor arguments?,2443309,closed,0,,,5,2016-12-28T21:41:26Z,2020-03-24T14:34:37Z,2020-03-24T14:34:37Z,MEMBER,,,,"In https://github.com/pydata/xarray/pull/1170#discussion_r94078121, @shoyer writes: > ...I would consider deprecating the encoding argument to DataArray instead. It would also make sense to get rid of the compat argument to Dataset. > > These extra arguments are not part of the fundamental xarray data model and thus are a little distracting, especially to new users. @pydata/xarray and others, what do we think about deprecating the `compat` argument to the `Dataset` constructor and the `encoding` arguement to the `DataArray` (and `Dataset` via #1170). ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1188/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 508743579,MDU6SXNzdWU1MDg3NDM1Nzk=,3413,Can apply_ufunc be used on arrays with different dimension sizes,2443309,closed,0,,,2,2019-10-17T22:04:00Z,2019-12-11T22:32:23Z,2019-12-11T22:32:23Z,MEMBER,,,,"We have an application where we want to use `apply_ufunc` to apply a function that takes two 1-D arrays and returns a scalar value (basically a reduction over the only axis). We start with two DataArrays that share all the same dimensions - except for the lengths of the dimension we'll be reducing along (`t` in this case): ```python def diff_mean(X, y): ''' a function that only works on 1d arrays that are different lengths''' assert X.ndim == 1, X.ndim assert y.ndim == 1, y.ndim assert len(X) != len(y), X return X.mean() - y.mean() X = np.random.random((10, 4, 5)) y = np.random.random((6, 4, 5)) Xda = xr.DataArray(X, dims=('t', 'x', 'y')).chunk({'t': -1, 'x': 2, 'y': 2}) yda = xr.DataArray(y, dims=('t', 'x', 'y')).chunk({'t': -1, 'x': 2, 'y': 2}) ``` Then, we'd like to use `apply_ufunc` to apply our function (e.g. `diff_mean`): ```python out = xr.apply_ufunc( diff_mean, Xda, yda, vectorize=True, dask=""parallelized"", output_dtypes=[np.float], input_core_dims=[['t'], ['t']], ) ``` This fails with an error when aligning the `t` dimensions: ```python-traceback --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in 9 dask=""parallelized"", 10 output_dtypes=[np.float], ---> 11 input_core_dims=[['t'], ['t']], 12 ) ~/miniconda3/envs/xarray-ml/lib/python3.7/site-packages/xarray/core/computation.py in apply_ufunc(func, input_core_dims, output_core_dims, exclude_dims, vectorize, join, dataset_join, dataset_fill_value, keep_attrs, kwargs, dask, output_dtypes, output_sizes, *args) 1042 join=join, 1043 exclude_dims=exclude_dims, -> 1044 keep_attrs=keep_attrs 1045 ) 1046 elif any(isinstance(a, Variable) for a in args): ~/miniconda3/envs/xarray-ml/lib/python3.7/site-packages/xarray/core/computation.py in apply_dataarray_vfunc(func, signature, join, exclude_dims, keep_attrs, *args) 222 if len(args) > 1: 223 args = deep_align( --> 224 args, join=join, copy=False, exclude=exclude_dims, raise_on_invalid=False 225 ) 226 ~/miniconda3/envs/xarray-ml/lib/python3.7/site-packages/xarray/core/alignment.py in deep_align(objects, join, copy, indexes, exclude, raise_on_invalid, fill_value) 403 indexes=indexes, 404 exclude=exclude, --> 405 fill_value=fill_value 406 ) 407 ~/miniconda3/envs/xarray-ml/lib/python3.7/site-packages/xarray/core/alignment.py in align(join, copy, indexes, exclude, fill_value, *objects) 321 ""arguments without labels along dimension %r cannot be "" 322 ""aligned because they have different dimension sizes: %r"" --> 323 % (dim, sizes) 324 ) 325 ValueError: arguments without labels along dimension 't' cannot be aligned because they have different dimension sizes: {10, 6} ``` https://nbviewer.jupyter.org/gist/jhamman/0e52d9bb29f679e26b0878c58bb813d2 I'm curious if this can be made to work with `apply_ufunc` or if we should pursue other options here. Advice and suggestions appreciated. #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 14:38:56) [Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 18.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: None libnetcdf: None xarray: 0.14.0 pandas: 0.25.1 numpy: 1.17.1 scipy: 1.3.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.3.2 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.3.0 distributed: 2.3.2 matplotlib: 3.1.1 cartopy: None seaborn: None numbagg: None setuptools: 41.2.0 pip: 19.2.3 conda: None pytest: 5.0.1 IPython: 7.8.0 sphinx: 2.2.0
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3413/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 132774456,MDU6SXNzdWUxMzI3NzQ0NTY=,757,Ordered Groupby Keys,2443309,open,0,,,6,2016-02-10T18:05:08Z,2019-11-20T16:12:41Z,,MEMBER,,,,"The current behavior of the xarray's `Groupby.groups` property provides a standard (unordered) dictionary. This is fine for most cases but leads to odd orderings in use cases like this one where I am using xarray's FacetGrid plotting: ``` Python plot_kwargs = dict(col='season', vmin=15, vmax=35, levels=12, extend='both') da_obs = ds_obs.SALT.isel(depth=0).groupby('time.season').mean('time') da_obs.plot(**plot_kwargs) ``` ![index](https://cloud.githubusercontent.com/assets/2443309/12956558/791d70fa-cfdd-11e5-80b6-8ca0bb564d36.png) _Note that MAM and JJA are out of order._ I think this could be easily fixed by using an `OrderedDict` in [`xarray.core.Groupby.groups`](https://github.com/pydata/xarray/blob/master/xarray/core/groupby.py#L162-L168). ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/757/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 280385592,MDU6SXNzdWUyODAzODU1OTI=,1769,Extend to_masked_array to support dask MaskedArrays,2443309,open,0,,,5,2017-12-08T06:22:56Z,2019-11-08T17:19:44Z,,MEMBER,,,,"Following @shoyer's [comment](https://github.com/pydata/xarray/pull/1750#discussion_r155692312), it will be pretty straightforward to support creating dask masked arrays within the `to_masked_array` method. My thought would be that data arrays use dask, would be converted to dask masked arrays, rather than to numpy arrays as they are currently. Two kinks: 1) The dask masked array feature requires dask 0.15.3 or newer. 2) I'm not sure how to test if an object is a `dask.array.ma.MaskedArray` (Dask doesn't have a `MaskedArray` class). @mrocklin - thoughts?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1769/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 503700649,MDU6SXNzdWU1MDM3MDA2NDk=,3380,[Release] 0.14,2443309,closed,0,,,19,2019-10-07T21:28:28Z,2019-10-15T01:08:11Z,2019-10-14T21:26:59Z,MEMBER,,,,"#3358 is going to make some fairly major changes to the minimum supported versions of required and optional dependencies. We also have a few bug fixes that have landed since releasing 0.13 that would be good to get out. From what I can tell, the following pending PRs are close enough to get into this release. - [ ] ~tests for arrays with units #3238~ - [x] map_blocks #3276 - [x] Rolling minimum dependency versions policy #3358 - [x] Remove all OrderedDict's (#3389) - [x] Speed up isel and \_\_getitem\_\_ #3375 - [x] Fix concat bug when concatenating unlabeled dimensions. #3362 - [ ] ~Add hypothesis test for netCDF4 roundtrip #3283~ - [x] Fix groupby reduce for dataarray #3338 - [x] Need a fix for https://github.com/pydata/xarray/issues/3377 Am I missing anything else that needs to get in? I think we should aim to wrap this release up soon (this week). I can volunteer to go through the release steps once we're ready. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3380/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 297227247,MDU6SXNzdWUyOTcyMjcyNDc=,1910,Pynio tests are being skipped on TravisCI,2443309,closed,0,,,3,2018-02-14T20:03:31Z,2019-02-07T00:08:17Z,2019-02-07T00:08:17Z,MEMBER,,,,"#### Problem description Currently on Travis, the Pynio tests are being skipped. The `py27-cdat+iris+pynio` is supposed to be running tests for each of these but it is not. https://travis-ci.org/pydata/xarray/jobs/341426116#L2429-L2518 I can't look at this right now in depth but I'm wondering if this is related to #1531. reported by @WeatherGod","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1910/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 302930480,MDU6SXNzdWUzMDI5MzA0ODA=,1971,Should we be testing against multiple dask schedulers?,2443309,closed,0,,,5,2018-03-07T01:25:37Z,2019-01-13T20:58:21Z,2019-01-13T20:58:20Z,MEMBER,,,,"Almost all of our unit tests are against the dask's default scheduler (usually dask.threaded). While it is true that beauty of dask is that one can separate the scheduler from the logical implementation, there are a few idiosyncrasies to consider, particularly in xarray's backends. To that end, we have a few tests covering the integration of the distributed scheduler with xarray's backends but the test coverage is not particularly complete. If nothing more, I think it is worth considering tests that use the threaded, multiprocessing, and distributed schedulers for a larger subset of the backends tests (those that use dask). *Note, I'm bringing this up because I'm seeing some failing tests in #1793 that are unrelated to my code change but do appear to be related to dask and possibly a different different default scheduler ([example failure](https://travis-ci.org/pydata/xarray/jobs/349955403#L6606-L6764)).*","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1971/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 293414745,MDU6SXNzdWUyOTM0MTQ3NDU=,1876,DEP: drop Python 2.7 support,2443309,closed,0,,,2,2018-02-01T06:11:07Z,2019-01-02T04:52:04Z,2019-01-02T04:52:04Z,MEMBER,,,,"The timeline for dropping Python 2.7 support for new Xarray releases is the end of 2018. This issue can be used to track the necessary documentation and code changes to make that happen. xref: #1830 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1876/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 323765896,MDU6SXNzdWUzMjM3NjU4OTY=,2142,add CFTimeIndex enabled date_range function,2443309,closed,0,,,1,2018-05-16T20:02:08Z,2018-09-19T20:24:40Z,2018-09-19T20:24:40Z,MEMBER,,,,"Pandas' [`date_range`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.date_range.html) function is a fast and flexible way to create `DateTimeIndex` objects. Now that we have a functioning `CFTimeIndex`, it would be great to add a version of the `date_range` function that supports other calendars and dates out of range for Pandas. #### Code Sampl and expected output ```python In [1]: import xarray as xr In [2]: xr.date_range('2000-02-26', '2000-03-02') Out[2]: DatetimeIndex(['2000-02-26', '2000-02-27', '2000-02-28', '2000-02-29', '2000-03-01', '2000-03-02'], dtype='datetime64[ns]', freq='D') In [3]: xr.date_range('2000-02-26', '2000-03-02', calendar='noleap') Out[3]: CFTimeIndex(['2000-02-26', '2000-02-27', '2000-02-28', '2000-03-01', '2000-03-02'], dtype='cftime.datetime', freq='D') ``` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2142/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 288465429,MDU6SXNzdWUyODg0NjU0Mjk=,1829,Drop support for Python 3.4,2443309,closed,0,,2856429,13,2018-01-15T02:38:19Z,2018-07-08T00:55:32Z,2018-07-08T00:55:32Z,MEMBER,,,,"Python 3.7-final is due out in June ([PEP 537](https://www.python.org/dev/peps/pep-0537/)). When do we want to deprecate 3.4 and when should we drop support all together. @maxim-lian brought this up in a PR he's working on: https://github.com/pydata/xarray/pull/1828#issuecomment-357562144. For reference, we dropped Python 3.3 in #1175 (12/20/2016).","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1829/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 327893262,MDU6SXNzdWUzMjc4OTMyNjI=,2203,Update minimum version of dask,2443309,closed,0,,,6,2018-05-30T20:47:57Z,2018-07-08T00:55:32Z,2018-07-08T00:55:32Z,MEMBER,,,,"Xarray currently states that it supports dask version 0.9 and later. However, 1) I don't think this is true and my quick test shows that some of our tests fail using dask 0.9, and 2) we have a growing number of tests that are being skipped for older dask versions: ``` $ grep -irn ""dask.__version__"" xarray/tests/*py xarray/tests/__init__.py:90: if LooseVersion(dask.__version__) < '0.18': xarray/tests/test_computation.py:755: if LooseVersion(dask.__version__) < LooseVersion('0.17.3'): xarray/tests/test_computation.py:841: if not use_dask or LooseVersion(dask.__version__) > LooseVersion('0.17.4'): xarray/tests/test_dask.py:211: @pytest.mark.skipif(LooseVersion(dask.__version__) <= '0.15.4', xarray/tests/test_dask.py:223: @pytest.mark.skipif(LooseVersion(dask.__version__) <= '0.15.4', xarray/tests/test_dask.py:284: @pytest.mark.skipif(LooseVersion(dask.__version__) <= '0.15.4', xarray/tests/test_dask.py:296: @pytest.mark.skipif(LooseVersion(dask.__version__) <= '0.15.4', xarray/tests/test_dask.py:387: if LooseVersion(dask.__version__) == LooseVersion('0.15.3'): xarray/tests/test_dask.py:784: pytest.mark.skipif(LooseVersion(dask.__version__) <= '0.15.4', xarray/tests/test_dask.py:802: pytest.mark.skipif(LooseVersion(dask.__version__) <= '0.15.4', xarray/tests/test_dask.py:818:@pytest.mark.skipif(LooseVersion(dask.__version__) <= '0.15.4', xarray/tests/test_variable.py:1664: if LooseVersion(dask.__version__) <= LooseVersion('0.15.1'): xarray/tests/test_variable.py:1670: if LooseVersion(dask.__version__) <= LooseVersion('0.15.1'): ``` I'd like to see xarray bump the minimum version number of dask to something around 0.15.4 (Oct. 2017) or 0.16 (Nov. 2017). cc @mrocklin, @pydata/xarray ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2203/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 327875183,MDU6SXNzdWUzMjc4NzUxODM=,2200,DEPS: drop numpy < 1.12,2443309,closed,0,,,0,2018-05-30T19:52:40Z,2018-07-08T00:55:31Z,2018-07-08T00:55:31Z,MEMBER,,,,"Pandas is dropping Numpy 1.11 and earlier in their 0.24 release. It is probably easiest to follow suit with xarray. xref: https://github.com/pandas-dev/pandas/issues/21242","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2200/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 331415995,MDU6SXNzdWUzMzE0MTU5OTU=,2225,Zarr Backend: check for non-uniform chunks is too strict,2443309,closed,0,,,3,2018-06-12T02:36:05Z,2018-06-13T05:51:36Z,2018-06-13T05:51:36Z,MEMBER,,,,"I think the following block of code is more strict than either dask or zarr requires: https://github.com/pydata/xarray/blob/6c3abedf906482111b06207b9016ea8493c42713/xarray/backends/zarr.py#L80-L89 It should be possible to have uneven chunks in the last position of multiple dimensions in a zarr dataset. #### Code Sample, a copy-pastable example if possible ```python In [1]: import xarray as xr In [2]: import dask.array as dsa In [3]: da = xr.DataArray(dsa.random.random((8, 7, 11), chunks=(3, 3, 3)), dims=('x', 'y', 't')) In [4]: da Out[4]: dask.array Dimensions without coordinates: x, y, t In [5]: da.data.chunks Out[5]: ((3, 3, 2), (3, 3, 1), (3, 3, 3, 2)) In [6]: da.to_dataset('varname').to_zarr('/Users/jhamman/workdir/test_chunks.zarr') /Users/jhamman/anaconda/bin/ipython:1: FutureWarning: the order of the arguments on DataArray.to_dataset has changed; you now need to supply ``name`` as a keyword argument #!/Users/jhamman/anaconda/bin/python --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () ----> 1 da.to_dataset('varname').to_zarr('/Users/jhamman/workdir/test_chunks.zarr') ~/anaconda/lib/python3.6/site-packages/xarray/core/dataset.py in to_zarr(self, store, mode, synchronizer, group, encoding, compute) 1185 from ..backends.api import to_zarr 1186 return to_zarr(self, store=store, mode=mode, synchronizer=synchronizer, -> 1187 group=group, encoding=encoding, compute=compute) 1188 1189 def __unicode__(self): ~/anaconda/lib/python3.6/site-packages/xarray/backends/api.py in to_zarr(dataset, store, mode, synchronizer, group, encoding, compute) 856 # I think zarr stores should always be sync'd immediately 857 # TODO: figure out how to properly handle unlimited_dims --> 858 dataset.dump_to_store(store, sync=True, encoding=encoding, compute=compute) 859 860 if not compute: ~/anaconda/lib/python3.6/site-packages/xarray/core/dataset.py in dump_to_store(self, store, encoder, sync, encoding, unlimited_dims, compute) 1073 1074 store.store(variables, attrs, check_encoding, -> 1075 unlimited_dims=unlimited_dims) 1076 if sync: 1077 store.sync(compute=compute) ~/anaconda/lib/python3.6/site-packages/xarray/backends/zarr.py in store(self, variables, attributes, *args, **kwargs) 341 def store(self, variables, attributes, *args, **kwargs): 342 AbstractWritableDataStore.store(self, variables, attributes, --> 343 *args, **kwargs) 344 345 def sync(self, compute=True): ~/anaconda/lib/python3.6/site-packages/xarray/backends/common.py in store(self, variables, attributes, check_encoding_set, unlimited_dims) 366 self.set_dimensions(variables, unlimited_dims=unlimited_dims) 367 self.set_variables(variables, check_encoding_set, --> 368 unlimited_dims=unlimited_dims) 369 370 def set_attributes(self, attributes): ~/anaconda/lib/python3.6/site-packages/xarray/backends/common.py in set_variables(self, variables, check_encoding_set, unlimited_dims) 403 check = vn in check_encoding_set 404 target, source = self.prepare_variable( --> 405 name, v, check, unlimited_dims=unlimited_dims) 406 407 self.writer.add(source, target) ~/anaconda/lib/python3.6/site-packages/xarray/backends/zarr.py in prepare_variable(self, name, variable, check_encoding, unlimited_dims) 325 326 encoding = _extract_zarr_variable_encoding( --> 327 variable, raise_on_invalid=check_encoding) 328 329 encoded_attrs = OrderedDict() ~/anaconda/lib/python3.6/site-packages/xarray/backends/zarr.py in _extract_zarr_variable_encoding(variable, raise_on_invalid) 181 182 chunks = _determine_zarr_chunks(encoding.get('chunks'), variable.chunks, --> 183 variable.ndim) 184 encoding['chunks'] = chunks 185 return encoding ~/anaconda/lib/python3.6/site-packages/xarray/backends/zarr.py in _determine_zarr_chunks(enc_chunks, var_chunks, ndim) 87 ""Zarr requires uniform chunk sizes excpet for final chunk."" 88 "" Variable %r has incompatible chunks. Consider "" ---> 89 ""rechunking using `chunk()`."" % (var_chunks,)) 90 # last chunk is allowed to be smaller 91 last_var_chunk = all_var_chunks[-1] ValueError: Zarr requires uniform chunk sizes excpet for final chunk. Variable ((3, 3, 2), (3, 3, 1), (3, 3, 3, 2)) has incompatible chunks. Consider rechunking using `chunk()`. ``` #### Problem description [this should explain **why** the current behavior is a problem and why the expected output is a better solution.] #### Expected Output IIUC, Zarr allows multiple dims to have uneven chunks, so long as they are all in the last position: ```Python In [9]: import zarr In [10]: z = zarr.zeros((8, 7, 11), chunks=(3, 3, 3), dtype='i4') In [11]: z.chunks Out[11]: (3, 3, 3) ``` #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Darwin OS-release: 17.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.7 pandas: 0.22.0 numpy: 1.14.3 scipy: 1.1.0 netCDF4: 1.3.1 h5netcdf: 0.5.1 h5py: 2.7.1 Nio: None zarr: 2.2.0 bottleneck: 1.2.1 cyordereddict: None dask: 0.17.2 distributed: 1.21.6 matplotlib: 2.2.2 cartopy: 0.16.0 seaborn: 0.8.1 setuptools: 39.0.1 pip: 9.0.3 conda: 4.5.4 pytest: 3.5.1 IPython: 6.3.1 sphinx: 1.7.4
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2225/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 322445312,MDU6SXNzdWUzMjI0NDUzMTI=,2121,rasterio backend should use DataStorePickleMixin (or something similar),2443309,closed,0,,,2,2018-05-11T21:51:59Z,2018-06-07T18:02:56Z,2018-06-07T18:02:56Z,MEMBER,,,,"#### Code Sample, a copy-pastable example if possible ```Python In [1]: import xarray as xr In [2]: ds = xr.open_rasterio('RGB.byte.tif') In [3]: ds Out[3]: [1703814 values with dtype=uint8] Coordinates: * band (band) int64 1 2 3 * y (y) float64 2.827e+06 2.826e+06 2.826e+06 2.826e+06 2.826e+06 ... * x (x) float64 1.021e+05 1.024e+05 1.027e+05 1.03e+05 1.033e+05 ... Attributes: transform: (101985.0, 300.0379266750948, 0.0, 2826915.0, 0.0, -300.0417... crs: +init=epsg:32618 res: (300.0379266750948, 300.041782729805) is_tiled: 0 nodatavals: (0.0, 0.0, 0.0) In [4]: import pickle In [5]: pickle.dumps(ds) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in () ----> 1 pickle.dumps(ds) TypeError: can't pickle rasterio._io.RasterReader objects ``` #### Problem description Originally reported by @rsignell-usgs in https://github.com/pangeo-data/pangeo/issues/249#issuecomment-388445370, the rasterio backend is not pickle-able. This obviously causes problems when using dask-distributed. We probably need to use `DataStorePickleMixin` or something similar on rasterio datasets to allow multiple readers of the same dataset. #### Expected Output ```python pickle.dumps(ds) ``` returns a pickled dataset. #### Output of ``xr.show_versions()``
xr.show_versions() /Users/jhamman/anaconda/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. from ._conv import register_converters as _register_converters INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Darwin OS-release: 17.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.3 pandas: 0.22.0 numpy: 1.14.2 scipy: 1.0.1 netCDF4: 1.3.1 h5netcdf: 0.5.1 h5py: 2.7.1 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: 0.17.2 distributed: 1.21.6 matplotlib: 2.2.2 cartopy: 0.16.0 seaborn: 0.8.1 setuptools: 39.0.1 pip: 9.0.3 conda: 4.5.1 pytest: 3.5.1 IPython: 6.3.1 sphinx: 1.7.4
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2121/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 304201107,MDU6SXNzdWUzMDQyMDExMDc=,1981,use dask to open datasets in parallel,2443309,closed,0,,,5,2018-03-11T22:33:52Z,2018-04-20T12:04:23Z,2018-04-20T12:04:23Z,MEMBER,,,,"#### Code Sample, a copy-pastable example if possible ```python xr.open_mfdataset('path/to/many/files*.nc', method='parallel') ``` #### Problem description We have many issues describing the less than stelar performance of open_mfdataset (e.g. #511, #893, #1385, #1788, #1823). The problem can be broken into three pieces: 1) open each file, 2) decode/preprocess each datasets, and 3) merge/combine/concat the collection of datasets. We can perform (1) and (2) in parallel (performance improvements to (3) would be a separate task). Lately, I'm finding that for large numbers of files, it can take many seconds to many minutes just to open all the files in a multi-file dataset of mine. I'm proposing that we use something like `dask.bag` to parallelize steps (1) and (2). I've played around with this a bit and it ""works"" almost right out of the box, provided you are using the ""autoclose=True"" option. A concrete example: We could change the line: ```Python datasets = [open_dataset(p, **open_kwargs) for p in paths] ``` to ```Python import dask.bag as db paths_bag = db.from_sequence(paths) datasets = paths_bag.map(open_dataset, **open_kwargs).compute() ``` I'm curious what others think of this idea and what the potential downfalls may be. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1981/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 295621576,MDU6SXNzdWUyOTU2MjE1NzY=,1897,Vectorized indexing with cache=False,2443309,closed,0,,,5,2018-02-08T18:38:18Z,2018-03-06T22:00:57Z,2018-03-06T22:00:57Z,MEMBER,,,,"#### Code Sample, a copy-pastable example if possible ```python import numpy as np import xarray as xr n_times = 4; n_lats = 10; n_lons = 15 n_points = 4 ds = xr.Dataset({'test_var': (['time', 'latitude', 'longitude'], np.random.random((n_times, n_lats, n_lons)))}) ds.to_netcdf('test.nc') rand_lons = xr.Variable('points', np.random.randint(0, high=n_lons, size=n_points)) rand_lats = xr.Variable('points', np.random.randint(0, high=n_lats, size=n_points)) ds = xr.open_dataset('test.nc', cache=False) points = ds['test_var'][:, rand_lats, rand_lons] ``` yields: ``` --------------------------------------------------------------------------- NotImplementedError Traceback (most recent call last) in () 12 13 ds = xr.open_dataset('test.nc', cache=False) ---> 14 points = ds['test_var'][:, rand_lats, rand_lons] ~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/core/dataarray.py in __getitem__(self, key) 478 else: 479 # xarray-style array indexing --> 480 return self.isel(**self._item_key_to_dict(key)) 481 482 def __setitem__(self, key, value): ~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/core/dataarray.py in isel(self, drop, **indexers) 759 DataArray.sel 760 """""" --> 761 ds = self._to_temp_dataset().isel(drop=drop, **indexers) 762 return self._from_temp_dataset(ds) 763 ~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/core/dataset.py in isel(self, drop, **indexers) 1390 for name, var in iteritems(self._variables): 1391 var_indexers = {k: v for k, v in indexers_list if k in var.dims} -> 1392 new_var = var.isel(**var_indexers) 1393 if not (drop and name in var_indexers): 1394 variables[name] = new_var ~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/core/variable.py in isel(self, **indexers) 851 if dim in indexers: 852 key[i] = indexers[dim] --> 853 return self[tuple(key)] 854 855 def squeeze(self, dim=None): ~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/core/variable.py in __getitem__(self, key) 620 """""" 621 dims, indexer, new_order = self._broadcast_indexes(key) --> 622 data = as_indexable(self._data)[indexer] 623 if new_order: 624 data = np.moveaxis(data, range(len(new_order)), new_order) ~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/core/indexing.py in __getitem__(self, key) 554 555 def __getitem__(self, key): --> 556 return type(self)(_wrap_numpy_scalars(self.array[key])) 557 558 def __setitem__(self, key, value): ~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/core/indexing.py in __getitem__(self, indexer) 521 522 def __getitem__(self, indexer): --> 523 return type(self)(self.array, self._updated_key(indexer)) 524 525 def __setitem__(self, key, value): ~/anaconda/envs/pangeo/lib/python3.6/site-packages/xarray/core/indexing.py in _updated_key(self, new_key) 491 'Vectorized indexing for {} is not implemented. Load your ' 492 'data first with .load() or .compute(), or disable caching by ' --> 493 'setting cache=False in open_dataset.'.format(type(self))) 494 495 iter_new_key = iter(expanded_indexer(new_key.tuple, self.ndim)) NotImplementedError: Vectorized indexing for is not implemented. Load your data first with .load() or .compute(), or disable caching by setting cache=False in open_dataset. ``` #### Problem description Raising a `NotImplementedError` here is fine but it instructs the user to ""disable caching by setting cache=False in open_dataset"" which I've already done. So my questions are 1) should we expect this to work and 2) if not #### Expected Output Ideally, we can get the same behavior as: ```python ds = xr.open_dataset('test2.nc', cache=False).load() points = ds['test_var'][:, rand_lats, rand_lons] array([[0.939469, 0.406885, 0.939469, 0.759075], [0.470116, 0.585546, 0.470116, 0.37833 ], [0.274321, 0.648218, 0.274321, 0.383391], [0.754121, 0.078878, 0.754121, 0.903788]]) Dimensions without coordinates: time, points ``` without needing to use `.load()` #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: None python: 3.6.4.final.0 python-bits: 64 OS: Linux OS-release: 3.10.0-693.5.2.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.0+dev55.g1d32399 pandas: 0.22.0 numpy: 1.14.0 scipy: 1.0.0 netCDF4: 1.3.1 h5netcdf: 0.5.0 h5py: 2.7.1 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: 0.16.1 distributed: 1.20.2 matplotlib: 2.1.2 cartopy: 0.15.1 seaborn: 0.8.1 setuptools: 38.4.0 pip: 9.0.1 conda: None pytest: 3.4.0 IPython: 6.2.1 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1897/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 287852184,MDU6SXNzdWUyODc4NTIxODQ=,1821,v0.10.1 Release,2443309,closed,0,,3008859,11,2018-01-11T16:56:08Z,2018-02-26T23:20:45Z,2018-02-26T01:48:32Z,MEMBER,,,,"We're close to a minor/bug-fix release (0.10.1). What do we need to get done before that can happen? - [x] #1800 Performance improvements to Zarr (@jhamman) - [ ] #1793 Fix for to_netcdf writes with dask-distributed (@jhamman, could use help) - [x] #1819 Normalisation for RGB imshow Help wanted / bugs that no-one is working on: - [ ] #1792 Comparison to masked numpy arrays - [ ] #1764 groupby_bins fails for empty bins What else? ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1821/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 113497063,MDU6SXNzdWUxMTM0OTcwNjM=,640,Use pytest to simplify unit tests,2443309,closed,0,,,2,2015-10-27T03:06:48Z,2018-02-05T21:00:02Z,2018-02-05T21:00:02Z,MEMBER,,,,"xray's unit testing system uses Python's standard `unittest` framework. [pytest](http://pytest.org/latest/) offers a more flexible framework requiring less boilerplate code. I recently (#638) introduced pytest into xray's CI builds. This issue proposes incrementally migrating and simplifying xray's unit testing framework to pytest. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/640/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 288466108,MDU6SXNzdWUyODg0NjYxMDg=,1830,Drop support for Python 2,2443309,closed,0,,,7,2018-01-15T02:44:15Z,2018-02-01T06:04:08Z,2018-02-01T06:04:08Z,MEMBER,,,,"When do we want to drop Python 2 support for Xarray. For reference, Pandas has a stated drop date for Python 2 of the end of 2018 (this year) and Numpy is slightly later and includes an incremental depreciation, final on Jan. 1, 2020. We may also consider signing this pledge to help make it clear when/why we're dropping Python 2 support: http://www.python3statement.org/ xref: https://github.com/pandas-dev/pandas/issues/18894, https://github.com/numpy/numpy/pull/10006, https://github.com/python3statement/python3statement.github.io/issues/11 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1830/reactions"", ""total_count"": 5, ""+1"": 4, ""-1"": 0, ""laugh"": 0, ""hooray"": 1, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 287186057,MDU6SXNzdWUyODcxODYwNTc=,1813,Test Failure: test_datetime_line_plot,2443309,closed,0,,,3,2018-01-09T18:29:35Z,2018-01-10T07:13:53Z,2018-01-10T07:13:53Z,MEMBER,,,,"We're getting a single test failure in the plot tests on master ([link to travis failure](https://travis-ci.org/pydata/xarray/jobs/326640013#L5176). I haven't been able to reproduce this locally yet so I'm just going to post here to see if anyone has any ideas. #### Code Sample ```python ___________________ TestDatetimePlot.test_datetime_line_plot ___________________ self = def test_datetime_line_plot(self): # test if line plot raises no Exception > self.darray.plot.line() xarray/tests/test_plot.py:1333: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ xarray/plot/plot.py:328: in line return line(self._da, *args, **kwargs) xarray/plot/plot.py:223: in line _ensure_plottable(x) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ args = ( array([datetime.datetime(2017, 1, 1, 0, 0), datetime.datetime(2017, 2, 1,... 12, 1, 0, 0)], dtype=object) Coordinates: * time (time) object 2017-01-01 2017-02-01 2017-03-01 2017-04-01 ...,) numpy_types = [, , , ] other_types = [] x = array([datetime.datetime(2017, 1, 1, 0, 0), datetime.datetime(2017, 2, 1, ...7, 12, 1, 0, 0)], dtype=object) Coordinates: * time (time) object 2017-01-01 2017-02-01 2017-03-01 2017-04-01 ... def _ensure_plottable(*args): """""" Raise exception if there is anything in args that can't be plotted on an axis. """""" numpy_types = [np.floating, np.integer, np.timedelta64, np.datetime64] other_types = [datetime] for x in args: if not (_valid_numpy_subdtype(np.array(x), numpy_types) or _valid_other_type(np.array(x), other_types)): > raise TypeError('Plotting requires coordinates to be numeric ' 'or dates.') E TypeError: Plotting requires coordinates to be numeric or dates. xarray/plot/plot.py:57: TypeError ``` #### Expected Output *This test was previously passing* #### Output of ``xr.show_versions()`` https://travis-ci.org/pydata/xarray/jobs/326640013#L1262 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1813/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 265056503,MDU6SXNzdWUyNjUwNTY1MDM=,1631,Resample / upsample behavior diverges from pandas ,2443309,closed,0,,,5,2017-10-12T19:22:44Z,2017-12-30T06:21:42Z,2017-12-30T06:21:42Z,MEMBER,,,,"I've found a few issues where xarray's new resample / upsample functionality is diverging from Pandas. I think they are mostly surrounding how NaNs are treated. Thoughts from @shoyer, @darothen and others. Gist with all the juicy details: https://gist.github.com/jhamman/354f0e5ff32a39550ffd25800e7214fc#file-xarray_resample-ipynb xref: #1608, #1272","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1631/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 283984555,MDU6SXNzdWUyODM5ODQ1NTU=,1798,BUG: set_variables in backends.commons loads target dataset,2443309,closed,0,,,1,2017-12-21T19:43:05Z,2017-12-28T05:40:17Z,2017-12-28T05:40:17Z,MEMBER,,,,"#### Problem description In #1609 we (I) implemented a fix for appending to datasets with existing variables. In doing so, it looks like I added a regression wherein the `variables` property on the `AbstractWritableDataStore` is repeatedly queried. This property calls `.load()` on the underlying dataset. This was discovered while diagnosing some problems with the zarr backend (#1770, https://github.com/pangeo-data/pangeo/issues/48#issuecomment-353223737). I have a potential fix for this that I will post once the tests pass. cc @rabernat, @mrocklin #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: 20f957db105a9348b0f7d2dac076c17c31cbccee python: 3.6.0.final.0 python-bits: 64 OS: Darwin OS-release: 17.3.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.0+dev18.g4a9c1e3 pandas: 0.21.0 numpy: 1.13.3 scipy: 0.19.1 netCDF4: 1.3.0 h5netcdf: 0.5.0 Nio: None zarr: 2.1.4 bottleneck: 1.2.1 cyordereddict: None dask: 0.15.4 distributed: 1.19.3 matplotlib: 2.0.2 cartopy: 0.15.1 seaborn: 0.8.1 setuptools: 33.1.0.post20170122 pip: 9.0.1 conda: None pytest: 3.2.3 IPython: 5.2.2 sphinx: 1.6.3
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1798/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 279958650,MDU6SXNzdWUyNzk5NTg2NTA=,1766,Pandas has deprecated the TimeGrouper,2443309,closed,0,,,0,2017-12-07T00:40:11Z,2017-12-07T01:33:29Z,2017-12-07T01:33:29Z,MEMBER,,,,"#### Code Sample, a copy-pastable example if possible ```python da.resample(time='MS').sum('time') ``` #### Problem description Pandas has deprecated the `TimeGrouper` class (https://github.com/pandas-dev/pandas/issues/16747) and that warning has started popping out during xarray resample operations. We can make this go away quite easily. (I'll submit a PR shortly). #### Output of ``xr.show_versions()``
In [2]: xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.6.0.final.0 python-bits: 64 OS: Darwin OS-release: 16.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.9.6-75-g246c352 pandas: 0.21.0 numpy: 1.13.3 scipy: 0.19.1 netCDF4: 1.3.0 h5netcdf: 0.5.0 Nio: None bottleneck: 1.2.1 cyordereddict: None dask: 0.15.4 matplotlib: 2.0.2 cartopy: 0.15.1 seaborn: 0.8.1 setuptools: 33.1.0.post20170122 pip: 9.0.1 conda: None pytest: 3.2.3 IPython: 5.2.2 sphinx: 1.6.3
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1766/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 253463226,MDU6SXNzdWUyNTM0NjMyMjY=,1535,v0.10 Release,2443309,closed,0,,,18,2017-08-28T21:31:43Z,2017-11-20T20:13:52Z,2017-11-20T17:27:24Z,MEMBER,,,,"I'd like to issue the v0.10 release in within the next few weeks, after merging the following PRs: #### Features - [x] #1272 Groupby-like API for resampling (@darothen) - [x] #1473 Indexing with broadcasting (@fujiisoup, @shoyer) - [x] #1489 `to_dask_dataframe()` (@jmunroe) - [x] #1508 Support using opened netCDF4.Dataset (@dopplershift) - [x] #1514 Add `pathlib.Path` support to `open_(mf)dataset` (@willirath) - [x] #1543 pass dask compute/persist args through from load/compute/perist (@jhamman) #### Bug Fixes - [x] #1532 Avoid computing dask variables on `__repr__` and `__getattr__` (@crusaderky) - [x] #1542 Pandas dev test failures (@shoyer) - [x] #1538 Disallow improper DataArray construction (@jhamman) #### Misc - [x] #1485 `xr.show_versions()` (@jhamman) - [x] #1530 Deprecate old pandas support (@fujiisoup) - [x] #1539 Remove support for dataset construction w/o dims. (@jhamman) #### TODO - [x] #1333 Deprecate indexing with non-aligned DataArray objects Let me know if there's anything else critical to get in. CC @pydata/xarray ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1535/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 267354113,MDU6SXNzdWUyNjczNTQxMTM=,1644,Formalize contract between XArray and the dask.distributed scheduler,2443309,closed,0,,,1,2017-10-21T06:09:22Z,2017-11-14T23:40:06Z,2017-11-14T23:40:06Z,MEMBER,,,,"From @mrocklin in https://github.com/pangeo-data/pangeo/issues/5#issue-255329911: > XArray was designed long before the dask.distributed task scheduler. As a result newer ways of doing things, like asynchronous computing, persist, etc. either don't function well, or were hacked on in a less-than-optimal-way. We should improve this relationship so that XArray can take advantage of newer dask.distributed features today and also adhere to contracts so that it benefits from changes in the future. > > There is conversation towards the end of dask/dask#1068 about what such a contract might look like. I think that @jcrist is planning to work on this on the Dask side some time in the next week or two. There is a new ""Dask Collection Interface"" implemented in https://github.com/dask/dask/pull/2748 (and the dask docs [docs](http://dask.pydata.org/en/latest/custom-collections.html)). I'm creating this issue here (in addition to https://github.com/pangeo-data/pangeo/issues/5) to track design considerations on the xarray side and to get input from the @pydata/xarray team. cc @mrocklin, @shoyer, @jcrist, @rabernat ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1644/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 270808895,MDU6SXNzdWUyNzA4MDg4OTU=,1684,Dask arrays and DataArray coords that share name with dimensions ,2443309,closed,0,,,3,2017-11-02T21:11:58Z,2017-11-05T01:29:45Z,2017-11-05T01:29:45Z,MEMBER,,,,"First reported by @mrocklin in [here](https://github.com/pydata/xarray/pull/1674/files#r148653180). ```python In [1]: import xarray In [2]: import dask.array as da In [3]: coord = da.arange(8, chunks=(4,)) ...: data = da.random.random((8, 8), chunks=(4, 4)) + 1 ...: array = xarray.DataArray(data, ...: coords={'x': coord, 'y': coord}, ...: dims=['x', 'y']) ...: --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () 3 array = xarray.DataArray(data, 4 coords={'x': coord, 'y': coord}, ----> 5 dims=['x', 'y']) /home/mrocklin/workspace/xarray/xarray/core/dataarray.py in __init__(self, data, coords, dims, name, attrs, encoding, fastpath) 227 228 data = as_compatible_data(data) --> 229 coords, dims = _infer_coords_and_dims(data.shape, coords, dims) 230 variable = Variable(dims, data, attrs, encoding, fastpath=True) 231 /home/mrocklin/workspace/xarray/xarray/core/dataarray.py in _infer_coords_and_dims(shape, coords, dims) 68 if utils.is_dict_like(coords): 69 for k, v in coords.items(): ---> 70 new_coords[k] = as_variable(v, name=k) 71 elif coords is not None: 72 for dim, coord in zip(dims, coords): /home/mrocklin/workspace/xarray/xarray/core/variable.py in as_variable(obj, name) 94 '{}'.format(obj)) 95 elif utils.is_scalar(obj): ---> 96 obj = Variable([], obj) 97 elif getattr(obj, 'name', None) is not None: 98 obj = Variable(obj.name, obj) /home/mrocklin/workspace/xarray/xarray/core/variable.py in __init__(self, dims, data, attrs, encoding, fastpath) 275 """""" 276 self._data = as_compatible_data(data, fastpath=fastpath) --> 277 self._dims = self._parse_dimensions(dims) 278 self._attrs = None 279 self._encoding = None /home/mrocklin/workspace/xarray/xarray/core/variable.py in _parse_dimensions(self, dims) 439 raise ValueError('dimensions %s must have the same length as the ' 440 'number of data dimensions, ndim=%s' --> 441 % (dims, self.ndim)) 442 return dims 443 ValueError: dimensions () must have the same length as the number of data dimensions, ndim=1 ``` or a similiar setup that computes the coordinates imediately ```Python In [18]: x = xr.Variable('x', da.arange(8, chunks=(4,))) ...: y = xr.Variable('y', da.arange(8, chunks=(4,)) * 2) ...: data = da.random.random((8, 8), chunks=(4, 4)) + 1 ...: array = xr.DataArray(data, ...: dims=['x', 'y']) ...: array.coords['x'] = x ...: array.coords['y'] = y ...: In [19]: array Out[19]: dask.array Coordinates: * x (x) int64 0 1 2 3 4 5 6 7 * y (y) int64 0 2 4 6 8 10 12 14 ``` #### Problem description I think we have two, possiblely related problems with using dask arrays as DataArray coordinates. 1. As the first snippet shows, the constructor fails when coordinates are specified as raw dask arrays. This does not occur when `coord` is a numpy array. 1. When coordinates are specified as dask arrays via the `coords` attribute, they are computed immediately. #### Expected Output #### Output of ``xr.show_versions()``
In [23]: xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.6.0.final.0 python-bits: 64 OS: Darwin OS-release: 16.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.0rc1 pandas: 0.20.3 numpy: 1.13.1 scipy: 0.19.1 netCDF4: None h5netcdf: 0.3.1 Nio: None bottleneck: 1.2.0 cyordereddict: None dask: 0.15.4 matplotlib: 2.0.2 cartopy: 0.15.1 seaborn: 0.8.1 setuptools: 36.6.0 pip: 9.0.1 conda: 4.3.29 pytest: 3.0.5 IPython: 5.1.0 sphinx: 1.5.1
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1684/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 265827204,MDU6SXNzdWUyNjU4MjcyMDQ=,1633,seaborn.apionly module is deprecated,2443309,closed,0,,,1,2017-10-16T16:11:29Z,2017-10-23T15:58:09Z,2017-10-23T15:58:09Z,MEMBER,,,,"Xarray is using the apionly module from seaborn which is now raising this warning: ```Python ...python3.6/site-packages/seaborn/apionly.py:6: UserWarning: As seaborn no longer sets a default style on import, the seaborn.apionly module is deprecated. It will be removed in a future version. warnings.warn(msg, UserWarning) ``` I think the only places we use seaborn are here: https://github.com/pydata/xarray/blob/2949558b75a65404a500a237ec54834fd6946d07/xarray/plot/utils.py#L76-L87 This shouldn't a difficult fix if/when we decide to change it. xref: https://github.com/mwaskom/seaborn/pull/1216","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1633/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 266250898,MDU6SXNzdWUyNjYyNTA4OTg=,1636,support writing unlimited dimensions with h5netcdf,2443309,closed,0,,,0,2017-10-17T19:33:11Z,2017-10-18T19:56:43Z,2017-10-18T19:56:43Z,MEMBER,,,,"`h5netcdf`v0.5 (just released) added support for unlimited dimensions. This may (should) allow us to enable writing unlimited dimensions with the `h5netcdf` backend. xref: https://github.com/shoyer/h5netcdf/pull/33","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1636/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 262847801,MDU6SXNzdWUyNjI4NDc4MDE=,1605,Resample interpolate failing on tutorial dataset ,2443309,closed,0,,,3,2017-10-04T16:17:56Z,2017-10-05T16:34:14Z,2017-10-05T16:34:14Z,MEMBER,,,,"I'm getting some unexpected behavior/errors from the new resample/interpolate methods. @darothen - any idea what's going on here? ```Python-traceback In [1]: import xarray as xr In [2]: ds = xr.tutorial.load_dataset('air_temperature') In [3]: ds.resample(time='15d').interpolate(kind='linear') --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) in () ----> 1 ds.resample(time='15d').interpolate(kind='linear') /glade/p/work/jhamman/storylines/src/xarray/xarray/core/resample.py in interpolate(self, kind) 110 111 """""" --> 112 return self._interpolate(kind=kind) 113 114 def _interpolate(self, kind='linear'): /glade/p/work/jhamman/storylines/src/xarray/xarray/core/resample.py in _interpolate(self, kind) 312 313 old_times = self._obj[self._dim].astype(float) --> 314 new_times = self._full_index.values.astype(float) 315 316 data_vars = OrderedDict() AttributeError: 'NoneType' object has no attribute 'values' ```","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1605/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 262858955,MDU6SXNzdWUyNjI4NTg5NTU=,1606,BUG: _extract_nc4_variable_encoding raises when shuffle argument is set,2443309,closed,0,,,0,2017-10-04T16:55:59Z,2017-10-05T00:12:38Z,2017-10-05T00:12:38Z,MEMBER,,,,"I think we're missing the [`shuffle`](http://unidata.github.io/netcdf4-python/#section9) key from the valid encodings list below: https://github.com/pydata/xarray/blob/24643ecee2eab04d0f84c41715d753e829f448e6/xarray/backends/netCDF4_.py#L155-L156 ```Python var = xr.Variable(('x',), [1, 2, 3], {}, {'chunking': (2, 1)}) encoding = _extract_nc4_variable_encoding(var, raise_on_invalid=True) ``` ``` variable = array([1, 2, 3]), raise_on_invalid = True, lsd_okay = True, backend = 'netCDF4' def _extract_nc4_variable_encoding(variable, raise_on_invalid=False, lsd_okay=True, backend='netCDF4'): encoding = variable.encoding.copy() safe_to_drop = set(['source', 'original_shape']) valid_encodings = set(['zlib', 'complevel', 'fletcher32', 'contiguous', 'chunksizes']) if lsd_okay: valid_encodings.add('least_significant_digit') if (encoding.get('chunksizes') is not None and (encoding.get('original_shape', variable.shape) != variable.shape) and not raise_on_invalid): del encoding['chunksizes'] for k in safe_to_drop: if k in encoding: del encoding[k] if raise_on_invalid: invalid = [k for k in encoding if k not in valid_encodings] if invalid: raise ValueError('unexpected encoding parameters for %r backend: ' > ' %r' % (backend, invalid)) E ValueError: unexpected encoding parameters for 'netCDF4' backend: ['shuffle'] xarray/backends/netCDF4_.py:173: ValueError ```","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1606/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 245893358,MDU6SXNzdWUyNDU4OTMzNTg=,1493,ENH: points coord from isel/sel_points should be a MultiIndex,2443309,closed,0,,,1,2017-07-27T00:33:42Z,2017-09-07T15:25:40Z,2017-09-07T15:25:40Z,MEMBER,,,,"We implemented the pointwise indexing methods (`isel_points` and `sel_points`) before we had MultiIndex support. Would it make sense to update these methods to return objects with coordinates defined as a MultiIndex? Current behavior: ```Python print('original --> \n', ds) lons = [-88, -85.9] lats = [34.2, 31.9] subset = ds.sel_points(lon=lons, lat=lats, method='nearest') print('subset --> \n', subset) ``` yields: ``` original --> Dimensions: (lat: 224, lon: 464, time: 19709) Coordinates: * lat (lat) float64 25.06 25.19 25.31 25.44 25.56 25.69 25.81 25.94 ... * lon (lon) float64 -124.9 -124.8 -124.7 -124.6 -124.4 -124.3 -124.2 ... * time (time) float64 5.548e+04 5.548e+04 5.548e+04 5.548e+04 ... Data variables: pcp (time, lat, lon) float64 nan nan nan nan nan nan nan nan nan ... subset --> Dimensions: (points: 2, time: 19709) Coordinates: lat (points) float64 34.19 31.94 lon (points) float64 -87.94 -85.94 * time (time) float64 5.548e+04 5.548e+04 5.548e+04 5.548e+04 ... Dimensions without coordinates: points Data variables: pcp (points, time) float64 0.0 5.698 0.0 0.0 14.66 0.0 0.0 0.0 0.0 ... ``` Maybe it makes sense to return an object with a MultiIndex like: ```Python new = pd.MultiIndex.from_arrays([subset.lon.to_index(), subset.lat.to_index()], names=['lon', 'lat']) print(new) ``` ``` MultiIndex(levels=[[-87.9375, -85.9375], [31.9375, 34.1875]], labels=[[0, 1], [1, 0]], names=['lon', 'lat']) ``` xref: #214, #475, #507 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1493/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 254430377,MDU6SXNzdWUyNTQ0MzAzNzc=,1542,Testing: Failing tests on py36-pandas-dev,2443309,closed,0,,2415632,4,2017-08-31T18:40:47Z,2017-09-05T22:22:32Z,2017-09-05T22:22:32Z,MEMBER,,,,"We currently have 7 failing tests when run against the pandas development code ([travis](https://travis-ci.org/pydata/xarray/jobs/270511674)). Question for @shoyer - can you take a look at these and see if we should try to get a fix in place prior to v.0.10.0? It looks like Pandas.0.21 is slated for release on Sept. 30.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1542/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 254217141,MDU6SXNzdWUyNTQyMTcxNDE=,1540,BUG: Dask distributed integration tests failing on Travis,2443309,closed,0,,,10,2017-08-31T05:41:50Z,2017-09-05T09:18:01Z,2017-09-01T01:09:11Z,MEMBER,,,,"Recent builds on travis are failing for the integration tests for dask distributed ([example](https://travis-ci.org/pydata/xarray/jobs/270222865#L3656)). Those tests are: - `test_dask_distributed_integration_test[h5netcdf]` - `test_dask_distributed_integration_test[netcdf4]` The traceback includes this detail: ``` _______________ test_dask_distributed_integration_test[netcdf4] ________________ loop = engine = 'netcdf4' @pytest.mark.parametrize('engine', ENGINES) def test_dask_distributed_integration_test(loop, engine): with cluster() as (s, _): with distributed.Client(s['address'], loop=loop): original = create_test_data() with create_tmp_file(allow_cleanup_failure=ON_WINDOWS) as filename: original.to_netcdf(filename, engine=engine) with xr.open_dataset(filename, chunks=3, engine=engine) as restored: assert isinstance(restored.var1.data, da.Array) > computed = restored.compute() xarray/tests/test_distributed.py:33: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ xarray/core/dataset.py:487: in compute return new.load() xarray/core/dataset.py:464: in load evaluated_data = da.compute(*lazy_data.values()) ../../../miniconda/envs/test_env/lib/python2.7/site-packages/dask/base.py:206: in compute results = get(dsk, keys, **kwargs) ../../../miniconda/envs/test_env/lib/python2.7/site-packages/distributed/client.py:1923: in get results = self.gather(packed, asynchronous=asynchronous) ../../../miniconda/envs/test_env/lib/python2.7/site-packages/distributed/client.py:1368: in gather asynchronous=asynchronous) ../../../miniconda/envs/test_env/lib/python2.7/site-packages/distributed/client.py:540: in sync return sync(self.loop, func, *args, **kwargs) ../../../miniconda/envs/test_env/lib/python2.7/site-packages/distributed/utils.py:239: in sync six.reraise(*error[0]) ../../../miniconda/envs/test_env/lib/python2.7/site-packages/distributed/utils.py:227: in f result[0] = yield make_coro() ../../../miniconda/envs/test_env/lib/python2.7/site-packages/tornado/gen.py:1055: in run value = future.result() ../../../miniconda/envs/test_env/lib/python2.7/site-packages/tornado/concurrent.py:238: in result raise_exc_info(self._exc_info) ../../../miniconda/envs/test_env/lib/python2.7/site-packages/tornado/gen.py:1063: in run yielded = self.gen.throw(*exc_info) ../../../miniconda/envs/test_env/lib/python2.7/site-packages/distributed/client.py:1246: in _gather traceback) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > c = a[b] E TypeError: string indices must be integers ``` Distributed [v.1.18.1](https://github.com/dask/distributed/releases/tag/1.18.1) was released 5 days ago so there must have been a breaking change that has been passed down to us. cc @shoyer, @mrocklin ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1540/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 140063713,MDU6SXNzdWUxNDAwNjM3MTM=,790,ENH: Optional Read-Only RasterIO backend,2443309,closed,0,,,15,2016-03-11T02:00:32Z,2017-06-06T10:25:22Z,2017-06-06T10:25:22Z,MEMBER,,,,"RasterIO is a GDAL based library that provides _Fast and direct raster I/O for use with Numpy and SciPy_. I've just used it a bit but have been generally impressed with its support for a range of ASCII and binary raster formats. It might be a nice addition to the suite of backends already available in `xarray`. I'm envisioning a functionality akin to what we provide in the PyNIO backend, which is to say, read-only support for which ever file types RasterIO supports. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/790/reactions"", ""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 124569898,MDU6SXNzdWUxMjQ1Njk4OTg=,696,Doc updates,2443309,closed,0,,,1,2016-01-02T01:37:58Z,2016-12-29T02:36:56Z,2016-12-29T02:36:56Z,MEMBER,,,,"Now that ReadTheDocs supports using `conda`, we can - use `cartopy` to plot the maps at build time - standardize on Python 3 xref: #695 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/696/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 138045063,MDU6SXNzdWUxMzgwNDUwNjM=,781,Don't infer x/y coordinates interval breaks for cartopy plot axes,2443309,closed,0,,,9,2016-03-03T01:22:19Z,2016-11-10T22:55:05Z,2016-11-10T22:55:05Z,MEMBER,,,,"The `DataArray.plot.pcolormesh()` method modifies the x/y coordinates of its plots. I'm finding that, at least for custom `cartopy` projections, the offset applied [here](https://github.com/pydata/xarray/blob/master/xarray/plot/plot.py#L543-L544) causes some real issues downstream. @clarkfitzg - Do you see any problem with treating the x/y offset in the [same way](https://github.com/pydata/xarray/blob/master/xarray/plot/plot.py#L550-L553) as the axis limits? ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/781/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 109589162,MDU6SXNzdWUxMDk1ODkxNjI=,605,Support Two-Dimensional Coordinate Variables,2443309,closed,0,,741199,11,2015-10-02T23:27:18Z,2016-07-31T23:02:46Z,2016-07-31T23:02:46Z,MEMBER,,,,"The CF Conventions supports the notion of a 2d coordinate variable in the case of irregularly spaced data. An example of this sort of dataset is below. The CF Convention is to add a ""coordinates"" attribute with a string describing the 2d coordinates. ``` dimensions: xc = 128 ; yc = 64 ; lev = 18 ; variables: float T(lev,yc,xc) ; T:long_name = ""temperature"" ; T:units = ""K"" ; T:coordinates = ""lon lat"" ; float xc(xc) ; xc:axis = ""X"" ; xc:long_name = ""x-coordinate in Cartesian system"" ; xc:units = ""m"" ; float yc(yc) ; yc:axis = ""Y"" ; yc:long_name = ""y-coordinate in Cartesian system"" ; yc:units = ""m"" ; float lev(lev) ; lev:long_name = ""pressure level"" ; lev:units = ""hPa"" ; float lon(yc,xc) ; lon:long_name = ""longitude"" ; lon:units = ""degrees_east"" ; float lat(yc,xc) ; lat:long_name = ""latitude"" ; lat:units = ""degrees_north"" ; ``` I'd like to discuss how we could support this in xray. There motivating application for this is in plotting operations but it may also have application in other grouping and remapping operations (e.g. #324, #475, #486). One option would just to honor the ""coordinates"" attr in plotting and use the specified coordinates as the x/y values. ref: http://cfconventions.org/Data/cf-conventions/cf-conventions-1.6/build/cf-conventions.html#idp5559280 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/605/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 122776511,MDU6SXNzdWUxMjI3NzY1MTE=,681,"to_netcdf on Python 3: ""string"" qualifier on attributes ",2443309,closed,0,,,8,2015-12-17T16:56:59Z,2016-06-16T08:27:33Z,2016-03-01T21:49:36Z,MEMBER,,,,"I've had a number of people ask me about this and I think we can figure out a way to fix this. In python3, variabile attributes in files written with `Dataset.to_netcdf` end up with the ""string"" type qualifier shown below. This causes all sorts of problems with other netcdf programs. Is this related to https://github.com/Unidata/netcdf4-python/issues/485? ``` bash PRISM$ ncdump -h prism_historical_conus4k.189501-201510.nc netcdf prism_historical_conus4k.189501-201510 { dimensions: latitude = 621 ; longitude = 1405 ; time = 1450 ; variables: double latitude(latitude) ; double longitude(longitude) ; int64 time(time) ; string time:units = ""days since 1895-01-01 00:00:00"" ; string time:calendar = ""proleptic_gregorian"" ; float prcp(time, latitude, longitude) ; string prcp:units = ""mm"" ; string prcp:description = ""precipitation "" ; string prcp:long_name = ""precipitation"" ; // global attributes: string :title = ""PRISM: Parameter-elevation Regressions on Independent Slopes Model"" ; } ``` cc @lizaclark ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/681/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 156186767,MDU6SXNzdWUxNTYxODY3Njc=,855,drop support for Python 2.6,2443309,closed,0,,,0,2016-05-23T01:53:15Z,2016-05-23T19:38:07Z,2016-05-23T19:38:07Z,MEMBER,,,,"@shoyer polled the xarray users list about dropping Python 2.6 from the supported versions of Python for xarray. There were no complaints so it looks like we are moving forward on this at the next major release (0.8). xref: https://groups.google.com/forum/#!searchin/xarray/2.6/xarray/JVIUiIhEW_8/qBjxmestCQAJ ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/855/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 113499493,MDU6SXNzdWUxMTM0OTk0OTM=,641,add rolling_apply method or function,2443309,closed,0,,,13,2015-10-27T03:30:11Z,2016-02-20T02:32:33Z,2016-02-20T02:32:33Z,MEMBER,,,,"Pandas has a generic [`rolling_apply`](http://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.rolling_apply.html) function. It would be nice to support a similar api on xray objects. The api I have in mind is something like: ``` Python # DataArray.rolling_apply(window, func, min_periods=None, freq=None, # center=False, args=(), kwargs={}) da.rolling_apply(7, np.mean) ``` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/641/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 108769226,MDU6SXNzdWUxMDg3NjkyMjY=,593,Bug when accessing sorted dataset before loading,2443309,closed,0,,,6,2015-09-28T23:58:29Z,2016-01-04T23:11:55Z,2015-10-02T21:41:11Z,MEMBER,,,,"I ran into this bug this afternoon. If I sort a Dataset using `isel` before loading the data, I end up with an error in the `netCDF4` backend. If I call `Dataset.load()` before sorting the Dataset, I get the expected behavior. First some info on my environment (everything should be fresh): ``` Python version : 3.4.3 |Anaconda 2.3.0 (x86_64)| (default, Mar 6 2015, 12:07:41) [GCC 4.2.1 (Apple Inc. build 5577)] xray version : 0.6.0 numpy version : 1.9.3 netCDF4 version : 1.1.9 ``` Now for a simplified example that reproduces the bug: ``` Python In [1]: import xray import numpy as np import netCDF4 In [2]: random_data = np.random.random(size=(4, 6)) dim0 = [0, 1, 2, 3] dim1 = [0, 2, 1, 3, 5, 4] # We will sort this in a later step da = xray.DataArray(data=random_data, dims=('dim0', 'dim1'), coords={'dim0': dim0, 'dim1': dim1}, name='randovar') ds = da.to_dataset() ds.to_netcdf('rando.nc') In [3]: ds2 = xray.open_dataset('rando.nc') # ds2.load() # work around to prevent IndexError inds = np.argsort(ds2.dim1.values) ds2 = ds2.isel(dim1=inds) print(ds2.randovar) Out[3]: --------------------------------------------------------------------------- IndexError Traceback (most recent call last) in () 2 inds = np.argsort(ds2.dim1.values) 3 ds2 = ds2.isel(dim1=inds) ----> 4 print(ds2.randovar) ... /Users/jhamman/anaconda/lib/python3.4/site-packages/xray/backends/netCDF4_.py in __getitem__(self, key) 43 else: 44 getitem = operator.getitem ---> 45 data = getitem(self.array, key) 46 if self.ndim == 0: 47 # work around for netCDF4-python's broken handling of 0-d netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Variable.__getitem__ (netCDF4/_netCDF4.c:30994)() /Users/jhamman/anaconda/lib/python3.4/site-packages/netCDF4/utils.py in _StartCountStride(elem, shape, dimensions, grp, datashape, put) 220 # duplicate indices in the sequence) 221 msg = ""integer sequences in slices must be sorted and cannot have duplicates"" --> 222 raise IndexError(msg) 223 # convert to boolean array. 224 # if unlim, let boolean array be longer than current dimension IndexError: integer sequences in slices must be sorted and cannot have duplicates ``` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/593/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 110102454,MDU6SXNzdWUxMTAxMDI0NTQ=,611,facet grid axis labels are None,2443309,closed,0,,1307323,4,2015-10-06T21:12:50Z,2016-01-04T23:11:55Z,2015-10-09T14:25:57Z,MEMBER,,,,"The dim names on this plot are not showing up (e.g. `None` is not right, it should be `x` and `y`): ``` Python data = (np.random.random(size=(20, 25, 12)) + np.linspace(-3, 3, 12)) # range is ~ -3 to 4 da = xray.DataArray(data, dims=['x', 'y', 'time'], name='data') fg = da.plot.pcolormesh(col='time', col_wrap=4) ``` ![fixed](https://cloud.githubusercontent.com/assets/2443309/10322611/6821bd50-6c33-11e5-96e8-5d69a5aeca21.png) ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/611/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 109434899,MDU6SXNzdWUxMDk0MzQ4OTk=,602,latest docs are broken,2443309,closed,0,1217238,1368762,4,2015-10-02T05:48:21Z,2016-01-02T01:31:17Z,2016-01-02T01:31:17Z,MEMBER,,,,"Looking at the doc build from tonight, something happened and netCDF4 isn't getting picked up. All the docs depending on the netCDF4 package are broken (e.g. plotting, IO, etc.). @shoyer - You may be able to just resubmit the doc build, or maybe we need to fix something. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/602/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 110801359,MDU6SXNzdWUxMTA4MDEzNTk=,617,travis builds are broken,2443309,closed,0,,,2,2015-10-10T15:39:51Z,2015-10-23T22:26:43Z,2015-10-23T22:26:43Z,MEMBER,,,,"Tests are failing on Python 2.7 and 3.4. We just started getting pandas 0.17 and numpy 1.10 so that is probably the source of the issue. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/617/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 110040239,MDU6SXNzdWUxMTAwNDAyMzk=,610,don't throw away variable specific coordinates information,2443309,closed,0,,,0,2015-10-06T15:50:41Z,2015-10-08T18:03:19Z,2015-10-08T18:03:19Z,MEMBER,,,,"Currently, we decode the `coordinates` attribute, when present, but it doesn't end up in the DataArray's `encoding` attribute (https://github.com/xray/xray/blob/master/xray/conventions.py#L822-L832). This should be changed so the user can reference the `coordinates` attribute after decoding. xref: #605 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/610/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 109553434,MDU6SXNzdWUxMDk1NTM0MzQ=,603,support using Cartopy with facet grids,2443309,closed,0,,,1,2015-10-02T19:06:33Z,2015-10-06T15:10:01Z,2015-10-06T15:10:01Z,MEMBER,,,,"Currently, I don't think it is possible to specify a Cartopy projection for facet grid plots. It would be nice to be able to specify either the subplots array including Cartopy projections (e.g. `ax=axes`) or a projection key word argument via (`subplots_kw=dict(projection=...)`) directly when using the xray's facet grid. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/603/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 101061611,MDU6SXNzdWUxMDEwNjE2MTE=,533,DataArray.name should always be a string,2443309,closed,0,,,2,2015-08-14T17:36:02Z,2015-09-18T17:35:26Z,2015-09-18T17:35:26Z,MEMBER,,,,"Consider the following example: ``` Python import numpy as np import xray da = xray.DataArray(np.random.random((4, 5))) ds = da.to_dataset(name=0) # or name=True, or name=(4) ds.to_netcdf('test.nc') ``` raises this error: ``` python /Users/jhamman/anaconda/lib/python3.4/site-packages/xray/backends/netCDF4_.py in prepare_variable(self, name, variable) 228 endian='native', 229 least_significant_digit=encoding.get('least_significant_digit'), --> 230 fill_value=fill_value) 231 nc4_var.set_auto_maskandscale(False) 232 netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Dataset.createVariable (netCDF4/_netCDF4.c:13217)() /Users/jhamman/anaconda/lib/python3.4/posixpath.py in normpath(path) 330 if path == empty: 331 return dot --> 332 initial_slashes = path.startswith(sep) 333 # POSIX allows one or two initial slashes, but treats three or more 334 # as single slash. AttributeError: 'int' object has no attribute 'startswith' ``` I think one way to solve this is to cast the name attribute to a string at the time of assignment. Another way is just to raise an error if a not string variable name is used. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/533/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 97861940,MDU6SXNzdWU5Nzg2MTk0MA==,500,discrete colormap option for imshow and pcolormesh,2443309,closed,0,,,9,2015-07-29T05:07:18Z,2015-08-06T16:06:33Z,2015-08-06T16:06:33Z,MEMBER,,,,"It may be nice to include an option for a discrete colormap/colorbar for the `imshow` and `pcolormesh` methods. I would suggest that the default behavior remains a continuous colormap. Perhaps adding an argument such as `cmap_intervals` would allow for easy discretization of the colormap. The logic in #499 takes care of most of the details for this issue. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/500/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 83000406,MDU6SXNzdWU4MzAwMDQwNg==,411,"unexpected positional indexing behavior with Dataset and DataArray ""isel""",2443309,closed,0,,,5,2015-05-31T04:48:10Z,2015-06-01T05:03:38Z,2015-06-01T05:03:29Z,MEMBER,,,,"I may be missing something here but I think the indexing behavior in `isel` is surprisingly different to that of `numpy` and is incongruent with the `xray` documentation. Either this is a bug or a feature that I don't understand. From the [`xray` docs on positional indexing](http://xray.readthedocs.org/en/latest/indexing.html#positional-indexing): > Indexing a DataArray directly works (mostly) just like it does for numpy arrays, except that the returned object is always another DataArray My example below uses two 1d `numpy` arrays to select from a 3d numpy array. When using pure `numpy`, I get a 2d array back. In my view, this is the expected behavior. When using the `xray.Dataset` or `xray.DataArray`, I get an oddly shaped 3d array back with a duplicate dimension. ``` python import numpy as np import xray import sys print('python version:', sys.version) print('numpy version:', np.version.full_version) print('xray version:', xray.version.version) ``` ``` python version: 3.4.3 |Anaconda 2.2.0 (x86_64)| (default, Mar 6 2015, 12:07:41) [GCC 4.2.1 (Apple Inc. build 5577)] numpy version: 1.9.2 xray version: 0.4.1 ``` ``` python # A few numpy arrays time = np.arange(100) lons = np.arange(40, 60) lats = np.arange(25, 70) np_data = np.random.random(size=(len(time), len(lats), len(lons))) # pick some random points to select ys, xs = np.nonzero(np_data[0] > 0.8) print(len(ys)) ``` ``` 176 ``` ``` python # create a xray.DataArray and xray.Dataset xr_data = xray.DataArray(np_data, [('time', time), ('y', lats), ('x', lons)]) # DataArray xr_ds = xr_data.to_dataset(name='data') # Dataset # numpy indexing print('numpy: ', np_data[:, ys, xs].shape) # xray positional indexing print('xray1: ', xr_data.isel(y=ys, x=xs).shape) print('xray2: ', xr_data[:, ys, xs].shape) print('xray3: ', xr_ds.isel(y=ys, x=xs)) ``` ``` numpy: (100, 176) xray1: (100, 176, 176) xray2: (100, 176, 176) xray3: Dimensions: (time: 100, x: 176, y: 176) Coordinates: * x (x) int64 46 47 57 45 48 50 51 54 57 59 48 52 49 50 52 53 55 57 43 46 47 48 53 ... * time (time) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 ... * y (y) int64 25 25 25 26 26 26 26 26 26 26 27 27 28 28 28 28 28 28 29 29 29 29 29 ... Data variables: data (time, y, x) float64 0.9343 0.8311 0.8842 0.3188 0.02052 0.4506 0.04177 ... ``` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/411/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 33273199,MDU6SXNzdWUzMzI3MzE5OQ==,122,Dataset.groupby summary methods,2443309,closed,0,,,3,2014-05-11T23:28:18Z,2014-06-23T07:25:08Z,2014-06-23T07:25:08Z,MEMBER,,,,"This may just be a documentation issue but the summary apply and combine methods for the `Dataset.GroupBy` object seem to be missing. ``` python In [146]: foo_values = np.random.RandomState(0).rand(3, 4) times = pd.date_range('2000-01-01', periods=3) ds = xray.Dataset({'time': ('time', times), 'foo': (['time', 'space'], foo_values)}) ds.groupby('time').mean() #replace time with time.month after #121 is adressed # ds.groupby('time').apply(np.mean) # also Errors here --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) in () 3 ds = xray.Dataset({'time': ('time', times), 4 'foo': (['time', 'space'], foo_values)}) ----> 5 ds.groupby('time').mean() 6 ds.groupby('time').apply(np.mean) AttributeError: 'DatasetGroupBy' object has no attribute 'mean' ``` Adding this functionality, if not already present, seems like a really nice addition to the package. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/122/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 33942756,MDU6SXNzdWUzMzk0Mjc1Ng==,138,keep attrs when reducing xray objects,2443309,closed,0,,,4,2014-05-21T00:40:19Z,2014-05-22T00:29:22Z,2014-05-22T00:29:22Z,MEMBER,,,,"Reduction operations currently drop all `Variable` and `Dataset` `attrs` when a reduction operation is performed. I'm proposing adding a keyword to these methods to allow for copying of the original `Variable` or `Dataset` `attrs`. The default value of the `keep_attrs` keyword would be `False`. For example: ``` python new = ds.mean(keep_attrs=True) ``` returns `new` with all the `Variable` and `Dataset` `attrs` as `ds` contained. Some previous discussion in #131 and #137. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/138/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 33272937,MDU6SXNzdWUzMzI3MjkzNw==,121,virtual variables not available when using open_dataset,2443309,closed,0,,,5,2014-05-11T23:11:21Z,2014-05-16T00:37:39Z,2014-05-16T00:37:39Z,MEMBER,,,,"The tutorial provides an example of how to use xray's `virtual_variables`. The same functionality is not availble from a Dataset object created by open_dataset. Tutorial: ``` python In [135]: foo_values = np.random.RandomState(0).rand(3, 4) times = pd.date_range('2000-01-01', periods=3) ds = xray.Dataset({'time': ('time', times), 'foo': (['time', 'space'], foo_values)}) ds['time.dayofyear'] Out[135]: array([1, 2, 3], dtype=int32) Attributes: Empty ``` however, reading a time coordinate / variable from a netCDF4 file, and applying the same logic raises an error: ``` In [136]: ds = xray.open_dataset('sample_for_xray.nc') ds['time'] Out[136]: array([1979-09-16 12:00:00, 1979-10-17 00:00:00, 1979-11-16 12:00:00, 1979-12-17 00:00:00], dtype=object) Attributes: dimensions: 1 long_name: time type_preferred: int In [137]: ds['time.dayofyear'] --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () ----> 1 ds['time.dayofyear'] /Users/jhamman/anaconda/lib/python2.7/site-packages/xray-0.2.0.dev_cc5e1b2-py2.7.egg/xray/dataset.pyc in __getitem__(self, key) 408 """"""Access the given variable name in this dataset as a `DataArray`. 409 """""" --> 410 return data_array.DataArray._constructor(self, key) 411 412 def __setitem__(self, key, value): /Users/jhamman/anaconda/lib/python2.7/site-packages/xray-0.2.0.dev_cc5e1b2-py2.7.egg/xray/data_array.pyc in _constructor(cls, dataset, name) 95 if name not in dataset and name not in dataset.virtual_variables: 96 raise ValueError('name %r must be a variable in dataset %r' ---> 97 % (name, dataset)) 98 obj._dataset = dataset 99 obj._name = name ValueError: name 'time.dayofyear' must be a variable in dataset Dimensions: (time: 4, x: 275, y: 205) Coordinates: time X x X y X Noncoordinates: Wind 0 2 1 Attributes: sample data for xray from RASM project ``` Is there a reason that the virtual time variables are only available if the dataset is created from a pandas date_range? Lastly, this could be related to the #118 . ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/121/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 33112594,MDU6SXNzdWUzMzExMjU5NA==,118,Problems parsing time variable using open_dataset,2443309,closed,0,,,4,2014-05-08T18:57:31Z,2014-05-16T00:37:28Z,2014-05-16T00:37:28Z,MEMBER,,,,"I'm noticing a problem parsing the time variable for at least the `noleap` calendar for a properly formatted time dimension. Any thoughts on why this is? ``` bash ncdump -c -t sample_for_xray.nc netcdf sample_for_xray { dimensions: time = UNLIMITED ; // (4 currently) y = 205 ; x = 275 ; variables: double Wind(time, y, x) ; Wind:units = ""m/s"" ; Wind:long_name = ""Wind speed"" ; Wind:coordinates = ""latitude longitude"" ; Wind:dimensions = ""2"" ; Wind:type_preferred = ""double"" ; Wind:time_rep = ""instantaneous"" ; Wind:_FillValue = 9.96920996838687e+36 ; double time(time) ; time:calendar = ""noleap"" ; time:dimensions = ""1"" ; time:long_name = ""time"" ; time:type_preferred = ""int"" ; time:units = ""days since 0001-01-01 0:0:0"" ; // global attributes: ... data: time = ""1979-09-16 12"", ""1979-10-17"", ""1979-11-16 12"", ""1979-12-17"" ; ``` ``` python ds = xray.open_dataset('sample_for_xray.nc') print ds['time'] ``` ``` --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in () 1 ds = xray.open_dataset('sample_for_xray.nc') ----> 2 print ds['time'] /home/jhamman/anaconda/lib/python2.7/site-packages/xray/common.pyc in __repr__(self) 40 41 def __repr__(self): ---> 42 return array_repr(self) 43 44 def _iter(self): /home/jhamman/anaconda/lib/python2.7/site-packages/xray/common.pyc in array_repr(arr) 122 summary = [''% (type(arr).__name__, name_str, dim_summary)] 123 if arr.size < 1e5 or arr._in_memory(): --> 124 summary.append(repr(arr.values)) 125 else: 126 summary.append('[%s values with dtype=%s]' % (arr.size, arr.dtype)) /home/jhamman/anaconda/lib/python2.7/site-packages/xray/data_array.pyc in values(self) 147 def values(self): 148 """"""The variables's data as a numpy.ndarray"""""" --> 149 return self.variable.values 150 151 @values.setter /home/jhamman/anaconda/lib/python2.7/site-packages/xray/variable.pyc in values(self) 217 def values(self): 218 """"""The variable's data as a numpy.ndarray"""""" --> 219 return utils.as_array_or_item(self._data_cached()) 220 221 @values.setter /home/jhamman/anaconda/lib/python2.7/site-packages/xray/utils.pyc in as_array_or_item(values, dtype) 56 # converted into an integer instead :( 57 return values ---> 58 values = as_safe_array(values, dtype=dtype) 59 if values.ndim == 0 and values.dtype.kind == 'O': 60 # unpack 0d object arrays to be consistent with numpy /home/jhamman/anaconda/lib/python2.7/site-packages/xray/utils.pyc in as_safe_array(values, dtype) 40 """"""Like np.asarray, but convert all datetime64 arrays to ns precision 41 """""" ---> 42 values = np.asarray(values, dtype=dtype) 43 if values.dtype.kind == 'M': 44 # np.datetime64 /home/jhamman/anaconda/lib/python2.7/site-packages/numpy/core/numeric.pyc in asarray(a, dtype, order) 458 459 """""" --> 460 return array(a, dtype, copy=False, order=order) 461 462 def asanyarray(a, dtype=None, order=None): /home/jhamman/anaconda/lib/python2.7/site-packages/xray/variable.pyc in __array__(self, dtype) 121 if dtype is None: 122 dtype = self.dtype --> 123 return self.array.values.astype(dtype) 124 125 def __getitem__(self, key): TypeError: Cannot cast datetime.date object from metadata [D] to [ns] according to the rule 'same_kind' ``` This file is available here: ftp://ftp.hydro.washington.edu/pub/jhamman/sample_for_xray.nc ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/118/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 33273376,MDU6SXNzdWUzMzI3MzM3Ng==,123,Selective variable reads in open_dataset,2443309,closed,0,,,2,2014-05-11T23:39:12Z,2014-05-12T02:25:10Z,2014-05-12T02:25:10Z,MEMBER,,,,"One of the beautiful things about the netCDF data model is that the variables can be read individually. I'm suggesting adding a `variables` keyword (or something along those lines) to the `open_dataset` function to support selecting one or more or all variables in a file. This will allow for faster reads and smaller memory usage when the full set of variables is not needed. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/123/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue