id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 482543307,MDU6SXNzdWU0ODI1NDMzMDc=,3232,Use pytorch as backend for xarrays,923438,open,0,,,49,2019-08-19T21:45:15Z,2022-07-20T18:01:56Z,,NONE,,,,"I would be interested in using pytorch as a backend for xarrays - because: a) pytorch is very similar to numpy - so the conceptual overhead is small b) [most helpful] enable having a GPU as the underlying hardware for compute - which would provide non-trivial speed up c) it would allow seamless integration with deep-learning algorithms and techniques Any thoughts on what the interest for such a feature might be ? I would be open to implementing parts of it - so any suggestions on where I could start ? Thanks ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3232/reactions"", ""total_count"": 4, ""+1"": 4, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,reopened,13221727,issue 495382528,MDU6SXNzdWU0OTUzODI1Mjg=,3320,Error saving xr.Dataset with timezone aware time index to netcdf format.,923438,open,0,,,1,2019-09-18T18:20:42Z,2022-01-17T21:23:02Z,,NONE,,,,"When I try to save a xr.Dataset that was created from a pandas dataframe with tz-aware time index ( see #3291) - xarray converts the time index into a int64 nanosecs For example, this is what the converted dataset looks like: ``` Dimensions: (symbol: 3196, time: 4977) Coordinates: * time (time) object 946933200000000000 ... 1566334800000000000 * symbol (symbol) int64 0 1 2 3 4 5 6 ... 3189 3190 3191 3192 3193 3194 3195 Data variables: var_0 (time, symbol) float32 nan 4301510000.0 nan nan ... nan nan nan nan var_1 (time, symbol) object nan False nan nan nan ... nan nan nan nan nan var_2 (time, symbol) float32 nan 475.0 nan nan nan ... nan nan nan nan var_3 (time, symbol) float32 nan 475.0 nan nan nan ... nan nan nan nan var_5 (time, symbol) float32 nan 475.9 nan nan nan ... nan nan nan nan var_6 (time, symbol) float32 nan 475.9 nan nan nan ... nan nan nan nan var_7 (time, symbol) float32 nan 429.5 nan nan nan ... nan nan nan nan var_8 (time, symbol) float32 nan 429.5 nan nan nan ... nan nan nan nan var_10 (time, symbol) float32 nan -0.06736842 nan nan ... nan nan nan nan var_11 (time, symbol) float32 nan 0.05085102 nan nan ... nan nan nan nan var_12 (time, symbol) float32 nan 0.029103609 nan nan ... nan nan nan nan var_13 (time, symbol) float32 nan 0.048769474 nan nan ... nan nan nan nan var_14 (time, symbol) float32 nan 442.9 nan nan nan ... nan nan nan nan var_15 (time, symbol) float32 nan 442.9 nan nan nan ... nan nan nan nan var_16 (time, symbol) float32 nan nan nan nan nan ... nan nan nan nan nan var_17 (time, symbol) float32 nan nan nan nan nan ... nan nan nan nan nan var_18 (time, symbol) float32 nan nan nan nan nan ... nan nan nan nan nan var_19 (time, symbol) float32 nan nan nan nan nan ... nan nan nan nan nan var_20 (time, symbol) float32 nan nan nan nan nan ... nan nan nan nan nan var_21 (time, symbol) float32 nan 9501900.0 nan nan ... nan nan nan nan var_22 (time, symbol) float32 nan 9501900.0 nan nan ... nan nan nan nan ``` Now when I try to save this dataset using ```python pds.to_netcdf( ... ) ``` I get the following error: ![image](https://user-images.githubusercontent.com/923438/65174709-6d0e3600-da1f-11e9-954e-05e90a13753b.png) Dropping into pdb when this error is hit - it looks like the problem is with the time index. ![image](https://user-images.githubusercontent.com/923438/65174609-3afcd400-da1f-11e9-85da-652502893d01.png) After converting the time index into a regular int index by: ```python pds = pds.assign_coords(time=np.arange( len( pds.time )) ) pds.to_netcdf( ... ) ``` this works OK. And this also works !! ```python pds = pds.assign_coords(time=pd.to_datetime( pds.time ) ) pds.to_netcdf( ... ) ``` Note `pd.to_datetime(pds.time)` drops the timezone from the index - so the issue is very much about saving tz-aware time indices. Any ideas on what I can do about this ? Thanks! -firdaus ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3320/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 480786385,MDU6SXNzdWU0ODA3ODYzODU=,3218,merge_asof functionality,923438,closed,0,,,6,2019-08-14T16:57:22Z,2021-07-21T18:18:20Z,2021-07-21T18:18:20Z,NONE,,,,"Would it be possible to add some functionality to xarray merge that mimics pandas merge_asof ? This would be very useful when aligning timeseries dataarrays where the two arrays are misaligned. Thanks.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3218/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 496688781,MDU6SXNzdWU0OTY2ODg3ODE=,3330,Feature requests for DataArray.rolling,923438,closed,0,,,1,2019-09-21T18:58:21Z,2021-07-08T16:29:18Z,2021-07-08T16:29:18Z,NONE,,,,"In `DataArray.rolling` it would be really nice to have support for window sizes specified in the units of the dimension (esp. time). For example if `da` has dimensions ```(time, space, feature)``` with `time` as `DatetimeIndex` - then it should be possible specificy `da.rolling( time=pd.Timedelta( 100, 'D') )` as a valid window ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3330/reactions"", ""total_count"": 4, ""+1"": 4, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 490618213,MDU6SXNzdWU0OTA2MTgyMTM=,3291,xr.DataSet.from_dataframe / xr.DataArray.from_series does not preserve DateTimeIndex with timezone,923438,open,0,,,4,2019-09-07T10:10:40Z,2021-04-21T21:00:41Z,,NONE,,,,"#### Problem Description When using DataSet.from_dataframe (DataArray.from_series) to convert a pandas dataframe with DateTimeIndex having a timezone - xarray convert the datetime into a nanosecond index - rather than keeping it as a datetime-index type. #### MCVE Code Sample ```python print( df.index ) ``` ``` DatetimeIndex(['2000-01-03 16:00:00-05:00', '2000-01-03 16:00:00-05:00', '2000-01-03 16:00:00-05:00', '2000-01-03 16:00:00-05:00', ... '2019-08-20 16:00:00-05:00', '2019-08-20 16:00:00-05:00'], dtype='datetime64[ns, EST]', name='time', length=12713014, freq=None) ``` ```python ds = xr.DataSet.from_dataframe( df.head( 1000 ) ) print( ds['time'] ) ``` ``` array([946933200000000000, 947019600000000000, 947106000000000000, 947192400000000000, 947278800000000000, 947538000000000000, 947624400000000000, ...], dtype=object) Coordinates: * time (time) object 946933200000000000 ... 947624400000000000 ``` #### Expected Output After removing the tz localization from the DateTimeIndex of the dataframe , the conversion to a DataSet preserves the time-index (without converting it to nanoseconds) ```python df.index = df.index.tz_convert('UTC').tz_localize(None) ds = xr.DataSet.from_dataframe( df.head(1000) ) print( ds['time] ) ``` ``` array(['2000-01-03T21:00:00.000000000', '2000-01-04T21:00:00.000000000', '2000-01-05T21:00:00.000000000', '2000-01-06T21:00:00.000000000', '2000-01-07T21:00:00.000000000', '2000-01-10T21:00:00.000000000', '2000-01-11T21:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] 2000-01-03T21:00:00 ... 2000-01-11T21:00:00 ``` #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 (default, Mar 27 2019, 22:11:17) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 4.9.0-9-amd64 machine: x86_64 processor: byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: None xarray: 0.12.3+81.g41fecd86 pandas: 0.24.2 numpy: 1.16.2 scipy: 1.2.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.9.0 Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 1.1.4 distributed: 1.26.0 matplotlib: 3.0.3 cartopy: None seaborn: 0.9.0 numbagg: None setuptools: 40.8.0 pip: 19.0.3 conda: 4.7.11 pytest: 4.3.1 IPython: 7.4.0 sphinx: 1.8.5
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3291/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 496809167,MDU6SXNzdWU0OTY4MDkxNjc=,3332,Memory usage of `da.rolling().construct`,923438,closed,0,,,5,2019-09-22T17:35:06Z,2021-02-16T15:00:37Z,2021-02-16T15:00:37Z,NONE,,,,"If I were to do `data_array.rolling( time=1000 ).construct('temp_time')` - what is going on under hood ? Does it make a 1000 phyiscal copies of the original dataarray - or is it only returning a view ? I feel like it's the latter - but I'm seeing a memory spike (about 20-30% increase in total process memory consumption) when I use it - so there might be something else going on ? Any ideas / pointers would be appreciated. Thanks! ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3332/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue