home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

3 rows where repo = 13221727, state = "open" and user = 923438 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

type 1

  • issue 3

state 1

  • open · 3 ✖

repo 1

  • xarray · 3 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
482543307 MDU6SXNzdWU0ODI1NDMzMDc= 3232 Use pytorch as backend for xarrays fjanoos 923438 open 0     49 2019-08-19T21:45:15Z 2022-07-20T18:01:56Z   NONE      

I would be interested in using pytorch as a backend for xarrays - because: a) pytorch is very similar to numpy - so the conceptual overhead is small b) [most helpful] enable having a GPU as the underlying hardware for compute - which would provide non-trivial speed up c) it would allow seamless integration with deep-learning algorithms and techniques

Any thoughts on what the interest for such a feature might be ? I would be open to implementing parts of it - so any suggestions on where I could start ?

Thanks

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3232/reactions",
    "total_count": 4,
    "+1": 4,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  reopened xarray 13221727 issue
495382528 MDU6SXNzdWU0OTUzODI1Mjg= 3320 Error saving xr.Dataset with timezone aware time index to netcdf format. fjanoos 923438 open 0     1 2019-09-18T18:20:42Z 2022-01-17T21:23:02Z   NONE      

When I try to save a xr.Dataset that was created from a pandas dataframe with tz-aware time index ( see #3291) - xarray converts the time index into a int64 nanosecs

For example, this is what the converted dataset looks like: <xarray.Dataset> Dimensions: (symbol: 3196, time: 4977) Coordinates: * time (time) object 946933200000000000 ... 1566334800000000000 * symbol (symbol) int64 0 1 2 3 4 5 6 ... 3189 3190 3191 3192 3193 3194 3195 Data variables: var_0 (time, symbol) float32 nan 4301510000.0 nan nan ... nan nan nan nan var_1 (time, symbol) object nan False nan nan nan ... nan nan nan nan nan var_2 (time, symbol) float32 nan 475.0 nan nan nan ... nan nan nan nan var_3 (time, symbol) float32 nan 475.0 nan nan nan ... nan nan nan nan var_5 (time, symbol) float32 nan 475.9 nan nan nan ... nan nan nan nan var_6 (time, symbol) float32 nan 475.9 nan nan nan ... nan nan nan nan var_7 (time, symbol) float32 nan 429.5 nan nan nan ... nan nan nan nan var_8 (time, symbol) float32 nan 429.5 nan nan nan ... nan nan nan nan var_10 (time, symbol) float32 nan -0.06736842 nan nan ... nan nan nan nan var_11 (time, symbol) float32 nan 0.05085102 nan nan ... nan nan nan nan var_12 (time, symbol) float32 nan 0.029103609 nan nan ... nan nan nan nan var_13 (time, symbol) float32 nan 0.048769474 nan nan ... nan nan nan nan var_14 (time, symbol) float32 nan 442.9 nan nan nan ... nan nan nan nan var_15 (time, symbol) float32 nan 442.9 nan nan nan ... nan nan nan nan var_16 (time, symbol) float32 nan nan nan nan nan ... nan nan nan nan nan var_17 (time, symbol) float32 nan nan nan nan nan ... nan nan nan nan nan var_18 (time, symbol) float32 nan nan nan nan nan ... nan nan nan nan nan var_19 (time, symbol) float32 nan nan nan nan nan ... nan nan nan nan nan var_20 (time, symbol) float32 nan nan nan nan nan ... nan nan nan nan nan var_21 (time, symbol) float32 nan 9501900.0 nan nan ... nan nan nan nan var_22 (time, symbol) float32 nan 9501900.0 nan nan ... nan nan nan nan

Now when I try to save this dataset using python pds.to_netcdf( ... )

I get the following error:

Dropping into pdb when this error is hit - it looks like the problem is with the time index.

After converting the time index into a regular int index by: python pds = pds.assign_coords(time=np.arange( len( pds.time )) ) pds.to_netcdf( ... ) this works OK.

And this also works !! python pds = pds.assign_coords(time=pd.to_datetime( pds.time ) ) pds.to_netcdf( ... ) Note pd.to_datetime(pds.time) drops the timezone from the index - so the issue is very much about saving tz-aware time indices.

Any ideas on what I can do about this ?

Thanks! -firdaus

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3320/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
490618213 MDU6SXNzdWU0OTA2MTgyMTM= 3291 xr.DataSet.from_dataframe / xr.DataArray.from_series does not preserve DateTimeIndex with timezone fjanoos 923438 open 0     4 2019-09-07T10:10:40Z 2021-04-21T21:00:41Z   NONE      

Problem Description

When using DataSet.from_dataframe (DataArray.from_series) to convert a pandas dataframe with DateTimeIndex having a timezone - xarray convert the datetime into a nanosecond index - rather than keeping it as a datetime-index type.

MCVE Code Sample

python print( df.index ) DatetimeIndex(['2000-01-03 16:00:00-05:00', '2000-01-03 16:00:00-05:00', '2000-01-03 16:00:00-05:00', '2000-01-03 16:00:00-05:00', ... '2019-08-20 16:00:00-05:00', '2019-08-20 16:00:00-05:00'], dtype='datetime64[ns, EST]', name='time', length=12713014, freq=None) python ds = xr.DataSet.from_dataframe( df.head( 1000 ) ) print( ds['time'] ) <xarray.DataArray 'time' (time: 7)> array([946933200000000000, 947019600000000000, 947106000000000000, 947192400000000000, 947278800000000000, 947538000000000000, 947624400000000000, ...], dtype=object) Coordinates: * time (time) object 946933200000000000 ... 947624400000000000

Expected Output

After removing the tz localization from the DateTimeIndex of the dataframe , the conversion to a DataSet preserves the time-index (without converting it to nanoseconds)

python df.index = df.index.tz_convert('UTC').tz_localize(None) ds = xr.DataSet.from_dataframe( df.head(1000) ) print( ds['time] ) <xarray.DataArray 'time' (time: 7)> array(['2000-01-03T21:00:00.000000000', '2000-01-04T21:00:00.000000000', '2000-01-05T21:00:00.000000000', '2000-01-06T21:00:00.000000000', '2000-01-07T21:00:00.000000000', '2000-01-10T21:00:00.000000000', '2000-01-11T21:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] 2000-01-03T21:00:00 ... 2000-01-11T21:00:00

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 (default, Mar 27 2019, 22:11:17) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 4.9.0-9-amd64 machine: x86_64 processor: byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: None xarray: 0.12.3+81.g41fecd86 pandas: 0.24.2 numpy: 1.16.2 scipy: 1.2.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.9.0 Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 1.1.4 distributed: 1.26.0 matplotlib: 3.0.3 cartopy: None seaborn: 0.9.0 numbagg: None setuptools: 40.8.0 pip: 19.0.3 conda: 4.7.11 pytest: 4.3.1 IPython: 7.4.0 sphinx: 1.8.5
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3291/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 27.029ms · About: xarray-datasette