home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

2 rows where type = "issue" and user = 12465248 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue · 2 ✖

state 1

  • closed 2

repo 1

  • xarray 2
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1033950863 I_kwDOAMm_X849oNaP 5888 open_[mf]dataset ds.encoding['source'] for pathlib.Path? jmccreight 12465248 closed 0     2 2021-10-22T21:10:23Z 2022-09-21T17:11:58Z 2022-09-13T07:17:39Z CONTRIBUTOR      

The question: do you want to support pathlib objects arriving at ds.encoding['source']?

What happened: If you pass a pathlib object to open_[mf]dataset, the ds.encoding['source'] is not set.

What you expected to happen: I believe that Paths used to work and now they dont, which greatly confused me.

Minimal Complete Verifiable Example:

```python (hv) jamesmcc@panama[523]:~> cat enc_source_pathlib.py import numpy as np from pathlib import Path import xarray as xr

inspiration in the test

https://github.com/pydata/xarray/blob/97887fd9bbfb2be58b491155c6bb08498ce294ca/xarray/tests/test_backends.py#L4918

rnddata = np.random.randn(10) ds = xr.Dataset({"foo": ("x", rnddata)}) file_name = "rnddata.nc" ds.to_netcdf(file_name)

with xr.open_dataset(file_name) as ds: assert ds.encoding["source"] == file_name

with xr.open_dataset(Path(file_name)) as ds: print(ds.encoding) assert ds.encoding["source"] == file_name

(hv) jamesmcc@panama[524]:~/> ipython enc_source_pathlib.py /Users/jamesmcc/python_libs/xarray/xarray/backends/cfgrib_.py:28: UserWarning: Failed to load cfgrib - most likely there is a problem accessing the ecCodes library. Try import cfgrib to get the full error message "Failed to load cfgrib - most likely there is a problem accessing the ecCodes library. " /Users/jamesmcc/python_libs/xarray/xarray/backends/plugins.py:61: RuntimeWarning: Engine 'cfgrib' loading failed: Cannot find the ecCodes library warnings.warn(f"Engine {name!r} loading failed:\n{ex}", RuntimeWarning) {'unlimited_dims': set()}


KeyError Traceback (most recent call last) ~/python_libs/xarray/enc_source_pathlib.py in <module> 16 with xr.open_dataset(Path(file_name)) as ds: 17 print(ds.encoding) ---> 18 assert ds.encoding["source"] == file_name 19 20

KeyError: 'source' ```

I scanned git blame, but didnt turn up any obviously undesired changes.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5888/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
433490490 MDU6SXNzdWU0MzM0OTA0OTA= 2895 Set/preserve the character array dimension name jmccreight 12465248 closed 0     2 2019-04-15T21:40:33Z 2019-04-19T22:50:36Z 2019-04-19T22:50:36Z CONTRIBUTOR      

This is a new feature proposal not a bug. I'll open a PR against this issue momentarily, it consists of 4 lines of new code.

I've found it highly annoying that one can not set the name of the character array dimension. Looking at the code, I basically found what I expected, except for what I added. Summary: Using a variable's variable.encoding one can decode the name into variable.encoding['char_dim_name'] or one can simply set it when creating data from scratch. The "char_dim_name" can be applied upon encoding. It's simple. All the new code is the same code that already handled character arrays, so there may not be any nasty edge cases.

This shows how it works and the behavoir it changes:

```

# Using the proposed changes....

user@machine-session-1[1]:~/Downloads> ipython

import xarray as xa char_arr = ['abc', 'def', 'ghi'] ds = xa.Dataset(data_vars={'char_arr': char_arr}) ds.char_arr.encoding.update({"dtype": "S1"})

Default/current behavior

ds.to_netcdf('char_arr_string.nc')

New functionality - name the character dimension.

ds.char_arr.encoding.update({"char_dim_name": "char_dim"}) ds.to_netcdf('char_arr_named.nc')

user@machine-session-2[1]:~/Downloads> ncdump -h char_arr_string.nc

netcdf char_arr_string {

dimensions:

char_arr = 3 ;

string3 = 3 ;

variables:

char char_arr(char_arr, string3) ;

char_arr:_Encoding = "utf-8" ;

}

user@machine-session-2[2]:~/Downloads> ncdump -h char_arr_named.nc

netcdf char_arr_named {

dimensions:

char_arr = 3 ;

char_dim = 3 ;

variables:

char char_arr(char_arr, char_dim) ;

char_arr:_Encoding = "utf-8" ;

}

New functionality - when decoding, preserve the character dimension name in the variable encoding for... encoding.

ds_read = xa.open_dataset('char_arr_named.nc') ds_read.char_arr.encoding

Out[4]:

{'_Encoding': 'utf-8',

'char_dim_name': 'char_dim',

'chunksizes': None,

'complevel': 0,

'contiguous': True,

'dtype': dtype('S1'),

'fletcher32': False,

'original_shape': (3, 3),

'shuffle': False,

'source': '/Users/james/Downloads/char_arr_named.nc',

'zlib': False}

ds_read.to_netcdf('char_arr_named_2.nc') exit()

user@machine-session-1[2]:~/Downloads> ncdump -h char_arr_named_2.nc

netcdf char_arr_named_2 {

dimensions:

char_arr = 3 ;

char_dim = 3 ;

variables:

char char_arr(char_arr, char_dim) ;

char_arr:_Encoding = "utf-8" ;

}

user@machine-session-1[3]:~/Downloads> pip uninstall -y xarray

user@machine-session-1[4]:~/Downloads> pip install xarray

user@machine-session-1[5]:~/Downloads> ipython

The old behavior... does not preserved the char dim name.

import xarray as xa ds_read = xa.open_dataset('char_arr_named.nc') ds_read.char_arr.encoding

Out[4]:

{'_Encoding': 'utf-8',

'chunksizes': None,

'complevel': 0,

'contiguous': True,

'dtype': dtype('S1'),

'fletcher32': False,

'original_shape': (3, 3),

'shuffle': False,

'source': '/Users/james/Downloads/char_arr_named.nc',

'zlib': False}

ds_read.to_netcdf('char_arr_string_2.nc')

user@machine-session-2[6]:~/Downloads> ncdump -y char_arr_string_2.nc

netcdf char_arr_string_2 {

dimensions:

char_arr = 3 ;

string3 = 3 ;

variables:

char char_arr(char_arr, string3) ;

char_arr:_Encoding = "utf-8" ;

}

```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2895/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 64.157ms · About: xarray-datasette