home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

2 rows where state = "open", type = "issue" and user = 5948670 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

type 1

  • issue · 2 ✖

state 1

  • open · 2 ✖

repo 1

  • xarray 2
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
351846466 MDU6SXNzdWUzNTE4NDY0NjY= 2374 Suggestion: Add option for default_fillvals to open_dataset MeraX 5948670 open 0     2 2018-08-18T19:47:53Z 2021-07-17T19:59:43Z   CONTRIBUTOR      

Hi,

May I suggest having a default_fillvals option to xarray.open_dataset (and xarray.open_dataarray)?

My problem:

I have netcdf data containing flagged data, that is flagged with the netcdf default fill value of 9.96...e+36. But xarray (0.10.8) only masks arrays that have an explicit fill_value set:

```python import netCDF4, xarray, numpy

nc = netCDF4.Dataset('test.nc', 'w', format='NETCDF4') nc.createDimension('x', 3)

var1 = nc.createVariable('var1', 'f8', ('x',)) var2 = nc.createVariable('var2', 'f8', ('x',), fill_value=netCDF4.default_fillvals['f8'])

var1[:] = numpy.array([0., 1., netCDF4.default_fillvals['f8']]) var2[:] = numpy.array([0., 1., netCDF4.default_fillvals['f8']]) print('netCDF4 var1', nc.variables['var1'][:]) print('netCDF4 var2', nc.variables['var2'][:]) nc.close()

ds = xarray.open_dataset('test.nc') print('xarray var1', ds.var1[:]) print('xarray var2', ds.var2[:]) ```

The problem is, that ds.var1 and ds.var2 are interpreted differently, although netCDF4 shows both as masked: netCDF4 var1 [0.0 1.0 --] netCDF4 var2 [0.0 1.0 --] xarray var1 <xarray.DataArray 'var1' (x: 3)> array([0.00000e+00, 1.00000e+00, 9.96921e+36]) Dimensions without coordinates: x xarray var2 <xarray.DataArray 'var2' (x: 3)> array([ 0., 1., nan]) Dimensions without coordinates: x

I agree, that it is a good default, to mask data, only if the fill_value attribute is set. But I think it would be useful to be able to pass default_fill values to open_dataset to enable reading data, that uses the implicit default values.

What do you think?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2374/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
455605431 MDU6SXNzdWU0NTU2MDU0MzE= 3020 Can we clarify decode_cf option of open_dataset? MeraX 5948670 open 0     2 2019-06-13T08:32:12Z 2019-11-16T20:45:19Z   CONTRIBUTOR      

Dea all,

I encountered a small unforeseen bug using the decode_cf of open_dataset. By only reading the doc string, I was not aware of the complete meaning of this parameter. Especially, that it overwrites other options.

Example: ds = xarray.open_dataset(file_name, mask_and_scale=True, decode_cf=False)

The result is an unmasked and not sclaed dataset. decode_cf simply overwrites other options. See the code: https://github.com/pydata/xarray/blob/stable/xarray/backends/api.py#L308-L312

A simple solution would be to explain, that decode_cf sets mask_and_scale, decode_times, concat_characters, and decode_coords to False. But probably it would be more convenient to detect option conflicts like in my example and raise a ValueError.

What do you think? Do you prefer any of these options? I could write the little PR.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3020/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 3767.129ms · About: xarray-datasette