home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

2 rows where state = "closed", type = "issue" and user = 6883049 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue · 2 ✖

state 1

  • closed · 2 ✖

repo 1

  • xarray 2
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1343038233 I_kwDOAMm_X85QDSMZ 6929 Support new netcdf4 1.6.0 compression arguments markelg 6883049 closed 0     2 2022-08-18T12:35:34Z 2022-12-01T22:41:53Z 2022-12-01T22:41:53Z CONTRIBUTOR      

Is your feature request related to a problem?

When using the netcdf4 engine, I am not able to use the new "compression" argument to choose a compression scheme different from zlib in the encoding.

``` if raise_on_invalid: invalid = [k for k in encoding if k not in valid_encodings] if invalid:

          raise ValueError(
                f"unexpected encoding parameters for {backend!r} backend: {invalid!r}. Valid "
                f"encodings are: {valid_encodings!r}"
            )

E ValueError: unexpected encoding parameters for 'netCDF4' backend: ['compression']. Valid encodings are: {'fletcher32', 'zlib', 'contiguous', 'dtype', 'least_significant_digit', 'shuffle', 'complevel', '_FillValue', 'chunksizes'}

../../../../netCDF4_.py:279: ValueError ```

Furthermore, according to the release notes of 1.6.0, zlib argument is to be deprecated:

* add 'compression' kwarg to createVariable to enable new compression functionality in netcdf-c 4.9.0. 'None','zlib','szip','zstd','bzip2' 'blosc_lz','blosc_lz4','blosc_lz4hc','blosc_zlib' and 'blosc_zstd' are currently supported. 'blosc_shuffle', 'szip_mask' and 'szip_pixels_per_block' kwargs also added. compression='zlib' is equivalent to (the now deprecated) zlib=True. If the environment variable NETCDF_PLUGIN_DIR is set to point to the directory with the compression plugin lib__nc* files, then the compression plugins will be installed within the package and be automatically available (the binary wheels have this). Otherwise, the environment variable HDF5_PLUGIN_PATH needs to be set at runtime to point to plugins in order to use the new compression options.

I am using the last versions

xarray: 2022.6.0 pandas: 1.4.2 numpy: 1.22.4 scipy: None netCDF4: 1.6.0 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.6.0 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2022.6.0 distributed: 2022.6.0 matplotlib: 3.5.2 cartopy: None seaborn: None numbagg: None fsspec: 2022.5.0 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 62.6.0 pip: 22.1.2 conda: None pytest: 7.1.2 IPython: 8.4.0 sphinx: 5.1.1

Describe the solution you'd like

Update the netcdf4 backend to support these arguments. Should not be too difficult.

Describe alternatives you've considered

No response

Additional context

I can try to do this myself, it does not look hard.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6929/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
94012395 MDU6SXNzdWU5NDAxMjM5NQ== 457 xray raises error when opening datasets with multi-dimensional coordinate variables markelg 6883049 closed 0     2 2015-07-09T10:32:16Z 2015-09-03T13:34:42Z 2015-09-03T13:34:42Z CONTRIBUTOR      

Hello and thank you for this great package.

I have a (opendap) dataset where one coordinate (time24), is attached to a 2-dimensional coordinate variable. The reason is that it contains a set of forecasts that overlap in time, so the value of time24 depends on the run. Unfortunately it's not open so I can't share it for tests.

The main variable is:

float32 mean2t24(run, member, time24, lat, lon) long_name: Mean temperature at 2 metres since last 24 hours @ Ground or water surface

And the coordinate variables are:

``` int32 run(run) long_name: Run time for ForecastModelRunCollection standard_name: forecast_reference_time units: hours since 1981-01-01T00:00:00 _CoordinateAxisType: RunTime

|S1 member(member, maxStrlen64) standard_name: realization _CoordinateAxisType: Ensemble

int32 time24(run, time24) long_name: Forecast time for ForecastModelRunCollection standard_name: time units: hours since 1981-01-01T00:00:00 _CoordinateAxisType: Time

float32 lon(lon) units: degrees_east

float32 lat(lat) units: degrees_north ```

xray is currently unable to open this dataset:

ValueError: an index variable must be defined with 1-dimensional data

Which its OK, this looks like something difficult to support, but it will be fine if at least I could simply exclude the variable time24 for being read by xray. A flag like "exclude_variable=(var1, var2, ...)". And then xray would fill the coordinate with the default int64 values (0, 1, 2, 3, 4...) that uses when there is no coordinate for a dimension. This would be very useful also to exclude troublesome variables (e.g. corrupt, with weird data types, inconsistent when concatenating) that are present in many datasets. Another way to go could be to issue a warning instead of an error, and then fill the variable with the default values (0, 1, 2, 3, 4...)

I am looking at the code to see if I can implement this by myself, but I am not sure about how to proceed.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/457/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 22.223ms · About: xarray-datasette