home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

17 rows where user = 905179 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, created_at (date), updated_at (date)

issue 5

  • Better compression algorithms for NetCDF 5
  • "write to read-only" Error in xarray.open_mfdataset() with opendap datasets 5
  • Should the zarr backend support NCZarr conventions? 5
  • decode_cf not scaling and off-setting correctly 1
  • Document Xarray zarr encoding conventions 1

user 1

  • DennisHeimbigner · 17 ✖

author_association 1

  • NONE 17
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1090483461 https://github.com/pydata/xarray/issues/6374#issuecomment-1090483461 https://api.github.com/repos/pydata/xarray/issues/6374 IC_kwDOAMm_X85A_3UF DennisHeimbigner 905179 2022-04-06T16:46:32Z 2022-04-06T16:46:32Z NONE

As it is currently it is also not possible to write a zarr which follows the GDAL ZARR driver conventions. Writing the _CRS attribute also results in a TypeError:

Can you elaborate? What API are you using to do the write: python, netcdf-c, or what?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Should the zarr backend support NCZarr conventions? 1172229856
1081236360 https://github.com/pydata/xarray/issues/6374#issuecomment-1081236360 https://api.github.com/repos/pydata/xarray/issues/6374 IC_kwDOAMm_X85AcluI DennisHeimbigner 905179 2022-03-28T23:05:51Z 2022-03-28T23:05:51Z NONE

dimension names stored by xarray in .zattrs["_ARRAY_DIMENSIONS"] are stored by NCZarr in .zarray["_NCZARR_ARRAY"]["dimrefs"]

I made a recent change to this so that where possible, all NCZarr files contain the xarray _ARRAY_ATTRIBUTE. By "where possible" I mean that the array is in the root group and the dimensions it references are "defined" in the root group (i.e. they have the simple FQN "/XXX" where XXX is the dim name. This means that there is sometimes a duplication of information between _ARRAY_ATTRIBUTE and ".zarray["_NCZARR_ARRAY"]["dimrefs"].

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Should the zarr backend support NCZarr conventions? 1172229856
1076821132 https://github.com/pydata/xarray/issues/6374#issuecomment-1076821132 https://api.github.com/repos/pydata/xarray/issues/6374 IC_kwDOAMm_X85ALvyM DennisHeimbigner 905179 2022-03-23T21:07:01Z 2022-03-23T21:07:01Z NONE

I guess I was not clear. If you are willing to lose netcdf specific metadata, then I believe any xarray or zarr implementation should be able to read nczarr written data with no changes needed.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Should the zarr backend support NCZarr conventions? 1172229856
1076777717 https://github.com/pydata/xarray/issues/6374#issuecomment-1076777717 https://api.github.com/repos/pydata/xarray/issues/6374 IC_kwDOAMm_X85ALlL1 DennisHeimbigner 905179 2022-03-23T20:15:18Z 2022-03-23T20:15:18Z NONE

As the moment, NCzarr format files (as opposed to pure Zarr format files produced by NCZarr) do not include the Xarray _ARRAY_DIMENSIONS attribute. Now that I think about it, there is no reason not to include that attribute where it is meaningful, so I will make that change. After that change, the situation should be as follows:

Xarray can read any nczarr format file subject to the following conditions:
1. xarray attempts to read only the root group and ignores subgroups
    * this is because xarray cannot handle subgroups.
2. the xarray implementation ignores extra dictionary keys in e.g. .zarray and .zattr
   that it does not recognize
    * this should already be the case under the principle of "read broadly, write narrowly".
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Should the zarr backend support NCZarr conventions? 1172229856
1071720621 https://github.com/pydata/xarray/issues/6374#issuecomment-1071720621 https://api.github.com/repos/pydata/xarray/issues/6374 IC_kwDOAMm_X84_4Sit DennisHeimbigner 905179 2022-03-17T22:47:59Z 2022-03-17T22:47:59Z NONE

For Unidata and netcdf, I think the situation is briefly this.

In netcdf-4, dimensions are named objects that can "reside" inside groups. So for example we might have this: netcdf example { dimensions: x=1; y=10; z=20; group g1 { dimensions: a=1; y=10; z=5; variables: float v(/x, /g1/y, /z); } } So base dimension names (e.g. "z") can occur in different groups and can represent different dimension objects (with different sizes).

It is possible to reference any dimension using fully-qualified-names (FQNs) such as "/g1/y". This capability is important so that, for example, related dimensions can be isolated with a group.

NCZarr captures this information by recording fully qualified names as special keys. This differs from XArray where fully qualified names are not supported. From the netcdf point of view, it is as if all dimension objects were declared in the root group.

If XArray is to be extended to support the equivalent of groups and distinct sets of dimensions are going to be supported in different groups, then some equivalent of the netcdf FQN is going to be needed.

One final note. In netcdf, the dimension size is declared once and associated with a name. In zarr/xarray, the size occurs in multiple places (via the "shape" key) and the name-size associated is also declared multlple times via the _ARRAY_DIMENSIONS attribute.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Should the zarr backend support NCZarr conventions? 1172229856
641468791 https://github.com/pydata/xarray/issues/4082#issuecomment-641468791 https://api.github.com/repos/pydata/xarray/issues/4082 MDEyOklzc3VlQ29tbWVudDY0MTQ2ODc5MQ== DennisHeimbigner 905179 2020-06-09T17:40:16Z 2020-06-09T17:40:16Z NONE

I do not know because I do not understand who is doing the caching. The above archive reference is no longer relevant because the dap2 code now uses an in-memory file rather than something in /tmp. Netcdf-c keeps its curl connections open until nc_close is called. I would assume that each curl connection maintains at least one file descriptor open. But is the cache that shows the problem a python maintained cache or a Windows cache of some sort?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  "write to read-only" Error in xarray.open_mfdataset() with opendap datasets 621177286
640871586 https://github.com/pydata/xarray/issues/4082#issuecomment-640871586 https://api.github.com/repos/pydata/xarray/issues/4082 MDEyOklzc3VlQ29tbWVudDY0MDg3MTU4Ng== DennisHeimbigner 905179 2020-06-08T20:34:30Z 2020-06-08T20:34:43Z NONE

So I tried to duplicate using cygwin with the latest netcdf master and using ncdump. It seems to work ok. But this raises a question? Can someone try this command under windows to see if it fails?

ncdump 'https://www.ncei.noaa.gov/thredds/dodsC/OisstBase/NetCDF/V2.0/AVHRR/201703/avhrr-only-v2.20170322.nc'

If it succeeds then it may mean the problem is with python rather than netcdf.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  "write to read-only" Error in xarray.open_mfdataset() with opendap datasets 621177286
640815050 https://github.com/pydata/xarray/issues/4082#issuecomment-640815050 https://api.github.com/repos/pydata/xarray/issues/4082 MDEyOklzc3VlQ29tbWVudDY0MDgxNTA1MA== DennisHeimbigner 905179 2020-06-08T19:05:44Z 2020-06-08T19:08:34Z NONE

BTW, what version of the netcdf-c library is being used. I see this in an above comment: netcdf4: 1.5.3 But that cannot possibly be correct.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  "write to read-only" Error in xarray.open_mfdataset() with opendap datasets 621177286
640813885 https://github.com/pydata/xarray/issues/4082#issuecomment-640813885 https://api.github.com/repos/pydata/xarray/issues/4082 MDEyOklzc3VlQ29tbWVudDY0MDgxMzg4NQ== DennisHeimbigner 905179 2020-06-08T19:03:28Z 2020-06-08T19:03:28Z NONE

I agree. To be more precise, NC_EPERM is generally thrown when an attempt is made to modify a read-only file. So it is possible that it isn't the DAP2 code, but somewhere, an attempt is being made to modify the dataset. There are pieces of the netcdf-c library that are conditional on Windows. It might be interesting if anyone can check if this occurs under cygwin.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  "write to read-only" Error in xarray.open_mfdataset() with opendap datasets 621177286
640803093 https://github.com/pydata/xarray/issues/4082#issuecomment-640803093 https://api.github.com/repos/pydata/xarray/issues/4082 MDEyOklzc3VlQ29tbWVudDY0MDgwMzA5Mw== DennisHeimbigner 905179 2020-06-08T18:41:16Z 2020-06-08T18:41:16Z NONE

You would lose your money :-) However, I can offer some info that might help. This message: OSError: [Errno -37] NetCDF: Write to read only is NC_EPERM. It is the signal for opendap that you attempted an operation that is illegal for DAP2. As an aside, it is a lousy message but I cannot find anything that is any more informative.

Anyway, it means that your code somehow called one of the following netcdf-c API functions:

nc_redef, nc__enddef, nc_create, nc_put_vara, nc_put_vars nc_set_fill, nc_def_dim, nc_put_att, nc_def_var

Perhaps with this info, you can figure out which of those above operations you invoked. Perhaps you can set breakpoints in the python wrappers for these functions?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  "write to read-only" Error in xarray.open_mfdataset() with opendap datasets 621177286
632918356 https://github.com/pydata/xarray/pull/4047#issuecomment-632918356 https://api.github.com/repos/pydata/xarray/issues/4047 MDEyOklzc3VlQ29tbWVudDYzMjkxODM1Ng== DennisHeimbigner 905179 2020-05-22T21:37:38Z 2020-05-22T21:37:38Z NONE

I have a couple of questions about _ARRAY_DIMENSIONS. Let me make sure I understand how it is used. Suppose I am given an array X with shape(10,20,30) and an _ARRAY_DIMENSION attribute on X with the contents _ARRAY_DIMENSION=["time", "lon", "lat"] Then this is equivalent to the following partial netcdf CDL: netcdf ... { dims: time=10; lon=20; lat=30; ...} Correct? I assume that if there are conflicts where two variables end up assigning different sIzes to the same named dimension, then that generates an error. Finally it is unclear where xarray puts these dimensions. In the closest enclosIng Group? or in the root group? =DennIs Heimbigner Unidata

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Document Xarray zarr encoding conventions 614814400
443918114 https://github.com/pydata/xarray/issues/2583#issuecomment-443918114 https://api.github.com/repos/pydata/xarray/issues/2583 MDEyOklzc3VlQ29tbWVudDQ0MzkxODExNA== DennisHeimbigner 905179 2018-12-04T00:04:59Z 2018-12-04T00:04:59Z NONE

Very possibly. The first thing to look at is what opendap is sending:

http://www.ncei.noaa.gov/thredds/dodsC/avhrr-patmos-x-cloudprops-noaa-asc-fc/files/2003/patmosx_v05r03_NOAA-17_asc_d20030101_c20140314.nc.das

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  decode_cf not scaling and off-setting correctly 386268842
365479970 https://github.com/pydata/xarray/issues/1536#issuecomment-365479970 https://api.github.com/repos/pydata/xarray/issues/1536 MDEyOklzc3VlQ29tbWVudDM2NTQ3OTk3MA== DennisHeimbigner 905179 2018-02-14T02:55:22Z 2018-02-14T02:55:22Z NONE

The methods that need to be implemented are (in the C API) as follows:

int nc_def_var_filter(int ncid, int varid, unsigned int id, size_t nparams, const unsigned int* parms); The only tricky part is passing a vector of unsigned integers (the parms argument).

int nc_inq_var_filter(int ncid, int varid, unsigned int idp, size_t nparams, unsigned int* params); This requires passing values out via pointers. Also, this uses the standard netcdf-c trick in which the function is called twice. First with nparams defined, but params has the value NULL. This gives the caller the number of parameters. The function is called a second time after allocating the params vector and with that vector as the final argument.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better compression algorithms for NetCDF 253476466
365476120 https://github.com/pydata/xarray/issues/1536#issuecomment-365476120 https://api.github.com/repos/pydata/xarray/issues/1536 MDEyOklzc3VlQ29tbWVudDM2NTQ3NjEyMA== DennisHeimbigner 905179 2018-02-14T02:30:05Z 2018-02-14T02:30:05Z NONE

The API is not yet exposed thru anything but the C api. So the python, fortran, and c++ wrappers do not yet show it. Passing it thru netcdf-python is probably pretty trivian, though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better compression algorithms for NetCDF 253476466
365475898 https://github.com/pydata/xarray/issues/1536#issuecomment-365475898 https://api.github.com/repos/pydata/xarray/issues/1536 MDEyOklzc3VlQ29tbWVudDM2NTQ3NTg5OA== DennisHeimbigner 905179 2018-02-14T02:28:42Z 2018-02-14T02:28:42Z NONE

A bit confusing, but I think the answer is yes. For example we provide a bzip2 compression plugin as an example (see examples/C/hdf5plugins in the netcdf-c distribution).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better compression algorithms for NetCDF 253476466
365419155 https://github.com/pydata/xarray/issues/1536#issuecomment-365419155 https://api.github.com/repos/pydata/xarray/issues/1536 MDEyOklzc3VlQ29tbWVudDM2NTQxOTE1NQ== DennisHeimbigner 905179 2018-02-13T21:59:35Z 2018-02-13T21:59:35Z NONE

You may already know, but should note that the filter stuff in netcdf-c is now available in netcdf-c library version 4.6.0. So any filter plugin usable with hdf5 can now be used both for reading and writing thru the netcdf-c api.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better compression algorithms for NetCDF 253476466
325775498 https://github.com/pydata/xarray/issues/1536#issuecomment-325775498 https://api.github.com/repos/pydata/xarray/issues/1536 MDEyOklzc3VlQ29tbWVudDMyNTc3NTQ5OA== DennisHeimbigner 905179 2017-08-29T19:35:55Z 2017-08-29T19:35:55Z NONE

The github branch filters.dmh for the netcdf-c library now exposes the HDF5 dynamic filter capability. This is documented here: https://github.com/Unidata/netcdf-c/blob/filters.dmh/docs/filters.md I welcome suggestions for improvements.

I also note that I am extending this branch to now handle szip compression. It turns out there is now a patent-free implementation called libaec (HT Rich Signell) so there is no reason not to make it available.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better compression algorithms for NetCDF 253476466

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 17.814ms · About: xarray-datasette