home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 303606683

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/pull/1421#issuecomment-303606683 https://api.github.com/repos/pydata/xarray/issues/1421 303606683 MDEyOklzc3VlQ29tbWVudDMwMzYwNjY4Mw== 1217238 2017-05-24T03:18:16Z 2017-05-24T03:18:16Z MEMBER

How about something like the following:

In encode_cf_variable, create a new variable with pickle encoded data (if appropriate). This looks something like: ```python def encode_cf_variable(var, allow_pickle=False): ... if var.dtype == object: if allow_pickle: var = maybe_encode_pickle(var) else: raise TypeError return var

def maybe_encode_pickle(var): if var.dtype == object: attrs = var.attrs.copy() safe_setitem('_FileFormat', 'python-pickle') protocol = var.encoding.pop('pickle_protocol', 2) data = utils.encode_pickle(var.values, protocol=protocol) var = Variable(var.dims, data, attrs, var.encoding) return var `` This reuses theencoding` parameter for setting the pickle protocol version, which is already what we use for similar variable specific encoding details.

In the netCDF backends, add a check for variable with dtype == object with a _FileFormat attribute. If this is the case, call a create_vlen_int8_dtype method to create an appropriate dtype using backend specific methods (The behavior on the base class should raise an error), and proceed with writing the data in the usual way.

For decoding, reverse the process. Convert custom vlen dtypes to dtype=object in the appropriate backend specific array wrapper type, but don't decode data. If allow_pickle=True and var.attrs['_FileFormat'] == 'python-pickle', then decode_cf_variable should do the unpickling, moving _FileFormat from attrs to encoding.

For bonus points, generalize handling of vlen types with encoding. Something like encoding={'dtype': {'vlen': np.int8}} could indicate that a special vlen with np.int8 data should be created to encode this variable's data. (This is inspired by h5py's API for special types.)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  230566456
Powered by Datasette · Queries took 0.635ms · About: xarray-datasette