home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 266320445

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
266320445 MDU6SXNzdWUyNjYzMjA0NDU= 1638 Unicode strings unexpectedly transformed to byte strings upon `open_dataset` 806256 closed 0     7 2017-10-18T00:16:38Z 2017-11-01T19:51:55Z 2017-10-27T23:36:07Z NONE      

When I first create the dataset, all the metadata is stored as unicode strings (yay!):

<xarray.Dataset> Dimensions: (cell: 53760, gene: 23438) Coordinates: * gene (gene) object '0610005C13Rik' ... Uniquely mapped reads number (cell) int64 1017682 634557 941828 1392029 ... Number of input reads (cell) int64 1229254 730274 1075370 ... EXP_ID (cell) <U29 '170925_A00111_0066_AH3TKNDMXX' ... TAXON (cell) <U3 'mus' 'mus' 'mus' 'mus' 'mus' ... WELL_MAPPING (cell) <U9 'B000126' 'B000126' 'B000126' ... Lysis Plate Batch (cell) <U32 '20' '20' '20' '20' '20' '20' ... dNTP.batch (cell) <U38 '457912' '457912' '457912' ... oligodT.order.no (cell) <U32 '6/23/17 12757296' ... plate.type (cell) <U32 'Biorad HSP3901' ... preparation.site (cell) <U32 'Biohub' 'Biohub' 'Biohub' ... date.prepared (cell) <U32 '07-06-17' '07-06-17' ... date.sorted (cell) <U6 '170707' '170707' '170707' ... tissue (cell) <U13 'Skin' 'Skin' 'Skin' 'Skin' ... subtissue (cell) <U32 'nan' 'nan' 'nan' 'nan' 'nan' ... mouse.id (cell) <U13 '3_39_F' '3_39_F' '3_39_F' ... FACS.selection (cell) <U52 'Multiple' 'Multiple' ... nozzle.size (cell) <U32 '100' '100' '100' '100' '100' ... FACS.instument (cell) <U32 'Sony SIM1' 'Sony SIM1' ... Experiment ID (cell) <U32 'exp22' 'exp22' 'exp22' ... Columns sorted (cell) float64 nan nan nan nan nan nan nan ... Double check (cell) float64 nan nan nan nan nan nan nan ... Plate (cell) <U32 '1' '1' '1' '1' '1' '1' '1' ... Location (cell) <U32 'MACA20_3' 'MACA20_3' ... Comments (cell) <U32 'nan' 'nan' 'nan' 'nan' 'nan' ... mouse.age (cell) <U1 '3' '3' '3' '3' '3' '3' '3' '3' ... mouse.number (cell) <U32 '39' '39' '39' '39' '39' '39' ... mouse.sex (cell) <U1 'F' 'F' 'F' 'F' 'F' 'F' 'F' 'F' ... * cell (cell) object 'A17-B000126-3_39_F-1-1' ... Data variables: counts (cell, gene) int64 0 0 0 0 442 0 0 0 0 0 0 ... log2 (cell, gene) float64 0.0 0.0 0.0 0.0 8.791 ... log10 (cell, gene) float64 0.0 0.0 0.0 0.0 2.646 ...

but then when I save using to_netcdf using the default arguments, then xr.open_dataset on the same dataset using default arguments, all of them get converted to byte strings:

<xarray.Dataset> Dimensions: (cell: 53760, gene: 23438) Coordinates: * cell (cell) |S24 b'A17-B000126-3_39_F-1-1' ... * gene (gene) |S22 b'0610005C13Rik' ... Data variables: counts (cell, gene) int32 0 0 0 0 442 0 0 0 0 0 0 ... log2 (cell, gene) float64 0.0 0.0 0.0 0.0 8.791 ... log10 (cell, gene) float64 0.0 0.0 0.0 0.0 2.646 ... FACS.selection (cell) |S52 b'Multiple' b'Multiple' ... dNTP.batch (cell) |S38 b'457912' b'457912' b'457912' ... EXP_ID (cell) |S29 b'170925_A00111_0066_AH3TKNDMXX' ... subtissue (cell) |S19 b'nan' b'nan' b'nan' b'nan' ... oligodT.order.no (cell) |S17 b'6/23/17 12757296' ... plate.type (cell) |S14 b'Biorad HSP3901' ... tissue (cell) |S13 b'Skin' b'Skin' b'Skin' ... mouse.id (cell) |S13 b'3_39_F' b'3_39_F' b'3_39_F' ... FACS.instument (cell) |S13 b'Sony SIM1' b'Sony SIM1' ... Comments (cell) |S11 b'nan' b'nan' b'nan' b'nan' ... WELL_MAPPING (cell) |S9 b'B000126' b'B000126' ... date.prepared (cell) |S9 b'07-06-17' b'07-06-17' ... Location (cell) |S9 b'MACA20_3' b'MACA20_3' ... preparation.site (cell) |S8 b'Biohub' b'Biohub' b'Biohub' ... date.sorted (cell) |S6 b'170707' b'170707' b'170707' ... Experiment ID (cell) |S6 b'exp22' b'exp22' b'exp22' ... TAXON (cell) |S3 b'mus' b'mus' b'mus' b'mus' ... Lysis Plate Batch (cell) |S3 b'20' b'20' b'20' b'20' b'20' ... nozzle.size (cell) |S3 b'100' b'100' b'100' b'100' ... Plate (cell) |S3 b'1' b'1' b'1' b'1' b'1' b'1' ... mouse.number (cell) |S3 b'39' b'39' b'39' b'39' b'39' ... Uniquely mapped reads number (cell) int32 1017682 634557 941828 1392029 ... Number of input reads (cell) int32 1229254 730274 1075370 ... Columns sorted (cell) float64 nan nan nan nan nan nan nan ... Double check (cell) float64 nan nan nan nan nan nan nan ... mouse.age (cell) |S1 b'3' b'3' b'3' b'3' b'3' b'3' ... mouse.sex (cell) |S1 b'F' b'F' b'F' b'F' b'F' b'F' ...

So then things I expect like selecting on gene, e.g. ds.sel(gene="Ins1") don't work unless they're byte strings, i.e. ds.sel(gene=b"Ins1") works just fine.

Do you know why this may be happening?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1638/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 7 rows from issue in issue_comments
Powered by Datasette · Queries took 0.822ms · About: xarray-datasette