home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 339162189

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/1638#issuecomment-339162189 https://api.github.com/repos/pydata/xarray/issues/1638 339162189 MDEyOklzc3VlQ29tbWVudDMzOTE2MjE4OQ== 806256 2017-10-24T23:02:34Z 2017-10-24T23:03:18Z NONE

Thank you for looking into this! I used the default engine to save, which looks like it was netcdf4. I did pip install h5netcdf and saved again. It took longer, ~2min instead of seconds. Loading was still 110ms and all the features are objects again! Though the coordinates --> variables thing is still happening.

<xarray.Dataset> Dimensions: (cell: 53760, gene: 23438) Coordinates: * cell (cell) object 'A17-B000126-3_39_F-1-1' ... * gene (gene) object '0610005C13Rik' ... Data variables: Columns sorted (cell) float64 nan nan nan nan nan nan nan ... Comments (cell) object 'nan' 'nan' 'nan' 'nan' ... Double check (cell) float64 nan nan nan nan nan nan nan ... EXP_ID (cell) object '170925_A00111_0066_AH3TKNDMXX' ... Experiment ID (cell) object 'exp22' 'exp22' 'exp22' ... FACS.instument (cell) object 'Sony SIM1' 'Sony SIM1' ... FACS.selection (cell) object 'Multiple' 'Multiple' ... Location (cell) object 'MACA20_3' 'MACA20_3' ... Lysis Plate Batch (cell) object '20' '20' '20' '20' '20' ... Number of input reads (cell) int64 1229254 730274 1075370 ... Plate (cell) object '1' '1' '1' '1' '1' '1' '1' ... TAXON (cell) object 'mus' 'mus' 'mus' 'mus' ... Uniquely mapped reads number (cell) int64 1017682 634557 941828 1392029 ... WELL_MAPPING (cell) object 'B000126' 'B000126' ... counts (cell, gene) int64 0 0 0 0 442 0 0 0 0 0 0 ... dNTP.batch (cell) object '457912' '457912' '457912' ... date.prepared (cell) object '07-06-17' '07-06-17' ... date.sorted (cell) object '170707' '170707' '170707' ... log10 (cell, gene) float64 0.0 0.0 0.0 0.0 2.646 ... log2 (cell, gene) float64 0.0 0.0 0.0 0.0 8.791 ... mouse.age (cell) object '3' '3' '3' '3' '3' '3' '3' ... mouse.id (cell) object '3_39_F' '3_39_F' '3_39_F' ... mouse.number (cell) object '39' '39' '39' '39' '39' ... mouse.sex (cell) object 'F' 'F' 'F' 'F' 'F' 'F' 'F' ... nozzle.size (cell) object '100' '100' '100' '100' ... oligodT.order.no (cell) object '6/23/17 12757296' ... plate.type (cell) object 'Biorad HSP3901' ... preparation.site (cell) object 'Biohub' 'Biohub' 'Biohub' ... subtissue (cell) object 'nan' 'nan' 'nan' 'nan' ... tissue (cell) object 'Skin' 'Skin' 'Skin' 'Skin' ...

Not sure if it matters, but one detail is that I created ~250 individual datasets (each sized at ~300 samples x 20,000 features) and then used xr.concat(datasets, dim='cell') to concatenate them because I couldn't read them all into memory at once.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  266320445
Powered by Datasette · Queries took 1.135ms · About: xarray-datasette