home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

2 rows where user = 806256 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue 2

state 1

  • closed 2

repo 1

  • xarray 2
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
270430189 MDU6SXNzdWUyNzA0MzAxODk= 1680 Coordinates get transformed to variables olgabot 806256 closed 0     7 2017-11-01T19:51:34Z 2017-11-08T21:37:04Z 2017-11-08T21:37:04Z NONE      

Code Sample, a copy-pastable example if possible

To get the data to reproduce the example: aws s3 sync s3://olgabot-maca/xarray-coordinates-to-variables/ . or download this .tar.gz file.

gist with code

Problem description

When I create a dataset, I set a bunch of metadata as coordinates, not variables:

But when I reload that exact same dataset with xr.open_dataset, the coordinates have be moved into variables!!

Interestingly, I think this is happening upon dataset creation because ds.variables already has the erroneous metadata:

Expected Output

I expected the coordinates and variables to be consistent between opening and closing the file

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.3.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-97-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.0rc1-2-gf83361c pandas: 0.20.3 numpy: 1.13.3 scipy: 0.19.1 netCDF4: None h5netcdf: None Nio: None bottleneck: None cyordereddict: None dask: None matplotlib: None cartopy: None seaborn: None setuptools: 36.5.0.post20170921 pip: 9.0.1 conda: None pytest: None IPython: 6.1.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1680/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
266320445 MDU6SXNzdWUyNjYzMjA0NDU= 1638 Unicode strings unexpectedly transformed to byte strings upon `open_dataset` olgabot 806256 closed 0     7 2017-10-18T00:16:38Z 2017-11-01T19:51:55Z 2017-10-27T23:36:07Z NONE      

When I first create the dataset, all the metadata is stored as unicode strings (yay!):

<xarray.Dataset> Dimensions: (cell: 53760, gene: 23438) Coordinates: * gene (gene) object '0610005C13Rik' ... Uniquely mapped reads number (cell) int64 1017682 634557 941828 1392029 ... Number of input reads (cell) int64 1229254 730274 1075370 ... EXP_ID (cell) <U29 '170925_A00111_0066_AH3TKNDMXX' ... TAXON (cell) <U3 'mus' 'mus' 'mus' 'mus' 'mus' ... WELL_MAPPING (cell) <U9 'B000126' 'B000126' 'B000126' ... Lysis Plate Batch (cell) <U32 '20' '20' '20' '20' '20' '20' ... dNTP.batch (cell) <U38 '457912' '457912' '457912' ... oligodT.order.no (cell) <U32 '6/23/17 12757296' ... plate.type (cell) <U32 'Biorad HSP3901' ... preparation.site (cell) <U32 'Biohub' 'Biohub' 'Biohub' ... date.prepared (cell) <U32 '07-06-17' '07-06-17' ... date.sorted (cell) <U6 '170707' '170707' '170707' ... tissue (cell) <U13 'Skin' 'Skin' 'Skin' 'Skin' ... subtissue (cell) <U32 'nan' 'nan' 'nan' 'nan' 'nan' ... mouse.id (cell) <U13 '3_39_F' '3_39_F' '3_39_F' ... FACS.selection (cell) <U52 'Multiple' 'Multiple' ... nozzle.size (cell) <U32 '100' '100' '100' '100' '100' ... FACS.instument (cell) <U32 'Sony SIM1' 'Sony SIM1' ... Experiment ID (cell) <U32 'exp22' 'exp22' 'exp22' ... Columns sorted (cell) float64 nan nan nan nan nan nan nan ... Double check (cell) float64 nan nan nan nan nan nan nan ... Plate (cell) <U32 '1' '1' '1' '1' '1' '1' '1' ... Location (cell) <U32 'MACA20_3' 'MACA20_3' ... Comments (cell) <U32 'nan' 'nan' 'nan' 'nan' 'nan' ... mouse.age (cell) <U1 '3' '3' '3' '3' '3' '3' '3' '3' ... mouse.number (cell) <U32 '39' '39' '39' '39' '39' '39' ... mouse.sex (cell) <U1 'F' 'F' 'F' 'F' 'F' 'F' 'F' 'F' ... * cell (cell) object 'A17-B000126-3_39_F-1-1' ... Data variables: counts (cell, gene) int64 0 0 0 0 442 0 0 0 0 0 0 ... log2 (cell, gene) float64 0.0 0.0 0.0 0.0 8.791 ... log10 (cell, gene) float64 0.0 0.0 0.0 0.0 2.646 ...

but then when I save using to_netcdf using the default arguments, then xr.open_dataset on the same dataset using default arguments, all of them get converted to byte strings:

<xarray.Dataset> Dimensions: (cell: 53760, gene: 23438) Coordinates: * cell (cell) |S24 b'A17-B000126-3_39_F-1-1' ... * gene (gene) |S22 b'0610005C13Rik' ... Data variables: counts (cell, gene) int32 0 0 0 0 442 0 0 0 0 0 0 ... log2 (cell, gene) float64 0.0 0.0 0.0 0.0 8.791 ... log10 (cell, gene) float64 0.0 0.0 0.0 0.0 2.646 ... FACS.selection (cell) |S52 b'Multiple' b'Multiple' ... dNTP.batch (cell) |S38 b'457912' b'457912' b'457912' ... EXP_ID (cell) |S29 b'170925_A00111_0066_AH3TKNDMXX' ... subtissue (cell) |S19 b'nan' b'nan' b'nan' b'nan' ... oligodT.order.no (cell) |S17 b'6/23/17 12757296' ... plate.type (cell) |S14 b'Biorad HSP3901' ... tissue (cell) |S13 b'Skin' b'Skin' b'Skin' ... mouse.id (cell) |S13 b'3_39_F' b'3_39_F' b'3_39_F' ... FACS.instument (cell) |S13 b'Sony SIM1' b'Sony SIM1' ... Comments (cell) |S11 b'nan' b'nan' b'nan' b'nan' ... WELL_MAPPING (cell) |S9 b'B000126' b'B000126' ... date.prepared (cell) |S9 b'07-06-17' b'07-06-17' ... Location (cell) |S9 b'MACA20_3' b'MACA20_3' ... preparation.site (cell) |S8 b'Biohub' b'Biohub' b'Biohub' ... date.sorted (cell) |S6 b'170707' b'170707' b'170707' ... Experiment ID (cell) |S6 b'exp22' b'exp22' b'exp22' ... TAXON (cell) |S3 b'mus' b'mus' b'mus' b'mus' ... Lysis Plate Batch (cell) |S3 b'20' b'20' b'20' b'20' b'20' ... nozzle.size (cell) |S3 b'100' b'100' b'100' b'100' ... Plate (cell) |S3 b'1' b'1' b'1' b'1' b'1' b'1' ... mouse.number (cell) |S3 b'39' b'39' b'39' b'39' b'39' ... Uniquely mapped reads number (cell) int32 1017682 634557 941828 1392029 ... Number of input reads (cell) int32 1229254 730274 1075370 ... Columns sorted (cell) float64 nan nan nan nan nan nan nan ... Double check (cell) float64 nan nan nan nan nan nan nan ... mouse.age (cell) |S1 b'3' b'3' b'3' b'3' b'3' b'3' ... mouse.sex (cell) |S1 b'F' b'F' b'F' b'F' b'F' b'F' ...

So then things I expect like selecting on gene, e.g. ds.sel(gene="Ins1") don't work unless they're byte strings, i.e. ds.sel(gene=b"Ins1") works just fine.

Do you know why this may be happening?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1638/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 95.254ms · About: xarray-datasette