id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 270430189,MDU6SXNzdWUyNzA0MzAxODk=,1680,Coordinates get transformed to variables,806256,closed,0,,,7,2017-11-01T19:51:34Z,2017-11-08T21:37:04Z,2017-11-08T21:37:04Z,NONE,,,,"#### Code Sample, a copy-pastable example if possible To get the data to reproduce the example: `aws s3 sync s3://olgabot-maca/xarray-coordinates-to-variables/ .` or download [this `.tar.gz` file](https://s3-us-west-2.amazonaws.com/olgabot-maca/xarray-coordinates-to-variables/olga_xarray-dev_subset_scrubbed.tar.gz). [gist with code](https://gist.github.com/76892304d6285d5450f1d6014928fbe8) #### Problem description When I create a dataset, I set a bunch of metadata as `coordinates`, not `variables`: ![screen shot 2017-11-01 at 12 45 39 pm](https://user-images.githubusercontent.com/806256/32294155-9751934c-bf02-11e7-8c26-eeaec7dcfa5b.png) But when I reload that exact same dataset with `xr.open_dataset`, the `coordinates` have be moved into `variables`!! ![screen shot 2017-11-01 at 12 45 54 pm](https://user-images.githubusercontent.com/806256/32294213-bba33ad4-bf02-11e7-9c3b-e147c8998847.png) Interestingly, I think this is happening upon dataset creation because `ds.variables` already has the erroneous metadata: ![screen shot 2017-11-01 at 12 47 20 pm](https://user-images.githubusercontent.com/806256/32294247-d3b4b7c4-bf02-11e7-8b88-822d0513e33e.png) #### Expected Output I expected the coordinates and variables to be consistent between opening and closing the file #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: None python: 3.6.3.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-97-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.0rc1-2-gf83361c pandas: 0.20.3 numpy: 1.13.3 scipy: 0.19.1 netCDF4: None h5netcdf: None Nio: None bottleneck: None cyordereddict: None dask: None matplotlib: None cartopy: None seaborn: None setuptools: 36.5.0.post20170921 pip: 9.0.1 conda: None pytest: None IPython: 6.1.0 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1680/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 266320445,MDU6SXNzdWUyNjYzMjA0NDU=,1638,Unicode strings unexpectedly transformed to byte strings upon `open_dataset`,806256,closed,0,,,7,2017-10-18T00:16:38Z,2017-11-01T19:51:55Z,2017-10-27T23:36:07Z,NONE,,,,"When I first create the dataset, all the metadata is stored as unicode strings (yay!): ``` Dimensions: (cell: 53760, gene: 23438) Coordinates: * gene (gene) object '0610005C13Rik' ... Uniquely mapped reads number (cell) int64 1017682 634557 941828 1392029 ... Number of input reads (cell) int64 1229254 730274 1075370 ... EXP_ID (cell) Dimensions: (cell: 53760, gene: 23438) Coordinates: * cell (cell) |S24 b'A17-B000126-3_39_F-1-1' ... * gene (gene) |S22 b'0610005C13Rik' ... Data variables: counts (cell, gene) int32 0 0 0 0 442 0 0 0 0 0 0 ... log2 (cell, gene) float64 0.0 0.0 0.0 0.0 8.791 ... log10 (cell, gene) float64 0.0 0.0 0.0 0.0 2.646 ... FACS.selection (cell) |S52 b'Multiple' b'Multiple' ... dNTP.batch (cell) |S38 b'457912' b'457912' b'457912' ... EXP_ID (cell) |S29 b'170925_A00111_0066_AH3TKNDMXX' ... subtissue (cell) |S19 b'nan' b'nan' b'nan' b'nan' ... oligodT.order.no (cell) |S17 b'6/23/17 12757296' ... plate.type (cell) |S14 b'Biorad HSP3901' ... tissue (cell) |S13 b'Skin' b'Skin' b'Skin' ... mouse.id (cell) |S13 b'3_39_F' b'3_39_F' b'3_39_F' ... FACS.instument (cell) |S13 b'Sony SIM1' b'Sony SIM1' ... Comments (cell) |S11 b'nan' b'nan' b'nan' b'nan' ... WELL_MAPPING (cell) |S9 b'B000126' b'B000126' ... date.prepared (cell) |S9 b'07-06-17' b'07-06-17' ... Location (cell) |S9 b'MACA20_3' b'MACA20_3' ... preparation.site (cell) |S8 b'Biohub' b'Biohub' b'Biohub' ... date.sorted (cell) |S6 b'170707' b'170707' b'170707' ... Experiment ID (cell) |S6 b'exp22' b'exp22' b'exp22' ... TAXON (cell) |S3 b'mus' b'mus' b'mus' b'mus' ... Lysis Plate Batch (cell) |S3 b'20' b'20' b'20' b'20' b'20' ... nozzle.size (cell) |S3 b'100' b'100' b'100' b'100' ... Plate (cell) |S3 b'1' b'1' b'1' b'1' b'1' b'1' ... mouse.number (cell) |S3 b'39' b'39' b'39' b'39' b'39' ... Uniquely mapped reads number (cell) int32 1017682 634557 941828 1392029 ... Number of input reads (cell) int32 1229254 730274 1075370 ... Columns sorted (cell) float64 nan nan nan nan nan nan nan ... Double check (cell) float64 nan nan nan nan nan nan nan ... mouse.age (cell) |S1 b'3' b'3' b'3' b'3' b'3' b'3' ... mouse.sex (cell) |S1 b'F' b'F' b'F' b'F' b'F' b'F' ... ``` So then things I expect like selecting on gene, e.g. `ds.sel(gene=""Ins1"")` don't work unless they're byte strings, i.e. `ds.sel(gene=b""Ins1"")` works just fine. Do you know why this may be happening?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1638/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue