home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 300545110

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/1391#issuecomment-300545110 https://api.github.com/repos/pydata/xarray/issues/1391 300545110 MDEyOklzc3VlQ29tbWVudDMwMDU0NTExMA== 6980561 2017-05-10T16:53:25Z 2017-05-10T16:53:25Z NONE

@darothen That sounds great!

I think we should be clearer. The issue that @NicWayand and I are highlighting is the coercing observational data, which often comes with some fairly heinous formatting issues, into an xarray format. The stacking of these data along a new dimension is usually the last step in this process, and one that can be frustrating. An example of this in practice can be found in this notebook (please be forgiving, it is one of the first things I ever wrote in python).

https://github.com/klapo/CalRad/blob/master/CR.SurfObs.DataIngest.xray.ipynb

The data flow looks like this: - read the csv summarizing each station - read data from one set of stations using pandas - clean the data - assign the data in a pandas DataFrame to a dictionary of DataFrames - rinse and repeat for the other set of data - concat the dictionary of DataFrames into a single DataFrame - convert to an xarray DataSet

This example is a little ludicrous because I didn't know what I was doing, but I think that's the point. There is a lot of ambiguity on which tools to use at what point. Concatenating a dictionary of DataFrames into a single DataFrame and then converting to a DataSet was the only solution I could get to work, after a lot of trial and error, for putting these data in an xarray DataSet.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  225536793
Powered by Datasette · Queries took 0.68ms · About: xarray-datasette