home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

2 rows where author_association = "NONE" and issue = 88868867 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • richardotis 2

issue 1

  • Working with labeled N-dimensional data with combinatoric independent variables · 2 ✖

author_association 1

  • NONE · 2 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
112984626 https://github.com/pydata/xarray/issues/435#issuecomment-112984626 https://api.github.com/repos/pydata/xarray/issues/435 MDEyOklzc3VlQ29tbWVudDExMjk4NDYyNg== richardotis 6405510 2015-06-18T00:16:19Z 2015-06-18T00:16:19Z NONE

xray definitely seems to be the correct tool, as you suggested.

For the record, this is my first pass at coming up with the Dataset:

<xray.Dataset> Dimensions: (P: 20, T: 20, components: 4, id: 600, internal_dof: 9) Coordinates: * components (components) <U2 'AL' 'NI' 'CR' 'FE' * internal_dof (internal_dof) <U4 'AL_0' 'NI_0' 'CR_0' 'FE_0' 'AL_1' ... composition (id, components) float64 0.153 0.2138 0.2917 0.3415 0.316 ... Phase <U6 'FCC_A1' * P (P) float64 1e+05 1.833e+05 3.36e+05 6.158e+05 1.129e+06 ... * T (T) float64 300.0 347.4 394.7 442.1 489.5 536.8 584.2 ... * id (id) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ... Data variables: constitution (id, internal_dof) float64 0.1296 0.2055 0.301 0.3639 ... energies (T, P, id) float64 -5.533e+03 -605.8 -2.507e+03 -8.546e+03 ...

Full notebook: https://github.com/richardotis/pycalphad/blob/178f150b492099c32e197b417c11729f12d6dfe8/research/xrayTest.ipynb

I decided I'm better off giving each phase its own Dataset, and when I need to do multi-phase operations I'll drop all the internal dimensions before I merge them. The result of to_dataframe(), with how it's getting rows and columns mixed up, makes me think I don't yet specify the optimal combination of coordinates and variables in the Dataset.

Some initial queries of the data seem to function well and at a fraction of the memory cost of the pandas-based approach, so I'm feeling optimistic here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Working with labeled N-dimensional data with combinatoric independent variables 88868867
112811117 https://github.com/pydata/xarray/issues/435#issuecomment-112811117 https://api.github.com/repos/pydata/xarray/issues/435 MDEyOklzc3VlQ29tbWVudDExMjgxMTExNw== richardotis 6405510 2015-06-17T13:58:59Z 2015-06-17T13:58:59Z NONE

Thank you for your thoughts. While composing my response I realized I'm actually concerned about two distinct data representation problems. 1. Energies computed for any number of phases at discrete temperatures, pressures and compositions in a system ("energy surface data"). This is intermediate data in the computation. 2. Result of constrained equilibrium computations using the data in (1)

The shape of the data in (2) would be something like (condition axis 1, condition axis 2, ..., condition axis n). Conditions can be independent or dependent variables (the solver can work backwards), and not all combinations of conditions result in an answer.

For example, I want to map the phase relations of a 4-component system in T-P-x space. I choose 50 temperatures and pressures, plus 100 points per independent composition axis (here I fix the total system size so that 1 composition variable is dependent). So then the shape of my equilibrium data would be (50, 50, 100, 100, 100). But what is the value of each element, the equilibrium result?

The equilibrium result is also multi-dimensional. I need to store the computed chemical potentials for each component (1-dimensional), the fraction of each stable phase (1-dimensional), and references to the corresponding physical states in the energy surface data (1-dimensional).

Going back to (1), phases can also have "internal" composition variables that map to the overall composition in a non-invertible way, i.e., two physical states can have the same overall composition but different internal compositions. The way I've been handling this is by adding more columns to my DataFrames, but it's not a sustainable approach for reasons we've both mentioned.

The data in (1) makes the most sense to me as a "ragged ndarray", where the internal degrees of freedom of each phase are free to be be different but still mapping to global composition coordinates. For (2), I imagine a "result object" bundled up inside all the conditions dimensions, but I need to be able to slice and search the derived/computed quantities just as easily as the independent variables.

Does xray make sense for either or both of these cases?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Working with labeled N-dimensional data with combinatoric independent variables 88868867

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 15.198ms · About: xarray-datasette