home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

11 rows where author_association = "NONE" and user = 10928117 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, reactions, created_at (date), updated_at (date)

issue 4

  • cartesian product of coordinates and using it to index / fill empty dataset 5
  • WIP: progress toward making groupby work with multiple arguments 3
  • variable length of a dimension in DataArray 2
  • question: dataset variables as coordinates 1

user 1

  • RafalSkolasinski · 11 ✖

author_association 1

  • NONE · 11 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
435717326 https://github.com/pydata/xarray/issues/1914#issuecomment-435717326 https://api.github.com/repos/pydata/xarray/issues/1914 MDEyOklzc3VlQ29tbWVudDQzNTcxNzMyNg== RafalSkolasinski 10928117 2018-11-04T23:07:56Z 2018-11-04T23:07:56Z NONE

@jcmgray I had to miss your reply to this issue, I saw it just now. I love your code! I will definitely include xyzpy in my tools from now on ;-).

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cartesian product of coordinates and using it to index / fill empty dataset 297560256
371184796 https://github.com/pydata/xarray/issues/1973#issuecomment-371184796 https://api.github.com/repos/pydata/xarray/issues/1973 MDEyOklzc3VlQ29tbWVudDM3MTE4NDc5Ng== RafalSkolasinski 10928117 2018-03-07T15:56:01Z 2018-03-07T18:59:18Z NONE

I just found one way to do it

python ds = ds2.transpose('a', 'n') tmp = [] for n, p in enumerate(ds.a): a = p.data.tolist() x = ds.x[n].data.tolist() y = ds.y[n].data.tolist() tmp.append(xr.DataArray(y, coords={'x': x, 'a': a}, dims='x'))

```python

ds1.equals(ds) True ```

I have a strong feeling that there should be much easier way to do it though...

Edit: I found a bit nicer way to do it python ds = ds2.set_coords('x') items = [ds.sel(a=a).swap_dims({'n': 'x'}) for a in ds.a] ds = xr.concat(items, dim='a') ds = ds.drop('n') ```python

ds1.equals(ds) True ``` however it still does not seem intuitive enough...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  question: dataset variables as coordinates 303130664
367677038 https://github.com/pydata/xarray/issues/1914#issuecomment-367677038 https://api.github.com/repos/pydata/xarray/issues/1914 MDEyOklzc3VlQ29tbWVudDM2NzY3NzAzOA== RafalSkolasinski 10928117 2018-02-22T13:15:11Z 2018-02-22T13:15:11Z NONE

@shoyer Thanks for your suggestions and linking the other issue. I think this one can also be labelled as the "usage question".

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cartesian product of coordinates and using it to index / fill empty dataset 297560256
366833780 https://github.com/pydata/xarray/issues/1914#issuecomment-366833780 https://api.github.com/repos/pydata/xarray/issues/1914 MDEyOklzc3VlQ29tbWVudDM2NjgzMzc4MA== RafalSkolasinski 10928117 2018-02-20T00:27:36Z 2018-02-20T00:27:36Z NONE

After preparing list similar to [{'x': 0, 'y': 'a'}, {'x': 1, 'y': 'a'}, ...] interaction with cluster is quite efficient. One can easily pass such a thing to async_map of ipyparallel.

Thanks for your suggestion, I need to try few things. I also want to try to extend it to function that computes few different things that could be multi-valued, e.g. ```python def dummy(x, y): ds = xr.Dataset( {'out1': ('n', [1x, 2x, 3*x]), 'out2': ('m', [x, y])}, coords = {'x': x, 'y': y, 'n': range(3), 'm': range(2)} )

return ds

``` and then group together such outputs... Ok, I know. I go from simple problem to much more complicated one, but isn't it the case usually?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cartesian product of coordinates and using it to index / fill empty dataset 297560256
366819497 https://github.com/pydata/xarray/issues/1914#issuecomment-366819497 https://api.github.com/repos/pydata/xarray/issues/1914 MDEyOklzc3VlQ29tbWVudDM2NjgxOTQ5Nw== RafalSkolasinski 10928117 2018-02-19T22:40:17Z 2018-02-19T22:58:02Z NONE

For "get done" I had for example the following (similar to what I linked as my initial attempt) ```python coordinates = { 'x': np.linspace(-1, 1), 'y': np.linspace(0, 10), }

constants = { 'a': 1, 'b': 5 }

inps = [{constants, {k: v for k, v in zip(coordinates.keys(), x)}} for x in list(it.product(*coordinates.values()))]

def f(x, y, a, b): """Some dummy function.""" v = a * x2 + b * y2 return xr.DataArray(v, {'x': x, 'y': y, 'a': a, 'b': b})

simulate computation on cluster

values = list(map(lambda s: f(**s), inps))

gather and unstack the inputs

ds = xr.concat(values, dim='new', coords='all') ds = ds.set_index(new=list(set(ds.coords) - set(ds.dims))) ds = ds.unstack('new') ```

It is very close to what you suggest. My main question is if this can be done better. Mainly I am wondering if 1. Is there any built-in iterator over the Cartesian product of coordinates. If no, are there people that also think it would be useful? 2. Gathering together / unstacking of the data. My 3 line combo of concat, set_index and unstack seems to do the trick but it seems a bit like over complication. Ideally I'd expect to have some mechanism that works similar to:

`python inputs = cartesian_product(coordinates) # list similar toinps`` above values = [function(inp) for inp in inputs] # or using ipypparallel map

xarray_data = ... # some empty xarray object for inp, val in zip(inputs, values): xarray_data[inp] = val ```

I asked how to generate product of coordinates from xarray object because I was expecting that I can create xarray_data as an empty object with all coordinates set and then fill it.


Added comment

Having an empty, as filled with nans, object to start with would have this benefit that one could save partial results and have clean information what was already computed.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cartesian product of coordinates and using it to index / fill empty dataset 297560256
366740505 https://github.com/pydata/xarray/issues/1914#issuecomment-366740505 https://api.github.com/repos/pydata/xarray/issues/1914 MDEyOklzc3VlQ29tbWVudDM2Njc0MDUwNQ== RafalSkolasinski 10928117 2018-02-19T16:20:15Z 2018-02-19T16:23:31Z NONE

Let me give a bit of a background what I would like to do:

  1. Create an empty Dataset of coordinates I want to explore, i.e. two np.arrays x and y, and two scalars a and b.
  2. Generate an list of the Cartesian product of all the coordinates, i.e. [ {'x': -1, 'y': 0, 'a': 1, 'b': 5}, ...] (data format doesn't really matter).
  3. For each item of the iterator compute some function: f = f(x, y, a, b). In principle this function can be expensive to compute, therefore I'd compute it for each item of list from 2. separately on the cluster.
  4. "merge" it all together into a single xarray object

In principle f should be allowed to return e.g. np.array. An related issue in holoviews and the notebook with my initial attempt. In the linked notebook I managed to achieve the goal however without starting with an xarray object containing coordinates. Also combining the data seems a bit inefficient as it takes more time than generating it for a larger datasets.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cartesian product of coordinates and using it to index / fill empty dataset 297560256
286525479 https://github.com/pydata/xarray/pull/924#issuecomment-286525479 https://api.github.com/repos/pydata/xarray/issues/924 MDEyOklzc3VlQ29tbWVudDI4NjUyNTQ3OQ== RafalSkolasinski 10928117 2017-03-14T18:57:48Z 2017-03-14T18:57:48Z NONE

@pwolfram Unfortunately nothing from my side yet.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  WIP: progress toward making groupby work with multiple arguments 168272291
280422799 https://github.com/pydata/xarray/issues/1265#issuecomment-280422799 https://api.github.com/repos/pydata/xarray/issues/1265 MDEyOklzc3VlQ29tbWVudDI4MDQyMjc5OQ== RafalSkolasinski 10928117 2017-02-16T18:51:30Z 2017-02-16T18:51:30Z NONE

Hi, I tried to came with a bit more interesting but still simple example

```python from itertools import product import numpy as np import pandas as pd

import holoviews as hv hv.notebook_extension()

def energies(L, a): k = np.pi * np.arange(1, L//a) / L return {'exact': k2, 'approx': 2*(1 - np.cos(k * a)) / a2}

L = np.arange(10, 21, 2) a = np.array([1, .5, .25])

data = [] for Li, ai in product(L, a): output = dict(L=Li, a=ai) output.update(**energies(Li, ai)) data.append(output)

df = pd.DataFrame(data)

hmap_data = {} for n, row in df.iterrows(): key = row.L, row.a val = (hv.Points((np.arange(len(row.exact)), row.exact), kdims=['n', 'E']) * hv.Points((np.arange(len(row.approx)), row.approx), kdims=['n', 'E'])) hmap_data[key] = val

hv.HoloMap(hmap_data, kdims=['L', 'a']).select(n=(0, 20), E=(0, 20)) ```

example is simple and don't include any serious simulation. I compare here energies of particle in 1D box vs what would came out from tight-binding simulation. Example is very simple but it captures situation that happens often when calculating spectrum of a finite system: for different system size we get different amount of energy levels.

That simple example is manageable without any pandas or xarray machinery but imagine real simulation made with kwant for series of different input parameters (system dimensions, gate voltages, chemical potentials, etc...)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  variable length of a dimension in DataArray 207283854
280314275 https://github.com/pydata/xarray/pull/924#issuecomment-280314275 https://api.github.com/repos/pydata/xarray/issues/924 MDEyOklzc3VlQ29tbWVudDI4MDMxNDI3NQ== RafalSkolasinski 10928117 2017-02-16T12:09:49Z 2017-02-16T12:09:49Z NONE

@shoyer I am considering contributing to this feature. Could you give me more details what needs to be done?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  WIP: progress toward making groupby work with multiple arguments 168272291
279514589 https://github.com/pydata/xarray/issues/1265#issuecomment-279514589 https://api.github.com/repos/pydata/xarray/issues/1265 MDEyOklzc3VlQ29tbWVudDI3OTUxNDU4OQ== RafalSkolasinski 10928117 2017-02-13T20:37:48Z 2017-02-13T20:37:48Z NONE

I believe that this is a common problem in simulation of quantum mechanical problems. I will try to come with a bit more realistic / practical example that I hope will help with choosing the best solution.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  variable length of a dimension in DataArray 207283854
278944182 https://github.com/pydata/xarray/pull/924#issuecomment-278944182 https://api.github.com/repos/pydata/xarray/issues/924 MDEyOklzc3VlQ29tbWVudDI3ODk0NDE4Mg== RafalSkolasinski 10928117 2017-02-10T13:40:51Z 2017-02-10T13:40:51Z NONE

Hi, is there any active work on that feature? It would be really cool to have it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  WIP: progress toward making groupby work with multiple arguments 168272291

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 480.31ms · About: xarray-datasette