home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

22 rows where user = 7300413 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, reactions, created_at (date), updated_at (date)

issue 10

  • Some queries 4
  • Making xray use multiple cores 3
  • Drawing only one contour 3
  • Using groupby with custom index 3
  • Creating a 2D DataArray 3
  • Query about concat 2
  • JJAS? 1
  • Unable to reference variable 1
  • colorbars in facet grids 1
  • Colorbar to FacetGrid plots 1

user 1

  • JoyMonteiro · 22 ✖

author_association 1

  • NONE 22
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
346242650 https://github.com/pydata/xarray/pull/1735#issuecomment-346242650 https://api.github.com/repos/pydata/xarray/issues/1735 MDEyOklzc3VlQ29tbWVudDM0NjI0MjY1MA== JoyMonteiro 7300413 2017-11-22T04:48:51Z 2017-11-22T04:48:51Z NONE

Happy to make any changes required!!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Colorbar to FacetGrid plots 275943854
345795013 https://github.com/pydata/xarray/issues/1717#issuecomment-345795013 https://api.github.com/repos/pydata/xarray/issues/1717 MDEyOklzc3VlQ29tbWVudDM0NTc5NTAxMw== JoyMonteiro 7300413 2017-11-20T19:04:11Z 2017-11-20T19:04:11Z NONE

Great, will prepare a PR! I second @fmaussion on keeping the auto colorbar, just makes life easy!

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  colorbars in facet grids 274233261
291552833 https://github.com/pydata/xarray/issues/1351#issuecomment-291552833 https://api.github.com/repos/pydata/xarray/issues/1351 MDEyOklzc3VlQ29tbWVudDI5MTU1MjgzMw== JoyMonteiro 7300413 2017-04-04T16:19:12Z 2017-04-04T16:19:12Z NONE

The DataArrays we use are a thin wrapper over xarray's to allow conversion into desired units. If we used Dataset, then we lose this functionality unless we subclass Dataset too. Overall, using a simple dictionary suits our purposes better.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Creating a 2D DataArray 219184224
291544315 https://github.com/pydata/xarray/issues/1351#issuecomment-291544315 https://api.github.com/repos/pydata/xarray/issues/1351 MDEyOklzc3VlQ29tbWVudDI5MTU0NDMxNQ== JoyMonteiro 7300413 2017-04-04T15:52:31Z 2017-04-04T15:52:31Z NONE

Thanks, Ryan. This is for CliMT, where the model arrays are DataArrays, so it would not make sense to use a Dataset. I think option 1 will make more sense.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Creating a 2D DataArray 219184224
291527776 https://github.com/pydata/xarray/issues/1351#issuecomment-291527776 https://api.github.com/repos/pydata/xarray/issues/1351 MDEyOklzc3VlQ29tbWVudDI5MTUyNzc3Ng== JoyMonteiro 7300413 2017-04-04T15:00:59Z 2017-04-04T15:00:59Z NONE

Also, I recall this is new functionality. What minimum version must I use to use 2D coordinates? That is, once I get the right syntax, that is ;)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Creating a 2D DataArray 219184224
286779750 https://github.com/pydata/xarray/issues/1308#issuecomment-286779750 https://api.github.com/repos/pydata/xarray/issues/1308 MDEyOklzc3VlQ29tbWVudDI4Njc3OTc1MA== JoyMonteiro 7300413 2017-03-15T15:32:33Z 2017-03-15T15:32:33Z NONE

Not sure if this helps, but I did a %%timeit on both versions. For daily climatology, the numbers are: CPU times: user 1h 21min 8s, sys: 6h 17min 39s, total: 7h 38min 47s Wall time: 20min 34s

For the 6 hourly thing, CPU times: user 5h 5min 6s, sys: 1d 2h 19min 45s, total: 1d 7h 24min 51s Wall time: 1h 31min 40s

It takes around 4x more time, which makes sense because there are 4x more groups. The ratio of user to system time is more or less constant, so nothing untoward seems to be happening in between the two runs.

I think it is just good to remember that the time to use scales linearly with the number of groups. I guess this is what @shoyer was talking about when he mentioned that since grouping is done within xarray, the dask graph grows, making things slower.

Thanks again!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Using groupby with custom index 214088387
286509639 https://github.com/pydata/xarray/issues/1308#issuecomment-286509639 https://api.github.com/repos/pydata/xarray/issues/1308 MDEyOklzc3VlQ29tbWVudDI4NjUwOTYzOQ== JoyMonteiro 7300413 2017-03-14T18:05:54Z 2017-03-14T18:05:54Z NONE

@shoyer If I increase the size of the longitude chunk anymore, it will almost like using no chunking at all. I guess this dataset is a corner case. I will try increasing doubling that value and see what happens. I hadn't realised that doing a groupby would also reduce the effective chunk size, thanks for pointing that out.

I'm using dask without distributed as of now, is there still some way to do the benchmark? I would be more than happy to run it.

@rabernat I would definitely favour a cloud based sandbox to try these things out. What would be the stumbling block towards actually setting it up? I have had some recent experience setting up jupyterhub, I can help set that up so that notebooks can be used easily in such an environment.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Using groupby with custom index 214088387
286497255 https://github.com/pydata/xarray/issues/1308#issuecomment-286497255 https://api.github.com/repos/pydata/xarray/issues/1308 MDEyOklzc3VlQ29tbWVudDI4NjQ5NzI1NQ== JoyMonteiro 7300413 2017-03-14T17:27:06Z 2017-03-14T17:31:32Z NONE

Hello Stephan,

The shape of the full data, if I read from within xarray, is (time, level, lat, lon), with level=60, lat=41, lon=480. time is 4*365*7 ~ 10000.

I am chunking only along longitude, using lon=100. I previously chunked along time, but that used too much memory (~45GB out of 128 GB) since the data is split into one file per month, and reading annual data would require reading many files into memory.

Superficially, I would think that both of the above would take similar amounts of time. In fact, calculating a daily climatology also requires grouping the four 6 hourly data points into a single day as well, which seems to be more complicated. However, it seems to run faster!

Thanks, Joy

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Using groupby with custom index 214088387
268234430 https://github.com/pydata/xarray/issues/1173#issuecomment-268234430 https://api.github.com/repos/pydata/xarray/issues/1173 MDEyOklzc3VlQ29tbWVudDI2ODIzNDQzMA== JoyMonteiro 7300413 2016-12-20T12:44:25Z 2016-12-20T12:44:25Z NONE

Playing around with things sounds like much more fun :) I can see how this will be useful, will start thinking of some test cases to code.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Some queries 196541604
268113379 https://github.com/pydata/xarray/issues/1173#issuecomment-268113379 https://api.github.com/repos/pydata/xarray/issues/1173 MDEyOklzc3VlQ29tbWVudDI2ODExMzM3OQ== JoyMonteiro 7300413 2016-12-19T23:54:13Z 2016-12-19T23:54:13Z NONE

Thanks. how big of an endeavour is this? I see some free time from 2-3rd week of Jan, and I could maybe contribute towards making this happen.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Some queries 196541604
268105389 https://github.com/pydata/xarray/issues/1173#issuecomment-268105389 https://api.github.com/repos/pydata/xarray/issues/1173 MDEyOklzc3VlQ29tbWVudDI2ODEwNTM4OQ== JoyMonteiro 7300413 2016-12-19T23:08:38Z 2016-12-19T23:08:38Z NONE

@shoyer: does this also work with dask.distributed? The doc seems to only mention a thread pool.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Some queries 196541604
268104755 https://github.com/pydata/xarray/issues/1173#issuecomment-268104755 https://api.github.com/repos/pydata/xarray/issues/1173 MDEyOklzc3VlQ29tbWVudDI2ODEwNDc1NQ== JoyMonteiro 7300413 2016-12-19T23:05:16Z 2016-12-19T23:05:16Z NONE

did not know about that, thanks!!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Some queries 196541604
223646763 https://github.com/pydata/xarray/issues/866#issuecomment-223646763 https://api.github.com/repos/pydata/xarray/issues/866 MDEyOklzc3VlQ29tbWVudDIyMzY0Njc2Mw== JoyMonteiro 7300413 2016-06-03T17:49:22Z 2016-06-03T17:49:22Z NONE

Oh, not really! thanks for the fix :)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Drawing only one contour 158212793
223629831 https://github.com/pydata/xarray/issues/866#issuecomment-223629831 https://api.github.com/repos/pydata/xarray/issues/866 MDEyOklzc3VlQ29tbWVudDIyMzYyOTgzMQ== JoyMonteiro 7300413 2016-06-03T16:40:41Z 2016-06-03T16:40:41Z NONE

Just repeating your example, but with

python z.plot.contour(ax=ax1, norm=None, **kw)

gives the expected

for some reason, the norm is causing an issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Drawing only one contour 158212793
223522352 https://github.com/pydata/xarray/issues/866#issuecomment-223522352 https://api.github.com/repos/pydata/xarray/issues/866 MDEyOklzc3VlQ29tbWVudDIyMzUyMjM1Mg== JoyMonteiro 7300413 2016-06-03T08:34:58Z 2016-06-03T08:34:58Z NONE

Yes, I had encountered that as well, but did not realise it was due to the colorbar... I normally don't use the colorbar. Thanks for the example!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Drawing only one contour 158212793
201191834 https://github.com/pydata/xarray/issues/803#issuecomment-201191834 https://api.github.com/repos/pydata/xarray/issues/803 MDEyOklzc3VlQ29tbWVudDIwMTE5MTgzNA== JoyMonteiro 7300413 2016-03-25T08:03:10Z 2016-03-25T08:03:10Z NONE

Yes, that makes sense. quite the corner case! works perfectly now, thanks again.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Unable to reference variable 143422096
162463237 https://github.com/pydata/xarray/issues/672#issuecomment-162463237 https://api.github.com/repos/pydata/xarray/issues/672 MDEyOklzc3VlQ29tbWVudDE2MjQ2MzIzNw== JoyMonteiro 7300413 2015-12-07T09:33:17Z 2015-12-07T09:33:17Z NONE

You were right, my chunk sizes were too large. It did not matter how many threads dask used either (4 vs. 8). The I/O component is still high, but that is also because I'm writing the final computed DataArray to disk.

Thanks!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Making xray use multiple cores 120681918
162419595 https://github.com/pydata/xarray/issues/672#issuecomment-162419595 https://api.github.com/repos/pydata/xarray/issues/672 MDEyOklzc3VlQ29tbWVudDE2MjQxOTU5NQ== JoyMonteiro 7300413 2015-12-07T06:11:39Z 2015-12-07T06:11:39Z NONE

Hello, I ran it with the dask profiler, and I looked at the top output disaggregated by core. It does seem to use multiple cores, but it seems to be using 8 threads when I looked at prof.visualize() (hyperthreading :P) and I feel this is killing performance.

How can I control how many threads to use?

Thanks, Joy

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Making xray use multiple cores 120681918
162417283 https://github.com/pydata/xarray/issues/672#issuecomment-162417283 https://api.github.com/repos/pydata/xarray/issues/672 MDEyOklzc3VlQ29tbWVudDE2MjQxNzI4Mw== JoyMonteiro 7300413 2015-12-07T05:46:15Z 2015-12-07T05:46:15Z NONE

I was trying to read ERA-Interim data, calculate anomalies using ds = ds - ds.mean(dim='longitude'), and similar operations along the time axis. Are such operations restricted to single cores?

Just multiplying two datasets (u*v) seems to be faster, though top shows two cores being used (I have 4 physical cores).

TIA, Joy

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Making xray use multiple cores 120681918
94132735 https://github.com/pydata/xarray/issues/393#issuecomment-94132735 https://api.github.com/repos/pydata/xarray/issues/393 MDEyOklzc3VlQ29tbWVudDk0MTMyNzM1 JoyMonteiro 7300413 2015-04-18T06:02:04Z 2015-04-18T06:02:04Z NONE

Thanks! Will extend my code in this fashion.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  JJAS? 69141510
76890735 https://github.com/pydata/xarray/issues/349#issuecomment-76890735 https://api.github.com/repos/pydata/xarray/issues/349 MDEyOklzc3VlQ29tbWVudDc2ODkwNzM1 JoyMonteiro 7300413 2015-03-03T05:45:50Z 2015-03-03T05:45:50Z NONE

Thanks. But that really kills my machine, even though I have 12 GB of RAM.

What I finally ended up doing is slicing the initial dataset created from one nc file to access the level+variable that I wanted. This gives me a DataArray object which I then xray.concat() with similar objects created from other variables.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Query about concat 59467251
76883017 https://github.com/pydata/xarray/issues/349#issuecomment-76883017 https://api.github.com/repos/pydata/xarray/issues/349 MDEyOklzc3VlQ29tbWVudDc2ODgzMDE3 JoyMonteiro 7300413 2015-03-03T03:57:55Z 2015-03-03T03:57:55Z NONE

No, not really. each file contains one year of data for four variables, and I have 35 files (1979-...)

I tried Dataset.merge as you suggested, but it says conflicting value for variable time, which I guess is what you would expect.

Can xray modify the nc file to make the time dimension unlimited? then I could simply use something like MFDataset...

TIA, Joy

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Query about concat 59467251

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.029ms · About: xarray-datasette