home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

9 rows where author_association = "CONTRIBUTOR", issue = 138332032 and user = 4295853 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: updated_at (date)

user 1

  • pwolfram · 9 ✖

issue 1

  • Array size changes following loading of numpy array · 9 ✖

author_association 1

  • CONTRIBUTOR · 9 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
194025629 https://github.com/pydata/xarray/issues/783#issuecomment-194025629 https://api.github.com/repos/pydata/xarray/issues/783 MDEyOklzc3VlQ29tbWVudDE5NDAyNTYyOQ== pwolfram 4295853 2016-03-08T23:37:16Z 2016-03-08T23:37:16Z CONTRIBUTOR

Issue resolved via https://github.com/dask/dask/issues/1038

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Array size changes following loading of numpy array 138332032
194009858 https://github.com/pydata/xarray/issues/783#issuecomment-194009858 https://api.github.com/repos/pydata/xarray/issues/783 MDEyOklzc3VlQ29tbWVudDE5NDAwOTg1OA== pwolfram 4295853 2016-03-08T23:06:11Z 2016-03-08T23:36:42Z CONTRIBUTOR

@jcrist, @mrocklin, and @shoyer thank you all for the fix. The problem disappeared following a pip -v install git+ssh://git@github.com/jcrist/dask@xr_issue. Do any of you know when this PR will be incorporated in a conda release? I'm ok hacking things on my laptop but typically only try to install release software on clusters to ensure production run quality. Once it is released I can remove my temp fix that sets the chunk size. Thanks again!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Array size changes following loading of numpy array 138332032
193849800 https://github.com/pydata/xarray/issues/783#issuecomment-193849800 https://api.github.com/repos/pydata/xarray/issues/783 MDEyOklzc3VlQ29tbWVudDE5Mzg0OTgwMA== pwolfram 4295853 2016-03-08T16:27:14Z 2016-03-08T16:27:14Z CONTRIBUTOR

Thank you @shoyer, @mrocklin, and @jcrist for your clutch help with this issue. Also, if it would be helpful for me to submit this issue to dask I'm happy to do that but am assuming it is not necessary.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Array size changes following loading of numpy array 138332032
193523226 https://github.com/pydata/xarray/issues/783#issuecomment-193523226 https://api.github.com/repos/pydata/xarray/issues/783 MDEyOklzc3VlQ29tbWVudDE5MzUyMzIyNg== pwolfram 4295853 2016-03-08T00:22:05Z 2016-03-08T00:22:05Z CONTRIBUTOR

Thanks @shoyer. You are correct that the files have different sizes, in this case corresponding to days in each month. Also, thank you for the clarification about the distinction between xarray and dask. I clearly don't have the background to debug this quickly myself. I'll follow your and @mrocklin 's lead on where to go from here. Since I have a fix in the short term science will progress. However, in the long term I'd like to help get this resolved, especially since our team here is starting to use xarray more and more.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Array size changes following loading of numpy array 138332032
193514285 https://github.com/pydata/xarray/issues/783#issuecomment-193514285 https://api.github.com/repos/pydata/xarray/issues/783 MDEyOklzc3VlQ29tbWVudDE5MzUxNDI4NQ== pwolfram 4295853 2016-03-07T23:59:44Z 2016-03-07T23:59:44Z CONTRIBUTOR

@shoyer, the problem can be "resolved" by manually specifying the chunk size, e.g., https://gist.github.com/76dccfed2ff8e33b3a2a, specifically line 46: rlzns = rlzns.chunk({'Time':30}). The actual number appears to be unimportant, meaning 1 and 1000 also work.

So, following @mrocklin, I'd intuit that the issue is that the xarray rechunking algorithm has a bug and somehow (I'm guessing) there may be incompatible or inconsistent chunk sizes for each dask array spawned for each file. For some condition, the chunk size is getting perturbed by an error of one. It appears that setting chunk size manually ensures that the sizes are maintained for small chunk sizes and for large chunksizes are the maximum size of the chunk for each dask array.

Is it the design that chunk sizes are automatically changed following the indexing of rlzns.xParticle[rnum*Ntr:(rnum+1)*Ntr,:]?

@shoyer, I'm assuming you or your team could work through this bug quickly but if not, can you please provide me some high-level guidance on how to sort this out? In the short term, I can just set the chunk size manually to be 100 which I will confirm works in my application.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Array size changes following loading of numpy array 138332032
193312146 https://github.com/pydata/xarray/issues/783#issuecomment-193312146 https://api.github.com/repos/pydata/xarray/issues/783 MDEyOklzc3VlQ29tbWVudDE5MzMxMjE0Ng== pwolfram 4295853 2016-03-07T15:53:55Z 2016-03-07T15:53:55Z CONTRIBUTOR

Here is a further simplified script to help with the start of debugging (to avoid using my custom preprocessing wrapper): https://gist.github.com/76dccfed2ff8e33b3a2a

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Array size changes following loading of numpy array 138332032
193306605 https://github.com/pydata/xarray/issues/783#issuecomment-193306605 https://api.github.com/repos/pydata/xarray/issues/783 MDEyOklzc3VlQ29tbWVudDE5MzMwNjYwNQ== pwolfram 4295853 2016-03-07T15:46:59Z 2016-03-07T15:46:59Z CONTRIBUTOR

I should also note I've tested this in linux on our cluster as well as in OS X so the problem should be platform independent (haven't tested on windows though).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Array size changes following loading of numpy array 138332032
193303887 https://github.com/pydata/xarray/issues/783#issuecomment-193303887 https://api.github.com/repos/pydata/xarray/issues/783 MDEyOklzc3VlQ29tbWVudDE5MzMwMzg4Nw== pwolfram 4295853 2016-03-07T15:39:44Z 2016-03-07T15:39:44Z CONTRIBUTOR

Thanks @shoyer and @mrocklin. You can find a small, reduced demonstration of the problem at https://www.dropbox.com/s/l31jiol5t08lj0s/test_dask.tgz?dl=0. The data is produced via our in-situ Lagrangian particle tracking, i.e., http://journals.ametsoc.org/doi/abs/10.1175/JPO-D-14-0260.1 and then I'm "clipping" the output files via for i in lagrPartTrack.00*; do ncks -F -d nParticles,1,100,1 -v numTimesReset,xParticle,xtime,buoyancySurfaceValues $i clipped_$i ; done

Once you have downloaded the tgz file in a new directory, run:

tar xzvf test_dask.tgz ; python test_dask_error.py -f 'clipped_lagrPartTrack.00*nc'

which should give you an error of

Traceback (most recent call last): File "test_dask_error.py", line 107, in <module> aggregate_positions(options.particlefile, options.nrealization, options.rlznrange) File "test_dask_error.py", line 70, in aggregate_positions str(rlzns.xParticle[rnum*Ntr:(rnum+1)*Ntr,:].values.shape) AssertionError: for rnum=7 rlzns.xParticle[rnum*Ntr:(rnum+1)*Ntr,:].shape = (30, 100) and rlzns.xParticle[rnum*Ntr:(rnum+1)*Ntr,:].values.shape = (29, 100)

to demonstrate the problem. Assuming the script finishes cleanly (without the assert triggering on an error) I'd say the problem should be resolved for my uses.

Note, this also occurs for dask 0.8.0 and I'm using the latest xarray 0.7.1

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Array size changes following loading of numpy array 138332032
192025306 https://github.com/pydata/xarray/issues/783#issuecomment-192025306 https://api.github.com/repos/pydata/xarray/issues/783 MDEyOklzc3VlQ29tbWVudDE5MjAyNTMwNg== pwolfram 4295853 2016-03-03T23:51:11Z 2016-03-03T23:51:11Z CONTRIBUTOR

This is for the following versions:

dask 0.7.5 py27_0 xarray 0.7.1 py27_0

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Array size changes following loading of numpy array 138332032

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 1675.728ms · About: xarray-datasette