github: issue_comments: 19 rows where issue = 138332032 sorted by updated

19 rows where issue = 138332032 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
194026664	https://github.com/pydata/xarray/issues/783#issuecomment-194026664	https://api.github.com/repos/pydata/xarray/issues/783	MDEyOklzc3VlQ29tbWVudDE5NDAyNjY2NA==	jcrist 2783717	2016-03-08T23:41:38Z	2016-03-08T23:41:38Z	NONE	We're going to try and do a bugfix release shortly. Thanks for reporting the issue!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array size changes following loading of numpy array 138332032
194025629	https://github.com/pydata/xarray/issues/783#issuecomment-194025629	https://api.github.com/repos/pydata/xarray/issues/783	MDEyOklzc3VlQ29tbWVudDE5NDAyNTYyOQ==	pwolfram 4295853	2016-03-08T23:37:16Z	2016-03-08T23:37:16Z	CONTRIBUTOR	Issue resolved via https://github.com/dask/dask/issues/1038	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array size changes following loading of numpy array 138332032
194009858	https://github.com/pydata/xarray/issues/783#issuecomment-194009858	https://api.github.com/repos/pydata/xarray/issues/783	MDEyOklzc3VlQ29tbWVudDE5NDAwOTg1OA==	pwolfram 4295853	2016-03-08T23:06:11Z	2016-03-08T23:36:42Z	CONTRIBUTOR	@jcrist, @mrocklin, and @shoyer thank you all for the fix. The problem disappeared following a `pip -v install git+ssh://git@github.com/jcrist/dask@xr_issue`. Do any of you know when this PR will be incorporated in a conda release? I'm ok hacking things on my laptop but typically only try to install release software on clusters to ensure production run quality. Once it is released I can remove my temp fix that sets the chunk size. Thanks again!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array size changes following loading of numpy array 138332032
193995046	https://github.com/pydata/xarray/issues/783#issuecomment-193995046	https://api.github.com/repos/pydata/xarray/issues/783	MDEyOklzc3VlQ29tbWVudDE5Mzk5NTA0Ng==	jcrist 2783717	2016-03-08T22:20:48Z	2016-03-08T22:21:28Z	NONE	@pwolfram, thanks for the bug report. This unearthed a pretty bad bug in the slicing code of dask. Should be fixed in https://github.com/dask/dask/pull/1038. If you have the chance, can you pull this branch and see if it fixes your problem?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array size changes following loading of numpy array 138332032
193849800	https://github.com/pydata/xarray/issues/783#issuecomment-193849800	https://api.github.com/repos/pydata/xarray/issues/783	MDEyOklzc3VlQ29tbWVudDE5Mzg0OTgwMA==	pwolfram 4295853	2016-03-08T16:27:14Z	2016-03-08T16:27:14Z	CONTRIBUTOR	Thank you @shoyer, @mrocklin, and @jcrist for your clutch help with this issue. Also, if it would be helpful for me to submit this issue to dask I'm happy to do that but am assuming it is not necessary.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array size changes following loading of numpy array 138332032
193618154	https://github.com/pydata/xarray/issues/783#issuecomment-193618154	https://api.github.com/repos/pydata/xarray/issues/783	MDEyOklzc3VlQ29tbWVudDE5MzYxODE1NA==	jcrist 2783717	2016-03-08T05:43:16Z	2016-03-08T05:43:16Z	NONE	I'll look at this tomorrow if you don't beat me to it :)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array size changes following loading of numpy array 138332032
193591506	https://github.com/pydata/xarray/issues/783#issuecomment-193591506	https://api.github.com/repos/pydata/xarray/issues/783	MDEyOklzc3VlQ29tbWVudDE5MzU5MTUwNg==	mrocklin 306380	2016-03-08T03:44:36Z	2016-03-08T03:44:36Z	MEMBER	Ah ha! Excellent. Thanks @shoyer . I'll give this a shot tomorrow (or perhaps ask @jcrist to look into it if he has time).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array size changes following loading of numpy array 138332032
193576856	https://github.com/pydata/xarray/issues/783#issuecomment-193576856	https://api.github.com/repos/pydata/xarray/issues/783	MDEyOklzc3VlQ29tbWVudDE5MzU3Njg1Ng==	shoyer 1217238	2016-03-08T02:56:28Z	2016-03-08T02:56:49Z	MEMBER	As expected, the following all dask.array solution triggers this: ``` python dates = pd.date_range('2001-01-01', freq='D', periods=1000) sizes = pd.Series(dates, dates).resample('1M', how='count').values chunks = (tuple(sizes), (100,)) x = da.ones((3630, 100), chunks=chunks) assert x[240:270].shape == x[240:270].compute().shape AssertionError ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array size changes following loading of numpy array 138332032
193527245	https://github.com/pydata/xarray/issues/783#issuecomment-193527245	https://api.github.com/repos/pydata/xarray/issues/783	MDEyOklzc3VlQ29tbWVudDE5MzUyNzI0NQ==	shoyer 1217238	2016-03-08T00:36:14Z	2016-03-08T00:36:14Z	MEMBER	Something like this might work to generate pathological chunks for dask.array: `dates = pandas.date_range('2000-01-01', freq='D', periods=1000) sizes = pandas.Series(dates, dates).resample('1M', how='count').values chunks = (tuple(sizes), (100,))` (I don't have xarray or dask installed on my work computer, but I could check this later)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array size changes following loading of numpy array 138332032
193523226	https://github.com/pydata/xarray/issues/783#issuecomment-193523226	https://api.github.com/repos/pydata/xarray/issues/783	MDEyOklzc3VlQ29tbWVudDE5MzUyMzIyNg==	pwolfram 4295853	2016-03-08T00:22:05Z	2016-03-08T00:22:05Z	CONTRIBUTOR	Thanks @shoyer. You are correct that the files have different sizes, in this case corresponding to days in each month. Also, thank you for the clarification about the distinction between xarray and dask. I clearly don't have the background to debug this quickly myself. I'll follow your and @mrocklin 's lead on where to go from here. Since I have a fix in the short term science will progress. However, in the long term I'd like to help get this resolved, especially since our team here is starting to use xarray more and more.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array size changes following loading of numpy array 138332032
193522753	https://github.com/pydata/xarray/issues/783#issuecomment-193522753	https://api.github.com/repos/pydata/xarray/issues/783	MDEyOklzc3VlQ29tbWVudDE5MzUyMjc1Mw==	mrocklin 306380	2016-03-08T00:20:43Z	2016-03-08T00:20:43Z	MEMBER	@shoyer perhaps you can help to translate the code within @pwolfram 's script (in particular the lines that I've highlighted) and say how xarray would use dask.array to accomplish this. `rnum = 7, Ntr = 30` I think this is a case where we each have some necessary expertise to resolve this issue. We probably need to work together to efficiently hunt down what's going on.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array size changes following loading of numpy array 138332032
193521326	https://github.com/pydata/xarray/issues/783#issuecomment-193521326	https://api.github.com/repos/pydata/xarray/issues/783	MDEyOklzc3VlQ29tbWVudDE5MzUyMTMyNg==	shoyer 1217238	2016-03-08T00:17:00Z	2016-03-08T00:17:00Z	MEMBER	If you don't specify a chunksize, xarray should use each file as a full "chunk". So it would probably be useful to know the shapes of each array you are loading with `open_mfdataset`. My guess is that this issue only arises when indexing arrays consisting of differently sized chunks, which is exactly why using `.chunk` to set a fixed chunk size resolves this issue. To be clear, all the logic implementing the chunking and indexing code for xarray objects containing dask arrays lives inside dask.array itself, not in our xarray wrapper (which is pretty thin). This doesn't make this any less of an issue for you, but I'm pretty sure (and I think @mrocklin agrees) that the bug here in probably in the dask.array layer.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array size changes following loading of numpy array 138332032
193514285	https://github.com/pydata/xarray/issues/783#issuecomment-193514285	https://api.github.com/repos/pydata/xarray/issues/783	MDEyOklzc3VlQ29tbWVudDE5MzUxNDI4NQ==	pwolfram 4295853	2016-03-07T23:59:44Z	2016-03-07T23:59:44Z	CONTRIBUTOR	@shoyer, the problem can be "resolved" by manually specifying the chunk size, e.g., https://gist.github.com/76dccfed2ff8e33b3a2a, specifically line 46: `rlzns = rlzns.chunk({'Time':30})`. The actual number appears to be unimportant, meaning 1 and 1000 also work. So, following @mrocklin, I'd intuit that the issue is that the xarray rechunking algorithm has a bug and somehow (I'm guessing) there may be incompatible or inconsistent chunk sizes for each dask array spawned for each file. For some condition, the chunk size is getting perturbed by an error of one. It appears that setting chunk size manually ensures that the sizes are maintained for small chunk sizes and for large chunksizes are the maximum size of the chunk for each dask array. Is it the design that chunk sizes are automatically changed following the indexing of `rlzns.xParticle[rnumNtr:(rnum+1)Ntr,:]`? @shoyer, I'm assuming you or your team could work through this bug quickly but if not, can you please provide me some high-level guidance on how to sort this out? In the short term, I can just set the chunk size manually to be 100 which I will confirm works in my application.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array size changes following loading of numpy array 138332032
193501447	https://github.com/pydata/xarray/issues/783#issuecomment-193501447	https://api.github.com/repos/pydata/xarray/issues/783	MDEyOklzc3VlQ29tbWVudDE5MzUwMTQ0Nw==	mrocklin 306380	2016-03-07T23:22:46Z	2016-03-07T23:22:46Z	MEMBER	It looks like the issue is in these lines: `(Pdb) pp rlzns.xParticle.data dask.array<getitem..., shape=(3630, 100), dtype=float64, chunksize=(21, 100)> (Pdb) pp rlzns.xParticle[rnumNtr:(rnum+1)Ntr,:].data dask.array<getitem..., shape=(30, 100), dtype=float64, chunksize=(23, 100)> (Pdb) pp rlzns.xParticle[rnumNtr:(rnum+1)Ntr,:].data.compute().shape (29, 100)` I'm confused by the chunksize change from 21 to 23. In straight dask.array I'm unable to reproduce this problem, although obviously I'm doing something differently here than how xarray does things. ``` python In [1]: import dask.array as da x In [2]: x = da.ones((3630, 100), chunks=(21, 100)) In [3]: y = x[730:830, :] In [4]: y.shape Out[4]: (30, 100) In [5]: y.compute().shape Out[5]: (30, 100) In [6]: y.chunks Out[6]: ((21, 9), (100,)) ``` It would be awesome if you all could produce a failing example with just dask.array.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array size changes following loading of numpy array 138332032
193312146	https://github.com/pydata/xarray/issues/783#issuecomment-193312146	https://api.github.com/repos/pydata/xarray/issues/783	MDEyOklzc3VlQ29tbWVudDE5MzMxMjE0Ng==	pwolfram 4295853	2016-03-07T15:53:55Z	2016-03-07T15:53:55Z	CONTRIBUTOR	Here is a further simplified script to help with the start of debugging (to avoid using my custom preprocessing wrapper): https://gist.github.com/76dccfed2ff8e33b3a2a	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array size changes following loading of numpy array 138332032
193306605	https://github.com/pydata/xarray/issues/783#issuecomment-193306605	https://api.github.com/repos/pydata/xarray/issues/783	MDEyOklzc3VlQ29tbWVudDE5MzMwNjYwNQ==	pwolfram 4295853	2016-03-07T15:46:59Z	2016-03-07T15:46:59Z	CONTRIBUTOR	I should also note I've tested this in linux on our cluster as well as in OS X so the problem should be platform independent (haven't tested on windows though).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array size changes following loading of numpy array 138332032
193303887	https://github.com/pydata/xarray/issues/783#issuecomment-193303887	https://api.github.com/repos/pydata/xarray/issues/783	MDEyOklzc3VlQ29tbWVudDE5MzMwMzg4Nw==	pwolfram 4295853	2016-03-07T15:39:44Z	2016-03-07T15:39:44Z	CONTRIBUTOR	Thanks @shoyer and @mrocklin. You can find a small, reduced demonstration of the problem at https://www.dropbox.com/s/l31jiol5t08lj0s/test_dask.tgz?dl=0. The data is produced via our in-situ Lagrangian particle tracking, i.e., http://journals.ametsoc.org/doi/abs/10.1175/JPO-D-14-0260.1 and then I'm "clipping" the output files via `for i in lagrPartTrack.00; do ncks -F -d nParticles,1,100,1 -v numTimesReset,xParticle,xtime,buoyancySurfaceValues $i clipped_$i ; done` Once you have downloaded the tgz file in a new directory, run: `tar xzvf test_dask.tgz ; python test_dask_error.py -f 'clipped_lagrPartTrack.00nc'` which should give you an error of `Traceback (most recent call last): File "test_dask_error.py", line 107, in <module> aggregate_positions(options.particlefile, options.nrealization, options.rlznrange) File "test_dask_error.py", line 70, in aggregate_positions str(rlzns.xParticle[rnumNtr:(rnum+1)Ntr,:].values.shape) AssertionError: for rnum=7 rlzns.xParticle[rnumNtr:(rnum+1)Ntr,:].shape = (30, 100) and rlzns.xParticle[rnumNtr:(rnum+1)Ntr,:].values.shape = (29, 100)` to demonstrate the problem. Assuming the script finishes cleanly (without the assert triggering on an error) I'd say the problem should be resolved for my uses. Note, this also occurs for dask 0.8.0 and I'm using the latest xarray 0.7.1	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array size changes following loading of numpy array 138332032
192048274	https://github.com/pydata/xarray/issues/783#issuecomment-192048274	https://api.github.com/repos/pydata/xarray/issues/783	MDEyOklzc3VlQ29tbWVudDE5MjA0ODI3NA==	shoyer 1217238	2016-03-04T01:28:23Z	2016-03-04T01:28:23Z	MEMBER	This does look very strange. I'm guessing it's a dask.array bug (cc @mrocklin). Can you make a reproducible example? If so, we'll probably be able to figure this out. How do you make this data? Tracking this sort of thing down is a good motivation for an eager-evaluation mode in dask.array... (https://github.com/dask/dask/issues/292)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array size changes following loading of numpy array 138332032
192025306	https://github.com/pydata/xarray/issues/783#issuecomment-192025306	https://api.github.com/repos/pydata/xarray/issues/783	MDEyOklzc3VlQ29tbWVudDE5MjAyNTMwNg==	pwolfram 4295853	2016-03-03T23:51:11Z	2016-03-03T23:51:11Z	CONTRIBUTOR	This is for the following versions: `dask 0.7.5 py27_0 xarray 0.7.1 py27_0`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array size changes following loading of numpy array 138332032

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

19 rows where issue = 138332032 sorted by updated_at descending

AssertionError

Advanced export