github: issue_comments: 24 rows where issue = 277538485 sorted by updated

24 rows where issue = 277538485 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
356390513	https://github.com/pydata/xarray/issues/1745#issuecomment-356390513	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM1NjM5MDUxMw==	shoyer 1217238	2018-01-09T19:36:10Z	2018-01-09T19:36:10Z	MEMBER	Both the warning message and the upstream anaconda issue seem like good ideas to me.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485
356255726	https://github.com/pydata/xarray/issues/1745#issuecomment-356255726	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM1NjI1NTcyNg==	braaannigan 10512793	2018-01-09T11:17:12Z	2018-01-09T11:17:12Z	CONTRIBUTOR	Hi @shoyer Updating netcdf4 to version 1.3.1 solves the problem. I'm trying to think what the potential solutions are. Essentially, we would need to modify the function ds.filepath(). However, this isn't possible inside xarray. Is there anything we can do other than add a warning message with the recommendation to upgrade netcdf4 when the file path has 88 characters and netcdf4 is version 1.2.4? Should we also submit an issue to anaconda to get the default package updates? Happy to prepare these if you think it's the best way to proceed. Liam	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485
352152392	https://github.com/pydata/xarray/issues/1745#issuecomment-352152392	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM1MjE1MjM5Mg==	shoyer 1217238	2017-12-16T01:58:02Z	2017-12-16T01:58:02Z	MEMBER	If upgrating to a newer version of netcdf4-python isn't an option we might need to figure out a workaround for xarray.... It seems that anaconda is still distributing netCDF4 1.2.4, which doesn't help here.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485
351944694	https://github.com/pydata/xarray/issues/1745#issuecomment-351944694	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM1MTk0NDY5NA==	braaannigan 10512793	2017-12-15T08:30:57Z	2017-12-15T08:30:57Z	CONTRIBUTOR	Hi @shoyer I've tried this print(ds.filepath()) suggestion and it reproduces when I use the full length file path which has 88 characters. Again, the segfault doesn't arise if I add or subtract a character to the file path (after copying the underlying file to a new name). This dependence on 88 characters is consistent with the bug here: https://github.com/Unidata/netcdf4-python/issues/585	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485
351788352	https://github.com/pydata/xarray/issues/1745#issuecomment-351788352	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM1MTc4ODM1Mg==	shoyer 1217238	2017-12-14T17:58:05Z	2017-12-14T17:58:05Z	MEMBER	Can you reproduce this just using netCDF4-python? Try: ``` import netCDF4 ds = netCDF4.Dataset(path) print(ds) print(ds.filepath()) ``` If so, it would be good to file a bug upstream. Actually, it looks like this might be https://github.com/Unidata/netcdf4-python/issues/506	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485
351786510	https://github.com/pydata/xarray/issues/1745#issuecomment-351786510	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM1MTc4NjUxMA==	braaannigan 10512793	2017-12-14T17:51:11Z	2017-12-14T17:51:11Z	CONTRIBUTOR	Interesting. I've tried to look at this a bit more by in netCDF4_.py running: `self._filename = self.ds.filepath() print(self.ds) self.is_remote = is_remote_uri(self._filename)` So, all I did was add a print statement `print(self.ds)`. In this case the open_dataset call worked fine.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485
351783850	https://github.com/pydata/xarray/issues/1745#issuecomment-351783850	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM1MTc4Mzg1MA==	shoyer 1217238	2017-12-14T17:41:05Z	2017-12-14T17:41:11Z	MEMBER	I think there is probably a bug buried inside the `netCDF4.Dataset.filepath()` method somewhere. For example, on netCDF4-python 1.2.4, this would crash if you have any non-ASCII characters in the path. But that doesn't seem to be the issue here.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485
351782302	https://github.com/pydata/xarray/issues/1745#issuecomment-351782302	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM1MTc4MjMwMg==	braaannigan 10512793	2017-12-14T17:35:24Z	2017-12-14T17:35:24Z	CONTRIBUTOR	I've also now tried out the re.match approach you suggest above, but it generates the same core dump as the re.search('^...') approach	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485
351781405	https://github.com/pydata/xarray/issues/1745#issuecomment-351781405	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM1MTc4MTQwNQ==	braaannigan 10512793	2017-12-14T17:32:04Z	2017-12-14T17:32:04Z	CONTRIBUTOR	With print(repr(path)) I get: 'grid.nc' '/path/verification/cabl/y2d/mnc_test_0008_1day_restoring/grid.nc' '/path/verification/cabl/y2d/mnc_test_0008_1day_restoring/grid.nc' where I've edited the changed the first part of the filename to "/path/"	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485
351780487	https://github.com/pydata/xarray/issues/1745#issuecomment-351780487	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM1MTc4MDQ4Nw==	shoyer 1217238	2017-12-14T17:28:37Z	2017-12-14T17:28:37Z	MEMBER	@braaannigan can you try adding `print(repr(path))` to `is_remote_uri()` so we can see exactly what these offending strings look like?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485
351779445	https://github.com/pydata/xarray/issues/1745#issuecomment-351779445	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM1MTc3OTQ0NQ==	shoyer 1217238	2017-12-14T17:24:40Z	2017-12-14T17:24:40Z	MEMBER	`re.match(pattern, string)` is equivalent to `re.search('^' + pattern, string)`, so arguably this is a cleaner solution anyways. But ideally I'd like to understand why this is a problem for you, so we can fix the underlying cause and not do it again.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485
351774833	https://github.com/pydata/xarray/issues/1745#issuecomment-351774833	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM1MTc3NDgzMw==	braaannigan 10512793	2017-12-14T17:07:49Z	2017-12-14T17:07:49Z	CONTRIBUTOR	If the ^ isn't strictly necessary I'm happy to put together a PR with it removed.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485
351768893	https://github.com/pydata/xarray/issues/1745#issuecomment-351768893	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM1MTc2ODg5Mw==	braaannigan 10512793	2017-12-14T16:50:43Z	2017-12-14T16:50:43Z	CONTRIBUTOR	Hi @shoyer The crash does not occur when the ^ is removed. When I run `python -c 'import sys; print(sys.getfilesystemencoding())` The output is: utf-8 The file loads with the scipy engine. I get a module import error with h5netcdf, even though `conda list` shows that I have version 0.5 installed. `xr.show_versions()` gives: INSTALLED VERSIONS commit: None python: 3.6.2.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-101-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.0 pandas: 0.21.0 numpy: 1.13.1 scipy: 0.19.1 netCDF4: 1.2.4 h5netcdf: None Nio: None bottleneck: 1.2.1 cyordereddict: None dask: 0.16.0 matplotlib: 2.0.2 cartopy: None seaborn: 0.7.1 setuptools: 36.7.1 pip: 9.0.1 conda: None pytest: 3.2.1 IPython: 6.2.1 sphinx: None	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485
351765967	https://github.com/pydata/xarray/issues/1745#issuecomment-351765967	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM1MTc2NTk2Nw==	shoyer 1217238	2017-12-14T16:41:19Z	2017-12-14T16:41:19Z	MEMBER	@braaannigan what about replacing `re.search('^https?\://', path)` with `re.match('https?\://', path)`? Can you share the output of running `python -c 'import sys; print(sys.getfilesystemencoding())'` at the command line? Also, please try `engine='scipy'` or `engine='h5netcdf'` with `open_dataset`. The output of `xarray.show_versions()` would also be helpful.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485
351733817	https://github.com/pydata/xarray/issues/1745#issuecomment-351733817	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM1MTczMzgxNw==	braaannigan 10512793	2017-12-14T14:56:04Z	2017-12-14T16:36:31Z	CONTRIBUTOR	There is also some filename dependence. The file load works for g.nc, gr.nc, gri.nc and then fails for grid.nc. The file load also works for grida.nc	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485
351732553	https://github.com/pydata/xarray/issues/1745#issuecomment-351732553	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM1MTczMjU1Mw==	braaannigan 10512793	2017-12-14T14:51:36Z	2017-12-14T14:51:36Z	CONTRIBUTOR	Hi @shoyer, thanks for getting back to me. That hasn't worked unfortunately. The only difference including the with LOCK statement makes is that the file load seems to work, but then the core dump happens when you try to access the object, e.g. with the `ds` line below: `import xarray as xr ds = xr.open_dataset('grid.nc') ds` As above, removing the `^` avoids the crash when the with LOCK statement is used.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485
351470450	https://github.com/pydata/xarray/issues/1745#issuecomment-351470450	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM1MTQ3MDQ1MA==	shoyer 1217238	2017-12-13T17:54:54Z	2017-12-13T17:54:54Z	MEMBER	@braaannigan Can you share the name of your problematic file? One possibility is that `re.search()` is not thread-safe, even though I don't think we call `is_remote_uri` from multiple threads. We can test that by adding a lock, and seeing if that resolves the issue. Try replacing `is_remote_uri` with: ```python import threading LOCK = threading.Lock() def is_remote_uri(path): with LOCK: return bool(re.search('^https?\://', path)) ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485
351345428	https://github.com/pydata/xarray/issues/1745#issuecomment-351345428	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM1MTM0NTQyOA==	braaannigan 10512793	2017-12-13T10:11:50Z	2017-12-13T10:11:50Z	CONTRIBUTOR	I've played around with it a bit more. It seems like it's the ^ character in the re.search term that's causing the issue. If this is removed and the function is simply: `def is_remote_uri(path): return bool(re.search('https?\://', path))` then I can load the file.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485
351339156	https://github.com/pydata/xarray/issues/1745#issuecomment-351339156	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM1MTMzOTE1Ng==	braaannigan 10512793	2017-12-13T09:49:13Z	2017-12-13T09:49:13Z	CONTRIBUTOR	I'm getting a similar error. The file size is very small (Kbs), so I don't think it's the size issue above. Instead, the error I get is due to something strange happening in core.utils.is_remote_uri(path). The error occurs when I'm reading netcdf3 files with the default netcdf4 engine (which should be able to handle netcdf3 of course). There is a workaround in that I can use the scipy reader to read netcdf3 files with no problems. Note that whenever I refer to "error" below it means the error that gives the following output rather than a python exception. The error message is: * Error in `/path/anaconda2/envs/base3/bin/python': corrupted size vs. prev_size: 0x0000000001814930 * Aborted (core dumped) The function where the problem arises is: `def is_remote_uri(path): return bool(re.search('^https?\://', path))` The function is called a few times during the open_dataset (or open_mfdataset, I get the same error). On the third or fourth call it triggers the error. As I'm not using remote datasets, I can hard-code the output of the function to be `return False` and then the file reads with no problems. The `is_remote_uri(path)` call is made a few times. However, it's only on line 233 of netCDF4_.py with `is_remote_uri(self._filename)` that the error is triggered. I've output the argument to the `is_remote_uri()` function for each time it's called. In the first call the argument is the filename, in the second call the argument is the filename with the absolute path and in the third (and fatal) call the argument is also the filename with the absolute path. I can't see any difference between the arguments to the function on the second and third call. When I copy them, assign them to variables and check equality in python it evaluates to True. I've added in a simpler call to `re.search` in the function: `def is_remote_uri(path): print((re.search('.nc','.nc'))) return bool(re.search('^https?\://', path))` This also triggers the error on the third call to the function. As such we can rule out something to do with the path name. I've played around with the `print((re.search('.nc','.nc')))` line that I've added in. It only triggers an error on the third call when the first argument of re.search has a dot in the string, so `re.search('.nc','.nc')` causes the error, but `re.search('nc','.nc')` doesn't. The error isn't dependent on .nc in any way, '.AAA' in the arguments will cause the same error. The error doesn't replicate if I simply import re in ipython. The error does not occur in xarray 0.9.6. The same function is called in a similar way and the function evaluates to False each time. I'm not really sure what to do next, though. The obvious workaround is to set engine='scipy' if you're working with netcdf3 files. Can anyone replicate this error?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485
347946641	https://github.com/pydata/xarray/issues/1745#issuecomment-347946641	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM0Nzk0NjY0MQ==	nick-weber 22665917	2017-11-29T18:09:39Z	2017-11-29T18:09:39Z	NONE	Thank you for the responses. Turns out my eyeball estimation of the dropped/kept variables was way off. My `dropvars` list is actually 88 variables and I am keeping 58 variables. Most of these have dimensions (time, y, x) and many are full-dimensional (time, z, y, x). The size of one netcdf file (which only contains one time step) is ~335 MB. You can look at one of these files here. It's a hefty dataset overall.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485
347856861	https://github.com/pydata/xarray/issues/1745#issuecomment-347856861	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM0Nzg1Njg2MQ==	crusaderky 6213168	2017-11-29T13:15:29Z	2017-11-29T13:15:29Z	MEMBER	Only if the coords are tridimensional..	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485
347819491	https://github.com/pydata/xarray/issues/1745#issuecomment-347819491	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM0NzgxOTQ5MQ==	shoyer 1217238	2017-11-29T10:34:25Z	2017-11-29T10:34:25Z	MEMBER	`(40528237)208` bytes = 676 MB, so running out of memory here seems plausible to me.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485
347815737	https://github.com/pydata/xarray/issues/1745#issuecomment-347815737	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM0NzgxNTczNw==	crusaderky 6213168	2017-11-29T10:19:52Z	2017-11-29T10:33:15Z	MEMBER	It sounds weird. Even if all the 20 variables he's dropping were coords on the longest dim, and the code was loading them up into memory and then dropping them (that would be wrong - but I didn't check the code yet to verify if that's the case), then we're talking about... `4052073=~690k` points? That's about 5mb of RAM if they're float64? @njweber2 how large are these files? Is it feasible to upload them somewhere? If not, could you write a script that generates equivalent dummy data and reproduce the problem with that?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485
347811473	https://github.com/pydata/xarray/issues/1745#issuecomment-347811473	https://api.github.com/repos/pydata/xarray/issues/1745	MDEyOklzc3VlQ29tbWVudDM0NzgxMTQ3Mw==	shoyer 1217238	2017-11-29T10:03:51Z	2017-11-29T10:03:51Z	MEMBER	I think this was introduced by https://github.com/pydata/xarray/pull/1551, where we started loading coordinates that are compared for equality into memory. This speeds up `open_mfdataset`, but does increase memory usage. We might consider adding an option for reduced memory usage at the price of speed. @crusaderky @jhamman @rabernat any thoughts?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() memory error in v0.10 277538485

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

24 rows where issue = 277538485 sorted by updated_at descending

print(ds)

Advanced export