github: issue_comments: 11 rows where author_association = "NONE" and issue = 94328498 sorted by updated

11 rows where author_association = "NONE" and issue = 94328498 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
347165242	https://github.com/pydata/xarray/issues/463#issuecomment-347165242	https://api.github.com/repos/pydata/xarray/issues/463	MDEyOklzc3VlQ29tbWVudDM0NzE2NTI0Mg==	sebhahn 5929935	2017-11-27T12:17:17Z	2017-11-27T12:17:17Z	NONE	Thanks, I'll test it!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset too many files 94328498
347140117	https://github.com/pydata/xarray/issues/463#issuecomment-347140117	https://api.github.com/repos/pydata/xarray/issues/463	MDEyOklzc3VlQ29tbWVudDM0NzE0MDExNw==	sebhahn 5929935	2017-11-27T10:26:56Z	2017-11-27T10:26:56Z	NONE	Ok, I found my problem. I had to increase `ulimit -n`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset too many files 94328498
347126256	https://github.com/pydata/xarray/issues/463#issuecomment-347126256	https://api.github.com/repos/pydata/xarray/issues/463	MDEyOklzc3VlQ29tbWVudDM0NzEyNjI1Ng==	sebhahn 5929935	2017-11-27T09:33:29Z	2017-11-27T09:33:29Z	NONE	@shoyer I just ran into this issue again (with 8000 files, each 50 kB), I'm using xarray 0.9.6 and work on some performance tests. Is there any upper limit of number of files? File "/home/shahn/.pyenv/versions/warp_conda/envs/pyraster_env/lib/python2.7/site-packages/xarray/backends/api.py", line 505, in open_mfdataset File "/home/shahn/.pyenv/versions/warp_conda/envs/pyraster_env/lib/python2.7/site-packages/xarray/backends/api.py", line 282, in open_dataset File "/home/shahn/.pyenv/versions/warp_conda/envs/pyraster_env/lib/python2.7/site-packages/xarray/backends/netCDF4_.py", line 210, in __init__ File "/home/shahn/.pyenv/versions/warp_conda/envs/pyraster_env/lib/python2.7/site-packages/xarray/backends/netCDF4_.py", line 185, in _open_netcdf4_group File "netCDF4/_netCDF4.pyx", line 1811, in netCDF4._netCDF4.Dataset.__init__ (netCDF4/_netCDF4.c:13231) IOError: Too many open files	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset too many files 94328498
288868053	https://github.com/pydata/xarray/issues/463#issuecomment-288868053	https://api.github.com/repos/pydata/xarray/issues/463	MDEyOklzc3VlQ29tbWVudDI4ODg2ODA1Mw==	ajoros 2615433	2017-03-23T21:37:19Z	2017-03-23T21:37:19Z	NONE	Yessir @pwolfram we are in business.!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset too many files 94328498
288835940	https://github.com/pydata/xarray/issues/463#issuecomment-288835940	https://api.github.com/repos/pydata/xarray/issues/463	MDEyOklzc3VlQ29tbWVudDI4ODgzNTk0MA==	ajoros 2615433	2017-03-23T19:34:33Z	2017-03-23T19:34:33Z	NONE	Thanks @pwolfram ... shot you a follow up email at your Gmail...	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset too many files 94328498
288829145	https://github.com/pydata/xarray/issues/463#issuecomment-288829145	https://api.github.com/repos/pydata/xarray/issues/463	MDEyOklzc3VlQ29tbWVudDI4ODgyOTE0NQ==	ajoros 2615433	2017-03-23T19:08:37Z	2017-03-23T19:08:37Z	NONE	Not sure this is good feedback at all but I just wanted to provide an additional problematic case, from my end, that is returning this "too many files" problem: NOTE: I have the latest xarray package. I have about 365 1.7MB Netcdf files that I am trying to read using open_mfdataset() and it continuously gives me the "too many files" error and completely hangs jupyter notebooks to the point where I have to ctrl+C out of it. Note that each netcdf contains a Dataset that is 195x195x1. Obviously it's not a file-size issue as I'm not dealing with multiple gigs worth of data. Should I increase the OSX open max file limit, or will that not solve anything in my case?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset too many files 94328498
224049602	https://github.com/pydata/xarray/issues/463#issuecomment-224049602	https://api.github.com/repos/pydata/xarray/issues/463	MDEyOklzc3VlQ29tbWVudDIyNDA0OTYwMg==	darothen 4992424	2016-06-06T18:42:06Z	2016-06-06T18:42:06Z	NONE	@mangecoeur, although it's not an xarray-based solution, I've found that by far the best solution to this problem is to transform your dataset from the "timeslice" format (which is convenient for models to write out - all the data at a given point in time, often in separate files for each time step) to "timeseries" format - a continuous format, where you have all the data for a single variable in a single (or much smaller collection of) files. NCAR published a great utility for converting batches of NetCDF output from timeslice to timeseries format here; it's significantly faster than any shell-script/CDO/NCO solution I've ever encountered, and it parallelizes extremely easily. Adding a simple post-processing step to convert my simulation output to timeseries format dramatically reduced my overall work time. Before, I had a separate handler which re-implemented open_mfdataset(), performed an intermediate reduction (usually extracting a variable), and then concatenated within xarray. This could get around the open file limit, but it wasn't fast. My pre-processed data is often still big - barely fitting within memory - but it's far easier to handle, and you can throw dask at it no problem to get huge speedups in analysis.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset too many files 94328498
143373357	https://github.com/pydata/xarray/issues/463#issuecomment-143373357	https://api.github.com/repos/pydata/xarray/issues/463	MDEyOklzc3VlQ29tbWVudDE0MzM3MzM1Nw==	cpaulik 380927	2015-09-25T23:11:39Z	2015-09-25T23:11:39Z	NONE	OK, I'll try. Thanks. But I originally tested if netCDF4 can work with a closed/reopened variable like this: ``` python In [1]: import netCDF4 In [2]: a = netCDF4.Dataset("temp.nc", mode="w") In [3]: a.createDimension("lon") Out[3]: <class 'netCDF4._netCDF4.Dimension'> (unlimited): name = 'lon', size = 0 In [4]: a.createVariable("lon", "f8", dimensions=("lon")) Out[4]: <class 'netCDF4._netCDF4.Variable'> float64 lon(lon) unlimited dimensions: lon current shape = (0,) filling on, default _FillValue of 9.969209968386869e+36 used In [5]: v = a.variables['lon'] In [6]: v Out[6]: <class 'netCDF4._netCDF4.Variable'> float64 lon(lon) unlimited dimensions: lon current shape = (0,) filling on, default _FillValue of 9.969209968386869e+36 used In [7]: a.close() In [8]: v Out[8]: --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) /home/cp/.pyenv/versions/miniconda3-3.16.0/envs/xray-3.5.0/lib/python3.5/site-packages/IPython/core/formatters.py in call(self, obj) 695 type_pprinters=self.type_printers, 696 deferred_pprinters=self.deferred_printers) --> 697 printer.pretty(obj) 698 printer.flush() 699 return stream.getvalue() /home/cp/.pyenv/versions/miniconda3-3.16.0/envs/xray-3.5.0/lib/python3.5/site-packages/IPython/lib/pretty.py in pretty(self, obj) 381 if callable(meth): 382 return meth(obj, self, cycle) --> 383 return default_pprint(obj, self, cycle) 384 finally: 385 self.end_group() /home/cp/.pyenv/versions/miniconda3-3.16.0/envs/xray-3.5.0/lib/python3.5/site-packages/IPython/lib/pretty.py in _default_pprint(obj, p, cycle) 501 if _safe_getattr(klass, '__repr__', None) not in _baseclass_reprs: 502 # A user-provided repr. Find newlines and replace them with p.break() --> 503 repr_pprint(obj, p, cycle) 504 return 505 p.begin_group(1, '<') /home/cp/.pyenv/versions/miniconda3-3.16.0/envs/xray-3.5.0/lib/python3.5/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle) 683 """A pprint that just redirects to the normal repr function.""" 684 # Find newlines and replace them with p.break() --> 685 output = repr(obj) 686 for idx,output_line in enumerate(output.splitlines()): 687 if idx: netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Variable.repr (netCDF4/_netCDF4.c:25045)() netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Variable.unicode (netCDF4/_netCDF4.c:25243)() netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Variable.dimensions.get (netCDF4/_netCDF4.c:27486)() netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Variable._getdims (netCDF4/_netCDF4.c:26297)() RuntimeError: NetCDF: Not a valid ID In [9]: a = netCDF4.Dataset("temp.nc") In [10]: v Out[10]: class 'netCDF4._netCDF4.Variable'> lon(lon) dimensions: lon shape = (0,) on, default _FillValue of 9.969209968386869e+36 used ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset too many files 94328498
143338384	https://github.com/pydata/xarray/issues/463#issuecomment-143338384	https://api.github.com/repos/pydata/xarray/issues/463	MDEyOklzc3VlQ29tbWVudDE0MzMzODM4NA==	cpaulik 380927	2015-09-25T20:02:42Z	2015-09-25T20:02:42Z	NONE	I've only put the try - except there to conditionally set the breakpoint. How does it make a difference if the self.store.close is called? It it is not called then the dataset remains opened which should not cause the weird behaviour reported above? Nevertheless I have updated my branch to use a contextmanager because it is a better solution but I still have this strange behaviour of only printing the variable altering the test outcome.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset too many files 94328498
143222580	https://github.com/pydata/xarray/issues/463#issuecomment-143222580	https://api.github.com/repos/pydata/xarray/issues/463	MDEyOklzc3VlQ29tbWVudDE0MzIyMjU4MA==	cpaulik 380927	2015-09-25T13:27:59Z	2015-09-25T13:27:59Z	NONE	I've pushed a few commits trying this out to https://github.com/cpaulik/xray/tree/closing_netcdf_backend . I can open a WIP PR if this would be easier to discuss there. There are however a few tests that keep failing and I can not figure out why. e.g.: `test_backends.py::NetCDF4ViaDaskDataTest::test_compression_encoding`: If I set a breakpoint at line 941 of dataset.py and just continue the test fails. If I however evaluate `self.variables.items()` or even `self.variables` at the breakpoint I get the correct output and the test passes when continued. I can not really see the difference between me evaluating this in `ipdb` and the code that is on the line. The error I get when running the test without interference is: ``` shell test_backends.py::NetCDF4ViaDaskDataTest::test_compression_encoding FAILED ====================================================== FAILURES ======================================================= ______ NetCDF4ViaDaskDataTest.test_compression_encoding _________ self = <xray.test.test_backends.NetCDF4ViaDaskDataTest testMethod=test_compression_encoding> `def test_compression_encoding(self): data = create_test_data() data['var2'].encoding.update({'zlib': True, 'chunksizes': (5, 5), 'fletcher32': True})` `with self.roundtrip(data) as actual:` test_backends.py:502: /usr/lib/python2.7/contextlib.py:17: in enter return self.gen.next() test_backends.py:596: in roundtrip yield ds.chunk() ../core/dataset.py:942: in chunk for k, v in self.variables.items()]) ../core/dataset.py:935: in maybe_chunk token2 = tokenize(name, token if token else var._data) /home/cpa/.virtualenvs/xray/local/lib/python2.7/site-packages/dask/base.py:152: in tokenize return md5(str(tuple(map(normalize_token, args))).encode()).hexdigest() ../core/indexing.py:301: in repr (type(self).name, self.array, self.key)) ../core/utils.py:377: in repr return '%s(array=%r)' % (type(self).name, self.array) ../core/indexing.py:301: in repr (type(self).name, self.array, self.key)) ../core/utils.py:377: in repr return '%s(array=%r)' % (type(self).name, self.array) netCDF4/_netCDF4.pyx:2931: in netCDF4._netCDF4.Variable.repr (netCDF4/_netCDF4.c:25068) ??? netCDF4/_netCDF4.pyx:2938: in netCDF4._netCDF4.Variable.unicode (netCDF4/_netCDF4.c:25243) ??? netCDF4/_netCDF4.pyx:3059: in netCDF4._netCDF4.Variable.dimensions.get (netCDF4/_netCDF4.c:27486) ??? ??? E RuntimeError: NetCDF: Not a valid ID netCDF4/_netCDF4.pyx:2994: RuntimeError ============================================== 1 failed in 0.50 seconds =============================================== ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset too many files 94328498
142637232	https://github.com/pydata/xarray/issues/463#issuecomment-142637232	https://api.github.com/repos/pydata/xarray/issues/463	MDEyOklzc3VlQ29tbWVudDE0MjYzNzIzMg==	cpaulik 380927	2015-09-23T15:19:36Z	2015-09-23T15:19:36Z	NONE	I've run into the same problem and have been looking at the netCDF backend. A solution does not seem to be so easy as to open and close the file in the `__getitem__` method since this closes the file also for any other access e.g. attributes like `shape` or `dtype`. Short of decorating all the functions of the netCDF4 package I can not think of a workable solution to this. But maybe I'm overlooking something fundamental.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset too many files 94328498

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);