home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

11 rows where author_association = "MEMBER", issue = 189817033 and user = 1217238 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • shoyer · 11 ✖

issue 1

  • Remove caching logic from xarray.Variable · 11 ✖

author_association 1

  • MEMBER · 11 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
265878280 https://github.com/pydata/xarray/pull/1128#issuecomment-265878280 https://api.github.com/repos/pydata/xarray/issues/1128 MDEyOklzc3VlQ29tbWVudDI2NTg3ODI4MA== shoyer 1217238 2016-12-08T22:44:12Z 2016-12-08T22:44:12Z MEMBER

@mangecoeur You still need to use lock=False (or lock=dask.utils.SerializableLock() with the dev version of dask) and use a spawning process pool (https://github.com/pydata/xarray/pull/1128#issuecomment-261936849).

The former should be updated internally, and the later should be a documentation note.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Remove caching logic from xarray.Variable 189817033
264033283 https://github.com/pydata/xarray/pull/1128#issuecomment-264033283 https://api.github.com/repos/pydata/xarray/issues/1128 MDEyOklzc3VlQ29tbWVudDI2NDAzMzI4Mw== shoyer 1217238 2016-11-30T23:44:54Z 2016-11-30T23:44:54Z MEMBER

OK, in it goes!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Remove caching logic from xarray.Variable 189817033
263927223 https://github.com/pydata/xarray/pull/1128#issuecomment-263927223 https://api.github.com/repos/pydata/xarray/issues/1128 MDEyOklzc3VlQ29tbWVudDI2MzkyNzIyMw== shoyer 1217238 2016-11-30T16:50:48Z 2016-11-30T16:50:48Z MEMBER

@kynan @crusaderky Do you have concerns about merging this in the current state?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Remove caching logic from xarray.Variable 189817033
263926969 https://github.com/pydata/xarray/pull/1128#issuecomment-263926969 https://api.github.com/repos/pydata/xarray/issues/1128 MDEyOklzc3VlQ29tbWVudDI2MzkyNjk2OQ== shoyer 1217238 2016-11-30T16:49:53Z 2016-11-30T16:49:53Z MEMBER

I decided that between the choices of not running these tests on Windows and leaking a few temp files, I would rather leak some temp files. So that's exactly what I've done in the latest commit, for explicitly whitelisted tests.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Remove caching logic from xarray.Variable 189817033
263345757 https://github.com/pydata/xarray/pull/1128#issuecomment-263345757 https://api.github.com/repos/pydata/xarray/issues/1128 MDEyOklzc3VlQ29tbWVudDI2MzM0NTc1Nw== shoyer 1217238 2016-11-28T18:04:17Z 2016-11-28T18:04:17Z MEMBER

@mrocklin OK, so one option is to just ignore the permission errors and not remove the files on Windows. But is it really better to make the test suite leak temp files?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Remove caching logic from xarray.Variable 189817033
263344764 https://github.com/pydata/xarray/pull/1128#issuecomment-263344764 https://api.github.com/repos/pydata/xarray/issues/1128 MDEyOklzc3VlQ29tbWVudDI2MzM0NDc2NA== shoyer 1217238 2016-11-28T18:00:38Z 2016-11-28T18:00:38Z MEMBER

OK, I'm ready to give up on the remaining test failures and merge this anyways (marking them as expected failures). They are specific to our test suite and for Windows only, due to the inability to delete files that are not closed.

If these manifest themselves as issues for real users, I am happy to revisit, especially if someone who uses Windows can help debug. The 5 minute feedback cycle of pushing a commit and then seeing what Appveyor says is too painful.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Remove caching logic from xarray.Variable 189817033
261980869 https://github.com/pydata/xarray/pull/1128#issuecomment-261980869 https://api.github.com/repos/pydata/xarray/issues/1128 MDEyOklzc3VlQ29tbWVudDI2MTk4MDg2OQ== shoyer 1217238 2016-11-21T16:04:14Z 2016-11-21T16:04:14Z MEMBER

Does your failure work with the following spawning pool in Python 3?

Why, yes it does -- and it shows a nice speedup, as well! What was I missing here?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Remove caching logic from xarray.Variable 189817033
261841025 https://github.com/pydata/xarray/pull/1128#issuecomment-261841025 https://api.github.com/repos/pydata/xarray/issues/1128 MDEyOklzc3VlQ29tbWVudDI2MTg0MTAyNQ== shoyer 1217238 2016-11-21T04:36:02Z 2016-11-21T04:36:02Z MEMBER

This isn't yet working with dask multiprocessing for reading a netCDF4 file with in-memory compression. I'm not quite sure why: ``` In [5]: from multiprocessing.pool import Pool

In [7]: ds = xr.open_dataset('big-random.nc', lock=False, chunks={'x': 2500})

In [8]: dask.set_options(pool=Pool(4)) Out[8]: <dask.context.set_options at 0x1087c3898>

In [9]: %time ds.sum().compute()

RuntimeError Traceback (most recent call last) <ipython-input-9-4c43356c48db> in <module>() ----> 1 get_ipython().magic('time ds.sum().compute()')

/Users/shoyer/conda/envs/xarray-dev/lib/python3.5/site-packages/IPython/core/interactiveshell.py in magic(self, arg_s) 2156 magic_name, _, magic_arg_s = arg_s.partition(' ') 2157 magic_name = magic_name.lstrip(prefilter.ESC_MAGIC) -> 2158 return self.run_line_magic(magic_name, magic_arg_s) 2159 2160 #-------------------------------------------------------------------------

/Users/shoyer/conda/envs/xarray-dev/lib/python3.5/site-packages/IPython/core/interactiveshell.py in run_line_magic(self, magic_name, line) 2077 kwargs['local_ns'] = sys._getframe(stack_depth).f_locals 2078 with self.builtin_trap: -> 2079 result = fn(args,*kwargs) 2080 return result 2081

<decorator-gen-59> in time(self, line, cell, local_ns)

/Users/shoyer/conda/envs/xarray-dev/lib/python3.5/site-packages/IPython/core/magic.py in <lambda>(f, a, k) 186 # but it's overkill for just that one bit of state. 187 def magic_deco(arg): --> 188 call = lambda f, a, k: f(*a, k) 189 190 if callable(arg):

/Users/shoyer/conda/envs/xarray-dev/lib/python3.5/site-packages/IPython/core/magics/execution.py in time(self, line, cell, local_ns) 1174 if mode=='eval': 1175 st = clock2() -> 1176 out = eval(code, glob, local_ns) 1177 end = clock2() 1178 else:

<timed eval> in <module>()

/Users/shoyer/dev/xarray/xarray/core/dataset.py in compute(self) 348 """ 349 new = self.copy(deep=False) --> 350 return new.load() 351 352 @classmethod

/Users/shoyer/dev/xarray/xarray/core/dataset.py in load(self) 325 326 # evaluate all the dask arrays simultaneously --> 327 evaluated_data = da.compute(*lazy_data.values()) 328 329 for k, data in zip(lazy_data, evaluated_data):

/Users/shoyer/conda/envs/xarray-dev/lib/python3.5/site-packages/dask/base.py in compute(args, kwargs) 176 dsk = merge(var.dask for var in variables) 177 keys = [var._keys() for var in variables] --> 178 results = get(dsk, keys, *kwargs) 179 180 results_iter = iter(results)

/Users/shoyer/conda/envs/xarray-dev/lib/python3.5/site-packages/dask/threaded.py in get(dsk, result, cache, num_workers, kwargs) 67 results = get_async(pool.apply_async, len(pool._pool), dsk, result, 68 cache=cache, get_id=_thread_get_id, ---> 69 kwargs) 70 71 # Cleanup pools associated to dead threads

/Users/shoyer/conda/envs/xarray-dev/lib/python3.5/site-packages/dask/async.py in get_async(apply_async, num_workers, dsk, result, cache, get_id, raise_on_exception, rerun_exceptions_locally, callbacks, dumps, loads, **kwargs) 500 _execute_task(task, data) # Re-execute locally 501 else: --> 502 raise(remote_exception(res, tb)) 503 state['cache'][key] = res 504 finish_task(dsk, key, state, results, keyorder.get)

RuntimeError: NetCDF: HDF error

Traceback

File "/Users/shoyer/conda/envs/xarray-dev/lib/python3.5/site-packages/dask/async.py", line 268, in execute_task result = execute_task(task, data) File "/Users/shoyer/conda/envs/xarray-dev/lib/python3.5/site-packages/dask/async.py", line 248, in _execute_task args2 = [_execute_task(a, cache) for a in args] File "/Users/shoyer/conda/envs/xarray-dev/lib/python3.5/site-packages/dask/async.py", line 248, in <listcomp> args2 = [_execute_task(a, cache) for a in args] File "/Users/shoyer/conda/envs/xarray-dev/lib/python3.5/site-packages/dask/async.py", line 245, in _execute_task return [_execute_task(a, cache) for a in arg] File "/Users/shoyer/conda/envs/xarray-dev/lib/python3.5/site-packages/dask/async.py", line 245, in <listcomp> return [_execute_task(a, cache) for a in arg] File "/Users/shoyer/conda/envs/xarray-dev/lib/python3.5/site-packages/dask/async.py", line 249, in _execute_task return func(*args2) File "/Users/shoyer/conda/envs/xarray-dev/lib/python3.5/site-packages/dask/array/core.py", line 51, in getarray c = np.asarray(c) File "/Users/shoyer/conda/envs/xarray-dev/lib/python3.5/site-packages/numpy/core/numeric.py", line 482, in asarray return array(a, dtype, copy=False, order=order) File "/Users/shoyer/dev/xarray/xarray/core/indexing.py", line 417, in __array__ return np.asarray(self.array, dtype=dtype) File "/Users/shoyer/conda/envs/xarray-dev/lib/python3.5/site-packages/numpy/core/numeric.py", line 482, in asarray return array(a, dtype, copy=False, order=order) File "/Users/shoyer/dev/xarray/xarray/core/indexing.py", line 392, in __array__ return np.asarray(array[self.key], dtype=None) File "/Users/shoyer/conda/envs/xarray-dev/lib/python3.5/site-packages/numpy/core/numeric.py", line 482, in asarray return array(a, dtype, copy=False, order=order) File "/Users/shoyer/dev/xarray/xarray/core/indexing.py", line 392, in __array__ return np.asarray(array[self.key], dtype=None) File "/Users/shoyer/dev/xarray/xarray/backends/netCDF4.py", line 56, in getitem data = getitem(self.array, key) File "netCDF4/_netCDF4.pyx", line 3695, in netCDF4._netCDF4.Variable.getitem (netCDF4/_netCDF4.c:37914) File "netCDF4/_netCDF4.pyx", line 4376, in netCDF4._netCDF4.Variable._get (netCDF4/_netCDF4.c:47134) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Remove caching logic from xarray.Variable 189817033
261837981 https://github.com/pydata/xarray/pull/1128#issuecomment-261837981 https://api.github.com/repos/pydata/xarray/issues/1128 MDEyOklzc3VlQ29tbWVudDI2MTgzNzk4MQ== shoyer 1217238 2016-11-21T04:08:22Z 2016-11-21T04:11:30Z MEMBER

I added pickle support to DataStores. This should solve the basic serialization issue for dask.distributed (#798), but does not yet resolve the "too many open files" issue.

@mrocklin this could use your review.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Remove caching logic from xarray.Variable 189817033
261755551 https://github.com/pydata/xarray/pull/1128#issuecomment-261755551 https://api.github.com/repos/pydata/xarray/issues/1128 MDEyOklzc3VlQ29tbWVudDI2MTc1NTU1MQ== shoyer 1217238 2016-11-20T03:13:30Z 2016-11-20T03:13:30Z MEMBER

I removed the custom pickle override on Dataset/DataArray -- the issue I was working around was actually a indirect manifestation of bug on IndexVariable.load() (introduced in this PR).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Remove caching logic from xarray.Variable 189817033
261433336 https://github.com/pydata/xarray/pull/1128#issuecomment-261433336 https://api.github.com/repos/pydata/xarray/issues/1128 MDEyOklzc3VlQ29tbWVudDI2MTQzMzMzNg== shoyer 1217238 2016-11-18T02:36:21Z 2016-11-18T02:36:21Z MEMBER

In the long run I think it would be more robust to check for attributes (duck type style) rather than types in the various places.

Indeed, in particular I'm not very happy with the isinstance check for indexing.MemoryCachedArray in Variable.copy() -- it's rather poor separation of concerns.

It exists so that variable.compute() does not cache data in-memory on variable but only on the computed variable. Otherwise, there's basically no point to the separate compute method: if you use cache=True, you are stuck with caching on the original object. Likewise, it ensures that .copy() creates an array with a new cache, which is consistent with the current behavior of .copy().

As for type checking for dask arrays in .data: yes, it would be nice to have a well defined array interface layer that other array types could plug into. That would entail a significant amount of further work, however.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Remove caching logic from xarray.Variable 189817033

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 986.803ms · About: xarray-datasette