github: issue_comments: 30 rows where author_association = "CONTRIBUTOR" and user = 7799184 sorted by updated

30 rows where author_association = "CONTRIBUTOR" and user = 7799184 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
1153302528	https://github.com/pydata/xarray/issues/6688#issuecomment-1153302528	https://api.github.com/repos/pydata/xarray/issues/6688	IC_kwDOAMm_X85EvgAA	rafa-guedes 7799184	2022-06-12T21:56:10Z	2022-06-12T21:56:10Z	CONTRIBUTOR	That works thanks. I just checked the example in the docs now and that uses `kwargs={"fill_value": None}` in the 2D example with the result evaluating to NaNs. That one also works and returns actual values when using `"extrapolate"` instead so it looks like something might have changed in xarray or scipy.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	2D extrapolation not working 1268630439
1010549000	https://github.com/pydata/xarray/issues/6036#issuecomment-1010549000	https://api.github.com/repos/pydata/xarray/issues/6036	IC_kwDOAMm_X848O8EI	rafa-guedes 7799184	2022-01-12T01:49:52Z	2022-01-12T01:49:52Z	CONTRIBUTOR	Related issue in dask: https://github.com/dask/dask/issues/6363	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	`xarray.open_zarr()` takes too long to lazy load when the data arrays contain a large number of Dask chunks. 1068225524
748554375	https://github.com/pydata/xarray/pull/4461#issuecomment-748554375	https://api.github.com/repos/pydata/xarray/issues/4461	MDEyOklzc3VlQ29tbWVudDc0ODU1NDM3NQ==	rafa-guedes 7799184	2020-12-20T02:35:40Z	2020-12-20T09:10:27Z	CONTRIBUTOR	@rabernat , awesome! I was stunned by the difference -- I guess the async loading of coordinate data is the big win, right? @rsignell-usgs one other thing that can largely speed up loading of metadata / coordinates is ensuring coordinate variables are stored in one single chunk. For this particular dataset, chunk size for `time` coordinate is 672 yielding 339 chunks, which can take a while to load from remote bucket stores. If you rewrite `time` coordinate setting `dset.time.encoding["chunks"] = (227904,)` you should see a very large performance increase. One thing we have been doing for the cases of zarr archives that are appended in time, is defining time coordinate with a very large chunk size (e.g., `dset.time.encoding["chunks"] = (10000000,)`) when we first write the store. This ensures time coordinate will still fit in one single chunk after appending over time dimension, and does not affect chunking of the actual data variables. One thing we have been having performance issues with is with loading coordinates / metadata from zarr archives that have too many chunks (millions), even when metadata is consolidated and coordinates are in one single chunk. There is an open issue in dask about this.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Allow fsspec/zarr/mfdataset 709187212
721504192	https://github.com/pydata/xarray/pull/4035#issuecomment-721504192	https://api.github.com/repos/pydata/xarray/issues/4035	MDEyOklzc3VlQ29tbWVudDcyMTUwNDE5Mg==	rafa-guedes 7799184	2020-11-04T04:23:58Z	2020-11-04T04:23:58Z	CONTRIBUTOR	@shoyer thanks for implementing this, it is going to be very useful. I am trying to write this dataset below: dsregion: ``` <xarray.Dataset> Dimensions: (latitude: 2041, longitude: 4320, time: 31) Coordinates: * latitude (latitude) float32 -80.0 -79.916664 -79.833336 ... 89.916664 90.0 * time (time) datetime64[ns] 2008-10-01T12:00:00 ... 2008-10-31T12:00:00 * longitude (longitude) float32 -180.0 -179.91667 ... 179.83333 179.91667 Data variables: vo (time, latitude, longitude) float32 dask.array<chunksize=(30, 510, 1080), meta=np.ndarray> uo (time, latitude, longitude) float32 dask.array<chunksize=(30, 510, 1080), meta=np.ndarray> sst (time, latitude, longitude) float32 dask.array<chunksize=(30, 510, 1080), meta=np.ndarray> ssh (time, latitude, longitude) float32 dask.array<chunksize=(30, 510, 1080), meta=np.ndarray> ``` As a region of this other dataset: dset: <xarray.Dataset> Dimensions: (latitude: 2041, longitude: 4320, time: 9490) Coordinates: * latitude (latitude) float32 -80.0 -79.916664 -79.833336 ... 89.916664 90.0 * longitude (longitude) float32 -180.0 -179.91667 ... 179.83333 179.91667 * time (time) datetime64[ns] 1993-01-01T12:00:00 ... 2018-12-25T12:00:00 Data variables: ssh (time, latitude, longitude) float64 dask.array<chunksize=(30, 510, 1080), meta=np.ndarray> sst (time, latitude, longitude) float64 dask.array<chunksize=(30, 510, 1080), meta=np.ndarray> uo (time, latitude, longitude) float64 dask.array<chunksize=(30, 510, 1080), meta=np.ndarray> vo (time, latitude, longitude) float64 dask.array<chunksize=(30, 510, 1080), meta=np.ndarray> Using the following call: `dsregion.to_zarr(dset_url, region={"time": slice(5752, 5783)})` But I got stuck on the conditional below within `xarray/backends/api.py`: 1347 non_matching_vars = [ 1348 k 1349 for k, v in ds_to_append.variables.items() 1350 if not set(region).intersection(v.dims) 1351 ] 1352 import ipdb; ipdb.set_trace() -> 1353 if non_matching_vars: 1354 raise ValueError( 1355 f"when setting `region` explicitly in to_zarr(), all " 1356 f"variables in the dataset to write must have at least " 1357 f"one dimension in common with the region's dimensions " 1358 f"{list(region.keys())}, but that is not " 1359 f"the case for some variables here. To drop these variables " 1360 f"from this dataset before exporting to zarr, write: " 1361 f".drop({non_matching_vars!r})" 1362 ) Apparently because `time` is not a dimension in coordinate variables ["longitude", "latitude"]: `ipdb> p non_matching_vars ['latitude', 'longitude'] ipdb> p set(region) {'time'}` Should this checking be performed for all variables, or only for data_variables?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Support parallel writes to regions of zarr stores 613012939
610615621	https://github.com/pydata/xarray/issues/3942#issuecomment-610615621	https://api.github.com/repos/pydata/xarray/issues/3942	MDEyOklzc3VlQ29tbWVudDYxMDYxNTYyMQ==	rafa-guedes 7799184	2020-04-07T20:55:29Z	2020-04-07T21:07:31Z	CONTRIBUTOR	Yep I managed to overcome this by manually setting encoding parameters, just wondering if there would be any downside in preferring `float64` over `int64` when automatically defining these? This seems to fix that issue. I guess it could result in some other precision losses due to float-point errors but these should be small..	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Time dtype encoding defaulting to `int64` when writing netcdf or zarr 595492608
572293244	https://github.com/pydata/xarray/issues/2656#issuecomment-572293244	https://api.github.com/repos/pydata/xarray/issues/2656	MDEyOklzc3VlQ29tbWVudDU3MjI5MzI0NA==	rafa-guedes 7799184	2020-01-08T22:42:01Z	2020-01-08T22:43:25Z	CONTRIBUTOR	Pandas has an option `date_format` in to_json to serialize it either as iso8601 or epoch. The `encode_times` option to `to_dict` could also be useful...	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	dataset info in .json format 396285440
572054942	https://github.com/pydata/xarray/issues/2656#issuecomment-572054942	https://api.github.com/repos/pydata/xarray/issues/2656	MDEyOklzc3VlQ29tbWVudDU3MjA1NDk0Mg==	rafa-guedes 7799184	2020-01-08T13:36:41Z	2020-01-08T13:36:41Z	CONTRIBUTOR	Would it make sense having `to_json` / `from_json` methods that would take care of datetime serialisation?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	dataset info in .json format 396285440
563330352	https://github.com/pydata/xarray/issues/2511#issuecomment-563330352	https://api.github.com/repos/pydata/xarray/issues/2511	MDEyOklzc3VlQ29tbWVudDU2MzMzMDM1Mg==	rafa-guedes 7799184	2019-12-09T16:53:38Z	2019-12-09T16:53:38Z	CONTRIBUTOR	I'm having similar issue, here is an example: ``` import numpy as np import dask.array as da import xarray as xr darr = xr.DataArray(data=[0.2, 0.4, 0.6], coords={"z": range(3)}, dims=("z",)) good_indexer = xr.DataArray( data=np.random.randint(0, 3, 8).reshape(4, 2).astype(int), coords={"y": range(4), "x": range(2)}, dims=("y", "x") ) bad_indexer = xr.DataArray( data=da.random.randint(0, 3, 8).reshape(4, 2).astype(int), coords={"y": range(4), "x": range(2)}, dims=("y", "x") ) In [5]: darr Out[5]: <xarray.DataArray (z: 3)> array([0.2, 0.4, 0.6]) Coordinates: * z (z) int64 0 1 2 In [6]: good_indexer Out[6]: <xarray.DataArray (y: 4, x: 2)> array([[0, 1], [2, 2], [1, 2], [1, 0]]) Coordinates: * y (y) int64 0 1 2 3 * x (x) int64 0 1 In [7]: bad_indexer Out[7]: <xarray.DataArray 'reshape-417766b2035dcb1227ddde8505297039' (y: 4, x: 2)> dask.array<reshape, shape=(4, 2), dtype=int64, chunksize=(4, 2), chunktype=numpy.ndarray> Coordinates: * y (y) int64 0 1 2 3 * x (x) int64 0 1 In [8]: darr[good_indexer] Out[8]: <xarray.DataArray (y: 4, x: 2)> array([[0.2, 0.4], [0.6, 0.6], [0.4, 0.6], [0.4, 0.2]]) Coordinates: z (y, x) int64 0 1 2 2 1 2 1 0 * y (y) int64 0 1 2 3 * x (x) int64 0 1 In [9]: darr[bad_indexer] TypeError Traceback (most recent call last) <ipython-input-8-2a57c1a2eade> in <module> ----> 1 darr[bad_indexer] ~/.virtualenvs/py3/local/lib/python3.7/site-packages/xarray/core/dataarray.py in getitem(self, key) 638 else: 639 # xarray-style array indexing --> 640 return self.isel(indexers=self._item_key_to_dict(key)) 641 642 def setitem(self, key: Any, value: Any) -> None: ~/.virtualenvs/py3/local/lib/python3.7/site-packages/xarray/core/dataarray.py in isel(self, indexers, drop, indexers_kwargs) 1012 """ 1013 indexers = either_dict_or_kwargs(indexers, indexers_kwargs, "isel") -> 1014 ds = self._to_temp_dataset().isel(drop=drop, indexers=indexers) 1015 return self._from_temp_dataset(ds) 1016 ~/.virtualenvs/py3/local/lib/python3.7/site-packages/xarray/core/dataset.py in isel(self, indexers, drop, indexers_kwargs) 1920 if name in self.indexes: 1921 new_var, new_index = isel_variable_and_index( -> 1922 name, var, self.indexes[name], var_indexers 1923 ) 1924 if new_index is not None: ~/.virtualenvs/py3/local/lib/python3.7/site-packages/xarray/core/indexes.py in isel_variable_and_index(name, variable, index, indexers) 79 ) 80 ---> 81 new_variable = variable.isel(indexers) 82 83 if new_variable.dims != (name,): ~/.virtualenvs/py3/local/lib/python3.7/site-packages/xarray/core/variable.py in isel(self, indexers, indexers_kwargs) 1052 1053 key = tuple(indexers.get(dim, slice(None)) for dim in self.dims) -> 1054 return self[key] 1055 1056 def squeeze(self, dim=None): ~/.virtualenvs/py3/local/lib/python3.7/site-packages/xarray/core/variable.py in getitem(self, key) 700 array `x.values` directly. 701 """ --> 702 dims, indexer, new_order = self._broadcast_indexes(key) 703 data = as_indexable(self._data)[indexer] 704 if new_order: ~/.virtualenvs/py3/local/lib/python3.7/site-packages/xarray/core/variable.py in _broadcast_indexes(self, key) 557 if isinstance(k, Variable): 558 if len(k.dims) > 1: --> 559 return self._broadcast_indexes_vectorized(key) 560 dims.append(k.dims[0]) 561 elif not isinstance(k, integer_types): ~/.virtualenvs/py3/local/lib/python3.7/site-packages/xarray/core/variable.py in _broadcast_indexes_vectorized(self, key) 685 new_order = None 686 --> 687 return out_dims, VectorizedIndexer(tuple(out_key)), new_order 688 689 def getitem(self: VariableType, key) -> VariableType: ~/.virtualenvs/py3/local/lib/python3.7/site-packages/xarray/core/indexing.py in init**(self, key) 447 else: 448 raise TypeError( --> 449 f"unexpected indexer type for {type(self).name}: {k!r}" 450 ) 451 new_key.append(k) TypeError: unexpected indexer type for VectorizedIndexer: dask.array<reshape, shape=(4, 2), dtype=int64, chunksize=(4, 2), chunktype=numpy.ndarray> In [10]: xr.version Out[10]: '0.14.1' In [11]: import dask; dask.version Out[11]: '2.9.0' ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array indexing with dask arrays 374025325
551963613	https://github.com/pydata/xarray/issues/3490#issuecomment-551963613	https://api.github.com/repos/pydata/xarray/issues/3490	MDEyOklzc3VlQ29tbWVudDU1MTk2MzYxMw==	rafa-guedes 7799184	2019-11-08T19:40:23Z	2019-11-08T19:40:23Z	CONTRIBUTOR	Perhaps reflected operators (i.e., `__rmul__`) could be defined differently somewhere? I cannot see anything obvious within xarray.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Dataset global attributes dropped when performing operations against numpy data type 518966560
513996346	https://github.com/pydata/xarray/issues/1524#issuecomment-513996346	https://api.github.com/repos/pydata/xarray/issues/1524	MDEyOklzc3VlQ29tbWVudDUxMzk5NjM0Ng==	rafa-guedes 7799184	2019-07-22T23:47:13Z	2019-07-22T23:47:13Z	CONTRIBUTOR	@shoyer does https://github.com/dask/dask/pull/4677 solve those accuracy concerns?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	(trivial) xarray.quantile silently resolves dask arrays 252548859
512663861	https://github.com/pydata/xarray/issues/2501#issuecomment-512663861	https://api.github.com/repos/pydata/xarray/issues/2501	MDEyOklzc3VlQ29tbWVudDUxMjY2Mzg2MQ==	rafa-guedes 7799184	2019-07-18T04:51:06Z	2019-07-18T04:52:17Z	CONTRIBUTOR	Hi guys, I'm having some issue that looks similar to @rsignell-usgs. Trying to open 413 netcdf files using `open_mfdataset` with `parallel=True`. The dataset (successfully opened with `parallel=False`) has ~300G on disk and looks like: ```ipython In [1] import xarray as xr In [2]: dset = xr.open_mfdataset("./bom-ww3/bom-ww3_.nc", chunks={'time': 744, 'latitude': 100, 'longitude': 100}, parallel=False) In [3]: dset Out[3]: <xarray.Dataset> Dimensions: (latitude: 190, longitude: 289, time: 302092) Coordinates: longitude (longitude) float32 70.0 70.4 70.8 71.2 ... 184.4 184.8 185.2 * latitude (latitude) float32 -55.6 -55.2 -54.8 -54.4 ... 19.2 19.6 20.0 * time (time) datetime64[ns] 1979-01-01 ... 2013-05-31T23:00:00.000013440 Data variables: hs (time, latitude, longitude) float32 dask.array<shape=(302092, 190, 289), chunksize=(745, 100, 100)> fp (time, latitude, longitude) float32 dask.array<shape=(302092, 190, 289), chunksize=(745, 100, 100)> dp (time, latitude, longitude) float32 dask.array<shape=(302092, 190, 289), chunksize=(745, 100, 100)> wl (time, latitude, longitude) float32 dask.array<shape=(302092, 190, 289), chunksize=(745, 100, 100)> U10 (time, latitude, longitude) float32 dask.array<shape=(302092, 190, 289), chunksize=(745, 100, 100)> V10 (time, latitude, longitude) float32 dask.array<shape=(302092, 190, 289), chunksize=(745, 100, 100)> hs1 (time, latitude, longitude) float32 dask.array<shape=(302092, 190, 289), chunksize=(745, 100, 100)> hs2 (time, latitude, longitude) float32 dask.array<shape=(302092, 190, 289), chunksize=(745, 100, 100)> tp1 (time, latitude, longitude) float32 dask.array<shape=(302092, 190, 289), chunksize=(745, 100, 100)> tp2 (time, latitude, longitude) float32 dask.array<shape=(302092, 190, 289), chunksize=(745, 100, 100)> lp0 (time, latitude, longitude) float32 dask.array<shape=(302092, 190, 289), chunksize=(745, 100, 100)> lp1 (time, latitude, longitude) float32 dask.array<shape=(302092, 190, 289), chunksize=(745, 100, 100)> lp2 (time, latitude, longitude) float32 dask.array<shape=(302092, 190, 289), chunksize=(745, 100, 100)> th0 (time, latitude, longitude) float32 dask.array<shape=(302092, 190, 289), chunksize=(745, 100, 100)> th1 (time, latitude, longitude) float32 dask.array<shape=(302092, 190, 289), chunksize=(745, 100, 100)> th2 (time, latitude, longitude) float32 dask.array<shape=(302092, 190, 289), chunksize=(745, 100, 100)> hs0 (time, latitude, longitude) float32 dask.array<shape=(302092, 190, 289), chunksize=(745, 100, 100)> tp0 (time, latitude, longitude) float32 dask.array<shape=(302092, 190, 289), chunksize=(745, 100, 100)> ``` Trying to read it on a standard python session gives me core dumped: ```ipython In [1]: import xarray as xr In [2]: dset = xr.open_mfdataset("./bom-ww3/bom-ww3_.nc", chunks={'time': 744, 'latitude': 100, 'longitude': 100}, parallel=True) Bus error (core dumped) ``` Trying to read it on a dask cluster I get: ```ipython In [1]: from dask.distributed import Client In [2]: import xarray as xr In [3]: client = Client() In [4]: dset = xr.open_mfdataset("./bom-ww3/bom-ww3_.nc", chunks={'time': 744, 'latitude': 100, 'longitud ...: e': 100}, parallel=True) free(): double free detected in tcache 2free(): double free detected in tcache 2 free(): double free detected in tcache 2 distributed.nanny - WARNING - Worker process 18744 was killed by signal 11 distributed.nanny - WARNING - Restarting worker distributed.nanny - WARNING - Worker process 18740 was killed by signal 6 distributed.nanny - WARNING - Restarting worker distributed.nanny - WARNING - Worker process 18742 was killed by signal 7 distributed.nanny - WARNING - Worker process 18738 was killed by signal 6 distributed.nanny - WARNING - Restarting worker distributed.nanny - WARNING - Restarting worker free(): double free detected in tcache 2munmap_chunk(): invalid pointer free(): double free detected in tcache 2 free(): double free detected in tcache 2 distributed.nanny - WARNING - Worker process 19082 was killed by signal 6 distributed.nanny - WARNING - Restarting worker distributed.nanny - WARNING - Worker process 19073 was killed by signal 6 distributed.nanny - WARNING - Restarting worker KilledWorker Traceback (most recent call last) <ipython-input-4-740561b80fec> in <module>() ----> 1 dset = xr.open_mfdataset("./bom-ww3/bom-ww3_.nc", chunks={'time': 744, 'latitude': 100, 'longitude': 100}, parallel=True) /usr/local/lib/python3.7/dist-packages/xarray/backends/api.py in open_mfdataset(paths, chunks, concat_dim, compat, preprocess, engine, lock, data_vars, coords, combine, autoclose, parallel, kwargs) 772 # calling compute here will return the datasets/file_objs lists, 773 # the underlying datasets will still be stored as dask arrays --> 774 datasets, file_objs = dask.compute(datasets, file_objs) 775 776 # Combine all datasets, closing them in case of a ValueError /usr/local/lib/python3.7/dist-packages/dask/base.py in compute(args,* kwargs) 444 keys = [x.dask_keys() for x in collections] 445 postcomputes = [x.dask_postcompute() for x in collections] --> 446 results = schedule(dsk, keys, kwargs) 447 return repack([f(r, a) for r, (f, a) in zip(results, postcomputes)]) 448 /home/oceanum/.local/lib/python3.7/site-packages/distributed/client.py in get(self, dsk, keys, restrictions, loose_restrictions, resources, sync, asynchronous, direct, retries, priority, fifo_timeout, actors, *kwargs) 2525 should_rejoin = False 2526 try: -> 2527 results = self.gather(packed, asynchronous=asynchronous, direct=direct) 2528 finally: 2529 for f in futures.values(): /home/oceanum/.local/lib/python3.7/site-packages/distributed/client.py in gather(self, futures, errors, direct, asynchronous) 1821 direct=direct, 1822 local_worker=local_worker, -> 1823 asynchronous=asynchronous, 1824 ) 1825 /home/oceanum/.local/lib/python3.7/site-packages/distributed/client.py in sync(self, func, asynchronous, callback_timeout, args,* kwargs) 761 else: 762 return sync( --> 763 self.loop, func, args, callback_timeout=callback_timeout, *kwargs 764 ) 765 /home/oceanum/.local/lib/python3.7/site-packages/distributed/utils.py in sync(loop, func, callback_timeout, args,* kwargs) 330 e.wait(10) 331 if error[0]: --> 332 six.reraise(error[0]) 333 else: 334 return result[0] /usr/lib/python3/dist-packages/six.py in reraise(tp, value, tb) 691 if value.traceback is not tb: 692 raise value.with_traceback(tb) --> 693 raise value 694 finally: 695 value = None /home/oceanum/.local/lib/python3.7/site-packages/distributed/utils.py in f() 315 if callback_timeout is not None: 316 future = gen.with_timeout(timedelta(seconds=callback_timeout), future) --> 317 result[0] = yield future 318 except Exception as exc: 319 error[0] = sys.exc_info() /home/oceanum/.local/lib/python3.7/site-packages/tornado/gen.py in run(self) 733 734 try: --> 735 value = future.result() 736 except Exception: 737 exc_info = sys.exc_info() /home/oceanum/.local/lib/python3.7/site-packages/tornado/gen.py in run(self) 740 if exc_info is not None: 741 try: --> 742 yielded = self.gen.throw(*exc_info) # type: ignore 743 finally: 744 # Break up a reference to itself /home/oceanum/.local/lib/python3.7/site-packages/distributed/client.py in _gather(self, futures, errors, direct, local_worker) 1678 exc = CancelledError(key) 1679 else: -> 1680 six.reraise(type(exception), exception, traceback) 1681 raise exc 1682 if errors == "skip": /usr/lib/python3/dist-packages/six.py in reraise(tp, value, tb) 691 if value.traceback is not tb: 692 raise value.with_traceback(tb) --> 693 raise value 694 finally: 695 value = None KilledWorker: ('open_dataset-e7916acb-6d9f-4532-ab76-5b9c1b1a39c2', <Worker 'tcp://10.240.0.5:36019', memory: 0, processing: 63>) ``` Is there anything obviously wrong I'm trying here please?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset usage and limitations. 372848074
323880231	https://github.com/pydata/xarray/issues/1081#issuecomment-323880231	https://api.github.com/repos/pydata/xarray/issues/1081	MDEyOklzc3VlQ29tbWVudDMyMzg4MDIzMQ==	rafa-guedes 7799184	2017-08-21T23:44:30Z	2017-08-21T23:56:54Z	CONTRIBUTOR	I have also hit this issue, this method could be useful. I'm putting below my workaround in case it is any helpful: `python def reorder_dims(darray, dim1, dim2): """ Interchange two dimensions of a DataArray in a similar way as numpy's swap_axes """ dims = list(darray.dims) assert set([dim1,dim2]).issubset(dims), 'dim1 and dim2 must be existing dimensions in darray' ind1, ind2 = dims.index(dim1), dims.index(dim2) dims[ind2], dims[ind1] = dims[ind1], dims[ind2] return darray.transpose(*dims)`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Transpose some but not all dimensions 187393785
295993132	https://github.com/pydata/xarray/issues/1379#issuecomment-295993132	https://api.github.com/repos/pydata/xarray/issues/1379	MDEyOklzc3VlQ29tbWVudDI5NTk5MzEzMg==	rafa-guedes 7799184	2017-04-21T00:54:28Z	2017-04-21T10:05:27Z	CONTRIBUTOR	I realised that some of the Datasets I was trying to concatenate had different coordinate values (for coordinates that I was assuming to be the same) so I guess xr.concat was trying to align these coordinates before concatenating and the resultant Dataset ended up being much larger than it should have been. When I ensure I only concatenate Datasets with consistent coordinates, I can do it. However still resource consumption is quite high compared to when I so the same thing with numpy arrays. The memory increased by 42% using xr.concat (against 6% using np.concatenate) and the whole processing took about 4 times longer.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xr.concat consuming too much resources 223231729
295970641	https://github.com/pydata/xarray/issues/1379#issuecomment-295970641	https://api.github.com/repos/pydata/xarray/issues/1379	MDEyOklzc3VlQ29tbWVudDI5NTk3MDY0MQ==	rafa-guedes 7799184	2017-04-20T23:41:38Z	2017-04-20T23:41:38Z	CONTRIBUTOR	Also, reading all Datasets into a list and then trying to concatenate this list of Datasets at once also blows memory up.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xr.concat consuming too much resources 223231729
292853553	https://github.com/pydata/xarray/issues/1366#issuecomment-292853553	https://api.github.com/repos/pydata/xarray/issues/1366	MDEyOklzc3VlQ29tbWVudDI5Mjg1MzU1Mw==	rafa-guedes 7799184	2017-04-10T05:32:29Z	2017-04-10T05:32:29Z	CONTRIBUTOR	That makes sense thanks for explaining @shoyer	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Setting attributes to multi-index coordinate 220533356
289321422	https://github.com/pydata/xarray/issues/1324#issuecomment-289321422	https://api.github.com/repos/pydata/xarray/issues/1324	MDEyOklzc3VlQ29tbWVudDI4OTMyMTQyMg==	rafa-guedes 7799184	2017-03-26T22:25:25Z	2017-03-26T22:25:25Z	CONTRIBUTOR	Thanks!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Choose time units in output netcdf 216626776
202631361	https://github.com/pydata/xarray/pull/806#issuecomment-202631361	https://api.github.com/repos/pydata/xarray/issues/806	MDEyOklzc3VlQ29tbWVudDIwMjYzMTM2MQ==	rafa-guedes 7799184	2016-03-28T23:52:52Z	2016-03-28T23:52:52Z	CONTRIBUTOR	:+1: nice one	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Decorators for registering custom accessors in xarray 143877458
177056825	https://github.com/pydata/xarray/issues/733#issuecomment-177056825	https://api.github.com/repos/pydata/xarray/issues/733	MDEyOklzc3VlQ29tbWVudDE3NzA1NjgyNQ==	rafa-guedes 7799184	2016-01-30T03:25:03Z	2016-01-30T03:25:03Z	CONTRIBUTOR	I personally find it useful - maybe not too intuitive though that the behaviour changes depending on whether there are attrs defined for that coordinate variable or not. I agree some documentation on this would be definitely helpful!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	coordinate variable not written in netcdf file in some cases 129630652
176542303	https://github.com/pydata/xarray/issues/728#issuecomment-176542303	https://api.github.com/repos/pydata/xarray/issues/728	MDEyOklzc3VlQ29tbWVudDE3NjU0MjMwMw==	rafa-guedes 7799184	2016-01-29T02:48:17Z	2016-01-29T02:48:17Z	CONTRIBUTOR	Thanks @shoyer that works (:	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Cannot inherit DataArray anymore in 0.7 release 128980804
176485011	https://github.com/pydata/xarray/issues/728#issuecomment-176485011	https://api.github.com/repos/pydata/xarray/issues/728	MDEyOklzc3VlQ29tbWVudDE3NjQ4NTAxMQ==	rafa-guedes 7799184	2016-01-28T23:44:58Z	2016-01-28T23:44:58Z	CONTRIBUTOR	Thanks @shoyer , what do you mean by preserve the signature of `DataArray.__init__` please?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Cannot inherit DataArray anymore in 0.7 release 128980804
175528287	https://github.com/pydata/xarray/pull/726#issuecomment-175528287	https://api.github.com/repos/pydata/xarray/issues/726	MDEyOklzc3VlQ29tbWVudDE3NTUyODI4Nw==	rafa-guedes 7799184	2016-01-27T10:16:40Z	2016-01-27T10:16:40Z	CONTRIBUTOR	Good point, done it	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Make import error of tokenize more explicit 128749355
170173475	https://github.com/pydata/xarray/issues/706#issuecomment-170173475	https://api.github.com/repos/pydata/xarray/issues/706	MDEyOklzc3VlQ29tbWVudDE3MDE3MzQ3NQ==	rafa-guedes 7799184	2016-01-09T00:59:14Z	2016-01-09T00:59:14Z	CONTRIBUTOR	Cool, thanks @shoyer. Yes @rabernat I totally agree with you and I would be very keen to collaborate on a library like that, I think that would be useful for many people.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Subclassing Dataset and DataArray 124915222
169860884	https://github.com/pydata/xarray/issues/682#issuecomment-169860884	https://api.github.com/repos/pydata/xarray/issues/682	MDEyOklzc3VlQ29tbWVudDE2OTg2MDg4NA==	rafa-guedes 7799184	2016-01-08T01:27:52Z	2016-01-08T01:27:52Z	CONTRIBUTOR	See #709	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	to_netcdf: not able to set dtype encoding with netCDF4 backend 123384529
165520642	https://github.com/pydata/xarray/issues/681#issuecomment-165520642	https://api.github.com/repos/pydata/xarray/issues/681	MDEyOklzc3VlQ29tbWVudDE2NTUyMDY0Mg==	rafa-guedes 7799184	2015-12-17T17:24:11Z	2015-12-17T17:24:11Z	CONTRIBUTOR	I had that happening with python2 as well - just for netcdf4 files though, because of the new string type I guess.. when writing as netcdf4-classic that string output was not shown.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	to_netcdf on Python 3: "string" qualifier on attributes 122776511
157576363	https://github.com/pydata/xarray/issues/660#issuecomment-157576363	https://api.github.com/repos/pydata/xarray/issues/660	MDEyOklzc3VlQ29tbWVudDE1NzU3NjM2Mw==	rafa-guedes 7799184	2015-11-18T02:24:52Z	2015-11-18T02:24:52Z	CONTRIBUTOR	Yes it is @shoyer !	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	time slice cannot be list 117262604
157572531	https://github.com/pydata/xarray/issues/662#issuecomment-157572531	https://api.github.com/repos/pydata/xarray/issues/662	MDEyOklzc3VlQ29tbWVudDE1NzU3MjUzMQ==	rafa-guedes 7799184	2015-11-18T02:00:07Z	2015-11-18T02:00:07Z	CONTRIBUTOR	Awesome, works here too with netCDF4==1.2.1 Thanks!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Problem with checking in Variable._parse_dimensions() (xray.core.variable) 117478779
157570185	https://github.com/pydata/xarray/issues/662#issuecomment-157570185	https://api.github.com/repos/pydata/xarray/issues/662	MDEyOklzc3VlQ29tbWVudDE1NzU3MDE4NQ==	rafa-guedes 7799184	2015-11-18T01:43:01Z	2015-11-18T01:43:01Z	CONTRIBUTOR	Hum... Ok I will try that in another machine too.. The versions are: pandas==0.17.0 netCDF4==1.1.1 scipy==0.15.1 numpy==1.10.1 xray==0.6.1-15-g5109f4f	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Problem with checking in Variable._parse_dimensions() (xray.core.variable) 117478779
157567446	https://github.com/pydata/xarray/issues/662#issuecomment-157567446	https://api.github.com/repos/pydata/xarray/issues/662	MDEyOklzc3VlQ29tbWVudDE1NzU2NzQ0Ng==	rafa-guedes 7799184	2015-11-18T01:26:01Z	2015-11-18T01:26:01Z	CONTRIBUTOR	@shoyer I'm sending you by email (was not able to attach here) a stripped version of one of the files I was using. The code below should reproduce the issue: `import xray dset = xray.open_dataset('hycom_example.nc', decode_times=False) ncvar = 'water_u' dset_sliced = xray.Dataset() slice_dict = {u'lat': [-30], u'lon': [0]} dset_sliced[ncvar] = dset[ncvar].sel(method='nearest', **slice_dict) dset_sliced.to_netcdf()`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Problem with checking in Variable._parse_dimensions() (xray.core.variable) 117478779
157564974	https://github.com/pydata/xarray/issues/662#issuecomment-157564974	https://api.github.com/repos/pydata/xarray/issues/662	MDEyOklzc3VlQ29tbWVudDE1NzU2NDk3NA==	rafa-guedes 7799184	2015-11-18T01:09:00Z	2015-11-18T01:09:00Z	CONTRIBUTOR	@maximilianr I have managed to reproduce this with a different file with different number of dimensions (time, latitude, longitude). So I believe the below example should give same problem if you run on some other file and change the variable / dimension names accordingly: ``` ncvar = 'hs' dset_sliced1 = xray.Dataset() dset_sliced2 = xray.Dataset() dset = xray.open_dataset(filename, decode_times=False) slice_dict1 = {u'latitude': [-30], u'longitude': [0], u'time': [2.83996800e+08, 2.84007600e+08]} dset_sliced1[ncvar] = dset[ncvar].sel(method='nearest', slice_dict1) slice_dict2 = {u'latitude': [-30], u'longitude': [0], u'time': [2.84018400e+08, 2.84029200e+08]} dset_sliced2[ncvar] = dset[ncvar].sel(method='nearest', slice_dict2) dset_sliced1.to_netcdf('test.nc') # This fails xray.concat([dset_sliced1, dset_sliced2], dim='time') # This also fails, same error Traceback: ----> 1 xray.concat([dset_sliced1, dset_sliced2], dim='time') # This also fails /source/xray/xray/core/combine.pyc in concat(objs, dim, data_vars, coords, compat, positions, indexers, mode, concat_over) 113 raise TypeError('can only concatenate xray Dataset and DataArray ' 114 'objects') --> 115 return f(objs, dim, data_vars, coords, compat, positions) 116 117 /source/xray/xray/core/combine.pyc in _dataset_concat(datasets, dim, data_vars, coords, compat, positions) 265 for k in concat_over: 266 vars = ensure_common_dims([ds.variables[k] for ds in datasets]) --> 267 combined = Variable.concat(vars, dim, positions) 268 insert_result_variable(k, combined) 269 /source/xray/xray/core/variable.pyc in concat(cls, variables, dim, positions, shortcut) 711 utils.remove_incompatible_items(attrs, var.attrs) 712 --> 713 return cls(dims, data, attrs) 714 715 def _data_equals(self, other): /source/xray/xray/core/variable.pyc in init(self, dims, data, attrs, encoding, fastpath) 194 """ 195 self._data = _as_compatible_data(data, fastpath=fastpath) --> 196 self._dims = self._parse_dimensions(dims) 197 self._attrs = None 198 self._encoding = None /source/xray/xray/core/variable.pyc in _parse_dimensions(self, dims) 302 raise ValueError('dimensions %s must have the same length as the ' 303 'number of data dimensions, ndim=%s' --> 304 % (dims, self.ndim)) 305 return dims 306 ValueError: dimensions (u'time', u'latitude', u'longitude') must have the same length as the number of data dimensions, ndim=2 ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Problem with checking in Variable._parse_dimensions() (xray.core.variable) 117478779
157559691	https://github.com/pydata/xarray/issues/662#issuecomment-157559691	https://api.github.com/repos/pydata/xarray/issues/662	MDEyOklzc3VlQ29tbWVudDE1NzU1OTY5MQ==	rafa-guedes 7799184	2015-11-18T00:43:32Z	2015-11-18T00:43:32Z	CONTRIBUTOR	I was concatenating them as: `dset_concat = xray.concat([ds1, ds2], dim='time')` Trying to dump any of them as netcdf: `ds1.to_netcdf('test.nc')` would also yield the same problem	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Problem with checking in Variable._parse_dimensions() (xray.core.variable) 117478779

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

30 rows where author_association = "CONTRIBUTOR" and user = 7799184 sorted by updated_at descending

In [9]: darr[bad_indexer]

Advanced export