home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

8 rows where user = 23618263 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, updated_at, closed_at, created_at (date), updated_at (date), closed_at (date)

type 2

  • pull 7
  • issue 1

state 1

  • closed 8

repo 1

  • xarray 8
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
690518703 MDU6SXNzdWU2OTA1MTg3MDM= 4399 Dask gufunc kwarg "output_sizes" is not deep copied griverat 23618263 closed 0     2 2020-09-01T23:41:47Z 2020-09-04T15:57:19Z 2020-09-04T15:57:19Z CONTRIBUTOR      

What happened:

Defining the kwargs used in xr.apply_ufunc in a separate dictionary and using it multiple times in different call of the method, while using dask="paralellized", ends in an error since the dimension names in ouput_sizes (inside dask_gufunc_kwargs) are modified internally.

What you expected to happen:

Keep the same dictionary of kwargs unmodified

Minimal Complete Verifiable Example:

```python import numpy as np

import xarray as xr

def dummy1(data, nfft): return data[..., (nfft // 2) + 1 :] * 2

def dummy2(data, nfft): return data[..., (nfft // 2) + 1 :] / 2

def xoperations(xarr, **kwargs): ufunc_kwargs = dict( kwargs=kwargs, input_core_dims=[["time"]], output_core_dims=[["freq"]], dask="parallelized", output_dtypes=[np.float], dask_gufunc_kwargs=dict(output_sizes={"freq": int(kwargs["nfft"] / 2) + 1}), )

ans1 = xr.apply_ufunc(dummy1, xarr, **ufunc_kwargs)
ans2 = xr.apply_ufunc(dummy2, xarr, **ufunc_kwargs)

return ans1, ans2

test = xr.DataArray( 4, coords=[("time", np.arange(1000)), ("lon", np.arange(160, 300, 10))] ).chunk({"time": -1, "lon": 10})

xoperations(test, nfft=1024) ```

This returns

```

ValueError Traceback (most recent call last) <ipython-input-1-822bd3b2d4da> in <module> 32 ).chunk({"time": -1, "lon": 10}) 33 ---> 34 xoperations(test, nfft=1024)

<ipython-input-1-822bd3b2d4da> in xoperations(xarr, kwargs) 23 24 ans1 = xr.apply_ufunc(dummy1, xarr, ufunc_kwargs) ---> 25 ans2 = xr.apply_ufunc(dummy2, xarr, **ufunc_kwargs) 26 27 return ans1, ans2

~/GitLab/xarray_test/xarray/xarray/core/computation.py in apply_ufunc(func, input_core_dims, output_core_dims, exclude_dims, vectorize, join, dataset_join, dataset_fill_value, keep_attrs, kwargs, dask, output_dtypes, output_sizes, meta, dask_gufunc_kwargs, *args) 1086 join=join, 1087 exclude_dims=exclude_dims, -> 1088 keep_attrs=keep_attrs, 1089 ) 1090 # feed Variables directly through apply_variable_ufunc

~/GitLab/xarray_test/xarray/xarray/core/computation.py in apply_dataarray_vfunc(func, signature, join, exclude_dims, keep_attrs, args) 260 261 data_vars = [getattr(a, "variable", a) for a in args] --> 262 result_var = func(data_vars) 263 264 if signature.num_outputs > 1:

~/GitLab/xarray_test/xarray/xarray/core/computation.py in apply_variable_ufunc(func, signature, exclude_dims, dask, output_dtypes, vectorize, keep_attrs, dask_gufunc_kwargs, *args) 632 if key not in signature.all_output_core_dims: 633 raise ValueError( --> 634 f"dimension '{key}' in 'output_sizes' must correspond to output_core_dims" 635 ) 636 output_sizes_renamed[signature.dims_map[key]] = value

ValueError: dimension 'dim0' in 'output_sizes' must correspond to output_core_dims ```

It is easily verifiable by sneaking a print statement before and after calling the first apply_ufunc. Everything is the same but the dimension names in output_sizes

```python {'kwargs': {'nfft': 1024}, 'input_core_dims': [['time']], 'output_core_dims': [['freq']], 'dask': 'parallelized', 'output_dtypes': [<class 'float'>], 'dask_gufunc_kwargs': {'output_sizes': {'freq': 513}}} {'kwargs': {'nfft': 1024}, 'input_core_dims': [['time']], 'output_core_dims': [['freq']], 'dask': 'parallelized', 'output_dtypes': [<class 'float'>], 'dask_gufunc_kwargs': {'output_sizes': {'dim0': 513}}}

```

Anything else we need to know?:

I have a fork with a fix ready to be sent as a PR. I just imported the copy module and used deepcopy like this

python dask_gufunc_kwargs = copy.deepcopy(dask_gufunc_kwargs) around here

https://github.com/pydata/xarray/blob/2acd0fc6563c3ad57f16e6ee804d592969419938/xarray/core/computation.py#L1013-L1020

If it's good enough then I can send the PR.

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: 2acd0fc6563c3ad57f16e6ee804d592969419938 python: 3.7.8 | packaged by conda-forge | (default, Jul 31 2020, 02:25:08) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 3.12.74-60.64.40-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.4 xarray: 0.14.2.dev337+g2acd0fc6 pandas: 1.1.1 numpy: 1.19.1 scipy: 1.5.2 netCDF4: 1.5.3 pydap: installed h5netcdf: 0.8.1 h5py: 2.10.0 Nio: 1.5.5 zarr: 2.4.0 cftime: 1.2.1 nc_time_axis: 1.2.0 PseudoNetCDF: installed rasterio: 1.1.5 cfgrib: 0.9.8.4 iris: 2.4.0 bottleneck: 1.3.2 dask: 2.25.0 distributed: 2.25.0 matplotlib: 3.3.1 cartopy: 0.18.0 seaborn: 0.10.1 numbagg: installed pint: 0.15 setuptools: 49.6.0.post20200814 pip: 20.2.2 conda: None pytest: 6.0.1 IPython: 7.18.1 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4399/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
690769335 MDExOlB1bGxSZXF1ZXN0NDc3NjEzNTQ5 4402 Use a copy of dask_gufuc_kwargs griverat 23618263 closed 0     1 2020-09-02T06:52:07Z 2020-09-04T15:57:19Z 2020-09-04T15:57:19Z CONTRIBUTOR   0 pydata/xarray/pulls/4402
  • [x] Closes #4399
  • [ ] Tests added
  • [x] Passes isort . && black . && mypy . && flake8
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [ ] New functions/methods are listed in api.rst

Following the suggestion of @kmuehlbauer, I am using just a shallow copy since a deep one is not required.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4402/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
537144956 MDExOlB1bGxSZXF1ZXN0MzUyNTUzNDkz 3615 Minor docstring fixes griverat 23618263 closed 0     1 2019-12-12T18:33:27Z 2019-12-12T19:13:41Z 2019-12-12T18:48:50Z CONTRIBUTOR   0 pydata/xarray/pulls/3615

Really minor docstring fixes, just added 's' at the end of kwarg and deleted Default n = 5 from thin method's docstring since it doesn't have a default value.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3615/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
493765130 MDExOlB1bGxSZXF1ZXN0MzE3NjU2NzEz 3309 Fix DataArray api doc griverat 23618263 closed 0     2 2019-09-15T17:36:46Z 2019-09-15T21:22:30Z 2019-09-15T20:27:31Z CONTRIBUTOR   0 pydata/xarray/pulls/3309

Seems like I forgot to point head, tail and thin to the right direction in the DataArray api documentation

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3309/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
491324262 MDExOlB1bGxSZXF1ZXN0MzE1NzE0MjYz 3298 Accept int value in head, thin and tail griverat 23618263 closed 0     7 2019-09-09T21:00:41Z 2019-09-15T07:05:58Z 2019-09-14T21:46:16Z CONTRIBUTOR   0 pydata/xarray/pulls/3298

Related #3278 This PR makes the methods head, thin and tail for both DataArray and Dataset accept a single integer value as a parameter. If no parameter is given, then it defaults to 5. - [x] Tests added - [x] Passes black . && mypy . && flake8

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3298/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
488812619 MDExOlB1bGxSZXF1ZXN0MzEzNzU3MjIw 3278 Add head, tail and thin methods griverat 23618263 closed 0     6 2019-09-03T20:41:42Z 2019-09-05T05:49:36Z 2019-09-05T04:22:24Z CONTRIBUTOR   0 pydata/xarray/pulls/3278

I feel like there's room for improvement in the docstrings, any change or suggestion is welcome!

  • [x] Closes #319
  • [x] Tests added
  • [x] Passes black . && mypy . && flake8
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3278/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
486153978 MDExOlB1bGxSZXF1ZXN0MzExNjYzOTg2 3271 Raise proper error for scalar array when coords is a dict griverat 23618263 closed 0     3 2019-08-28T04:29:02Z 2019-08-29T17:23:20Z 2019-08-29T17:09:00Z CONTRIBUTOR   0 pydata/xarray/pulls/3271

As explained here https://github.com/pydata/xarray/pull/3159#discussion_r316230281 , when a user uses a scalar array to build a DataArray with coords given as a dictionary the error is not self explanatory. ```python

xr.DataArray(np.array(1), coords={'x': np.arange(4), 'y': 'a'}, dims=['x']) ... KeyError: 'x' ```

This PR makes sure that when data is a scalar array and dims is not empty, it sets the shape to (0,) to make it fail with the proper raise message

```python

xr.DataArray(np.array(1), coords={'x': np.arange(4), 'y': 'a'}, dims=['x']) ... ValueError: conflicting sizes for dimension 'x': length 0 on the data but length 4 on coordinate 'x' ```

  • [x] Test updated
  • [x] Passes black . && mypy . && flake8
  • [ ] Fully documented, including whats-new.rst for all changes and api.rst for new API (is this needed for a change like this?)
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3271/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
472100381 MDExOlB1bGxSZXF1ZXN0MzAwNTc2Nzc4 3159 Initialize empty or full DataArray griverat 23618263 closed 0     20 2019-07-24T06:21:50Z 2019-08-27T16:28:04Z 2019-08-26T20:36:36Z CONTRIBUTOR   0 pydata/xarray/pulls/3159

I attempted to implement what has been asked for in #277 as an effort to contribute to this project. This PR adds the ability to initialize a DataArray with a constant value, including np.nan. Also, if data = None then it is initialized as np.empty to take advantage of its speed for big arrays. ```python

foo = xr.DataArray(None, coords=[range(3), range(4)]) foo <xarray.DataArray (dim_0: 3, dim_1: 4)> array([[4.673257e-310, 0.000000e+000, 0.000000e+000, 0.000000e+000], [0.000000e+000, 0.000000e+000, 0.000000e+000, 0.000000e+000], [0.000000e+000, 0.000000e+000, 0.000000e+000, 0.000000e+000]]) Coordinates: * dim_0 (dim_0) int64 0 1 2 * dim_1 (dim_1) int64 0 1 2 3 ```

  • [x] Closes #878, #277
  • [x] Tests added
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API

Regarding the tests, I am not sure how to test the creation of an empty DataArray with data=None since the values changes between calls of np.empty. This is the reason I only added the test for the constant value.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3159/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 30.598ms · About: xarray-datasette