home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

12 rows where comments = 4, repo = 13221727 and user = 6213168 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 2

  • issue 6
  • pull 6

state 2

  • closed 11
  • open 1

repo 1

  • xarray · 12 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
309686915 MDU6SXNzdWUzMDk2ODY5MTU= 2027 square-bracket slice a Dataset with a DataArray crusaderky 6213168 open 0     4 2018-03-29T09:39:57Z 2022-04-18T03:51:25Z   MEMBER      

Given this: ``` ds = xarray.Dataset( data_vars={ 'vote': ('pupil', [5, 7, 8]), 'age': ('pupil', [15, 14, 16]) }, coords={ 'pupil': ['Alice', 'Bob', 'Charlie'] })

<xarray.Dataset> Dimensions: (pupil: 3) Coordinates: * pupil (pupil) <U7 'Alice' 'Bob' 'Charlie' Data variables: vote (pupil) int64 5 7 8 age (pupil) int64 15 14 16 ```

Why does this work: ``` ds.age[ds.vote >= 6]

<xarray.DataArray 'age' (pupil: 2)> array([14, 16]) Coordinates: * pupil (pupil) <U7 'Bob' 'Charlie' ```

But this doesn't? ``` ds[ds.vote >= 6]

KeyError: False `ds.vote >= 6`` is a DataArray with dims=('pupil', ) and dtype=bool, so I can't think of any ambiguity in what I want to achieve?

Workaround: ``` ds.sel(pupil=ds.vote >= 6)

<xarray.Dataset> Dimensions: (pupil: 2) Coordinates: * pupil (pupil) <U7 'Bob' 'Charlie' Data variables: vote (pupil) int64 7 8 age (pupil) int64 14 16 ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2027/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
671216158 MDExOlB1bGxSZXF1ZXN0NDYxNDM4MDIz 4297 Lazily load resource files crusaderky 6213168 closed 0 crusaderky 6213168   4 2020-08-01T21:31:36Z 2020-09-22T05:32:38Z 2020-08-02T07:05:15Z MEMBER   0 pydata/xarray/pulls/4297
  • Marginal speed-up and RAM footprint reduction when not running in Jupyter Notebook
  • Closes #4294
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4297/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
551532886 MDExOlB1bGxSZXF1ZXN0MzY0MjM4MTM2 3703 hardcoded xarray.__all__ crusaderky 6213168 closed 0     4 2020-01-17T17:09:45Z 2020-01-18T00:58:06Z 2020-01-17T20:42:25Z MEMBER   0 pydata/xarray/pulls/3703

Closes #3695

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3703/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
551544665 MDExOlB1bGxSZXF1ZXN0MzY0MjQ3NjE3 3705 One-off isort run crusaderky 6213168 closed 0     4 2020-01-17T17:36:10Z 2020-01-17T22:59:26Z 2020-01-17T21:00:24Z MEMBER   0 pydata/xarray/pulls/3705
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3705/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
510527025 MDExOlB1bGxSZXF1ZXN0MzMwODg1Mzk2 3429 minor lint tweaks crusaderky 6213168 closed 0     4 2019-10-22T09:15:03Z 2019-10-24T12:53:24Z 2019-10-24T12:53:21Z MEMBER   0 pydata/xarray/pulls/3429
  • Ran pyflakes 2.1.1
  • Some f-string tweaks
  • Ran black -t py36
  • Ran mypy 0.740. We'll need to skip it and jump directly to 0.750 once it's released because of https://github.com/python/mypy/issues/7735
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3429/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
506885041 MDU6SXNzdWU1MDY4ODUwNDE= 3397 "How Do I..." formatting issues crusaderky 6213168 closed 0     4 2019-10-14T21:32:27Z 2019-10-16T21:41:06Z 2019-10-16T21:41:06Z MEMBER      

@dcherian The new page http://xarray.pydata.org/en/stable/howdoi.html (#3357) is somewhat painful to read on readthedocs. The table goes out of the screen and one is forced to scroll left and right non stop.

Maybe a better alternative could be with Sphinx definitions syntax (which allows for automatic reflowing)?

rst How do I ... ============ Add variables from other datasets to my dataset? :py:meth:`Dataset.merge` (that's a 4 spaces indent)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3397/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
479359010 MDExOlB1bGxSZXF1ZXN0MzA2MjczNTY3 3202 chunk sparse arrays crusaderky 6213168 closed 0 crusaderky 6213168   4 2019-08-11T11:19:16Z 2019-08-12T21:02:31Z 2019-08-12T21:02:25Z MEMBER   0 pydata/xarray/pulls/3202

Closes #3191

@shoyer I completely disabled wrapping in ImplicitToExplicitIndexingAdapter for sparse arrays, cupy arrays, etc. I'm not sure if it's desirable; the chief problem is that I don't think I understand the purpose of ImplicitToExplicitIndexingAdapter to begin with... some enlightenment would be appreciated.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3202/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
168469112 MDU6SXNzdWUxNjg0NjkxMTI= 926 stack() on dask array produces inefficient chunking crusaderky 6213168 closed 0     4 2016-07-30T14:12:34Z 2019-02-01T16:04:43Z 2019-02-01T16:04:43Z MEMBER      

Whe the stack() method is used on a xarray with dask backend, one would expect that every output chunk is produced by exactly 1 input chunk.

This is not the case, as stack() actually produces an extremely fragmented dask array: https://gist.github.com/crusaderky/07991681d49117bfbef7a8870e3cba67

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/926/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
172291585 MDU6SXNzdWUxNzIyOTE1ODU= 979 align() should align chunks crusaderky 6213168 closed 0     4 2016-08-20T21:25:01Z 2019-01-24T17:19:30Z 2019-01-24T17:19:30Z MEMBER      

In the xarray docs I read

With the current version of dask, there is no automatic alignment of chunks when performing operations between dask arrays with different chunk sizes. If your computation involves multiple dask arrays with different chunks, you may need to explicitly rechunk each array to ensure compatibility.

While chunk auto-alignment could be done within the dask library, that would be limited to arrays with the same dimensionality and same dims order. For example it would not be possible to have a dask library call to align the chunks on xarrays with the following dims: - (time, latitude, longitude) - (time) - (longitude, latitude)

even if it makes perfect sense in xarray.

I think xarray.align() should take care of it automatically.

A safe algorithm would be to always scale down the chunksize when in conflict. This would prevent having chunks larger than expected, and should minimise (in a greedy way) the number of operations. It's also a good idea on dask.distributed, where merging two chunks could cause one of them to travel on the network - which is very expensive.

e.g. to reconcile chunksizes a: (5, 10, 6) b: (5, 7, 9) the algorithm would rechunk both arrays to (5, 7, 3, 6).

Finally, when served with a numpy-based array and a dask-based array, align() should convert the numpy array to dask. The critical use case that would benefit from this behaviour is when align() is invoked inside a broadcast() between a tiny constant you just loaded from csv/pandas/pure python list/whatever - e.g. dims=(time, ) shape=(100, ) - and a huge dask-backed array e.g. dims=(time, scenario) shape=(100, 2**30) chunks=(25, 2**20).

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/979/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
339611449 MDU6SXNzdWUzMzk2MTE0NDk= 2273 to_netcdf uses deprecated and unnecessary dask call crusaderky 6213168 closed 0     4 2018-07-09T21:20:20Z 2018-07-31T20:03:41Z 2018-07-31T19:42:20Z MEMBER      

```

ds = xarray.Dataset({'x': 1}) ds.to_netcdf('foo.nc') dask/utils.py:1010: UserWarning: Deprecated, see dask.base.get_scheduler instead ```

Stack trace: ```

xarray/backends/common.py(44)get_scheduler() 43 from dask.utils import effective_get ---> 44 actual_get = effective_get(get, collection) ``` There are two separate problems here:

  • dask recently changed API from get(get=callable) to get(scheduler=str). Should we
  • just increase the minimum version of dask (I doubt anybody will complain)
  • go through the hoops of dynamically invoking a different API depending on the dask version :sweat:
  • silence the warning now, and then increase the minimum version of dask the day that dask removes the old API entirely (risky)?
  • xarray is calling dask even when it's unnecessary, as none of the variables in the example Dataset had a dask backend. I don't think there are any CI suites for NetCDF without dask. I'm also wondering if they would bring any actual added value, as dask is small, has no exotic dependencies, and is pure Python; so I doubt anybody will have problems installing it whatever his setup is.

@shoyer opinion?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2273/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
172290413 MDU6SXNzdWUxNzIyOTA0MTM= 978 broadcast() broken on dask backend crusaderky 6213168 closed 0     4 2016-08-20T20:56:33Z 2016-12-09T20:28:42Z 2016-12-09T20:28:42Z MEMBER      

``` python

a = xarray.DataArray([1,2]).chunk(1) a <xarray.DataArray (dim_0: 2)> dask.array<xarray-..., shape=(2,), dtype=int64, chunksize=(1,)> Coordinates: * dim_0 (dim_0) int64 0 1 xarray.broadcast(a) (<xarray.DataArray (dim_0: 2)> array([1, 2]) Coordinates: * dim_0 (dim_0) int64 0 1,) ```

The problem is actually somewhere in the constructor of DataArray. In alignment.py:362, we have return DataArray(data, ...) where data is a Variable with dask backend. The returned DataArray object has a numpy backend. As a workaround, changing that line to return DataArray(data.data, ...) (thus passing a dask array) fixes the problem.

After that however there's a new issue: whenever broadcast adds a dimension to an array, it creates it in a single chunk, as opposed to copying the chunking of the other arrays. This can easily call a host to go out of memory, and makes it harder to work with the arrays afterwards because chunks won't match.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/978/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
170744285 MDExOlB1bGxSZXF1ZXN0ODEwMjkzMDc= 963 Align broadcast crusaderky 6213168 closed 0     4 2016-08-11T20:55:29Z 2016-08-14T23:25:02Z 2016-08-14T23:24:15Z MEMBER   0 pydata/xarray/pulls/963
  • Removed partial_align()
  • Added exclude and indexes optional parameters to align() public API
  • Added exclude optional parameter to broadcast() public API
  • Added various unit tests to check that align() and broadcast() do not perform needless data copies
  • broadcast() to automatically align inputs

Note: there is a failed unit test, TestDataset.test_broadcast_nocopy, which shows broadcast on dataset doing a data copy whereas it shouldn't. Could you look into it?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/963/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 151.538ms · About: xarray-datasette