home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

8 rows where comments = 6 and user = 35968931 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 2

  • pull 6
  • issue 2

state 2

  • closed 6
  • open 2

repo 1

  • xarray 8
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
2267803218 PR_kwDOAMm_X85t8pSN 8980 Complete deprecation of Dataset.dims returning dict TomNicholas 35968931 open 0     6 2024-04-28T20:32:29Z 2024-05-01T15:40:44Z   MEMBER   0 pydata/xarray/pulls/8980
  • [x] Completes deprecation cycle described in #8496, and started in #8500
  • [ ] ~~Tests added~~
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [ ] ~~New functions/methods are listed in api.rst~~
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8980/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2120030667 PR_kwDOAMm_X85mGm4g 8712 Only use CopyOnWriteArray wrapper on BackendArrays TomNicholas 35968931 open 0     6 2024-02-06T06:05:53Z 2024-02-07T17:09:56Z   MEMBER   0 pydata/xarray/pulls/8712

This makes sure we only use the CopyOnWriteArray wrapper on arrays that have been explicitly marked to be lazily-loaded (through being subclasses of BackendArray). Without this change we are implicitly assuming that any array type obtained through the BackendEntrypoint system should be treated as if it points to an on-disk array.

Motivated by https://github.com/pydata/xarray/issues/8699, which is a counterexample to that assumption.

  • [ ] Closes #xxxx
  • [ ] Tests added
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [ ] New functions/methods are listed in api.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8712/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1974681146 PR_kwDOAMm_X85edMm- 8404 Hypothesis strategy for generating Variable objects TomNicholas 35968931 closed 0     6 2023-11-02T17:04:03Z 2023-12-05T22:45:57Z 2023-12-05T22:45:57Z MEMBER   0 pydata/xarray/pulls/8404

Breaks out just the part of #6908 needed for generating arbitrary xarray.Variable objects. (so ignore the ginormous number of commits)

EDIT: Check out this test which performs a mean on any subset of any Variable object!

```python In [36]: from xarray.testing.strategies import variables

In [37]: variables().example() <xarray.Variable (ĭ: 3)> array([-2.22507386e-313-6.62447795e+016j, nan-6.46207519e+185j, -2.22507386e-309+3.33333333e-001j]) ```

@andersy005 @maxrjones @jhamman I thought this might be useful for the NamedArray testing. (xref #8370 and #8244)

@keewis and @Zac-HD sorry for letting that PR languish for literally a year :sweat_smile: This PR addresses your feedback about accepting a callable that returns a strategy generating arrays. That suggestion makes some things a bit more complex in user code but actually allows me to simplify the internals of the variables strategy significantly. I'm actually really happy with this PR - I think it solves what we were discussing, and is a sensible checkpoint to merge before going back to making strategies for generating composite objects like DataArrays/Datasets work.

  • [x] Closes part of #6911
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [x] New functions/methods are listed in api.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8404/reactions",
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1200309334 PR_kwDOAMm_X842BOIk 6471 Support **kwargs form in `.chunk()` TomNicholas 35968931 closed 0     6 2022-04-11T17:37:38Z 2022-04-12T03:34:49Z 2022-04-11T19:36:40Z MEMBER   0 pydata/xarray/pulls/6471

Also adds some explicit tests (and type hinting) for Variable.chunk(), as I don't think it had dedicated tests before.

  • [x] Closes #6459
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6471/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1020282789 I_kwDOAMm_X8480Eel 5843 Why are `da.chunks` and `ds.chunks` properties inconsistent? TomNicholas 35968931 closed 0     6 2021-10-07T17:21:01Z 2021-10-29T18:12:22Z 2021-10-29T18:12:22Z MEMBER      

Basically the title, but what I'm referring to is this:

```python In [2]: da = xr.DataArray([[0, 1], [2, 3]], name='foo').chunk(1)

In [3]: ds = da.to_dataset()

In [4]: da.chunks Out[4]: ((1, 1), (1, 1))

In [5]: ds.chunks Out[5]: Frozen({'dim_0': (1, 1), 'dim_1': (1, 1)}) ```

Why does DataArray.chunks return a tuple and Dataset.chunks return a frozen dictionary?

This seems a bit silly, for a few reasons:

1) it means that some perfectly reasonable code might fail unnecessarily if passed a DataArray instead of a Dataset or vice versa, such as

```python
def is_core_dim_chunked(obj, core_dim):
    return len(obj.chunks[core_dim]) > 1
```
which will work as intended for a dataset but raises a `TypeError` for a dataarray.

2) it breaks the pattern we use for .sizes, where

```python
In [14]: da.sizes
Out[14]: Frozen({'dim_0': 2, 'dim_1': 2})

In [15]: ds.sizes
Out[15]: Frozen({'dim_0': 2, 'dim_1': 2})
```

3) if you want the chunks as a tuple they are always accessible via da.data.chunks, which is a more sensible place to look to find the chunks without dimension names.

4) It's an undocumented difference, as the docstrings for ds.chunks and da.chunks both only say

`"""Block dimensions for this dataset’s data or None if it’s not a dask array."""`

which doesn't tell me anything about the return type, or warn me that the return types are different.

EDIT: In fact `DataArray.chunk` doesn't even appear to be listed on the API docs page at all.

In our codebase this difference is mostly washed out by us using ._to_temp_dataset() all the time, and also by the way that the .chunk() method accepts both the tuple and dict form, so both of these invariants hold (but in different ways):

ds == ds.chunk(ds.chunks) da == da.chunk(da.chunks)

I'm not sure whether making this consistent is worth the effort of a significant breaking change though :confused:

(Sort of related to https://github.com/pydata/xarray/issues/2103)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5843/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1033884661 PR_kwDOAMm_X84tkKtA 5886 Use .to_numpy() for quantified facetgrids TomNicholas 35968931 closed 0     6 2021-10-22T19:25:24Z 2021-10-28T22:42:43Z 2021-10-28T22:41:59Z MEMBER   0 pydata/xarray/pulls/5886

Follows on from https://github.com/pydata/xarray/pull/5561 by replacing .values with .to_numpy() in more places in the plotting code. This allows pint.Quantity arrays to be plotted without issuing a UnitStrippedWarning (and will generalise better to other duck arrays later).

I noticed the need for this when trying out this example (but trying it without the .dequantify() call first).

(@Illviljan in theory .values should be replaced with .to_numpy() everywhere in the plotting code by the way)

  • [ ] Closes #xxxx
  • [x] Tests added
  • [x] Passes pre-commit run --all-files
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5886/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
935317034 MDExOlB1bGxSZXF1ZXN0NjgyMjU1NDE5 5561 Plots get labels from pint arrays TomNicholas 35968931 closed 0     6 2021-07-02T00:44:28Z 2021-07-21T23:06:21Z 2021-07-21T22:38:34Z MEMBER   0 pydata/xarray/pulls/5561

Stops you needing to call .pint.dequantify() before plotting.

Builds on top of #5568, so that should be merged first.

  • [x] Closes (1) from https://github.com/pydata/xarray/issues/3245#issue-484240082
  • [x] Tests added
  • [x] Tests passing
  • [x] Passes pre-commit run --all-files
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5561/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
367763373 MDU6SXNzdWUzNjc3NjMzNzM= 2473 Recommended way to extend xarray Datasets using accessors? TomNicholas 35968931 closed 0     6 2018-10-08T12:19:21Z 2018-10-31T09:58:05Z 2018-10-31T09:58:05Z MEMBER      

Hi,

I'm now regularly using xarray (& dask) for organising and analysing the output of the simulation code I use (BOUT++) and it's very helpful, thank you!.

However my current approach is quite clunky at dealing the extra information and functionality that's specific to the simulation code I'm using, and I have questions about what the recommended way to extend the xarray Dataset class is. This seems like a general enough problem that I thought I would make an issue for it.

Desired

What I ideally want to do is extend the xarray.Dataset class to accommodate extra attributes and methods, while retaining as much xarray functionality as possible, but avoiding reimplementing any of the API. This might not be possible, but ideally I want to make a BoutDataset class which contains extra attributes to hold information about the run which doesn't naturally fit into the xarray data model, extra methods to perform analysis/plotting which only users of this code would require, but also be able to use xarray-specific methods and top-level functions:

```python bd = BoutDataset('/path/to/data')

ds = bd.data # access the wrapped xarray dataset extra_data = bd.extra_data # access the BOUT-specific data

bd.isel(time=-1) # use xarray dataset methods

bd2 = BoutDataset('/path/to/other/data') concatenated_bd = xr.concat([bd, bd2]) # apply top-level xarray functions to the data

bd.plot_tokamak() # methods implementing bout-specific functionality ```

Problems with my current approach

I have read the documentation about extending xarray, and the issue threads about subclassing Datasets (#706) and accessors (#1080), but I wanted to check that what I'm doing is the recommended approach.

Right now I'm trying to do something like

```python @xr.register_dataset_accessor('bout') class BoutDataset: def init(self, path): self.data = collect_data(path) # collect all my numerical data from output files self.extra_data = read_extra_data(path) # collect extra data about the simulation

def plot_tokamak():
    plot_in_bout_specific_way(self.data, self.extra_data)

```

which works in the sense that I can do

```python bd = BoutDataset('/path/to/data')

ds = bd.bout.data # access the wrapped xarray dataset extra_data = bd.bout.extra_data # access the BOUT-specific data bd.bout.plot_tokamak() # methods implementing bout-specific functionality ```

but not so well with

```python bd.isel(time=-1) # AttributeError: 'BoutDataset' object has no attribute 'isel' bd.bout.data.isel(time=-1) # have to do this instead, but this returns an xr.Dataset not a BoutDataset

concatenated_bd = xr.concat([bd1, bd2]) # TypeError: can only concatenate xarray Dataset and DataArray objects, got <class 'BoutDataset'> concatenated_ds = xr.concat([bd1.bout.data, bd2.bout.data]) # again have to do this instead, which again returns an xr.Dataset not a BoutDataset ```

If I have to reimplement the APl for methods like .isel() and top-level functions like concat(), then why should I not just subclass xr.Dataset?

There aren't very many top-level xarray functions so reimplementing them would be okay, but there are loads of Dataset methods. However I think I know how I want my BoutDataset class to behave when an xr.Dataset method is called on it: I want it to implement that method on the underlying dataset and return the full BoutDatset with extra data and attributes still attached.

Is it possible to do something like: "if calling an xr.Dataset method on an instance of BoutDataset, call the corresponding method on the wrapped dataset and return a BoutDataset that has the extra BOUT-specific data propagated through"?

Thanks in advance, apologies if this is either impossible or relatively trivial, I just thought other xarray users might have the same questions.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2473/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 4723.358ms · About: xarray-datasette