home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

2 rows where comments = 11, type = "issue" and user = 5635139 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue · 2 ✖

state 1

  • closed 2

repo 1

  • xarray 2
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
729117202 MDU6SXNzdWU3MjkxMTcyMDI= 4539 Failing main branch — test_save_mfdataset_compute_false_roundtrip max-sixty 5635139 closed 0     11 2020-10-25T21:22:36Z 2023-09-21T06:48:03Z 2023-09-20T19:57:17Z MEMBER      

We had the main branch passing for a while, but unfortunately another test failure. Now in our new Linux py38-backend-api-v2 test case, intest_save_mfdataset_compute_false_roundtrip

link

``` self = <xarray.tests.test_backends.TestDask object at 0x7f821a0d6190>

def test_save_mfdataset_compute_false_roundtrip(self):
    from dask.delayed import Delayed

    original = Dataset({"foo": ("x", np.random.randn(10))}).chunk()
    datasets = [original.isel(x=slice(5)), original.isel(x=slice(5, 10))]
    with create_tmp_file(allow_cleanup_failure=ON_WINDOWS) as tmp1:
        with create_tmp_file(allow_cleanup_failure=ON_WINDOWS) as tmp2:
            delayed_obj = save_mfdataset(
                datasets, [tmp1, tmp2], engine=self.engine, compute=False
            )
            assert isinstance(delayed_obj, Delayed)
            delayed_obj.compute()
            with open_mfdataset(
                [tmp1, tmp2], combine="nested", concat_dim="x"
            ) as actual:
              assert_identical(actual, original)

E AssertionError: Left and right Dataset objects are not identical E
E
E Differing data variables: E L foo (x) float64 dask.array<chunksize=(5,), meta=np.ndarray> E R foo (x) float64 dask.array<chunksize=(10,), meta=np.ndarray>

/home/vsts/work/1/s/xarray/tests/test_backends.py:3274: AssertionError

AssertionError: Left and right Dataset objects are not identical

Differing data variables: L foo (x) float64 dask.array<chunksize=(5,), meta=np.ndarray> R foo (x) float64 dask.array<chunksize=(10,), meta=np.ndarray> ```

@aurghs & @alexamici — are you familiar with this? Thanks in advance

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4539/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
160505403 MDU6SXNzdWUxNjA1MDU0MDM= 884 Iterating over a Dataset iterates only over its data_vars max-sixty 5635139 closed 0   0.11 2856429 11 2016-06-15T19:35:50Z 2018-10-25T15:26:59Z 2018-10-25T15:26:59Z MEMBER      

This has been a small-but-persistent issue for me for a while. I suspect that my perspective might be dependent on my current outlook, but socializing it here to test if it's secular...

Currently Dataset.keys() returns both variables and coordinates (but not its attrs keys):

python In [5]: ds=xr.Dataset({'a': (('x', 'y'), np.random.rand(10,2))}) In [12]: list(ds.keys()) Out[12]: ['a', 'x', 'y']

Is this conceptually correct? I would posit that a Dataset is a mapping of keys to variables, and the coordinates contain values that label that data.

So should Dataset.keys() instead return just the keys of the Variables?

We're often passing around a dataset as a Mapping of keys to values - but then when we run a function across each of the keys, we get something run on both the Variables' keys, and the Coordinate / label's keys.

In Pandas, DataFrame.keys() returns just the columns, so that conforms to what we need. While I think the xarray design is in general much better in these areas, this is one area that pandas seems to get correct - and because of the inconsistency between pandas & xarray, we're having to coerce our objects to pandas DataFrames before passing them off to functions that pull out their keys (this is also why we can't just look at ds.data_vars.keys() - because it breaks that duck-typing).

Does that make sense?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/884/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 54.837ms · About: xarray-datasette