home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

5 rows where repo = 13221727 and user = 6042212 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 2

  • pull 3
  • issue 2

state 1

  • closed 5

repo 1

  • xarray · 5 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1952739368 PR_kwDOAMm_X85dTMKv 8339 Reduce dask tokenization time martindurant 6042212 closed 0     6 2023-10-19T17:22:06Z 2023-10-20T23:13:44Z 2023-10-20T23:13:43Z CONTRIBUTOR   0 pydata/xarray/pulls/8339

When using dask (e.g., chunks={} with a zarr dataset), each dask.array gets a token. Calculating this token currently hits a recursive path within dask and is relatively slow (~10ms), which adds up for many variables. This PR makes a simpler but still unique token.

An example profile of open_dataset before:

and after

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8339/reactions",
    "total_count": 2,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 2,
    "eyes": 0
}
    xarray 13221727 pull
839823306 MDU6SXNzdWU4Mzk4MjMzMDY= 5070 requires io.IOBase subclass rather than duck file-like martindurant 6042212 closed 0     3 2021-03-24T15:06:39Z 2021-03-29T16:22:35Z 2021-03-29T16:22:35Z CONTRIBUTOR      

If you open_dataset on a file-like object, you will get the error TypeError: cannot read the magic number form [sic!] This is because in xarray.core.utils.read_magic_number, we have isinstance(filename_or_obj, io.IOBase), which is not necessarily the case for a file-like. I recommend changing this to hasattr(filename_or_obj, "read").

Example: ``` In [1]: import fsspec

In [2]: fs = fsspec.filesystem('file')

In [3]: f = fs.open("201704010000.CHRTOUT_DOMAIN1.comp")

In [5]: import xarray as xr

In [6]: xr.open_dataset(f) TypeError `` Wherefis aLocalFileOpener`, made to be lazy and defer file opening, so that it can be pickled.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5070/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
788398518 MDExOlB1bGxSZXF1ZXN0NTU2OTE3MDIx 4823 Allow fsspec URLs in open_(mf)dataset martindurant 6042212 closed 0     20 2021-01-18T16:22:35Z 2021-02-16T21:26:53Z 2021-02-16T21:18:05Z CONTRIBUTOR   0 pydata/xarray/pulls/4823
  • [x] Closes #4461 and related
  • [x] Tests added
  • [x] Passes pre-commit run --all-files
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [x] New functions/methods are listed in api.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4823/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
709187212 MDExOlB1bGxSZXF1ZXN0NDkzMjkyOTIw 4461 Allow fsspec/zarr/mfdataset martindurant 6042212 closed 0     18 2020-09-25T18:14:38Z 2021-02-16T15:36:54Z 2021-02-16T15:36:54Z CONTRIBUTOR   0 pydata/xarray/pulls/4461

Requires https://github.com/zarr-developers/zarr-python/pull/606

  • [ ] ~Closes #xxxx~
  • [x] Tests added
  • [x] Passes isort . && black . && mypy . && flake8
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [x] New functions/methods are listed in api.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4461/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
202260275 MDU6SXNzdWUyMDIyNjAyNzU= 1223 zarr as persistent store for xarray martindurant 6042212 closed 0     12 2017-01-20T22:37:20Z 2017-12-14T02:11:36Z 2017-12-14T02:11:36Z CONTRIBUTOR      

netCDF and HDF are good legacy archival formats handled by xarray and the wider numerical python ecosystem, but they don't play nicely with parallel access across a cluster or from an archive store like s3. zarr is certainly non-standard, but would make a very nice internal store for intermediates.

This gist, below, is a simple motivator that we could use zarr not only for dask but for xarray too without too much expenditure of effort. https://gist.github.com/martindurant/dc27a072da47fab8d63117488f1fd7f1

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1223/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 4959.645ms · About: xarray-datasette