home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

11 rows where issue = 207021356 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 7

  • shoyer 3
  • mhvk 2
  • stale[bot] 2
  • mrocklin 1
  • clarkfitzg 1
  • max-sixty 1
  • keewis 1

author_association 2

  • MEMBER 7
  • NONE 4

issue 1

  • Logical DTypes · 11 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
751361296 https://github.com/pydata/xarray/issues/1262#issuecomment-751361296 https://api.github.com/repos/pydata/xarray/issues/1262 MDEyOklzc3VlQ29tbWVudDc1MTM2MTI5Ng== keewis 14808389 2020-12-26T14:25:57Z 2020-12-26T14:25:57Z MEMBER

there is now a series of NEPs starting with NEP-40 discussing this, so we should be able to wait until numpy releases a version that supports custom dtypes. Should we close this?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Logical DTypes 207021356
751238151 https://github.com/pydata/xarray/issues/1262#issuecomment-751238151 https://api.github.com/repos/pydata/xarray/issues/1262 MDEyOklzc3VlQ29tbWVudDc1MTIzODE1MQ== stale[bot] 26384082 2020-12-25T11:49:53Z 2020-12-25T11:49:53Z NONE

In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity

If this issue remains relevant, please comment here or remove the stale label; otherwise it will be marked as closed automatically

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Logical DTypes 207021356
456858311 https://github.com/pydata/xarray/issues/1262#issuecomment-456858311 https://api.github.com/repos/pydata/xarray/issues/1262 MDEyOklzc3VlQ29tbWVudDQ1Njg1ODMxMQ== mhvk 2789820 2019-01-23T16:01:02Z 2019-01-23T16:01:02Z NONE

See https://github.com/numpy/numpy/pull/12630 for a numpy enhancement proposal that would end up making dtype more easily subclassable.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Logical DTypes 207021356
456849614 https://github.com/pydata/xarray/issues/1262#issuecomment-456849614 https://api.github.com/repos/pydata/xarray/issues/1262 MDEyOklzc3VlQ29tbWVudDQ1Njg0OTYxNA== stale[bot] 26384082 2019-01-23T15:40:33Z 2019-01-23T15:40:33Z NONE

In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity If this issue remains relevant, please comment here; otherwise it will be marked as closed automatically

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Logical DTypes 207021356
281679396 https://github.com/pydata/xarray/issues/1262#issuecomment-281679396 https://api.github.com/repos/pydata/xarray/issues/1262 MDEyOklzc3VlQ29tbWVudDI4MTY3OTM5Ng== mhvk 2789820 2017-02-22T14:11:14Z 2017-02-22T14:11:14Z NONE

Just as a heads-up, there is indeed the realisation within numpy that subclassable dtype would be great -- see https://github.com/numpy/numpy/issues/2899. If you have something like a design, I would certainly be interested (as maintainer of astropy's Quantity -- physical units should really be supported everywhere!), and I'd suggest to send a note to numpy-dev to get possible feedback/help.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Logical DTypes 207021356
279861287 https://github.com/pydata/xarray/issues/1262#issuecomment-279861287 https://api.github.com/repos/pydata/xarray/issues/1262 MDEyOklzc3VlQ29tbWVudDI3OTg2MTI4Nw== clarkfitzg 5356122 2017-02-14T22:47:28Z 2017-02-14T22:47:28Z MEMBER

Other datatypes would be extremely useful. But I think it would be better to start as a separate project and build some confidence in a system first.

@MaximilianR I was just typing nearly the same thing... :+1:

we might consider lightly wrapping NumPy arrays in a new object that also includes extra dtype information

Pandas seems to be moving away from this approach now.

Any other existing alternatives? datashape?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Logical DTypes 207021356
279854154 https://github.com/pydata/xarray/issues/1262#issuecomment-279854154 https://api.github.com/repos/pydata/xarray/issues/1262 MDEyOklzc3VlQ29tbWVudDI3OTg1NDE1NA== max-sixty 5635139 2017-02-14T22:17:40Z 2017-02-14T22:17:40Z MEMBER

Worth considering pandas 2.0 discussions around types https://github.com/pandas-dev/pandas2/issues/24, and some of their rejected considerations, such as https://github.com/libdynd/libdynd

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Logical DTypes 207021356
279846111 https://github.com/pydata/xarray/issues/1262#issuecomment-279846111 https://api.github.com/repos/pydata/xarray/issues/1262 MDEyOklzc3VlQ29tbWVudDI3OTg0NjExMQ== shoyer 1217238 2017-02-14T21:46:09Z 2017-02-14T21:46:09Z MEMBER

CC @pydata/xarray in case anyone else has opinions here

Benefits in favor would be that I suspect XArray already has mechanisms for coercion and such and it would reduce the number of total libraries.

We really don't have much existing machinery. Two things we have that might be useful:

  • a couple of mixin classes for easily defining custom array types. This could be a nice building block, but it's self-contained and only a few dozen lines of code.
  • some existing code for function dispatch to either numpy or dask.array. This is quite messy, somewhat xarray-specific and not worth copying.

Fewer libraries is definitely nice, but I see this as more of a secondary rather than primary goal.

More broadly, doing this project right will need strong separation of concerns from xarray's handling of labeled arrays. So there's not a huge amount to be gained by doing it in the same repository.

Argument against is that XArray is currently only focused on indexed and labeled arrays, and possible it doesn't want to deal with the dtype mess.

I would love to see this project be successful and integrated with xarray. But better dtypes is tangental to our current focus, and project maintenance is already stretched pretty thin -- there's still a lot of core functionality to build out for manipulation of labeled arrays.

So I'm not comfortable with building this in xarray at this time. But I would be happy to revisit this decision when you have a design document, prototype and someone committed to developing and maintaining the module.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Logical DTypes 207021356
279515654 https://github.com/pydata/xarray/issues/1262#issuecomment-279515654 https://api.github.com/repos/pydata/xarray/issues/1262 MDEyOklzc3VlQ29tbWVudDI3OTUxNTY1NA== mrocklin 306380 2017-02-13T20:41:50Z 2017-02-13T20:41:50Z MEMBER

To be clear, my original question was more ambitious. It may be interpreted as "should such a system be integrated directly into the XArray codebase?"

The answer of "No, it should be a standalone library that XArray wraps much in the same way it wraps around numpy or dask.array" if fine with me. Just asking. Benefits in favor would be that I suspect XArray already has mechanisms for coercion and such and it would reduce the number of total libraries. Argument against is that XArray is currently only focused on indexed and labeled arrays, and possible it doesn't want to deal with the dtype mess. So, more broadly, the question is "What is the scope of XArray?"

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Logical DTypes 207021356
279509836 https://github.com/pydata/xarray/issues/1262#issuecomment-279509836 https://api.github.com/repos/pydata/xarray/issues/1262 MDEyOklzc3VlQ29tbWVudDI3OTUwOTgzNg== shoyer 1217238 2017-02-13T20:18:42Z 2017-02-13T20:18:42Z MEMBER

One major API design challenge to solve with such a package (unresolved in NumPy) is how to handle dtype-specific methods/properties, e.g., year, month and day properties for a custom datetime dtype, or a .keys() methods for a structured dtype (https://github.com/numpy/numpy/pull/8615).

Fitting these into a generic NDArray type is not very natural. So perhaps the solution is to use subclasses (fixed for each dtype) with some very strict design constraints (e.g., only add new methods/properties, don't override functionality). The contract would still be that the dtype defines all valid extension points for overriding functionality.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Logical DTypes 207021356
279505323 https://github.com/pydata/xarray/issues/1262#issuecomment-279505323 https://api.github.com/repos/pydata/xarray/issues/1262 MDEyOklzc3VlQ29tbWVudDI3OTUwNTMyMw== shoyer 1217238 2017-02-13T20:00:53Z 2017-02-13T20:09:43Z MEMBER

So question: Is it sensible to add logical dtype information to XArray?

Sure, this would pretty sensible, especially if there is a nice story for wrapping upstream libraries providing alternate physical arrays such as dask.array and bolt (cc @freeman-lab).

There are certainly plenty of use-cases. A few more examples that would be particularly relevant for xarray:

  • a generic optional dtype for handling missing values (e.g., for integers)
  • a generic wrapper for 1D pandas dtypes into N-dimensional arrays
  • physical units (https://github.com/pydata/xarray/issues/525)

Can this be done with only moderate effort and maintenance costs to the XArray project?

If we have a well defined interface that defines the right operations, my guess is indeed "yes, probably". See https://github.com/bolt-project/bolt/issues/58 for a list of operations worth considering wrapping (obviously some of these, like arithmetic, are not needed for all dtypes).

If the answer is "yes, probably", then what is the right way to go about this?

I think it should start as a separate package to ensure a cleanly separated interface and because there are definitely other clients than xarray. We can quickly add it as an optional dependency to xarray for testing purposes.

I'm excited about this, but I'm unlikely to have much time available to work on this directly.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Logical DTypes 207021356

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 48.565ms · About: xarray-datasette
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows