home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where issue = 562075354 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 4

  • DancingQuanta 2
  • shoyer 1
  • crusaderky 1
  • scottcanoe 1

author_association 2

  • NONE 3
  • MEMBER 2

issue 1

  • Suggestion: interpolation of non-numerical data · 5 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
583903530 https://github.com/pydata/xarray/issues/3763#issuecomment-583903530 https://api.github.com/repos/pydata/xarray/issues/3763 MDEyOklzc3VlQ29tbWVudDU4MzkwMzUzMA== shoyer 1217238 2020-02-09T22:48:38Z 2020-02-09T22:48:38Z MEMBER

Could you share an small example of what you’d like to do, ideally on synthetic data?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Suggestion: interpolation of non-numerical data 562075354
583874997 https://github.com/pydata/xarray/issues/3763#issuecomment-583874997 https://api.github.com/repos/pydata/xarray/issues/3763 MDEyOklzc3VlQ29tbWVudDU4Mzg3NDk5Nw== DancingQuanta 8419157 2020-02-09T18:07:00Z 2020-02-09T18:07:00Z NONE

I suggest that in order to convince xarrsy developers to help you is to provide an example data and show what you have tried with your string encoding solution and describe applications for the method. You should check out pandas which xarrsy extends and is more widely used then xarray. Hopefully someone have a similar problem as you with pandas and you can write here how to apply their solutions.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Suggestion: interpolation of non-numerical data 562075354
583869461 https://github.com/pydata/xarray/issues/3763#issuecomment-583869461 https://api.github.com/repos/pydata/xarray/issues/3763 MDEyOklzc3VlQ29tbWVudDU4Mzg2OTQ2MQ== scottcanoe 19554926 2020-02-09T17:12:54Z 2020-02-09T17:12:54Z NONE

Hi all, thanks for the reply. Just to clarify, I'm making the suggestion that any one (or more) of these categorical interpolation techniques be incorporated into the internals of xarray so that any categorical arrays present in the dataset (properly aligned to a given dimension, of course) are interpolated automatically. As it stands, resampling such "mixed" datasets requires manually partitioning the numerical arrays from the categorical arrays and handling their interpolation separately. What makes xarray so appealing to me is how much of the laborious, error-prone, and not-so-extensible coding I've had to do in order to maintain relationships between various objects. It just seems to me like there is an opportunity here to push more into the background.

Forgive me if I'm mistaken or if this view is naive or possibly just a bad idea. I've only been working with xarray for a couple of days. Thanks again.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Suggestion: interpolation of non-numerical data 562075354
583840504 https://github.com/pydata/xarray/issues/3763#issuecomment-583840504 https://api.github.com/repos/pydata/xarray/issues/3763 MDEyOklzc3VlQ29tbWVudDU4Mzg0MDUwNA== DancingQuanta 8419157 2020-02-09T12:39:08Z 2020-02-09T12:39:08Z NONE

Sounds like a technique in data science, encoding strings, which is actually number of different techniques.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Suggestion: interpolation of non-numerical data 562075354
583783815 https://github.com/pydata/xarray/issues/3763#issuecomment-583783815 https://api.github.com/repos/pydata/xarray/issues/3763 MDEyOklzc3VlQ29tbWVudDU4Mzc4MzgxNQ== crusaderky 6213168 2020-02-08T22:39:54Z 2020-02-08T22:39:54Z MEMBER

Hi Scott,

I can't think of a generic situation where text labels have a numerical weight that is hardcoded to their position on the alphabet, e.g. mean("A", "C") = "B". What one typically does is map the labels (any string) to their (arbitrary) weights, interpolate the weights, and then do a nearest-neighbour interpolation (or floor or ceil, depending on the preference) back to the label. Which is what you described but with the special caveat that your weights are the ASCII codes for your labels.

On Sat, 8 Feb 2020 at 20:43, scottcanoe notifications@github.com wrote:

I'd like to suggest an improvement to enable a repeat-based interpolation mechanism for non-numerical data. In my use case, I have time series data (dim='t'), where each timepoint is associated with a measured variable (e.g., fluorescence) as well as a label indicating the stimulus being presented (e.g., "A"). However, if and when I need to upsample my data, the string-valued stimulus information is lost, and its imperative that the stimulus information is still present when working on the resampled data.

My solution to this problem has been to map the labels to integers, use nearest-neighbor interpolation on the integer-valued representation, and finally map the integers back to labels. (I'm willing to bet there's a name for this technique, but I wasn't able to find it by googling around for it.)

I'm new to xarray, but so far as I can tell this functionality is not provided. More specifically, calling DataArray.interp on a string-valued array results in a type error (<builtins.TypeError: interp only works for a numeric type array. Given <U1.>).

Finally, I'd like to applaud you for your work on xarray. I only wish I had found it sooner!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/3763?email_source=notifications&email_token=ABPM4MER3APWULR2QQVFE23RB4KOTA5CNFSM4KR43K22YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IMAS3NA, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPM4MEFUTJISHNCHFOYEXLRB4KOTANCNFSM4KR43K2Q .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Suggestion: interpolation of non-numerical data 562075354

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 18.045ms · About: xarray-datasette