home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

7 rows where user = 22542812 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, created_at (date), updated_at (date)

issue 6

  • Add writing complex data to docs 2
  • Support non-string dimension/variable names 1
  • Performance: numpy indexes small amounts of data 1000 faster than xarray 1
  • DataArray.transpose cannot handle Ellipsis 1
  • Dropping of unaligned Data at assignment to Dataset 1
  • Boolean confusion 1

user 1

  • DerWeh · 7 ✖

author_association 1

  • NONE 7
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
832454582 https://github.com/pydata/xarray/issues/5254#issuecomment-832454582 https://api.github.com/repos/pydata/xarray/issues/5254 MDEyOklzc3VlQ29tbWVudDgzMjQ1NDU4Mg== DerWeh 22542812 2021-05-05T06:48:46Z 2021-05-05T06:48:46Z NONE

@mathause Indeed, I am using engine="h5netcdf", invalid_netcdf=True.

I would also agree on the point that expanding the isinstance is not the best solution. Duck typing makes reliable instance checks quite painful.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Boolean confusion 874695249
722565840 https://github.com/pydata/xarray/issues/2292#issuecomment-722565840 https://api.github.com/repos/pydata/xarray/issues/2292 MDEyOklzc3VlQ29tbWVudDcyMjU2NTg0MA== DerWeh 22542812 2020-11-05T18:41:24Z 2020-11-05T18:41:24Z NONE

I just came along this question as I tried something similar to @joshburkart. Using a string-enum instead, the code works in principle:

```python import enum

import numpy as np import pandas as pd import xarray as xr

class CoordId(str, enum.Enum): LAT = 'lat' LON = 'lon'

pd.DataFrame({CoordId.LAT: [1,2,3]}).to_csv()

Returns: ',CoordId.LAT\n0,1\n1,2\n2,3\n'

xr.DataArray( data=np.arange(3 * 2).reshape(3, 2), coords={CoordId.LAT: [1, 2, 3], CoordId.LON: [7, 8]}, dims=[CoordId.LAT, CoordId.LON], )

output

<xarray.DataArray (lat: 3, lon: 2)>

array([[0, 1],

[2, 3],

[4, 5]])

Coordinates:

* lat (CoordId.LAT) int64 1 2 3

* lon (CoordId.LON) int64 7 8

```

We however got somewhat ambivalent results, that the dimensions are still enum elements dims = (<CoordId.LAT: 'lat'>, <CoordId.LON: 'lon'>), but the coordinate names are the strings. After writing and reading the DataArray, everything is a plain string, we can still access the elements using the enum elements, as they are equal to the strings.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support non-string dimension/variable names 341643235
707800003 https://github.com/pydata/xarray/issues/4507#issuecomment-707800003 https://api.github.com/repos/pydata/xarray/issues/4507 MDEyOklzc3VlQ29tbWVudDcwNzgwMDAwMw== DerWeh 22542812 2020-10-13T14:59:36Z 2020-10-13T14:59:36Z NONE

I agree, that the given example problem is related to a tolerance.

In principle, I see the problem in the current practice of just dropping data that doesn't align. If I perform an assignment =, I do not expect to lose any data.

Another example would be assigning:

python dataset['data2'] = xr.DataArray(np.random.random(50), dims=['x'], coords={'x': np.linspace(2, 12)})

This line of code would effectively do nothing, I generate data and upon assignment it is dropped.

But this might be a bit of a physiological question, what the governing design principle is. Personally I think, an assignment should only be possible if the assigned coordinates are a subset of the dataset's coordinates.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Dropping of unaligned Data at assignment to Dataset 720315478
560338657 https://github.com/pydata/xarray/issues/3583#issuecomment-560338657 https://api.github.com/repos/pydata/xarray/issues/3583 MDEyOklzc3VlQ29tbWVudDU2MDMzODY1Nw== DerWeh 22542812 2019-12-02T10:40:23Z 2019-12-02T10:40:23Z NONE

I am very sorry, I didn't realize that there had been a new release. Everything is fine after updating.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  DataArray.transpose cannot handle Ellipsis 530448473
530680614 https://github.com/pydata/xarray/issues/3297#issuecomment-530680614 https://api.github.com/repos/pydata/xarray/issues/3297 MDEyOklzc3VlQ29tbWVudDUzMDY4MDYxNA== DerWeh 22542812 2019-09-12T06:11:02Z 2019-09-12T06:11:02Z NONE

Sorry for the slow response, I have little time at the moment. The option invalid_netcdf=True is not yet in the latest release, is it? I get an TypeError. I would have to use a manually installed version of xarray to use it, right?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add writing complex data to docs 491215043
529689580 https://github.com/pydata/xarray/issues/3297#issuecomment-529689580 https://api.github.com/repos/pydata/xarray/issues/3297 MDEyOklzc3VlQ29tbWVudDUyOTY4OTU4MA== DerWeh 22542812 2019-09-09T22:20:08Z 2019-09-09T22:20:08Z NONE

I agree that including it in NetCDF is the 'most sane' approach. I don't really know how much work it is, expanding the standard.

To be honest, I don't really care about NetCDF, for me xarray is just an incredible good way to make code more stable and readable (though it still has several usability issues). In my community everyone uses HDF5 anyway, so dropping compatibility is no big issue. I just want a way to persist data as it is and conveniently load it for plotting and post processing.

I would still encourage you to push saving of complex data. In most fields people use complex data and it is hard to convince them that they benefit from this great library, if saving simple data takes complicated keyword arguments and annoys you with warnings compared to a simple np.savez on regular ndarrays.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add writing complex data to docs 491215043
529569885 https://github.com/pydata/xarray/issues/2799#issuecomment-529569885 https://api.github.com/repos/pydata/xarray/issues/2799 MDEyOklzc3VlQ29tbWVudDUyOTU2OTg4NQ== DerWeh 22542812 2019-09-09T16:53:20Z 2019-09-09T16:53:20Z NONE

It might be interesting to see, if pythran is an alternative to Cython. It seems like it handles high level numpy quite well, and would retain the readability of Python. Of course, it has its own issues...

But it seems like other libraries like e.g. scikit-image made some good experience with it.

Sadly I can't be of much help, as I lack experience (and most importantly time).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Performance: numpy indexes small amounts of data 1000 faster than xarray 416962458

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.765ms · About: xarray-datasette