issue_comments
25 rows where author_association = "NONE" and user = 12912489 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: issue_url, reactions, created_at (date), updated_at (date)
user 1
- SimonHeybrock · 25 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
1370563894 | https://github.com/pydata/xarray/issues/1475#issuecomment-1370563894 | https://api.github.com/repos/pydata/xarray/issues/1475 | IC_kwDOAMm_X85RsSU2 | SimonHeybrock 12912489 | 2023-01-04T07:20:36Z | 2023-01-04T07:20:36Z | NONE | Recently I experimented with an (incomplete) duck-array prototype, wrapping an array of length N+1 in a duck array of length N (such that you can use it as a coordinate for a See https://github.com/scipp/scippx/blob/main/src/scippx/bin_edge_array.py (there is a bunch of unrelated stuff in the repo, you can mostly ignore that). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Allow DataArray to hold cell boundaries as coordinate variables 242181620 | |
1288374461 | https://github.com/pydata/xarray/issues/4285#issuecomment-1288374461 | https://api.github.com/repos/pydata/xarray/issues/4285 | IC_kwDOAMm_X85Mywi9 | SimonHeybrock 12912489 | 2022-10-24T03:44:44Z | 2022-11-03T17:04:15Z | NONE | Also note the Ragged Array Summit on Scientific Python. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Awkward array backend? 667864088 | |
1283416324 | https://github.com/pydata/xarray/issues/4285#issuecomment-1283416324 | https://api.github.com/repos/pydata/xarray/issues/4285 | IC_kwDOAMm_X85Mf2EE | SimonHeybrock 12912489 | 2022-10-19T04:39:06Z | 2022-10-19T04:39:06Z | NONE | A possibly relevant distinction that had not occurred to me previously is the example by @milancurcic: If I understand this correctly then this type of data is essentially an array of variable-length time-series (essentially a list of lists?), i.e., there is an order within each inner list. This is conceptually different from the data I am typically dealing with, where each inner list is a list of records without specific ordering. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Awkward array backend? 667864088 | |
1251836989 | https://github.com/pydata/xarray/issues/7045#issuecomment-1251836989 | https://api.github.com/repos/pydata/xarray/issues/7045 | IC_kwDOAMm_X85KnYQ9 | SimonHeybrock 12912489 | 2022-09-20T04:48:07Z | 2022-09-20T06:13:32Z | NONE | This suggestion looks roughly like what we are discussing in https://github.com/pydata/xarray/discussions/7041#discussioncomment-3662179, i.e., using a custom index that avoids this? So maybe the question here is whether such an Aside from that, with my outside perspective (having used Xarray extremely little, looking at the docs and code occasionally, but developing a similar library that does not have indexes): Indexes (including alignment behavior) feel like a massive complication of Xarray, both conceptually (which includes documentation and teaching efforts) as well as code. If all you require is the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Should Xarray stop doing automatic index-based alignment? 1376109308 | |
1243222416 | https://github.com/pydata/xarray/issues/3981#issuecomment-1243222416 | https://api.github.com/repos/pydata/xarray/issues/3981 | IC_kwDOAMm_X85KGhGQ | SimonHeybrock 12912489 | 2022-09-12T04:59:42Z | 2022-09-12T04:59:42Z | NONE | I note that |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
[Proposal] Expose Variable without Pandas dependency 602256880 | |
1243218951 | https://github.com/pydata/xarray/issues/3981#issuecomment-1243218951 | https://api.github.com/repos/pydata/xarray/issues/3981 | IC_kwDOAMm_X85KGgQH | SimonHeybrock 12912489 | 2022-09-12T04:51:23Z | 2022-09-12T04:55:13Z | NONE | This is something I am getting more and more interested in. We (scipp) currently have a C++ implementation (with Pything bindings) of a simpler version of While I am still far from having reached a conclusion (or convincing anyone here to support this), investing in technology that is adopted and carried by the community is considered important here. In other words, we may in principle be able to help out and invest some time into this. One important precondition would be full compatibility with other custom array containers: For our applications we do not just need to add labelled axes, but also units, masks, bin edges, and ragged data support. I am currently toying with the idea of a "stack" of Python array libraries (I guess you would call them duck arrays?) that add these features one by one, selectively, but can all be used also independently --- unlike Scipp, where you get all or nothing, and lose the ability of using NumPy (or other) array libraries under the hood. Each of those libraries could be small and simple, focussing one just one specific aspect, but everything should be composable. For example, we can imagine a |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
[Proposal] Expose Variable without Pandas dependency 602256880 | |
552756428 | https://github.com/pydata/xarray/issues/3509#issuecomment-552756428 | https://api.github.com/repos/pydata/xarray/issues/3509 | MDEyOklzc3VlQ29tbWVudDU1Mjc1NjQyOA== | SimonHeybrock 12912489 | 2019-11-12T06:37:20Z | 2022-09-09T13:08:45Z | NONE | @jthielen Thanks for your reply! I am not familiar with UnitsI do not see any advantage using scipp. The current unit system in scipp is based on UncertaintiesThere are two routes to take here: 1. Store a single array of value/variance pairs
2. Store two arrays (values array and uncertainties array)
Other aspectsScipp supports a generic Other
Questions
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
NEP 18, physical units, uncertainties, and the scipp library? 520815068 | |
1222318201 | https://github.com/pydata/xarray/issues/6591#issuecomment-1222318201 | https://api.github.com/repos/pydata/xarray/issues/6591 | IC_kwDOAMm_X85I2xh5 | SimonHeybrock 12912489 | 2022-08-22T12:53:22Z | 2022-08-22T12:53:22Z | NONE | Note duplicate (or related): #5750 |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Xarray ignores the underlying unit of "datetime64" types. 1232587833 | |
1216208075 | https://github.com/pydata/xarray/issues/4285#issuecomment-1216208075 | https://api.github.com/repos/pydata/xarray/issues/4285 | IC_kwDOAMm_X85IfdzL | SimonHeybrock 12912489 | 2022-08-16T06:38:32Z | 2022-08-16T06:42:28Z | NONE | @jpivarski
You are right that "sparse" is misleading. Since it is indeed most commonly used for sparse matrix/array representations we are now usually avoiding this term (and refer to it as binned data, or ragged data instead). Obviously our title page needs an update 😬 .
This does actually apply to Scipp's binned data. A |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Awkward array backend? 667864088 | |
1216107702 | https://github.com/pydata/xarray/issues/4285#issuecomment-1216107702 | https://api.github.com/repos/pydata/xarray/issues/4285 | IC_kwDOAMm_X85IfFS2 | SimonHeybrock 12912489 | 2022-08-16T03:43:29Z | 2022-08-16T05:11:50Z | NONE |
Anecdotal evidence that this is indeed not a good solution: scipp's "ragged data" implementation was originally implemented with such a variable-length dimension support. This led to a whole series of problems, including significantly complicating |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Awkward array backend? 667864088 | |
1216144957 | https://github.com/pydata/xarray/issues/4285#issuecomment-1216144957 | https://api.github.com/repos/pydata/xarray/issues/4285 | IC_kwDOAMm_X85IfOY9 | SimonHeybrock 12912489 | 2022-08-16T04:54:25Z | 2022-08-16T04:54:25Z | NONE | Is anyone here going to EuroScipy (two weeks from now) and interested in having a chat/discussion about ragged data? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Awkward array backend? 667864088 | |
1216125098 | https://github.com/pydata/xarray/issues/4285#issuecomment-1216125098 | https://api.github.com/repos/pydata/xarray/issues/4285 | IC_kwDOAMm_X85IfJiq | SimonHeybrock 12912489 | 2022-08-16T04:17:52Z | 2022-08-16T04:17:52Z | NONE | @danielballan mentioned that the photon community (synchrotrons/X-ray scattering) is starting to talk more and more about ragged data related to "event mode" data collection as well. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Awkward array backend? 667864088 | |
1216123818 | https://github.com/pydata/xarray/issues/4285#issuecomment-1216123818 | https://api.github.com/repos/pydata/xarray/issues/4285 | IC_kwDOAMm_X85IfJOq | SimonHeybrock 12912489 | 2022-08-16T04:15:24Z | 2022-08-16T04:15:24Z | NONE |
Partially, but the bigger challenge may be the related algorithms, e.g., for getting data into this layout, and for switching to other ragged layouts. For context, one of the main reasons for our data layout is the ability to make cuts/slices quickly. We frequently deal with 2-D, 3-D, and 4-D data. For example, a 3-D case may be be the momentum transfer $\vec Q$ in a scattering process, with a "record" for every detected neutron. Desired final resolution may exceed 1000 per dimension (of the 3 components of $\vec Q$). On top of this there may be additional dimensions relating to environment parameters of the sample under study, such as temperature, pressure, or strain. This would lead to bin-counts that cannot be handled easily (in single-node memory). A naive solution could be to simply work with something like Scipp's ragged data can be considered a "partial sorting", to build a sort of "index". Based on all this we can then, e.g., quickly compute high-resolution cuts. Say we are in 3-D (Qx, Qy, Qz). We would not have bin sizes that match the final resolution required by the science. Instead we could use 50x50x50 bins. Then we can very quickly produce a high-res 2-D plot (say (1000x1000), Qx, Qz or whatever), since our binned data format reduces the data/memory you have to load and consider by a factor of up to 50 (in this example). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Awkward array backend? 667864088 | |
634558423 | https://github.com/pydata/xarray/issues/3213#issuecomment-634558423 | https://api.github.com/repos/pydata/xarray/issues/3213 | MDEyOklzc3VlQ29tbWVudDYzNDU1ODQyMw== | SimonHeybrock 12912489 | 2020-05-27T10:00:25Z | 2021-10-15T04:38:25Z | NONE | @pnsaevik If the approach we adopt in scipp could be ported to xarray you would be able to to something like (assuming that the ragged array representation you have in mind is "list of lists"): ```python data = my_load_netcdf(...) # list of lists assume 'x' is the dimension of the nested listsbin_edges = sc.Variable(dims=['x'], values=[0.1,0.3,0.5,0.7,0.9]) realigned = sc.realign(data, {'x':bin_edges}) filtered = realigned['x', 1:3].copy() my_store_netcdf(filtered.unaligned, ...) ``` Basically, we have slicing for the "realigned" wrapper. It performs a filter operation when copied. Edit 2021: Above example is very outdated, we have cleaned up the mechanism, see https://scipp.github.io/user-guide/binned-data/binned-data.html. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
How should xarray use/support sparse arrays? 479942077 | |
632536798 | https://github.com/pydata/xarray/issues/3213#issuecomment-632536798 | https://api.github.com/repos/pydata/xarray/issues/3213 | MDEyOklzc3VlQ29tbWVudDYzMjUzNjc5OA== | SimonHeybrock 12912489 | 2020-05-22T07:20:35Z | 2021-10-15T04:36:17Z | NONE | I am not familiar with the details of the various applications people in this discussion have, but here is an approach we are taking, trying to solve variations of the problem "data scattered in multi-dimensional space" or irregular time-series data. See https://scipp.github.io/user-guide/binned-data/binned-data.html for an illustrated description. The basic idea is to keep data in a linear representation and wrap it in a "realigned" wrapper. One reason for this development was to provide a pathway to use dask with our type of data (independent time series at a large number of points in space, with chunking along the "time-series", which is not a dimension since every time series has a different length). With the linked approach we could use dask to distribute the linear underlying representation, keeping the lightweight realigned wrapper on all workers. We are still in early experimentation with this (the dask part is not actually in development yet). It probably has performance issues if more than "millions" of points are realigned --- our case is millions of time series with thousands/millions of time points in each, but the two do not mix (not both are realigned, and if they are it is independently), so we do not run into the performance issue in most cases. In principle I could imagine this non-destructive realignment approach could be mapped to xarray, so it may be of interest to people here. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
How should xarray use/support sparse arrays? 479942077 | |
890459710 | https://github.com/pydata/xarray/issues/5648#issuecomment-890459710 | https://api.github.com/repos/pydata/xarray/issues/5648 | IC_kwDOAMm_X841E1Y- | SimonHeybrock 12912489 | 2021-08-01T06:12:19Z | 2021-08-01T06:12:19Z | NONE |
Thanks! I am definitely interested. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Duck array compatibility meeting 956103236 | |
872054936 | https://github.com/pydata/xarray/pull/5201#issuecomment-872054936 | https://api.github.com/repos/pydata/xarray/issues/5201 | MDEyOklzc3VlQ29tbWVudDg3MjA1NDkzNg== | SimonHeybrock 12912489 | 2021-07-01T08:49:04Z | 2021-07-01T08:49:04Z | NONE | Before:
After:
On the top band, I have used the screenshot-timeline to zoom onto the time window where the cell is being executed (marked with |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Fix lag in Jupyter caused by CSS in `_repr_html_` 863506023 | |
872037862 | https://github.com/pydata/xarray/pull/5201#issuecomment-872037862 | https://api.github.com/repos/pydata/xarray/issues/5201 | MDEyOklzc3VlQ29tbWVudDg3MjAzNzg2Mg== | SimonHeybrock 12912489 | 2021-07-01T08:25:12Z | 2021-07-01T08:25:12Z | NONE |
I think this is also a problem, but I believe this is independent and not improved by the CSS changes in this branch. Maybe a Jupyter issue and not related to libraries in use?
So I had used Chrome, open "Developer Tools" > "Performance" tab: - start recording a profile - run a cell the displays HTML output - stop profile I think I had observed a difference in the "Render" part of the profile, but I cannot check now (I may be able later today when I am back to my main computer). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Fix lag in Jupyter caused by CSS in `_repr_html_` 863506023 | |
872013761 | https://github.com/pydata/xarray/pull/5201#issuecomment-872013761 | https://api.github.com/repos/pydata/xarray/issues/5201 | MDEyOklzc3VlQ29tbWVudDg3MjAxMzc2MQ== | SimonHeybrock 12912489 | 2021-07-01T07:54:15Z | 2021-07-01T07:54:15Z | NONE | Indeed, such timings do not include the CSS timings. The only way I was able to see it was to use the Web Dev tools that come as part of Firefox or Chrome. You should be able to see the timings included there when recording a profile. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Fix lag in Jupyter caused by CSS in `_repr_html_` 863506023 | |
871996744 | https://github.com/pydata/xarray/pull/5201#issuecomment-871996744 | https://api.github.com/repos/pydata/xarray/issues/5201 | MDEyOklzc3VlQ29tbWVudDg3MTk5Njc0NA== | SimonHeybrock 12912489 | 2021-07-01T07:26:50Z | 2021-07-01T07:26:50Z | NONE | @fujiisoup Maybe I missed it in the video, but did you try if there are differences when running an individual cell, not just when loading the page the first time? My point is: - When a page is first loaded, obviously the CSS for everything (all cells) has to be processed. That cannot be changed. - When updated a single cell, prior to this branch, it triggered CSS changes for all cells. - With this branch, only current cell should be affected. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Fix lag in Jupyter caused by CSS in `_repr_html_` 863506023 | |
828259218 | https://github.com/pydata/xarray/pull/5201#issuecomment-828259218 | https://api.github.com/repos/pydata/xarray/issues/5201 | MDEyOklzc3VlQ29tbWVudDgyODI1OTIxOA== | SimonHeybrock 12912489 | 2021-04-28T08:27:14Z | 2021-04-28T08:27:14Z | NONE | Or not quite: The DOM seems to end at |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Fix lag in Jupyter caused by CSS in `_repr_html_` 863506023 | |
828236723 | https://github.com/pydata/xarray/pull/5201#issuecomment-828236723 | https://api.github.com/repos/pydata/xarray/issues/5201 | MDEyOklzc3VlQ29tbWVudDgyODIzNjcyMw== | SimonHeybrock 12912489 | 2021-04-28T07:55:23Z | 2021-04-28T07:55:23Z | NONE | Cheers, that was what I was looking for! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Fix lag in Jupyter caused by CSS in `_repr_html_` 863506023 | |
828216637 | https://github.com/pydata/xarray/pull/5201#issuecomment-828216637 | https://api.github.com/repos/pydata/xarray/issues/5201 | MDEyOklzc3VlQ29tbWVudDgyODIxNjYzNw== | SimonHeybrock 12912489 | 2021-04-28T07:26:41Z | 2021-04-28T07:28:06Z | NONE | Ok, I tried, but got stuck: I can reproduce the issue in VScode. However, I cannot find a way to inspect the CSS in VScode's Jupyter console. The theme itself is a We somehow need to detect the theme within |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Fix lag in Jupyter caused by CSS in `_repr_html_` 863506023 | |
826492174 | https://github.com/pydata/xarray/pull/5201#issuecomment-826492174 | https://api.github.com/repos/pydata/xarray/issues/5201 | MDEyOklzc3VlQ29tbWVudDgyNjQ5MjE3NA== | SimonHeybrock 12912489 | 2021-04-26T04:29:00Z | 2021-04-26T04:29:00Z | NONE | I don't have VS code so I can't try, but looking at the CSS I feel that this would actually break the colors there, since I moved the general settings from So I would recommend not to merge this unless someone is able to try it out. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Fix lag in Jupyter caused by CSS in `_repr_html_` 863506023 | |
526541433 | https://github.com/pydata/xarray/pull/1820#issuecomment-526541433 | https://api.github.com/repos/pydata/xarray/issues/1820 | MDEyOklzc3VlQ29tbWVudDUyNjU0MTQzMw== | SimonHeybrock 12912489 | 2019-08-30T09:55:16Z | 2019-08-30T09:55:16Z | NONE |
We have done something similar using inline svg (see, e.g., https://scipp.readthedocs.io/en/latest/user-guide/data-structures.html#Dataset). It is basically a hack for testing right now, but is sufficient for auto-generated illustration in the documentation. I am pretty impressed by the html representation previewed in https://github.com/pydata/xarray/issues/1627. Since our data structures are very similar I would be happy to contribute to this output rendering somehow, since we could then also benefit from it (with a few tweaks, probably). So let me know if I can help out somehow (unfortunately I do not know much html and css, just C++ and a bit of Python). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: html repr 287844110 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
issue 10