github: issue_comments: 7 rows where issue = 667864088 and user = 12912489 sorted by updated

7 rows where issue = 667864088 and user = 12912489 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
1288374461	https://github.com/pydata/xarray/issues/4285#issuecomment-1288374461	https://api.github.com/repos/pydata/xarray/issues/4285	IC_kwDOAMm_X85Mywi9	SimonHeybrock 12912489	2022-10-24T03:44:44Z	2022-11-03T17:04:15Z	NONE	Also note the Ragged Array Summit on Scientific Python.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Awkward array backend? 667864088
1283416324	https://github.com/pydata/xarray/issues/4285#issuecomment-1283416324	https://api.github.com/repos/pydata/xarray/issues/4285	IC_kwDOAMm_X85Mf2EE	SimonHeybrock 12912489	2022-10-19T04:39:06Z	2022-10-19T04:39:06Z	NONE	A possibly relevant distinction that had not occurred to me previously is the example by @milancurcic: If I understand this correctly then this type of data is essentially an array of variable-length time-series (essentially a list of lists?), i.e., there is an order within each inner list. This is conceptually different from the data I am typically dealing with, where each inner list is a list of records without specific ordering.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Awkward array backend? 667864088
1216208075	https://github.com/pydata/xarray/issues/4285#issuecomment-1216208075	https://api.github.com/repos/pydata/xarray/issues/4285	IC_kwDOAMm_X85IfdzL	SimonHeybrock 12912489	2022-08-16T06:38:32Z	2022-08-16T06:42:28Z	NONE	@jpivarski Support for event data, a particular form of sparse data. I might have been misinterpreting the word "sparse data" in conversations about this. I had thought that "sparse data" is logically rectilinear but represented in memory with the zeros removed, so the internal machinery has to deal with irregular structures, but the outward API it presents is regular (dimensionality is completely described by a `shape: tuple[int]`). You are right that "sparse" is misleading. Since it is indeed most commonly used for sparse matrix/array representations we are now usually avoiding this term (and refer to it as binned data, or ragged data instead). Obviously our title page needs an update 😬 . logically rectilinear This does actually apply to Scipp's binned data. A `scipp.Variable` may have `shape=(N,M)` and be "ragged". But the "ragged" dimension is in addition to the two regular dimensions. That is, in this case we have (conceptually) a 2-D array of lists.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Awkward array backend? 667864088
1216107702	https://github.com/pydata/xarray/issues/4285#issuecomment-1216107702	https://api.github.com/repos/pydata/xarray/issues/4285	IC_kwDOAMm_X85IfFS2	SimonHeybrock 12912489	2022-08-16T03:43:29Z	2022-08-16T05:11:50Z	NONE	Generalise xarray to allow for variable-length dimensions This seems hard. Xarray's whole model is built assuming that `dims` has type `Mapping[Hashable, int]`. It also breaks our normal concept of alignment, which we need to put coordinate variables in DataArrays alongside data variables. Anecdotal evidence that this is indeed not a good solution: scipp's "ragged data" implementation was originally implemented with such a variable-length dimension support. This led to a whole series of problems, including significantly complicating `scipp.DataArray`, both in terms of code and conceptually. After this experience we switched to the current model, which exposes only the regular, aligned dimensions.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Awkward array backend? 667864088
1216144957	https://github.com/pydata/xarray/issues/4285#issuecomment-1216144957	https://api.github.com/repos/pydata/xarray/issues/4285	IC_kwDOAMm_X85IfOY9	SimonHeybrock 12912489	2022-08-16T04:54:25Z	2022-08-16T04:54:25Z	NONE	Is anyone here going to EuroScipy (two weeks from now) and interested in having a chat/discussion about ragged data?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Awkward array backend? 667864088
1216125098	https://github.com/pydata/xarray/issues/4285#issuecomment-1216125098	https://api.github.com/repos/pydata/xarray/issues/4285	IC_kwDOAMm_X85IfJiq	SimonHeybrock 12912489	2022-08-16T04:17:52Z	2022-08-16T04:17:52Z	NONE	@danielballan mentioned that the photon community (synchrotrons/X-ray scattering) is starting to talk more and more about ragged data related to "event mode" data collection as well.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Awkward array backend? 667864088
1216123818	https://github.com/pydata/xarray/issues/4285#issuecomment-1216123818	https://api.github.com/repos/pydata/xarray/issues/4285	IC_kwDOAMm_X85IfJOq	SimonHeybrock 12912489	2022-08-16T04:15:24Z	2022-08-16T04:15:24Z	NONE	5. Neutron scattering data Scipp is an xarray-like labelled data structure for neutron scattering experiment data. On their FAQ Q titled "Why is xarray not enough", one of the things they quote is Support for event data, a particular form of sparse data. More concretely, this is essentially a 1-D (or N-D) array of random-length lists, with very small list entries. This type of data arises in time-resolved detection of neutrons in pixelated detectors. Would a `RaggedArray` class that's wrappable in xarray help with this? (cc @SimonHeybrock) Partially, but the bigger challenge may be the related algorithms, e.g., for getting data into this layout, and for switching to other ragged layouts. For context, one of the main reasons for our data layout is the ability to make cuts/slices quickly. We frequently deal with 2-D, 3-D, and 4-D data. For example, a 3-D case may be be the momentum transfer $\vec Q$ in a scattering process, with a "record" for every detected neutron. Desired final resolution may exceed 1000 per dimension (of the 3 components of $\vec Q$). On top of this there may be additional dimensions relating to environment parameters of the sample under study, such as temperature, pressure, or strain. This would lead to bin-counts that cannot be handled easily (in single-node memory). A naive solution could be to simply work with something like `pandas.DataFrame`, with columns for the components of $\vec Q$ as well as the sample environment parameters. Those could then be used for grouping/histogramming to the desired 2-D cuts or slices. However, as frequently many such slices or required this can quickly become inefficient (though there is certainly cases where it would work well, providing a simpler solution that scipp). Scipp's ragged data can be considered a "partial sorting", to build a sort of "index". Based on all this we can then, e.g., quickly compute high-resolution cuts. Say we are in 3-D (Qx, Qy, Qz). We would not have bin sizes that match the final resolution required by the science. Instead we could use 50x50x50 bins. Then we can very quickly produce a high-res 2-D plot (say (1000x1000), Qx, Qz or whatever), since our binned data format reduces the data/memory you have to load and consider by a factor of up to 50 (in this example).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Awkward array backend? 667864088

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);