github: issue_comments: 6 rows where issue = 902009258 sorted by updated

6 rows where issue = 902009258 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
852833408	https://github.com/pydata/xarray/issues/5376#issuecomment-852833408	https://api.github.com/repos/pydata/xarray/issues/5376	MDEyOklzc3VlQ29tbWVudDg1MjgzMzQwOA==	benbovy 4160723	2021-06-02T08:07:38Z	2021-06-02T08:07:38Z	MEMBER	What would be other examples like ImagePyramidIndex, outside of the multi-scale context? There can be many examples like spatial indexes, complex grid indexes (select cell centers/faces of a staggered grid), distributed indexes, etc. Some of them are illustrated in a presentation I gave a couple of weeks ago (slides here). Although all those examples actually do data indexing. In the multi-scale context, I admit that the name "index" may sound confusing since an `ImagePyramidIndex` would not really perform any data indexing based on some coordinate labels. Perhaps `ImageRescaler` would be a better name? Such `ImageRescaler` might still fit well the broad purpose Xarray indexes IMHO since it would enable efficient data visualization through extraction and resampling. The goal with Xarray custom indexes is to allow (many) kinds of objects with a scope possibly much more narrow than, e.g., `pandas.Index`, and that could possibly be reused in a broader range of operations like data selection, resampling, alignment, etc. Xarray indexes will be explicitly part of Xarray's `Dataset`/`DataArray` data model alongside data variables, coordinates and attributes, but unlike the latter they're not intended to wrap any (meta)data. Instead, they could wrap any structure or object that may be built from the (meta)data and that would enable efficient operations on the data (a-priori based on coordinate labels, although in some contexts like multi-scale this might be more accessory?).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Multi-scale datasets and custom indexes 902009258
852686461	https://github.com/pydata/xarray/issues/5376#issuecomment-852686461	https://api.github.com/repos/pydata/xarray/issues/5376	MDEyOklzc3VlQ29tbWVudDg1MjY4NjQ2MQ==	shoyer 1217238	2021-06-02T03:25:31Z	2021-06-02T03:25:31Z	MEMBER	I do think multi-scale datasets are common enough across different scientific fields (remote sensing, bio-imaging, simulation output, etc) that this could be worth considering.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Multi-scale datasets and custom indexes 902009258
852431061	https://github.com/pydata/xarray/issues/5376#issuecomment-852431061	https://api.github.com/repos/pydata/xarray/issues/5376	MDEyOklzc3VlQ29tbWVudDg1MjQzMTA2MQ==	thewtex 25432	2021-06-01T20:41:03Z	2021-06-01T20:41:12Z	CONTRIBUTOR	@benbovy I also agree that a data structure that encapsulates a scale into a nice API, where you set the scale currently desired, and the same Xarray Dataset/DataArray API is available, and that scale can optionally be lazily be loaded. Maybe an Index as proposed could be a good API, but I do not have a good enough understanding of how the interface is used in general. What would be other examples like `ImagePyramidIndex`, outside of the multi-scale context? Should something like `Scale` be used instead? Regarding dynamic multi-scale, etc., one use case of interest is where you are interactively processing a larger-then memory dataset, and want to visualize the result over a limited domain on an intermediate scale.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Multi-scale datasets and custom indexes 902009258
850552195	https://github.com/pydata/xarray/issues/5376#issuecomment-850552195	https://api.github.com/repos/pydata/xarray/issues/5376	MDEyOklzc3VlQ29tbWVudDg1MDU1MjE5NQ==	d-v-b 3805136	2021-05-28T17:04:27Z	2021-05-28T17:04:27Z	NONE	Are there cases in practice where on-demand downsampling computation would be preferred over pre-computing and storing all pyramid levels for the full dataset? I admit it's probably a very naive question since most workflows on the client side would likely start by loading the top level (lowest resolution) dataset at full extent, which would require pre-computing the whole thing? I'm not sure when dynamic downsampling would be preferred over loading previously downsampled images from disk. In my usage, the application consuming the multiresolution images is an interactive data visualization tool and the goal is to minimize latency / maximize responsiveness of the visualization, and this would be difficult if the multiresolution images were generated dynamically from the full image -- under a dynamic scheme the lowest resolution image, i.e. the one that should be fastest to load, would instead require the most I/O and compute to generate.... Are there cases where it makes sense to pre-compute all the the pyramid levels in-memory (could be, e.g., chunked dask arrays persisted on a distributed cluster) without the need to store them? Although I do not do this today, I can think of a lot of uses for this functionality -- an data processing pipeline could expose intermediate data over http via xpublish, but this would require a good caching layer to prevent re-computing the same region of the data repeatedly.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Multi-scale datasets and custom indexes 902009258
850307092	https://github.com/pydata/xarray/issues/5376#issuecomment-850307092	https://api.github.com/repos/pydata/xarray/issues/5376	MDEyOklzc3VlQ29tbWVudDg1MDMwNzA5Mg==	benbovy 4160723	2021-05-28T10:04:52Z	2021-05-28T10:04:52Z	MEMBER	I think there's certainly something to be won just by having a data structure which says these arrays/datasets represent a multiscale series. I agree, but I'm wondering whether the multiscale series couldn't be also viewed as something that can be abstracted away, i.e., the original dataset (level 0) is the "real" dataset while all other levels are some derived datasets that are convenient for some specific applications (e.g., visualization) but not very useful for general use. Having a single `xarray.Dataset` with a custom index (+ custom Dataset extension) taking care of all the multiscale stuff may have benefits too. For example, it would be pretty straightforward reusing a tool like https://github.com/xarray-contrib/xpublish to interactively (pre)fetch data to web-based clients (via some custom API endpoints). More generally, I guess it's easier to integrate with some existing tools built on top of Xarray vs. adding support for a new data structure. Some related questions (out of curiosity): Are there cases in practice where on-demand downsampling computation would be preferred over pre-computing and storing all pyramid levels for the full dataset? I admit it's probably a very naive question since most workflows on the client side would likely start by loading the top level (lowest resolution) dataset at full extent, which would require pre-computing the whole thing? Are there cases where it makes sense to pre-compute all the the pyramid levels in-memory (could be, e.g., chunked dask arrays persisted on a distributed cluster) without the need to store them?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Multi-scale datasets and custom indexes 902009258
848914165	https://github.com/pydata/xarray/issues/5376#issuecomment-848914165	https://api.github.com/repos/pydata/xarray/issues/5376	MDEyOklzc3VlQ29tbWVudDg0ODkxNDE2NQ==	joshmoore 88113	2021-05-26T16:23:13Z	2021-05-26T16:23:13Z	NONE	I don't think I am familiar enough to really judge between the suggestions, @benbovy, but I'm intrigued. I think there's certainly something to be won just by having a data structure which says these arrays/datasets represent a multiscale series. One real benefit though will be when access of that structure can simplify the client code needed to interactively load that data, e.g. with prefetching.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Multi-scale datasets and custom indexes 902009258

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);