issues: 267628781

This data as json

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at	closed_at	author_association	active_lock_reason	draft	pull_request	body	reactions	performed_via_github_app	state_reason	repo	type
267628781	MDU6SXNzdWUyNjc2Mjg3ODE=	1650	Low memory/out-of-core index?	703554	open	0			17	2017-10-23T11:13:06Z	2023-08-29T11:43:19Z		CONTRIBUTOR				Has anyone considered implementing an index for monotonic data that does not require loading all values into main memory? Motivation: We have data where first dimension can be length ~100,000,000, and coordinates for this dimension are stored as 32-bit integers. Currently if we used a pandas Index this would cast to 64-bit integers, and the index would require ~1GB RAM. This isn't enormous, but isn't negligible for people working on modest computers. Our use cases are simple, typically we only ever need to locate a slice of this dimension from a pair of coordinates, i.e., we only need to do binary search (bisect) on the coordinates. To achieve binary search in fact there is no need at all to load the coordinate values into memory, they could be left on disk (e.g., in HDF5 or Zarr dataset) and still achieve perfectly adequate performance for our needs. This is of course also relevant to pandas but thought I'd post here as I know there have been some discussions about how to handle indexes when working with larger datasets via dask.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1650/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }			13221727	issue

Links from other tables

1 row from issues_id in issues_labels
8 rows from issue in issue_comments