html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/7816#issuecomment-1535931753,https://api.github.com/repos/pydata/xarray/issues/7816,1535931753,IC_kwDOAMm_X85bjHVp,56827,2023-05-05T08:46:42Z,2023-05-05T08:46:42Z,NONE,"Hi, I forgot to rebuild the package after removing the BACKEND_... line. With only the line in pyproject.toml it works as it should! My mistake. Thanks for the patience. Regards, Gaute","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1695809136 https://github.com/pydata/xarray/issues/7816#issuecomment-1535716950,https://api.github.com/repos/pydata/xarray/issues/7816,1535716950,IC_kwDOAMm_X85biS5W,56827,2023-05-05T05:29:10Z,2023-05-05T05:29:10Z,NONE,"Hi, Yes, I tried that, but I then got the same error as if I kept that line in the old format. I'll do a few tests and post the proper error here. Gaute ","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1695809136 https://github.com/pydata/xarray/issues/7816#issuecomment-1535432806,https://api.github.com/repos/pydata/xarray/issues/7816,1535432806,IC_kwDOAMm_X85bhNhm,56827,2023-05-04T21:23:31Z,2023-05-04T21:23:31Z,NONE,"If I do not manually add the backend to the array, but only have this line in https://github.com/gauteh/hidefix/blob/main/pyproject.toml#L29: ``` [project.entry-points.""xarray.backends""] hidefix = ""hidefix.xarray:HidefixBackendEntrypoint"" ``` which is only what is supported by pyproject.toml/maturin I get an error where xarray expected a tuple and cannot parse the entrypoint, not just the adderss to the entrypoint - as it used to be (back in January at least). ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1695809136 https://github.com/pydata/xarray/issues/7446#issuecomment-1396560033,https://api.github.com/repos/pydata/xarray/issues/7446,1396560033,IC_kwDOAMm_X85TPdCh,56827,2023-01-19T07:44:30Z,2023-01-19T07:44:30Z,NONE,"On Tue, Jan 17, 2023 at 5:23 PM Ryan Abernathey ***@***.***> wrote: > Hi @gauteh ! This is very cool! Thanks for > sharing. I'm really excited about way that Rust can be used to optimized > different parts of our stack. > > A couple of questions: > > - > > Can your reader read over HTTP / S3 protocol? Or is it just local > files? > > It is built to do this, but I haven't implemented it. I initially wrote it for an OpenDAP server (dars: https://github.com/gauteh/dars), where the plan is to also support files stored in the cloud. So the hidefix-reader can read from any interface that supports ReadAt or Read + Seek. It would probably be beneficial to index the files beforehand. I submitted a patch to HDF5 that allows it to iterate over the chunks quickly, so indexing a 5-6 GB file takes only a couple of hundred ms - so I no longer store the index for local files. It is still faster than native HDF5 including the indexing. > > - > - > > Do you know about kerchunk ? The > approach you described: > > The reader works by indexing the chunks of a dataset so that chunks > can be accessed independently. > > ...is identical to the approach taken by Kerchunk (although the > implementation is different). I'm curious what specification you use to > store your indexes. Could we make your implementation interoperable with > kerchunk, such that a kerchunk reference specification could be read by > your reader? It would be great to reach for some degree of alignment here. > > The index is serializable using the rust serde system, so it can be stored in any format supported by that. A fair amount of effort went into making the deserialization _zero-copy_: that means that I can read the e.g. 10mb index for a 5-6gb file very quickly, but it requires very little deserialization since the read buffers are already memory-mapped to the structures making it very fast. I don't have a specific format at the moment, but I have used bincode a lot in e.g. dars. > > - > - > > Do you know about hdf5-coro - http://icesat2sliderule.org/h5coro/ - > they have similar goals, but focused on cloud-based access > > I hope this can be of general interest, and if it would be of interest to > move the hidefix xarray backend into xarray that would be very cool. > > This is definitely of general interest! However, it is not necessary to > add a new backend directly into xarray. We support entry points which allow > packages to implement their own readers, as you have apparently already > discovered: > https://docs.xarray.dev/en/stable/internals/how-to-add-new-backend.html > > Installing your package should be enough to enable the new engine. > > We would, however, welcome a documentation PR that described how to use > this package on the I/O page. > Great, the package should already register itself with xarray. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1536004355