issue_comments
92 rows where user = 6042212 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
user 1
- martindurant · 92 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
1466255394 | https://github.com/pydata/xarray/issues/7574#issuecomment-1466255394 | https://api.github.com/repos/pydata/xarray/issues/7574 | IC_kwDOAMm_X85XZUgi | martindurant 6042212 | 2023-03-13T14:32:53Z | 2023-03-13T14:32:53Z | CONTRIBUTOR | Sorry, I really don't know what goes inside xarray's cache layers. It seems that fsspec is doing the right thing if it opens via one route, and parallel=True shouldn't require any serialisation for the in-process threaded scheduler. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xr.open_mfdataset doesn't work with fsspec and dask 1605108888 | |
1453911083 | https://github.com/pydata/xarray/issues/4122#issuecomment-1453911083 | https://api.github.com/repos/pydata/xarray/issues/4122 | IC_kwDOAMm_X85WqOwr | martindurant 6042212 | 2023-03-03T18:12:01Z | 2023-03-03T18:12:01Z | CONTRIBUTOR |
No compression, encoding or chunking except for the one "append" dimension. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Document writing netcdf from xarray directly to S3 631085856 | |
1453902381 | https://github.com/pydata/xarray/issues/4122#issuecomment-1453902381 | https://api.github.com/repos/pydata/xarray/issues/4122 | IC_kwDOAMm_X85WqMot | martindurant 6042212 | 2023-03-03T18:04:29Z | 2023-03-03T18:04:29Z | CONTRIBUTOR | scipy only reads/writes netcdf2/3 ( https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.netcdf_file.html ), which is a very different and simpler format than netcdf4. The latter uses HDF5 as a container, and h5netcdf as the xarray engine. I guess "to_netcdf" is ambiguous. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Document writing netcdf from xarray directly to S3 631085856 | |
1453898602 | https://github.com/pydata/xarray/issues/4122#issuecomment-1453898602 | https://api.github.com/repos/pydata/xarray/issues/4122 | IC_kwDOAMm_X85WqLtq | martindurant 6042212 | 2023-03-03T18:01:30Z | 2023-03-03T18:01:30Z | CONTRIBUTOR |
This is netCDF3, in that case. If that's fine for you, no problem. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Document writing netcdf from xarray directly to S3 631085856 | |
1453558039 | https://github.com/pydata/xarray/issues/4122#issuecomment-1453558039 | https://api.github.com/repos/pydata/xarray/issues/4122 | IC_kwDOAMm_X85Wo4kX | martindurant 6042212 | 2023-03-03T13:48:09Z | 2023-03-03T13:48:09Z | CONTRIBUTOR | Maybe it is netCDF3? xarray is supposed to be able to determine the file type
|
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Document writing netcdf from xarray directly to S3 631085856 | |
1450727551 | https://github.com/pydata/xarray/issues/7522#issuecomment-1450727551 | https://api.github.com/repos/pydata/xarray/issues/7522 | IC_kwDOAMm_X85WeFh_ | martindurant 6042212 | 2023-03-01T19:22:54Z | 2023-03-01T19:22:54Z | CONTRIBUTOR | I do generally recommend cache_type="first" for reading HDF5 files, because they tend to have most of the metadata in the header area of the file, with short pieces of metadata "elsewhere"; so the default readahead doesn't perform very well. As to what the two writers might be doing differently, I only have guesses. I imagine xarray leaves it entirely to HDF to make whatever choices it likes. Dask does not write in parallel, since HDF does not support that, but it may order the writes more logically. It does set up the whole set of variables as a initialisation stage before writing any data - I don't know if xarray does this. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Differences in `to_netcdf` for dask and numpy backed arrays 1581046647 | |
1400583499 | https://github.com/pydata/xarray/issues/4122#issuecomment-1400583499 | https://api.github.com/repos/pydata/xarray/issues/4122 | IC_kwDOAMm_X85TezVL | martindurant 6042212 | 2023-01-23T15:57:24Z | 2023-01-23T15:57:24Z | CONTRIBUTOR | Would you mind writing out long-hand the version that worked and the version that didn't? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Document writing netcdf from xarray directly to S3 631085856 | |
1400545067 | https://github.com/pydata/xarray/issues/4122#issuecomment-1400545067 | https://api.github.com/repos/pydata/xarray/issues/4122 | IC_kwDOAMm_X85Tep8r | martindurant 6042212 | 2023-01-23T15:31:16Z | 2023-01-23T15:31:16Z | CONTRIBUTOR | I can confirm that something like the following does work, basically automating the "write local and then push" workflow:
Unfortunately, directly writing to the remote file without a local cached file is not supported, because HDF5 does not write in a linear way. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Document writing netcdf from xarray directly to S3 631085856 | |
1381232564 | https://github.com/pydata/xarray/issues/7430#issuecomment-1381232564 | https://api.github.com/repos/pydata/xarray/issues/7430 | IC_kwDOAMm_X85SU--0 | martindurant 6042212 | 2023-01-13T02:24:24Z | 2023-01-13T02:24:24Z | CONTRIBUTOR | I recommend turning on logging in the HTTP file system
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Missing Blocks when loading zarr file 1525802030 | |
1330736962 | https://github.com/pydata/xarray/pull/7304#issuecomment-1330736962 | https://api.github.com/repos/pydata/xarray/issues/7304 | IC_kwDOAMm_X85PUW9C | martindurant 6042212 | 2022-11-29T14:30:43Z | 2022-11-29T14:30:43Z | CONTRIBUTOR | It loos reasonable to me. I'm not sure if the warning is needed or not - we don't expect anyone to see it, or if they do, necessarily do anything about it. It's not unusual for code interacting with a file-like object to move the file pointer. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Reset file pointer to 0 when reading file stream 1458347938 | |
1244150155 | https://github.com/pydata/xarray/issues/6809#issuecomment-1244150155 | https://api.github.com/repos/pydata/xarray/issues/6809 | IC_kwDOAMm_X85KKDmL | martindurant 6042212 | 2022-09-12T18:45:09Z | 2022-09-12T18:45:09Z | CONTRIBUTOR | I agree that @nestabur , if you pass the s3 path directly to xarray/zarr rather than making your own FSMap, you should get a FSStore storage layer (similar but different!). Does this have the same behaviour? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Checking whether there is a chunk_store passed iterates over all files 1309500528 | |
1204310218 | https://github.com/pydata/xarray/issues/6813#issuecomment-1204310218 | https://api.github.com/repos/pydata/xarray/issues/6813 | IC_kwDOAMm_X85HyFDK | martindurant 6042212 | 2022-08-03T18:10:57Z | 2022-08-03T18:10:57Z | CONTRIBUTOR | Yes, it is reasonable to always I am mildly against subclassing from RawIOBase, since some file-likes might choose to implement text mode right in the class (as opposed to a text wrapper layered on top). Pretty surprised that it doesn't have read()/write(), though, since all the derived classes do. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Opening fsspec s3 file twice results in invalid start byte 1310058435 | |
1146159832 | https://github.com/pydata/xarray/issues/6662#issuecomment-1146159832 | https://api.github.com/repos/pydata/xarray/issues/6662 | IC_kwDOAMm_X85EUQLY | martindurant 6042212 | 2022-06-03T16:34:44Z | 2022-06-03T16:34:44Z | CONTRIBUTOR | Can you please explicitly check the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Obscure h5netcdf http serialization issue with python's http.server 1260047355 | |
1146079311 | https://github.com/pydata/xarray/issues/6662#issuecomment-1146079311 | https://api.github.com/repos/pydata/xarray/issues/6662 | IC_kwDOAMm_X85ET8hP | martindurant 6042212 | 2022-06-03T15:30:17Z | 2022-06-03T15:30:17Z | CONTRIBUTOR | Python's HTTP server does not normally provide content lengths without some extra work, that might be the difference. |
{ "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 1 } |
Obscure h5netcdf http serialization issue with python's http.server 1260047355 | |
1085091126 | https://github.com/pydata/xarray/pull/5879#issuecomment-1085091126 | https://api.github.com/repos/pydata/xarray/issues/5879 | IC_kwDOAMm_X85ArS02 | martindurant 6042212 | 2022-03-31T20:45:54Z | 2022-03-31T20:45:54Z | CONTRIBUTOR | OK, I get you - so the real problem is that OpenFile can look path-like, but isn't really. OpenFile is really a file-like factory, a proxy for open file-likes that you can make (and seialise for Dask). Its main purpose is to be used in a context:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Check for path-like objects rather than Path type, use os.fspath 1031275532 | |
1085037801 | https://github.com/pydata/xarray/pull/5879#issuecomment-1085037801 | https://api.github.com/repos/pydata/xarray/issues/5879 | IC_kwDOAMm_X85ArFzp | martindurant 6042212 | 2022-03-31T19:54:26Z | 2022-03-31T19:54:26Z | CONTRIBUTOR | "s3://noaa-nwm-retrospective-2-1-zarr-pds/lakeout.zarr" is a directory, right? You cannot open that as a file, or maybe there is no equivalent key at all (because s3 is magic like that). No, you should not be able to do this directly - zarr requires a path which fsspec can turn into a mapper, or an instantiated mapper. To make a bare mapper (i.e., dict-like):
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Check for path-like objects rather than Path type, use os.fspath 1031275532 | |
1085022939 | https://github.com/pydata/xarray/pull/5879#issuecomment-1085022939 | https://api.github.com/repos/pydata/xarray/issues/5879 | IC_kwDOAMm_X85ArCLb | martindurant 6042212 | 2022-03-31T19:37:49Z | 2022-03-31T19:37:49Z | CONTRIBUTOR |
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Check for path-like objects rather than Path type, use os.fspath 1031275532 | |
1020190813 | https://github.com/pydata/xarray/issues/6033#issuecomment-1020190813 | https://api.github.com/repos/pydata/xarray/issues/6033 | IC_kwDOAMm_X848zuBd | martindurant 6042212 | 2022-01-24T15:00:53Z | 2022-01-24T15:00:53Z | CONTRIBUTOR | It would be interesting to turn on s3fs logging to see the access pattern, if you are interested.
The dask version is interesting:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Threadlocking in DataArray calculations for zarr data depending on where it's loaded from (S3 vs local) 1064837571 | |
970365001 | https://github.com/pydata/xarray/issues/5426#issuecomment-970365001 | https://api.github.com/repos/pydata/xarray/issues/5426 | IC_kwDOAMm_X8451phJ | martindurant 6042212 | 2021-11-16T15:08:03Z | 2021-11-16T15:08:03Z | CONTRIBUTOR | The conversation here seems to have stalled, but I feel like it was useful. Did we gather any useful actions? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Implement dask.sizeof for xarray.core.indexing.ImplicitToExplicitIndexingAdapter 908971901 | |
961211401 | https://github.com/pydata/xarray/issues/5918#issuecomment-961211401 | https://api.github.com/repos/pydata/xarray/issues/5918 | IC_kwDOAMm_X845SuwJ | martindurant 6042212 | 2021-11-04T16:30:33Z | 2021-11-04T16:30:33Z | CONTRIBUTOR | Some thoughts:
- fsspec's mapper (or even the filesystem instance) could hold default kwargs to be applied to any open() function, perhaps separate ones for reading and writing. In this case, that would mean supplying |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Reading zarr gives unspecific PermissionError: Access Denied when public data has been consolidated after being written to S3 1039844354 | |
880695337 | https://github.com/pydata/xarray/issues/5600#issuecomment-880695337 | https://api.github.com/repos/pydata/xarray/issues/5600 | MDEyOklzc3VlQ29tbWVudDg4MDY5NTMzNw== | martindurant 6042212 | 2021-07-15T13:28:53Z | 2021-07-15T13:28:53Z | CONTRIBUTOR |
Perhaps so? We are releasing pretty frequently, though, and if there is a problem here, we'd be happy to put out a bugfix. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
⚠️ Nightly upstream-dev CI failed ⚠️ 943923579 | |
880685334 | https://github.com/pydata/xarray/issues/5600#issuecomment-880685334 | https://api.github.com/repos/pydata/xarray/issues/5600 | MDEyOklzc3VlQ29tbWVudDg4MDY4NTMzNA== | martindurant 6042212 | 2021-07-15T13:15:14Z | 2021-07-15T13:15:14Z | CONTRIBUTOR | There was a release of fsspec, but I don't see why anything would have changed here. Can you see whether the failure is associated with the new version? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
⚠️ Nightly upstream-dev CI failed ⚠️ 943923579 | |
870777725 | https://github.com/pydata/xarray/issues/4591#issuecomment-870777725 | https://api.github.com/repos/pydata/xarray/issues/4591 | MDEyOklzc3VlQ29tbWVudDg3MDc3NzcyNQ== | martindurant 6042212 | 2021-06-29T17:20:43Z | 2021-06-29T17:20:43Z | CONTRIBUTOR | I only have vague thoughts. To be sure: you can pickle the file-system, any mapper ( The question here is, why msgpack is being invoked. Those items, as well as any internal xarray stuff should only be in tasks, and so pickled. Is there a high-level-graph layer encapsulating things that were previously pickled? The only things that appear in any HLG-layer should be the paths and storage options needed to open a file-system, not the file-system itself. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Serialization issue with distributed, h5netcdf, and fsspec (ImplicitToExplicitIndexingAdapter) 745801652 | |
809493007 | https://github.com/pydata/xarray/issues/5070#issuecomment-809493007 | https://api.github.com/repos/pydata/xarray/issues/5070 | MDEyOklzc3VlQ29tbWVudDgwOTQ5MzAwNw== | martindurant 6042212 | 2021-03-29T15:52:28Z | 2021-03-29T15:52:28Z | CONTRIBUTOR |
Agree, that's fine. An
This is in general a bad idea, since we are wanting to deal with large files, and we have random access capabilities.
It does, but this is an edge case of using fsspec for local files; these are normally passed as the filename. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
requires io.IOBase subclass rather than duck file-like 839823306 | |
805900745 | https://github.com/pydata/xarray/issues/5070#issuecomment-805900745 | https://api.github.com/repos/pydata/xarray/issues/5070 | MDEyOklzc3VlQ29tbWVudDgwNTkwMDc0NQ== | martindurant 6042212 | 2021-03-24T15:07:16Z | 2021-03-24T15:07:16Z | CONTRIBUTOR | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
requires io.IOBase subclass rather than duck file-like 839823306 | ||
797547241 | https://github.com/pydata/xarray/pull/4659#issuecomment-797547241 | https://api.github.com/repos/pydata/xarray/issues/4659 | MDEyOklzc3VlQ29tbWVudDc5NzU0NzI0MQ== | martindurant 6042212 | 2021-03-12T15:04:34Z | 2021-03-12T15:04:34Z | CONTRIBUTOR | Ping, can I please ask what the current status is here? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xr.DataArray.from_dask_dataframe feature 758606082 | |
780127931 | https://github.com/pydata/xarray/pull/4823#issuecomment-780127931 | https://api.github.com/repos/pydata/xarray/issues/4823 | MDEyOklzc3VlQ29tbWVudDc4MDEyNzkzMQ== | martindurant 6042212 | 2021-02-16T21:26:52Z | 2021-02-16T21:26:52Z | CONTRIBUTOR | Thank you, @dcherian |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Allow fsspec URLs in open_(mf)dataset 788398518 | |
779870336 | https://github.com/pydata/xarray/pull/4823#issuecomment-779870336 | https://api.github.com/repos/pydata/xarray/issues/4823 | MDEyOklzc3VlQ29tbWVudDc3OTg3MDMzNg== | martindurant 6042212 | 2021-02-16T14:26:12Z | 2021-02-16T14:26:12Z | CONTRIBUTOR | Can someone please explain the minimum version policy that is failing ``` Package Required Policy Status aiobotocore 1.1.2 (2020-08-18) 0.12 (2020-02-24) > (!) (w) ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Allow fsspec URLs in open_(mf)dataset 788398518 | |
778353067 | https://github.com/pydata/xarray/pull/4823#issuecomment-778353067 | https://api.github.com/repos/pydata/xarray/issues/4823 | MDEyOklzc3VlQ29tbWVudDc3ODM1MzA2Nw== | martindurant 6042212 | 2021-02-12T18:06:49Z | 2021-02-12T18:06:49Z | CONTRIBUTOR | @raybellwaves , might I paraphrase to "this PR is useful, please merge!" ? |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Allow fsspec URLs in open_(mf)dataset 788398518 | |
769247752 | https://github.com/pydata/xarray/pull/4823#issuecomment-769247752 | https://api.github.com/repos/pydata/xarray/issues/4823 | MDEyOklzc3VlQ29tbWVudDc2OTI0Nzc1Mg== | martindurant 6042212 | 2021-01-28T17:27:50Z | 2021-01-28T17:27:50Z | CONTRIBUTOR | I have decided, on reflection, to back away on the scope here and only implement for zarr for now, since, frankly, I am confused about what should happen for other backends, and they are not tested. Yes, some of them are happy to accept file-like objects, but others either don't do that at all, or want the URL passing through. My code would have changed how things were handled, depending on whether it passed through open_dataset or open_mfdataset. Best would be to set up a set of expectations as tests. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Allow fsspec URLs in open_(mf)dataset 788398518 | |
768393609 | https://github.com/pydata/xarray/pull/4823#issuecomment-768393609 | https://api.github.com/repos/pydata/xarray/issues/4823 | MDEyOklzc3VlQ29tbWVudDc2ODM5MzYwOQ== | martindurant 6042212 | 2021-01-27T16:10:39Z | 2021-01-27T16:10:39Z | CONTRIBUTOR | Thanks, @kmuehlbauer |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Allow fsspec URLs in open_(mf)dataset 788398518 | |
768385226 | https://github.com/pydata/xarray/pull/4823#issuecomment-768385226 | https://api.github.com/repos/pydata/xarray/issues/4823 | MDEyOklzc3VlQ29tbWVudDc2ODM4NTIyNg== | martindurant 6042212 | 2021-01-27T15:58:15Z | 2021-01-27T15:58:15Z | CONTRIBUTOR | The RTD failure appears to be:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Allow fsspec URLs in open_(mf)dataset 788398518 | |
768362931 | https://github.com/pydata/xarray/pull/4823#issuecomment-768362931 | https://api.github.com/repos/pydata/xarray/issues/4823 | MDEyOklzc3VlQ29tbWVudDc2ODM2MjkzMQ== | martindurant 6042212 | 2021-01-27T15:26:57Z | 2021-01-27T15:26:57Z | CONTRIBUTOR | I am marking this PR as ready, but please ask me for specific test cases that might be relevant and should be included. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Allow fsspec URLs in open_(mf)dataset 788398518 | |
764649377 | https://github.com/pydata/xarray/pull/4823#issuecomment-764649377 | https://api.github.com/repos/pydata/xarray/issues/4823 | MDEyOklzc3VlQ29tbWVudDc2NDY0OTM3Nw== | martindurant 6042212 | 2021-01-21T13:40:33Z | 2021-01-21T13:40:33Z | CONTRIBUTOR | (please definitely do not merge until I've added documentation) |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Allow fsspec URLs in open_(mf)dataset 788398518 | |
762858956 | https://github.com/pydata/xarray/pull/4823#issuecomment-762858956 | https://api.github.com/repos/pydata/xarray/issues/4823 | MDEyOklzc3VlQ29tbWVudDc2Mjg1ODk1Ng== | martindurant 6042212 | 2021-01-19T14:04:05Z | 2021-01-19T14:04:05Z | CONTRIBUTOR | Next open question: aside from zarr, few of the other backends will know what to do with fsspec's dict-like mappers. Should we prevent them from passing through? Should we attempt to distinguish between directories and files, and make fsspec file-like objects? We could just allow the backends to fail later on incorrect input. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Allow fsspec URLs in open_(mf)dataset 788398518 | |
762428604 | https://github.com/pydata/xarray/pull/4461#issuecomment-762428604 | https://api.github.com/repos/pydata/xarray/issues/4461 | MDEyOklzc3VlQ29tbWVudDc2MjQyODYwNA== | martindurant 6042212 | 2021-01-18T19:15:25Z | 2021-01-18T19:15:25Z | CONTRIBUTOR | All interested parties, please see new attempt at https://github.com/pydata/xarray/pull/4823 |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Allow fsspec/zarr/mfdataset 709187212 | |
762394713 | https://github.com/pydata/xarray/pull/4823#issuecomment-762394713 | https://api.github.com/repos/pydata/xarray/issues/4823 | MDEyOklzc3VlQ29tbWVudDc2MjM5NDcxMw== | martindurant 6042212 | 2021-01-18T17:52:47Z | 2021-01-18T17:55:19Z | CONTRIBUTOR |
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Allow fsspec URLs in open_(mf)dataset 788398518 | |
762393744 | https://github.com/pydata/xarray/issues/4691#issuecomment-762393744 | https://api.github.com/repos/pydata/xarray/issues/4691 | MDEyOklzc3VlQ29tbWVudDc2MjM5Mzc0NA== | martindurant 6042212 | 2021-01-18T17:50:32Z | 2021-01-18T17:50:32Z | CONTRIBUTOR | https://github.com/pydata/xarray/pull/4823 working on this. Please try and comment. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Non-HTTPS remote URLs no longer work as input for open_zarr 766826777 | |
762367350 | https://github.com/pydata/xarray/pull/4823#issuecomment-762367350 | https://api.github.com/repos/pydata/xarray/issues/4823 | MDEyOklzc3VlQ29tbWVudDc2MjM2NzM1MA== | martindurant 6042212 | 2021-01-18T16:54:21Z | 2021-01-18T16:54:21Z | CONTRIBUTOR | Question: should HTTP URLs be passed through unprocessed as before? I think that might be required by some of the netCDF engines, but we probably don't test this. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Allow fsspec URLs in open_(mf)dataset 788398518 | |
762350678 | https://github.com/pydata/xarray/pull/4823#issuecomment-762350678 | https://api.github.com/repos/pydata/xarray/issues/4823 | MDEyOklzc3VlQ29tbWVudDc2MjM1MDY3OA== | martindurant 6042212 | 2021-01-18T16:22:53Z | 2021-01-18T16:22:53Z | CONTRIBUTOR | Docs to be added |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Allow fsspec URLs in open_(mf)dataset 788398518 | |
761147346 | https://github.com/pydata/xarray/issues/4691#issuecomment-761147346 | https://api.github.com/repos/pydata/xarray/issues/4691 | MDEyOklzc3VlQ29tbWVudDc2MTE0NzM0Ng== | martindurant 6042212 | 2021-01-15T19:30:21Z | 2021-01-15T19:30:21Z | CONTRIBUTOR | I believe https://github.com/pydata/xarray/pull/4461 fixes this Note that you can still use the "old" method of opening the mapper (e.g., fsspec.get_mapper) beforehand and passing that |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Non-HTTPS remote URLs no longer work as input for open_zarr 766826777 | |
747453674 | https://github.com/pydata/xarray/issues/4704#issuecomment-747453674 | https://api.github.com/repos/pydata/xarray/issues/4704 | MDEyOklzc3VlQ29tbWVudDc0NzQ1MzY3NA== | martindurant 6042212 | 2020-12-17T13:56:40Z | 2020-12-17T13:56:40Z | CONTRIBUTOR | As far as I can tell, this has only been happening in gcsfs - so my suggestion, to try to collect the set of conditions that should be considered "retryable" but currently aren't, still holds. However, it is also worthwhile discussing where else in the stack retries might be applied, which would affect multiple storage backends. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Retries for rare failures 770006670 | |
743287803 | https://github.com/pydata/xarray/pull/4461#issuecomment-743287803 | https://api.github.com/repos/pydata/xarray/issues/4461 | MDEyOklzc3VlQ29tbWVudDc0MzI4NzgwMw== | martindurant 6042212 | 2020-12-11T16:19:26Z | 2020-12-11T16:19:26Z | CONTRIBUTOR |
I'm not sure, it's been a while now... |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Allow fsspec/zarr/mfdataset 709187212 | |
741881966 | https://github.com/pydata/xarray/pull/4461#issuecomment-741881966 | https://api.github.com/repos/pydata/xarray/issues/4461 | MDEyOklzc3VlQ29tbWVudDc0MTg4MTk2Ng== | martindurant 6042212 | 2020-12-09T16:20:33Z | 2020-12-09T16:20:33Z | CONTRIBUTOR | ping again |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Allow fsspec/zarr/mfdataset 709187212 | |
739959248 | https://github.com/pydata/xarray/issues/4478#issuecomment-739959248 | https://api.github.com/repos/pydata/xarray/issues/4478 | MDEyOklzc3VlQ29tbWVudDczOTk1OTI0OA== | martindurant 6042212 | 2020-12-07T14:39:57Z | 2020-12-07T14:39:57Z | CONTRIBUTOR | Please try with fsspec master. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Dataset to zarr not working with newest s3fs Storage (s3fs > 0.5.0) 712782711 | |
730396711 | https://github.com/pydata/xarray/issues/4556#issuecomment-730396711 | https://api.github.com/repos/pydata/xarray/issues/4556 | MDEyOklzc3VlQ29tbWVudDczMDM5NjcxMQ== | martindurant 6042212 | 2020-11-19T14:03:47Z | 2020-11-19T14:03:47Z | CONTRIBUTOR | Looks like a special case of a numpy scalar. I can catch this in fsspec - please wait. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
quick overview example not working with `to_zarr` function with gcs store 733201109 | |
729863434 | https://github.com/pydata/xarray/issues/4591#issuecomment-729863434 | https://api.github.com/repos/pydata/xarray/issues/4591 | MDEyOklzc3VlQ29tbWVudDcyOTg2MzQzNA== | martindurant 6042212 | 2020-11-18T18:14:28Z | 2020-11-18T18:14:28Z | CONTRIBUTOR | The |
{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Serialization issue with distributed, h5netcdf, and fsspec (ImplicitToExplicitIndexingAdapter) 745801652 | |
729803257 | https://github.com/pydata/xarray/issues/4591#issuecomment-729803257 | https://api.github.com/repos/pydata/xarray/issues/4591 | MDEyOklzc3VlQ29tbWVudDcyOTgwMzI1Nw== | martindurant 6042212 | 2020-11-18T16:42:30Z | 2020-11-18T16:42:30Z | CONTRIBUTOR | OK, I can see a thing after all... please stand by |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Serialization issue with distributed, h5netcdf, and fsspec (ImplicitToExplicitIndexingAdapter) 745801652 | |
729795030 | https://github.com/pydata/xarray/issues/4591#issuecomment-729795030 | https://api.github.com/repos/pydata/xarray/issues/4591 | MDEyOklzc3VlQ29tbWVudDcyOTc5NTAzMA== | martindurant 6042212 | 2020-11-18T16:29:18Z | 2020-11-18T16:29:18Z | CONTRIBUTOR | I don't think it's fsspec, the HTTPFileSystem and file objects are known to serialise. However ```
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Serialization issue with distributed, h5netcdf, and fsspec (ImplicitToExplicitIndexingAdapter) 745801652 | |
721365827 | https://github.com/pydata/xarray/pull/4461#issuecomment-721365827 | https://api.github.com/repos/pydata/xarray/issues/4461 | MDEyOklzc3VlQ29tbWVudDcyMTM2NTgyNw== | martindurant 6042212 | 2020-11-03T20:46:57Z | 2020-11-03T20:46:57Z | CONTRIBUTOR | One completely unrelated failure (test_polyfit_warnings). Can I please get a final say here (@max-sixty @alexamici ?) |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Allow fsspec/zarr/mfdataset 709187212 | |
712194464 | https://github.com/pydata/xarray/pull/4461#issuecomment-712194464 | https://api.github.com/repos/pydata/xarray/issues/4461 | MDEyOklzc3VlQ29tbWVudDcxMjE5NDQ2NA== | martindurant 6042212 | 2020-10-19T14:22:23Z | 2020-10-19T14:22:23Z | CONTRIBUTOR | (failures look like something in pandas dev) |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Allow fsspec/zarr/mfdataset 709187212 | |
704353239 | https://github.com/pydata/xarray/issues/4478#issuecomment-704353239 | https://api.github.com/repos/pydata/xarray/issues/4478 | MDEyOklzc3VlQ29tbWVudDcwNDM1MzIzOQ== | martindurant 6042212 | 2020-10-06T15:30:50Z | 2020-10-06T15:30:50Z | CONTRIBUTOR | That's a lot of data! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Dataset to zarr not working with newest s3fs Storage (s3fs > 0.5.0) 712782711 | |
704285976 | https://github.com/pydata/xarray/issues/4478#issuecomment-704285976 | https://api.github.com/repos/pydata/xarray/issues/4478 | MDEyOklzc3VlQ29tbWVudDcwNDI4NTk3Ng== | martindurant 6042212 | 2020-10-06T13:55:34Z | 2020-10-06T13:55:34Z | CONTRIBUTOR | Can you confirm that this works ok with fsspec and s3fs master? |
{ "total_count": 2, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 2, "rocket": 0, "eyes": 0 } |
Dataset to zarr not working with newest s3fs Storage (s3fs > 0.5.0) 712782711 | |
702846089 | https://github.com/pydata/xarray/issues/4478#issuecomment-702846089 | https://api.github.com/repos/pydata/xarray/issues/4478 | MDEyOklzc3VlQ29tbWVudDcwMjg0NjA4OQ== | martindurant 6042212 | 2020-10-02T16:59:45Z | 2020-10-02T16:59:45Z | CONTRIBUTOR | I have reproduced it locally (also with moto). Indeed, many threads are trying to stall the event loop at once. This will take a little finesse. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Dataset to zarr not working with newest s3fs Storage (s3fs > 0.5.0) 712782711 | |
702816676 | https://github.com/pydata/xarray/issues/4478#issuecomment-702816676 | https://api.github.com/repos/pydata/xarray/issues/4478 | MDEyOklzc3VlQ29tbWVudDcwMjgxNjY3Ng== | martindurant 6042212 | 2020-10-02T15:59:59Z | 2020-10-02T15:59:59Z | CONTRIBUTOR | Thanks for the digging, I'll look into it |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Dataset to zarr not working with newest s3fs Storage (s3fs > 0.5.0) 712782711 | |
702213899 | https://github.com/pydata/xarray/issues/4478#issuecomment-702213899 | https://api.github.com/repos/pydata/xarray/issues/4478 | MDEyOklzc3VlQ29tbWVudDcwMjIxMzg5OQ== | martindurant 6042212 | 2020-10-01T15:27:00Z | 2020-10-01T15:27:00Z | CONTRIBUTOR |
This looks like it may be a race conditions where multiple threads are calling the event loop at once. I wonder if you could list the event loops in use and the threads (perhaps best run with base python than ipython/jupyter). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Dataset to zarr not working with newest s3fs Storage (s3fs > 0.5.0) 712782711 | |
702124268 | https://github.com/pydata/xarray/issues/4478#issuecomment-702124268 | https://api.github.com/repos/pydata/xarray/issues/4478 | MDEyOklzc3VlQ29tbWVudDcwMjEyNDI2OA== | martindurant 6042212 | 2020-10-01T13:11:32Z | 2020-10-01T13:11:32Z | CONTRIBUTOR | The following code, modified to the style of the s3fs test suite, works OK: ```python def test_with_xzarr(s3): da = pytest.importorskip("dask.array") xr = pytest.importorskip("xarray") name = "sample"
``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Dataset to zarr not working with newest s3fs Storage (s3fs > 0.5.0) 712782711 | |
699155033 | https://github.com/pydata/xarray/pull/4461#issuecomment-699155033 | https://api.github.com/repos/pydata/xarray/issues/4461 | MDEyOklzc3VlQ29tbWVudDY5OTE1NTAzMw== | martindurant 6042212 | 2020-09-25T21:05:42Z | 2020-09-25T21:05:42Z | CONTRIBUTOR | Question: to eventually get tests to pass, will need changes only just now going into zarr. Those may be released some time soon, but in the meantime is it reasonable to install from master? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Allow fsspec/zarr/mfdataset 709187212 | |
696766963 | https://github.com/pydata/xarray/pull/4187#issuecomment-696766963 | https://api.github.com/repos/pydata/xarray/issues/4187 | MDEyOklzc3VlQ29tbWVudDY5Njc2Njk2Mw== | martindurant 6042212 | 2020-09-22T14:41:41Z | 2020-09-22T14:41:41Z | CONTRIBUTOR | Note that zarr.open* now works with fsspec URLs (in master) |
{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Xarray open_mfdataset with engine Zarr 647804004 | |
639777701 | https://github.com/pydata/xarray/issues/4122#issuecomment-639777701 | https://api.github.com/repos/pydata/xarray/issues/4122 | MDEyOklzc3VlQ29tbWVudDYzOTc3NzcwMQ== | martindurant 6042212 | 2020-06-05T20:17:38Z | 2020-06-05T20:17:38Z | CONTRIBUTOR | The write feature for simplecache isn't released yet, of course. It would be interesting if someone could subclass file and write locally with h5netcdf to see what kind of seeks it does. Is it popping back to some file header to update array sizes? Presumably it would need a fixed-size header to do that. Parquet and other cloud formats have the metadata at the footer exactly for this reason, so you only write once you know everything and you only ever move forward in the fie. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Document writing netcdf from xarray directly to S3 631085856 | |
620151178 | https://github.com/pydata/xarray/pull/4003#issuecomment-620151178 | https://api.github.com/repos/pydata/xarray/issues/4003 | MDEyOklzc3VlQ29tbWVudDYyMDE1MTE3OA== | martindurant 6042212 | 2020-04-27T18:19:54Z | 2020-04-27T18:19:54Z | CONTRIBUTOR |
IF we can push on https://github.com/zarr-developers/zarr-python/pull/546 ; but here is also an opportunity to get the behaviour out of the zarr/fsspec interaction most convenient for this work. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray.open_mzar: open multiple zarr files (in parallel) 606683601 | |
605222008 | https://github.com/pydata/xarray/issues/3831#issuecomment-605222008 | https://api.github.com/repos/pydata/xarray/issues/3831 | MDEyOklzc3VlQ29tbWVudDYwNTIyMjAwOA== | martindurant 6042212 | 2020-03-27T19:11:59Z | 2020-03-27T19:11:59Z | CONTRIBUTOR | Note that s3fs and gcsfs now expose the kwargs (although the new releases for both already include the change that accessing a file, contents or metadata, does not require a directory listing, which is the right thing for zarr, where the full paths are known) |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Errors using to_zarr for an s3 store 576337745 | |
595379998 | https://github.com/pydata/xarray/issues/3831#issuecomment-595379998 | https://api.github.com/repos/pydata/xarray/issues/3831 | MDEyOklzc3VlQ29tbWVudDU5NTM3OTk5OA== | martindurant 6042212 | 2020-03-05T18:32:38Z | 2020-03-05T18:32:38Z | CONTRIBUTOR | https://github.com/intake/filesystem_spec/pull/243 is where my attempt to fix this kind of thing will live. However, writing or deleting keys should invalidate the appropriate part of the cache as it currently stands, so I don't know why the problem has arisen. If it is a cache problem, then |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Errors using to_zarr for an s3 store 576337745 | |
524910496 | https://github.com/pydata/xarray/issues/3251#issuecomment-524910496 | https://api.github.com/repos/pydata/xarray/issues/3251 | MDEyOklzc3VlQ29tbWVudDUyNDkxMDQ5Ng== | martindurant 6042212 | 2019-08-26T15:38:16Z | 2019-08-26T15:38:16Z | CONTRIBUTOR | Note that get_mapper is implemented for all file systems, so there should be no need for any gcsfs-specific code. On August 26, 2019 11:21:00 AM EDT, Justin Minsk notifications@github.com wrote:
-- Sent from my Android device with K-9 Mail. Please excuse my brevity. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
to_zarr append with gcsmap does not work properly 484592018 | |
524663267 | https://github.com/pydata/xarray/issues/3251#issuecomment-524663267 | https://api.github.com/repos/pydata/xarray/issues/3251 | MDEyOklzc3VlQ29tbWVudDUyNDY2MzI2Nw== | martindurant 6042212 | 2019-08-25T21:00:37Z | 2019-08-25T21:00:37Z | CONTRIBUTOR | I am not sure why str should ever be called on the mapping. For sure, what it returns is not the same as before (perhaps you could go back a version and check?), but I don't know what the string would have been used for. I am on leave at the moment and unlikely to be able to get time to investigate. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
to_zarr append with gcsmap does not work properly 484592018 | |
460257484 | https://github.com/pydata/xarray/issues/2740#issuecomment-460257484 | https://api.github.com/repos/pydata/xarray/issues/2740 | MDEyOklzc3VlQ29tbWVudDQ2MDI1NzQ4NA== | martindurant 6042212 | 2019-02-04T13:54:31Z | 2019-02-04T13:54:31Z | CONTRIBUTOR | Do you have any idea what is taking the extra time? s3fs ought to, in theory, treat URLs with and with the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
`open_zarr` hangs if 's3://' at front of root s3fs string 406178487 | |
444514608 | https://github.com/pydata/xarray/pull/2559#issuecomment-444514608 | https://api.github.com/repos/pydata/xarray/issues/2559 | MDEyOklzc3VlQ29tbWVudDQ0NDUxNDYwOA== | martindurant 6042212 | 2018-12-05T14:58:58Z | 2018-12-05T14:58:58Z | CONTRIBUTOR | I like those timings. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Zarr consolidated 382497709 | |
443804859 | https://github.com/pydata/xarray/pull/2559#issuecomment-443804859 | https://api.github.com/repos/pydata/xarray/issues/2559 | MDEyOklzc3VlQ29tbWVudDQ0MzgwNDg1OQ== | martindurant 6042212 | 2018-12-03T17:55:51Z | 2018-12-03T17:55:51Z | CONTRIBUTOR | LGTM Do you think there should be more explicit text of how to add consolidation to existing zarr/xarray data-sets, rather than creating them with consolidation turned on? We may also need some text around updating consolidated data-sets, but that can maybe wait to see what kind of usage people try. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Zarr consolidated 382497709 | |
442581092 | https://github.com/pydata/xarray/pull/2559#issuecomment-442581092 | https://api.github.com/repos/pydata/xarray/issues/2559 | MDEyOklzc3VlQ29tbWVudDQ0MjU4MTA5Mg== | martindurant 6042212 | 2018-11-28T19:49:43Z | 2018-11-28T19:49:43Z | CONTRIBUTOR | Glad to see this happening, by the way. Once in, catalogs using intake-xarray can be updated and I don't thin the code will need to change. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Zarr consolidated 382497709 | |
442580432 | https://github.com/pydata/xarray/pull/2559#issuecomment-442580432 | https://api.github.com/repos/pydata/xarray/issues/2559 | MDEyOklzc3VlQ29tbWVudDQ0MjU4MDQzMg== | martindurant 6042212 | 2018-11-28T19:47:43Z | 2018-11-28T19:47:43Z | CONTRIBUTOR | Will the default for both options be |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Zarr consolidated 382497709 | |
396930603 | https://github.com/pydata/xarray/pull/2228#issuecomment-396930603 | https://api.github.com/repos/pydata/xarray/issues/2228 | MDEyOklzc3VlQ29tbWVudDM5NjkzMDYwMw== | martindurant 6042212 | 2018-06-13T13:07:58Z | 2018-06-13T13:07:58Z | CONTRIBUTOR | Right now, |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
fix zarr chunking bug 331752926 | |
365412033 | https://github.com/pydata/xarray/pull/1528#issuecomment-365412033 | https://api.github.com/repos/pydata/xarray/issues/1528 | MDEyOklzc3VlQ29tbWVudDM2NTQxMjAzMw== | martindurant 6042212 | 2018-02-13T21:35:03Z | 2018-02-13T21:35:03Z | CONTRIBUTOR | Yeah, ideally when adding a variable like
On the other hand, implementing
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: Zarr backend 253136694 | |
364817111 | https://github.com/pydata/xarray/pull/1528#issuecomment-364817111 | https://api.github.com/repos/pydata/xarray/issues/1528 | MDEyOklzc3VlQ29tbWVudDM2NDgxNzExMQ== | martindurant 6042212 | 2018-02-12T02:43:43Z | 2018-02-12T03:47:48Z | CONTRIBUTOR | OK, so the way to do this in pure-zarr appears to be to simply create the appropriate zarr array and set it's dimensions attribute:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: Zarr backend 253136694 | |
364804697 | https://github.com/pydata/xarray/pull/1528#issuecomment-364804697 | https://api.github.com/repos/pydata/xarray/issues/1528 | MDEyOklzc3VlQ29tbWVudDM2NDgwNDY5Nw== | martindurant 6042212 | 2018-02-12T00:19:55Z | 2018-02-12T00:19:55Z | CONTRIBUTOR | It might be enough, in this case, to provide some helper function in zarr to create and fetch arrays that will show up as variables in xarray - this need not be specific to being used via dask. I am assuming with the work done in this PR, that there is an unambiguous way to determine if a zarr group can be interpreted as an xarray dataset, and that zarr then knows how to add things that look like variables (which generally in the zarr case don't involve writing any actual data until the parts of the array are filled in). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: Zarr backend 253136694 | |
364803984 | https://github.com/pydata/xarray/pull/1528#issuecomment-364803984 | https://api.github.com/repos/pydata/xarray/issues/1528 | MDEyOklzc3VlQ29tbWVudDM2NDgwMzk4NA== | martindurant 6042212 | 2018-02-12T00:12:36Z | 2018-02-12T00:12:36Z | CONTRIBUTOR | @jhamman , that partially solves what I mean, I can probably turn my data into dask arrays with some difficulty; but really I was hoping for something like the following:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: Zarr backend 253136694 | |
364801073 | https://github.com/pydata/xarray/pull/1528#issuecomment-364801073 | https://api.github.com/repos/pydata/xarray/issues/1528 | MDEyOklzc3VlQ29tbWVudDM2NDgwMTA3Mw== | martindurant 6042212 | 2018-02-11T23:35:34Z | 2018-02-11T23:35:34Z | CONTRIBUTOR | Question: how would one build a zarr-xarray dataset? With zarr you can open an array that contains no data, and use set-slice notation to fill in the values (which is what dask's store essentially does). If I have some pre-known coordinates and bigger-than-memory data arrays, how would I go about getting the values into the zarr structure? If this can't be done directly with the xarray interface, is there a way to call zarr's open/create/zeros such that the corresponding array will appear as a variable when the same dataset is opened with xarray? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: Zarr backend 253136694 | |
351106449 | https://github.com/pydata/xarray/issues/1770#issuecomment-351106449 | https://api.github.com/repos/pydata/xarray/issues/1770 | MDEyOklzc3VlQ29tbWVudDM1MTEwNjQ0OQ== | martindurant 6042212 | 2017-12-12T16:31:55Z | 2017-12-12T16:31:55Z | CONTRIBUTOR | Yes, |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
slow performance when storing datasets in gcsfs-backed zarr stores 280626621 | |
351100872 | https://github.com/pydata/xarray/issues/1770#issuecomment-351100872 | https://api.github.com/repos/pydata/xarray/issues/1770 | MDEyOklzc3VlQ29tbWVudDM1MTEwMDg3Mg== | martindurant 6042212 | 2017-12-12T16:15:43Z | 2017-12-12T16:15:43Z | CONTRIBUTOR | I am puzzled that serializing the mapping is pulling the data. GCSMap does not have get/set_state, but the only attributes are the GCSFileSystem and path. Perhaps the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
slow performance when storing datasets in gcsfs-backed zarr stores 280626621 | |
345770374 | https://github.com/pydata/xarray/pull/1528#issuecomment-345770374 | https://api.github.com/repos/pydata/xarray/issues/1528 | MDEyOklzc3VlQ29tbWVudDM0NTc3MDM3NA== | martindurant 6042212 | 2017-11-20T17:37:01Z | 2017-11-20T17:37:01Z | CONTRIBUTOR | This is, of course, by design :) I imagine there is much that could be done to optimise performance, but for fewer, larger chunks, it should be pretty good. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: Zarr backend 253136694 | |
345104440 | https://github.com/pydata/xarray/pull/1528#issuecomment-345104440 | https://api.github.com/repos/pydata/xarray/issues/1528 | MDEyOklzc3VlQ29tbWVudDM0NTEwNDQ0MA== | martindurant 6042212 | 2017-11-17T00:10:19Z | 2017-11-17T00:10:19Z | CONTRIBUTOR |
|
{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: Zarr backend 253136694 | |
333400272 | https://github.com/pydata/xarray/pull/1528#issuecomment-333400272 | https://api.github.com/repos/pydata/xarray/issues/1528 | MDEyOklzc3VlQ29tbWVudDMzMzQwMDI3Mg== | martindurant 6042212 | 2017-10-01T19:26:22Z | 2017-10-01T19:26:22Z | CONTRIBUTOR | I have not done anything, I'm afraid, since posting my commit, the content of which is just an example of how you might pass parameters down to zarr, and a test-case which shows that the basic data is round-tripping properly, but actually the dataset does not come back with the same structure as it started off. We can loop back and decide where to go from here. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: Zarr backend 253136694 | |
327901739 | https://github.com/pydata/xarray/pull/1528#issuecomment-327901739 | https://api.github.com/repos/pydata/xarray/issues/1528 | MDEyOklzc3VlQ29tbWVudDMyNzkwMTczOQ== | martindurant 6042212 | 2017-09-07T19:36:15Z | 2017-09-07T19:36:15Z | CONTRIBUTOR | @shoyer , is https://github.com/martindurant/xarray/commit/6c1fb6b76ebba862a1c5831210ce026160da0065 a reasonable start ? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: Zarr backend 253136694 | |
327833777 | https://github.com/pydata/xarray/pull/1528#issuecomment-327833777 | https://api.github.com/repos/pydata/xarray/issues/1528 | MDEyOklzc3VlQ29tbWVudDMyNzgzMzc3Nw== | martindurant 6042212 | 2017-09-07T15:23:31Z | 2017-09-07T15:23:31Z | CONTRIBUTOR | @rabernat , is there anything I can do to help push this along? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: Zarr backend 253136694 | |
325728378 | https://github.com/pydata/xarray/pull/1528#issuecomment-325728378 | https://api.github.com/repos/pydata/xarray/issues/1528 | MDEyOklzc3VlQ29tbWVudDMyNTcyODM3OA== | martindurant 6042212 | 2017-08-29T17:00:29Z | 2017-08-29T17:00:29Z | CONTRIBUTOR | A further rather big advantage in zarr that I'm not aware of in cdf/hdf (I may be wrong) is not just null values, but not having a given block be written to disc at all if it only contains null data. This probably meshes perfectly well with most user's understanding of missing data/fill value. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: Zarr backend 253136694 | |
325727354 | https://github.com/pydata/xarray/pull/1528#issuecomment-325727354 | https://api.github.com/repos/pydata/xarray/issues/1528 | MDEyOklzc3VlQ29tbWVudDMyNTcyNzM1NA== | martindurant 6042212 | 2017-08-29T16:57:10Z | 2017-08-29T16:57:10Z | CONTRIBUTOR | Worth pointing out here, that the zarr filter-set is extensible (I suppose hdf5 is too, but I don't think this is ever done in practice), but I don't think it makes any particular claims to performance. I think both of the options above are reasonable, and there is no particular reason to exclude either: a zarr variable could look to xarray like floats but actually be stored as ints (i.e., arguments are passed to zarr), or it could look like ints which xarray expects to inflate to floats (i.e., stored as an attribute). I mean, if a user stores a float variable, but includes kwargs to zarr for scale/filter (or any other filter arguments), we should make no attempt to interrupt that. The only question is, if the user wishes to apply scale/offset in xarray, which is their most likely intention? I would guess the latter, compute in xarray and use attributes, since xarray users probably don't know about zarr and its filters. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: Zarr backend 253136694 | |
325390391 | https://github.com/pydata/xarray/pull/1528#issuecomment-325390391 | https://api.github.com/repos/pydata/xarray/issues/1528 | MDEyOklzc3VlQ29tbWVudDMyNTM5MDM5MQ== | martindurant 6042212 | 2017-08-28T15:41:08Z | 2017-08-28T15:41:08Z | CONTRIBUTOR | @rabernat : on actually looking through your code :) Happy to see you doing exactly as I felt I was not knowledgeable to do and poke xarray's guts. If I can help in any way, please let me know, although I don't have a lot of spare hours right now. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: Zarr backend 253136694 | |
325220001 | https://github.com/pydata/xarray/pull/1528#issuecomment-325220001 | https://api.github.com/repos/pydata/xarray/issues/1528 | MDEyOklzc3VlQ29tbWVudDMyNTIyMDAwMQ== | martindurant 6042212 | 2017-08-27T19:46:31Z | 2017-08-27T19:46:31Z | CONTRIBUTOR | Sorry that I let this slide - there was not a huge upswell of interest around what I had done, and I was not ready to dive into xarray internals. Could you comment more on the difference between your approach and mine? Is the aim to reduce the number of metadata files hanging around? zarr has made an effort with the groups interface to parallel netCDF, which is, after all, what xarray essentially expects of all its data sources. As in this comment I have come to the realisation that although nice to/from zarr methods can be made relatively easily, they will not get traction unless they can be put within a class that mimics the existing xarray infrastructure, i.e., the user would never know, except that magically they have extra encoding/compression options, the file-path can be an S3 URL (say), and dask parallel computation suddenly works on a cluster and/or with out-of-core processing. That would raise some eyebrows! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: Zarr backend 253136694 | |
281990573 | https://github.com/pydata/xarray/issues/1223#issuecomment-281990573 | https://api.github.com/repos/pydata/xarray/issues/1223 | MDEyOklzc3VlQ29tbWVudDI4MTk5MDU3Mw== | martindurant 6042212 | 2017-02-23T13:25:36Z | 2017-02-23T13:25:36Z | CONTRIBUTOR | @alimanfoo , do you think this work would make more sense as part of zarr rather than as part of xarray? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
zarr as persistent store for xarray 202260275 | |
281860859 | https://github.com/pydata/xarray/issues/1223#issuecomment-281860859 | https://api.github.com/repos/pydata/xarray/issues/1223 | MDEyOklzc3VlQ29tbWVudDI4MTg2MDg1OQ== | martindurant 6042212 | 2017-02-23T01:25:52Z | 2017-02-23T01:25:52Z | CONTRIBUTOR | True, xarray_to_zarr is unchanged from before. The dataset functions could supercede, since a single xarray is just a special case of a dataset; or we could decide that for the special case it is worth having short-cut functions. I was worried about the number of metadata files being created, since on a remote system like S3, there is a large overhead to reading many small files. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
zarr as persistent store for xarray 202260275 | |
281813651 | https://github.com/pydata/xarray/issues/1223#issuecomment-281813651 | https://api.github.com/repos/pydata/xarray/issues/1223 | MDEyOklzc3VlQ29tbWVudDI4MTgxMzY1MQ== | martindurant 6042212 | 2017-02-22T21:42:49Z | 2017-02-22T21:43:05Z | CONTRIBUTOR | @alimanfoo , in the new dataset save function, I do exactly as you suggest, with everything getting put as a dict into the main zarr group attributes, with special attribute names "attrs" for the data-set root, "coords" for the set of coordinate objects and "variables" for the set of variables objects (all of these have their own attributes in xarray). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
zarr as persistent store for xarray 202260275 | |
279181938 | https://github.com/pydata/xarray/issues/1223#issuecomment-279181938 | https://api.github.com/repos/pydata/xarray/issues/1223 | MDEyOklzc3VlQ29tbWVudDI3OTE4MTkzOA== | martindurant 6042212 | 2017-02-11T22:56:56Z | 2017-02-11T22:56:56Z | CONTRIBUTOR | I have developed my example a little to sidestep subclassing you suggest, which seemed tricky to implement. Please see https://gist.github.com/martindurant/06a1e98c91f0033c4649a48a2f943390 (dataset_to/from_zarr functions) I can use the zarr groups structure to mirror at least typical use of xarrays: variables, coordinates and sets of attributes on each. I have tested this with s3 too, stealing a little code from dask to show the idea. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
zarr as persistent store for xarray 202260275 | |
274202189 | https://github.com/pydata/xarray/issues/1223#issuecomment-274202189 | https://api.github.com/repos/pydata/xarray/issues/1223 | MDEyOklzc3VlQ29tbWVudDI3NDIwMjE4OQ== | martindurant 6042212 | 2017-01-20T22:57:07Z | 2017-01-20T22:57:07Z | CONTRIBUTOR | 3: a json-like representation such as used by the hidden .xarray item would also do. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
zarr as persistent store for xarray 202260275 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
issue >30