home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 406178487

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
406178487 MDU6SXNzdWU0MDYxNzg0ODc= 2740 `open_zarr` hangs if 's3://' at front of root s3fs string 1796208 closed 0     2 2019-02-04T04:34:16Z 2019-04-29T06:32:01Z 2019-04-29T06:32:01Z NONE      

The following code has an error in it:

```python import s3fs import xarray as xr

S3_DIR = 's3://my_bucket'

s3 = s3fs.S3FileSystem(**storage_options) store = s3fs.S3Map(root=f'{S3_DIR}/my_zarr_store', s3=s3) array = xr.open_zarr(store)['data'] ```

The presence of "s3://" at the beginning of the string causes to take a really really really long time (I don't have the time off hand but over 10 minutes) to return with a key error, that there is nothing at 'data', which is often a clue of a permissions error.

Without the "s3://" this returns quickly with my data.

This error occurred for me as I was opening other files with dask with code such as

python df = dd.read_parquet(f'{S3_DIR}/my_data.parquet', storage_options=storage_options)

I know that this is not technically an xarray issue. However, it is the xarray line that suffers the user experience as the s3fs just returns without any checking.

I was wondering whether the open_zarr function could be generous and inspect the root argument in the case of s3fs access and warn if 's3://' is detected.

I am also wondering what the interaction issue is that causes it to take so long for the permission type error to be returned.

ping @martindurant in case you have thoughts from the s3fs side.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2740/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 2 rows from issue in issue_comments
Powered by Datasette · Queries took 0.611ms · About: xarray-datasette