id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 404119432,MDU6SXNzdWU0MDQxMTk0MzI=,2724,Support variable length string arrays in xarray/zarr,1796208,closed,0,,,3,2019-01-29T04:38:35Z,2019-07-05T18:02:36Z,2019-07-05T18:02:36Z,NONE,,,,"Ran into a problem writing my xarray to zarr that @jhamman helped me figure out the source of my error. I had set up an xarray (from a chunked dask array) ``` dask.array Coordinates: * snippets (snippets) object '0.gravatar.com||gprofiles.js||Gravatar.init' ... 'подолÑ\x8cÑ\x81к-админиÑ\x81Ñ\x82Ñ\x80аÑ\x86иÑ\x8f.Ñ\x80Ñ\x84||wp-embed.min.js||c' * symbols (symbols) object 'AnalyserNode.channelCount' ... 'window.sessionStorage' ``` Upon trying to write ```python array.to_dataset(name='data').to_zarr('test.zarr') ``` I would get a memory error ```python-traceback --------------------------------------------------------------------------- MemoryError Traceback (most recent call last) in ----> 1 array.to_dataset(name='data').to_zarr('test.zarr') ~/miniconda3/envs/ovscrptd/lib/python3.6/site-packages/xarray/core/dataset.py in to_zarr(self, store, mode, synchronizer, group, encoding, compute, consolidated) 1275 return to_zarr(self, store=store, mode=mode, synchronizer=synchronizer, 1276 group=group, encoding=encoding, compute=compute, -> 1277 consolidated=consolidated) 1278 1279 def __unicode__(self): ~/miniconda3/envs/ovscrptd/lib/python3.6/site-packages/xarray/backends/api.py in to_zarr(dataset, store, mode, synchronizer, group, encoding, compute, consolidated) 915 writer = ArrayWriter() 916 # TODO: figure out how to properly handle unlimited_dims --> 917 dump_to_store(dataset, zstore, writer, encoding=encoding) 918 writes = writer.sync(compute=compute) 919 ~/miniconda3/envs/ovscrptd/lib/python3.6/site-packages/xarray/backends/api.py in dump_to_store(dataset, store, writer, encoder, encoding, unlimited_dims) 790 791 store.store(variables, attrs, check_encoding, writer, --> 792 unlimited_dims=unlimited_dims) 793 794 ~/miniconda3/envs/ovscrptd/lib/python3.6/site-packages/xarray/backends/zarr.py in store(self, variables, attributes, *args, **kwargs) 343 def store(self, variables, attributes, *args, **kwargs): 344 AbstractWritableDataStore.store(self, variables, attributes, --> 345 *args, **kwargs) 346 347 def sync(self): ~/miniconda3/envs/ovscrptd/lib/python3.6/site-packages/xarray/backends/common.py in store(self, variables, attributes, check_encoding_set, writer, unlimited_dims) 259 writer = ArrayWriter() 260 --> 261 variables, attributes = self.encode(variables, attributes) 262 263 self.set_attributes(attributes) ~/miniconda3/envs/ovscrptd/lib/python3.6/site-packages/xarray/backends/common.py in encode(self, variables, attributes) 203 """""" 204 variables = OrderedDict([(k, self.encode_variable(v)) --> 205 for k, v in variables.items()]) 206 attributes = OrderedDict([(k, self.encode_attribute(v)) 207 for k, v in attributes.items()]) ~/miniconda3/envs/ovscrptd/lib/python3.6/site-packages/xarray/backends/common.py in (.0) 203 """""" 204 variables = OrderedDict([(k, self.encode_variable(v)) --> 205 for k, v in variables.items()]) 206 attributes = OrderedDict([(k, self.encode_attribute(v)) 207 for k, v in attributes.items()]) ~/miniconda3/envs/ovscrptd/lib/python3.6/site-packages/xarray/backends/zarr.py in encode_variable(self, variable) 308 309 def encode_variable(self, variable): --> 310 variable = encode_zarr_variable(variable) 311 return variable 312 ~/miniconda3/envs/ovscrptd/lib/python3.6/site-packages/xarray/backends/zarr.py in encode_zarr_variable(var, needs_copy, name) 214 # TODO: allow toggling this explicitly via dtype in encoding. 215 coder = coding.strings.EncodedStringCoder(allows_unicode=False) --> 216 var = coder.encode(var, name=name) 217 var = coding.strings.ensure_fixed_length_bytes(var) 218 ~/miniconda3/envs/ovscrptd/lib/python3.6/site-packages/xarray/coding/strings.py in encode(self, variable, name) 60 safe_setitem(attrs, '_Encoding', string_encoding, name=name) 61 # TODO: figure out how to handle this in a lazy way with dask ---> 62 data = encode_string_array(data, string_encoding) 63 64 return Variable(dims, data, attrs, encoding) ~/miniconda3/envs/ovscrptd/lib/python3.6/site-packages/xarray/coding/strings.py in encode_string_array(string_array, encoding) 85 string_array = np.asarray(string_array) 86 encoded = [x.encode(encoding) for x in string_array.ravel()] ---> 87 return np.array(encoded, dtype=bytes).reshape(string_array.shape) 88 89 MemoryError: ``` My coordinates for 'snippets' are 800k long and include strings like ``` inassets1-internationsgmbh.netdna-ssl.com||gn1MljtM.11d8186d87588f8fe848.js||[""./app-new/src/InterNations/Bundle/LayoutBundle/Resources/public/frontend/js/vendor/fingerprint2.js""]/ xarray is converting you list of string into a numpy array of bytestrings. The dimensions of this new array will be len(orig_list) x len(longest_string). So, the resulting array is going to be, potentially, much larger than the 88mb So, this issue is a request to support my bizarre indexing. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2724/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 438166604,MDU6SXNzdWU0MzgxNjY2MDQ=,2927,Data variables empty with to_zarr / from_zarr on s3 if 's3://' in root s3fs string,1796208,closed,0,,,3,2019-04-29T06:31:34Z,2019-06-27T20:07:20Z,2019-06-27T20:07:20Z,NONE,,,,"I apparently have a bad habit of prepending my s3 strings with 's3://' (see #2740) To reproduce: ```python import numpy as np import xarray as xr import s3fs s3 = s3fs.S3FileSystem(**options) store = s3fs.S3Map(root='bucket/store.zarr', s3=s3) # Create and write a = xr.DataArray(np.random.randn(2, 3)) a.to_dataset(name='data').to_zarr(store) # Read back in a = xr.open_zarr(store) print(a) ``` Actual Result: ``` Dimensions: () Data variables: *empty* ``` Expected Result: my dataset, which I can then get the DataArray with `a['data']` If the 's3://' is not there everything works as expected. I will add that the fact that it at least opens now means I believe we can close #2740. xarray version 0.12.1 zarr version 2.3.1 s3fs version 0.2.1","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2927/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 406178487,MDU6SXNzdWU0MDYxNzg0ODc=,2740,`open_zarr` hangs if 's3://' at front of root s3fs string,1796208,closed,0,,,2,2019-02-04T04:34:16Z,2019-04-29T06:32:01Z,2019-04-29T06:32:01Z,NONE,,,,"The following code has an error in it: ```python import s3fs import xarray as xr S3_DIR = 's3://my_bucket' s3 = s3fs.S3FileSystem(**storage_options) store = s3fs.S3Map(root=f'{S3_DIR}/my_zarr_store', s3=s3) array = xr.open_zarr(store)['data'] ``` The presence of ""s3://"" at the beginning of the string causes to take a really really really long time (I don't have the time off hand but over 10 minutes) to return with a key error, that there is nothing at 'data', which is often a clue of a permissions error. Without the ""s3://"" this returns quickly with my data. This error occurred for me as I was opening other files with dask with code such as ```python df = dd.read_parquet(f'{S3_DIR}/my_data.parquet', storage_options=storage_options) ``` I know that this is not technically an xarray issue. However, it is the xarray line that suffers the user experience as the s3fs just returns without any checking. I was wondering whether the open_zarr function could be generous and inspect the root argument in the case of s3fs access and warn if 's3://' is detected. I am also wondering what the interaction issue is that causes it to take so long for the permission type error to be returned. ping @martindurant in case you have thoughts from the s3fs side.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2740/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue