html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/4180#issuecomment-650091343,https://api.github.com/repos/pydata/xarray/issues/4180,650091343,MDEyOklzc3VlQ29tbWVudDY1MDA5MTM0Mw==,7360639,2020-06-26T09:45:39Z,2020-06-26T09:45:39Z,NONE,"Ah that is a much better compromise - it's still slower for my own much larger dataset but is definitely manageable now. I think that this is what I was trying to find originally when I ended up using |S1. As the problem was my usage of encoding / netCDF4's slow variable strings and you've given me a good workaround, I'll close this. Thanks for your help!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,645443880 https://github.com/pydata/xarray/issues/4180#issuecomment-649883875,https://api.github.com/repos/pydata/xarray/issues/4180,649883875,MDEyOklzc3VlQ29tbWVudDY0OTg4Mzg3NQ==,1217238,2020-06-26T00:31:36Z,2020-06-26T00:31:36Z,MEMBER,"The profile shows that all the time is spent in the netCDF4 library. By default, xarray writes string dtypes as variable length strings. That appears to be rather slow in netCDF4, for reasons that aren't clear to me. One work around is to save the data as fixed-width character data instead, e.g., ``` ds.to_netcdf('somefilename', encoding={'tester': {'dtype': 'S1'}}) ``` Unlike `astype('|S1')`, this version safely encodes the data as UTF-8, so it an handle arbitrary Python strings.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,645443880