html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/5070#issuecomment-808605422,https://api.github.com/repos/pydata/xarray/issues/5070,808605422,MDEyOklzc3VlQ29tbWVudDgwODYwNTQyMg==,2067093,2021-03-27T00:39:26Z,2021-03-27T00:43:35Z,NONE,"Just ran into this. Unsure whether checking `hasattr` is better than just trying to read the object and catching an error - someone could implement a non-compliant `read` method, which would create other errors. As a workaround, you could read it into BytesIO and pass the BytesIO instance: ```python import fsspec import xarray as xr from io import BytesIO of = fsspec.open(""example.nc"") with of as f: xr.load_dataset(BytesIO(f.read())) ``` Also, [here's the link](https://github.com/pydata/xarray/blob/master/xarray/core/utils.py#L655) to the code referenced above. Ideally xarray would work with `fsspec` or `pyfilesystem2` out of the box (to parse access URLs, for example). I've had to fall back to using BytesIO buffers too many times. 😛 Edit: You don't even need BytesIO, it works even with Bytes: ```python import fsspec import xarray as xr of = fsspec.open(""example.nc"") with of as f: xr.load_dataset(f.read()) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,839823306 https://github.com/pydata/xarray/issues/2059#issuecomment-730263703,https://api.github.com/repos/pydata/xarray/issues/2059,730263703,MDEyOklzc3VlQ29tbWVudDczMDI2MzcwMw==,2067093,2020-11-19T10:02:35Z,2020-11-19T10:02:35Z,NONE,"This may be relevant here, maybe not, but it appears the HDF5 backend is also at odds with all the above serialization. Our internal project's dependencies changed, and that moved the `h5py` version from 2.10 to 3.1; apparently there was a breaking change that meant unicode strings were either encoded or decoded as `bytes`. Thankfully we had a test for that, but figuring out what was wrong was difficult. Essentially, netCDF4 files that were round-tripped to a BytesIO (via an HDF5 backend) had unicode strings converted to bytes. I'm not sure whether it was the encoding or decoding part, likely decoding, judging by the docs: https://docs.h5py.org/en/stable/strings.html https://docs.h5py.org/en/stable/whatsnew/3.0.html#breaking-changes-deprecations This might require even more special-casing to achieve consistent behavior for xarray users who don't really want to go into backend details (like me 😋).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,314444743 https://github.com/pydata/xarray/issues/2995#issuecomment-657798184,https://api.github.com/repos/pydata/xarray/issues/2995,657798184,MDEyOklzc3VlQ29tbWVudDY1Nzc5ODE4NA==,2067093,2020-07-13T21:17:06Z,2020-07-13T21:17:06Z,NONE,"I ran into this issue, here's a simple workaround that seems to work: ```python def dataset_to_bytes(ds: xr.Dataset, name: str = ""my-dataset"") -> bytes: """"""Converts datset to bytes."""""" nc4_ds = netCDF4.Dataset(name, mode=""w"", diskless=True, memory=ds.nbytes) nc4_store = NetCDF4DataStore(nc4_ds) dump_to_store(ds, nc4_store) res_mem = nc4_ds.close() res_bytes = res_mem.tobytes() return res_bytes ``` I tested this using the following: ```python import BytesIO fname = ""REDACTED.nc"" ds = xr.load_dataset(fname) ds_bytes = dataset_to_bytes(ds) ds2 = xr.load_dataset(BytesIO(ds_bytes)) assert ds2.equals(ds) and all(ds2.attrs[k]==ds.attrs[k] for k in set(ds2.attrs).union(ds.attrs)) ``` The assertion holds true, however the file size on disk is different. It's possible they were saved using different netCDF4 versions, I haven't had time to test that. I tried using just `ds.to_netcdf()` but get the following error: `ValueError: NetCDF 3 does not support type |S32` That's because it falls back to the `'scipy'` engine. Would be nice to have a non-hacky way to write netcdf4 files to byte streams. :smiley:","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,449706080 https://github.com/pydata/xarray/issues/1603#issuecomment-557579503,https://api.github.com/repos/pydata/xarray/issues/1603,557579503,MDEyOklzc3VlQ29tbWVudDU1NzU3OTUwMw==,2067093,2019-11-22T15:34:57Z,2019-11-22T15:34:57Z,NONE,"> Thanks @NowanIlfideme for your feedback. > > Could you perhaps share a gist of code related to your use case? The first example in this comment is similar to my use case: https://github.com/pydata/xarray/issues/3213#issuecomment-520741706 . There are several ""core"" dimensions, but some part of the coordinates may be hierarchical or cross-defined (e.g. country > province > city > building, but also country > province > voting district > building). We might have a full or nearly-full panel in the MultiIndex representation, but have a huge cross product (even if we keep strictly hierarchical dimensions out). Meanwhile using a true COO sparse representation (as I understand it) will likely end up with slower operations overall, since nearly all machine learning models (think: linear regression) require a dense array input anyways. I'll make an example of this when I find some free time, along with a contrasting one in Pandas. :)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,262642978 https://github.com/pydata/xarray/issues/1603#issuecomment-557563566,https://api.github.com/repos/pydata/xarray/issues/1603,557563566,MDEyOklzc3VlQ29tbWVudDU1NzU2MzU2Ng==,2067093,2019-11-22T14:59:29Z,2019-11-22T14:59:29Z,NONE,"I've noticed that basically all my current troubles with xarray lead to this issue (lack of MultiIndex support). I use xarray for machine learning/data science/econometrics. My current problem requires a semi-hierarchical indexing on one of the dimensions, and slicing/aggregation along some levels of those dimensions. My first attempt was to just assume each dimension was orthogonal, which resulted in out-of-memory errors. I ended up using a MultiIndex for the hierarchy dimension to have a ""dense"" representation of a sparse subspace. Unfortunately, currently `.sel()` and such will cut out MultiIndex dimensions, and I've had to do boolean masking to keep all the dimensions I need. Multidimensional groupby, especially within the MultiIndex, is a headache as it currently stands. I had to resort to making auxilliary dimensions with one-hot encoded levels (dummy variables) and doing multiply-aggregate operations by hand. `xarray` is really beautiful and should be used more by data scientists, but it's really difficult to recommend it to colleagues when not all the familiar `pandas`-style operations are supported.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,262642978 https://github.com/pydata/xarray/issues/3458#issuecomment-557476617,https://api.github.com/repos/pydata/xarray/issues/3458,557476617,MDEyOklzc3VlQ29tbWVudDU1NzQ3NjYxNw==,2067093,2019-11-22T10:21:08Z,2019-11-22T10:21:08Z,NONE,"Note that this doesn't work on MultiIndex levels, since vectorized operations on them are not currently supported. Meanwhile, using `sel(multiindex_level_name=""a"")` drops the level from the multiindex entirely. The running theme is that this is dependent on #1603, it seems. :)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,514077742