html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/pull/7862#issuecomment-1578775636,https://api.github.com/repos/pydata/xarray/issues/7862,1578775636,IC_kwDOAMm_X85eGjRU,5821660,2023-06-06T13:30:15Z,2023-06-06T13:30:15Z,MEMBER,"> > Might be worth an issue over at numpy with the example from the test. > > [numpy/numpy#23886](https://github.com/numpy/numpy/issues/23886) The issue is already resolved over at numpy which is really great! It was also marked as backport. @headtr1ck How are these issues resolved currently or how do we track removing the ignore?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1720045908 https://github.com/pydata/xarray/pull/7862#issuecomment-1578248748,https://api.github.com/repos/pydata/xarray/issues/7862,1578248748,IC_kwDOAMm_X85eEios,5821660,2023-06-06T09:04:39Z,2023-06-06T09:04:39Z,MEMBER,"> Might be worth an issue over at numpy with the example from the test. https://github.com/numpy/numpy/issues/23886","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1720045908 https://github.com/pydata/xarray/issues/7866#issuecomment-1576080083,https://api.github.com/repos/pydata/xarray/issues/7866,1576080083,IC_kwDOAMm_X85d8RLT,5821660,2023-06-05T05:45:30Z,2023-06-05T05:45:30Z,MEMBER,"@vrishk Sorry for the delay here and thanks for bringing this to attention. We now have at least two requests which might move this forward (moving `ensure_dtype_not_object` into the backends). But this would need some discussion first, how to do this.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1720924071 https://github.com/pydata/xarray/issues/7892#issuecomment-1576074048,https://api.github.com/repos/pydata/xarray/issues/7892,1576074048,IC_kwDOAMm_X85d8PtA,5821660,2023-06-05T05:37:32Z,2023-06-05T05:37:32Z,MEMBER,@mktippett Thanks for raising this. The issue should be cleared after #7888 is merged.,"{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1740685974 https://github.com/pydata/xarray/pull/7862#issuecomment-1572021301,https://api.github.com/repos/pydata/xarray/issues/7862,1572021301,IC_kwDOAMm_X85dsyQ1,5821660,2023-06-01T13:06:32Z,2023-06-01T13:06:32Z,MEMBER,"@tomwhite I've added tests to check the backend code for vlen string dtype metadadata. Also had to add specific check for the h5py vlen string metadata. I think we've covered everything for the proposed change to allow empty vlen strings dtype metadata. I'm looking at the mypy error and do not have the slightest clue what and where to change. Any help appreciated. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1720045908 https://github.com/pydata/xarray/issues/7868#issuecomment-1561584592,https://api.github.com/repos/pydata/xarray/issues/7868,1561584592,IC_kwDOAMm_X85dE-PQ,5821660,2023-05-24T16:50:34Z,2023-05-24T16:50:34Z,MEMBER,"Thanks @ghiggi for your comment. The problem is we have at least two contradicting user requests here, see #7328 and #7862. I'm sure there is a solution to accommodate both sides. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1722417436 https://github.com/pydata/xarray/pull/7862#issuecomment-1561285499,https://api.github.com/repos/pydata/xarray/issues/7862,1561285499,IC_kwDOAMm_X85dD1N7,5821660,2023-05-24T14:37:58Z,2023-05-24T14:37:58Z,MEMBER,"Thanks for trying. I can't think of any downsides for the netcdf4-fix, as it just adds the needed metadata to the object-dtype. But you never know, so it would be good to get another set of eyes on it. So it looks like the changes here with the fix in my branch will get your issue resolved @tomwhite, right? I'm a bit worried, that this might break other users workflows, if they depend on the current conversion to floating point for some reason. Also other backends might rely on this feature. Especially because this has been there since the early days when xarray was known as xray. @dcherian What would be the way to go here? There is also a somehow contradicting issue in #7868.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1720045908 https://github.com/pydata/xarray/issues/7868#issuecomment-1561214028,https://api.github.com/repos/pydata/xarray/issues/7868,1561214028,IC_kwDOAMm_X85dDjxM,5821660,2023-05-24T13:58:16Z,2023-05-24T13:58:16Z,MEMBER,"My main question here is, why is dask not trying to retrieve the object types from dtype.metadata? Or does it and fail for some reason?.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1722417436 https://github.com/pydata/xarray/pull/7862#issuecomment-1561195832,https://api.github.com/repos/pydata/xarray/issues/7862,1561195832,IC_kwDOAMm_X85dDfU4,5821660,2023-05-24T13:52:04Z,2023-05-24T13:52:04Z,MEMBER,"@tomwhite I've put a commit with changes to zarr/netcdf4-backends which should preserve the dtype metadata here: https://github.com/kmuehlbauer/xarray/tree/preserve-vlen-string-dtype. I'm not really sure if that is the right location, but as it was already present that location at netcdf4-backend I think it will do.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1720045908 https://github.com/pydata/xarray/pull/7862#issuecomment-1561162311,https://api.github.com/repos/pydata/xarray/issues/7862,1561162311,IC_kwDOAMm_X85dDXJH,5821660,2023-05-24T13:32:26Z,2023-05-24T13:32:57Z,MEMBER,"@tomwhite Special casing on netcdf4 backend should be possible, too. But it might need fixing at zarr backend, too: ```python ds = xr.Dataset({""a"": np.array([], dtype=xr.coding.strings.create_vlen_dtype(str))}) print(f""dtype: {ds['a'].dtype}"") print(f""metadata: {ds['a'].dtype.metadata}"") ds.to_zarr(""a.zarr"") print(""\n### Loading ###"") with xr.open_dataset(""a.zarr"", engine=""zarr"") as ds: print(f""dtype: {ds['a'].dtype}"") print(f""metadata: {ds['a'].dtype.metadata}"") ``` ```python dtype: object metadata: {'element_type': } ### Loading ### dtype: object metadata: None ``` Could you verify the above example, please? I'm relatively new to `zarr` :grimacing: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1720045908 https://github.com/pydata/xarray/issues/7868#issuecomment-1560674198,https://api.github.com/repos/pydata/xarray/issues/7868,1560674198,IC_kwDOAMm_X85dBf-W,5821660,2023-05-24T08:27:11Z,2023-05-24T08:27:11Z,MEMBER,"@ghiggi Glad it works, but we still have to check if that is the correct location for the fix, as it's not CF specific. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1722417436 https://github.com/pydata/xarray/pull/7862#issuecomment-1560559426,https://api.github.com/repos/pydata/xarray/issues/7862,1560559426,IC_kwDOAMm_X85dBD9C,5821660,2023-05-24T07:01:44Z,2023-05-24T07:01:44Z,MEMBER,"Thanks @tomwhite for the PR. I've only quickly checked the approach, which looks reasonable. But those changes have implications on several locations of the backend code, which we would have to sort out. Considering this example: ```python import numpy as np import xarray as xr print(f""creating dataset with empty string array"") print(""-----------------------------------------"") dtype = xr.coding.strings.create_vlen_dtype(str) ds = xr.Dataset({""a"": np.array([], dtype=dtype)}) print(f""dtype: {ds['a'].dtype}"") print(f""metadata: {ds['a'].dtype.metadata}"") ds.to_netcdf(""a.nc"", engine=""netcdf4"") print(""\nncdump"") print(""-------"") !ncdump a.nc engines = [""netcdf4"", ""h5netcdf""] for engine in engines: with xr.open_dataset(""a.nc"", engine=engine) as ds: print(f""\nloading with {engine}"") print(""-------------------"") print(f""dtype: {ds['a'].dtype}"") print(f""metadata: {ds['a'].dtype.metadata}"") ``` ```python creating dataset with empty string array ----------------------------------------- dtype: object metadata: {'element_type': } ncdump ------- netcdf a { dimensions: a = UNLIMITED ; // (0 currently) variables: string a(a) ; data: } loading with netcdf4 ------------------- dtype: object metadata: None loading with h5netcdf ------------------- dtype: object metadata: {'vlen': } ``` Engine `netcdf4` does not roundtrip here, losing the dtype metadata information. There is special casing for h5netcdf backend, though. The source is actually located in `open_store_variable` of `netcdf4` backend, when the underlying data is converted to `Variable` (which does some object dtype twiddling). Unfortunately I do not have an immediate solution here. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1720045908 https://github.com/pydata/xarray/issues/7328#issuecomment-1560534067,https://api.github.com/repos/pydata/xarray/issues/7328,1560534067,IC_kwDOAMm_X85dA9wz,5821660,2023-05-24T06:37:39Z,2023-05-24T06:37:39Z,MEMBER,"@tomwhite Sorry for the delay here. I'll respond shortly on your PR #7862, but we might have to reiterate here later","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1466586967 https://github.com/pydata/xarray/issues/7868#issuecomment-1559959581,https://api.github.com/repos/pydata/xarray/issues/7868,1559959581,IC_kwDOAMm_X85c-xgd,5821660,2023-05-23T18:42:55Z,2023-05-23T19:01:00Z,MEMBER,@ghiggi Thanks for getting this back into action. I got dragged away from the one string object issue in #7654. I'll split this out and add a PR.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1722417436 https://github.com/pydata/xarray/issues/7868#issuecomment-1559973194,https://api.github.com/repos/pydata/xarray/issues/7868,1559973194,IC_kwDOAMm_X85c-01K,5821660,2023-05-23T18:55:46Z,2023-05-23T18:55:46Z,MEMBER,@ghiggi I'd appreciate if you could test your workflows against #7869. Your example and the one over in #7652 are working AFAICT.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1722417436 https://github.com/pydata/xarray/pull/7827#issuecomment-1556891860,https://api.github.com/repos/pydata/xarray/issues/7827,1556891860,IC_kwDOAMm_X85czEjU,5821660,2023-05-22T09:40:04Z,2023-05-22T09:40:04Z,MEMBER,"The example below is only based on Variable and the cf encode/decode variable functions. ```python import xarray as xr import numpy as np # create DataArray times = [np.datetime64(""2000-01-01"", ""ns""), np.datetime64(""NaT"")] da = xr.DataArray(times, dims=[""time""], name=""foo"") da.encoding[""dtype""] = np.float64 da.encoding[""_FillValue""] = 20.0 # extract Variable source_var = da.variable print(""---------- source_var ------------------"") print(source_var) print(source_var.encoding) # encode Variable encoded_var = xr.conventions.encode_cf_variable(source_var) print(""\n---------- encoded_var ------------------"") print(encoded_var) # decode Variable decoded_var = xr.conventions.decode_cf_variable(""foo"", encoded_var) print(""\n---------- decoded_var ------------------"") print(decoded_var.load()) ``` ```python /home/kai/miniconda/envs/xarray_311/lib/python3.11/site-packages/xarray/coding/times.py:618: RuntimeWarning: invalid value encountered in cast int_num = np.asarray(num, dtype=np.int64) /home/kai/miniconda/envs/xarray_311/lib/python3.11/site-packages/xarray/coding/times.py:254: RuntimeWarning: invalid value encountered in cast flat_num_dates_ns_int = (flat_num_dates * _NS_PER_TIME_DELTA[delta]).astype( /home/kai/miniconda/envs/xarray_311/lib/python3.11/site-packages/xarray/coding/times.py:254: RuntimeWarning: invalid value encountered in cast flat_num_dates_ns_int = (flat_num_dates * _NS_PER_TIME_DELTA[delta]).astype( ---------- source_var ------------------ array(['2000-01-01T00:00:00.000000000', 'NaT'], dtype='datetime64[ns]') {'dtype': , '_FillValue': 20.0} dtype num float64 ---------- encoded_var ------------------ array([ 0., 20.]) Attributes: units: days since 2000-01-01 00:00:00 calendar: proleptic_gregorian _FillValue: 20.0 ---------- decoded_var ------------------ array(['2000-01-01T00:00:00.000000000', 'NaT'], dtype='datetime64[ns]') {'_FillValue': 20.0, 'units': 'days since 2000-01-01 00:00:00', 'calendar': 'proleptic_gregorian', 'dtype': dtype('float64')} ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1700227455 https://github.com/pydata/xarray/pull/7827#issuecomment-1556869361,https://api.github.com/repos/pydata/xarray/issues/7827,1556869361,IC_kwDOAMm_X85cy_Dx,5821660,2023-05-22T09:24:47Z,2023-05-22T09:24:47Z,MEMBER,"@spencerkclark With current master I get the following `RuntimeWarning` running your code example: - on encoding (calling `to_netcdf()`): ```python /home/kai/miniconda/envs/xarray_311/lib/python3.11/site-packages/xarray/coding/times.py:618: RuntimeWarning: invalid value encountered in cast int_num = np.asarray(num, dtype=np.int64) ``` - on decoding (calling `open_dataset()`): ```python /home/kai/miniconda/envs/xarray_311/lib/python3.11/site-packages/xarray/coding/times.py:254: RuntimeWarning: invalid value encountered in cast flat_num_dates_ns_int = (flat_num_dates * _NS_PER_TIME_DELTA[delta]).astype( /home/kai/miniconda/envs/xarray_311/lib/python3.11/site-packages/xarray/coding/times.py:254: RuntimeWarning: invalid value encountered in cast flat_num_dates_ns_int = (flat_num_dates * _NS_PER_TIME_DELTA[delta]).astype( ``` The latter was discussed in #7098 (casting float64 to int64), the former was aimed to be resolved with this PR. I'll try to create a test case using `Variable` and the respective encoding/decoding functions without involving IO (per your suggestion @spencerkclark). ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1700227455 https://github.com/pydata/xarray/pull/7827#issuecomment-1554532844,https://api.github.com/repos/pydata/xarray/issues/7827,1554532844,IC_kwDOAMm_X85cqEns,5821660,2023-05-19T12:57:31Z,2023-05-19T12:57:31Z,MEMBER,Thanks @spencerkclark for taking the time. NaN has been written to disk (as you assumed). Let's have another try next week.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1700227455 https://github.com/pydata/xarray/pull/7788#issuecomment-1545446155,https://api.github.com/repos/pydata/xarray/issues/7788,1545446155,IC_kwDOAMm_X85cHaML,5821660,2023-05-12T09:23:13Z,2023-05-12T09:23:13Z,MEMBER,"@maxhollmann I'm sorry, I'm still finding my way into Xarray. I've taken a closer look at #2377, especially https://github.com/pydata/xarray/issues/2377#issuecomment-415074188. There @shoyer suggested to just use: ```python data = duck_array_ops.where_method(data, ~mask, fill_value) ``` instead of ```python data[mask] = fill_value ``` I've checked and it works nicely with your test. That way we would get away without the flags test and the special handling will take place in duck_array_ops. Would be great if someone can double check. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1685422501 https://github.com/pydata/xarray/issues/4220#issuecomment-1545408039,https://api.github.com/repos/pydata/xarray/issues/4220,1545408039,IC_kwDOAMm_X85cHQ4n,5821660,2023-05-12T08:55:09Z,2023-05-12T08:55:09Z,MEMBER,`combine_first` uses `fillna` under the hood -> #3570,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,656089264 https://github.com/pydata/xarray/issues/5706#issuecomment-1545346823,https://api.github.com/repos/pydata/xarray/issues/5706,1545346823,IC_kwDOAMm_X85cHB8H,5821660,2023-05-12T08:06:06Z,2023-05-12T08:06:06Z,MEMBER,This is resolved in recent `netcdf-c`/`netcdf4-python` and works with recent Xarray.,"{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 1, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,970619131 https://github.com/pydata/xarray/pull/7788#issuecomment-1545337724,https://api.github.com/repos/pydata/xarray/issues/7788,1545337724,IC_kwDOAMm_X85cG_t8,5821660,2023-05-12T07:59:19Z,2023-05-12T07:59:19Z,MEMBER,"@maxhollmann We might get at least some more views on this. There have been discussions on handling masked arrays and we should make sure this is exactly the solution we want to have. @dcherian This changes `as_compatible_data`. Could you please have another look here? I'm a bit unclear about the implications.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1685422501 https://github.com/pydata/xarray/pull/7834#issuecomment-1543526954,https://api.github.com/repos/pydata/xarray/issues/7834,1543526954,IC_kwDOAMm_X85cAFoq,5821660,2023-05-11T08:03:01Z,2023-05-11T08:03:01Z,MEMBER,"@mx-moth Yes, this casting should be fixed. I'm adding a bit of context here, as this might need to be solved in combination with #7098 and #7827. #7098 removes undefined casting for decoding. In #7827 there are efforts to do this for encoding, too. As `cast_to_int_if_safe` is called for encoding as well as decoding I'm not sure if all cases have been catched by these two PR. One issue on decoding is that at least for datetime64 based times the calculated `time_deltas` are currently converted to float64 in the presence of `NaT` (although `NaT` can perfectly be expressed as int64). It would be great if you could try your PR on top of #7827 (which includes #7098) to see if that fixes the errors in this PR.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1705163672 https://github.com/pydata/xarray/issues/7833#issuecomment-1543285629,https://api.github.com/repos/pydata/xarray/issues/7833,1543285629,IC_kwDOAMm_X85b_Kt9,5821660,2023-05-11T03:39:29Z,2023-05-11T03:39:29Z,MEMBER,"@alimanfoo The slow code stems from my changes in #7400. Obviously the performance drop did not manifest in the tests/ benchmarks. In #7824 @Illviljan is tackling concat performance.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1704950804 https://github.com/pydata/xarray/pull/7827#issuecomment-1542767369,https://api.github.com/repos/pydata/xarray/issues/7827,1542767369,IC_kwDOAMm_X85b9MMJ,5821660,2023-05-10T20:27:08Z,2023-05-10T20:27:08Z,MEMBER,"@dcherian You were right from the beginning, changing order for decoding and handling `_FillValue` in `CFDatetimeCoder` seems to be one working solution with minimal code changes. If the CI is happy I'll add tests to cover for the nanosecond issues in #7817. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1700227455 https://github.com/pydata/xarray/issues/7831#issuecomment-1541410601,https://api.github.com/repos/pydata/xarray/issues/7831,1541410601,IC_kwDOAMm_X85b4A8p,5821660,2023-05-10T06:13:20Z,2023-05-10T06:13:39Z,MEMBER,Yet another idea would be to add and `Engines` heading on https://docs.xarray.dev/en/stable/ecosystem.html where engines/backends and there respective packages can be listed. The error could include a link to that page. ,"{""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1702025553 https://github.com/pydata/xarray/issues/7831#issuecomment-1540845511,https://api.github.com/repos/pydata/xarray/issues/7831,1540845511,IC_kwDOAMm_X85b12_H,5821660,2023-05-09T20:26:32Z,2023-05-09T20:26:32Z,MEMBER,"Maybe it would also help to rephrase the error, something along the lines ""Engine `rasterio` is not available. Please install the needed package. Engines [xxx, yyy, zzz] are available."" ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1702025553 https://github.com/pydata/xarray/pull/7827#issuecomment-1539356386,https://api.github.com/repos/pydata/xarray/issues/7827,1539356386,IC_kwDOAMm_X85bwLbi,5821660,2023-05-09T03:51:39Z,2023-05-09T03:51:39Z,MEMBER,"Thanks for the heads-up, @spencerkclark. No worries, I need to apply some changes anyway as it turns out.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1700227455 https://github.com/pydata/xarray/pull/7827#issuecomment-1538998850,https://api.github.com/repos/pydata/xarray/issues/7827,1538998850,IC_kwDOAMm_X85bu0JC,5821660,2023-05-08T20:22:28Z,2023-05-08T20:22:28Z,MEMBER,All tests have passed. Rebased now on latest main. The issue described in #7817 is resolved. Ready for first reviews. ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1700227455 https://github.com/pydata/xarray/pull/7827#issuecomment-1538966366,https://api.github.com/repos/pydata/xarray/issues/7827,1538966366,IC_kwDOAMm_X85busNe,5821660,2023-05-08T20:01:17Z,2023-05-08T20:01:17Z,MEMBER,"I've reset the order of coders to the initial behaviour. Instead the times are special cased in the CFMaskCoder. Locally it works, but I'll only trust the CI. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1700227455 https://github.com/pydata/xarray/pull/7771#issuecomment-1538819904,https://api.github.com/repos/pydata/xarray/issues/7771,1538819904,IC_kwDOAMm_X85buIdA,5821660,2023-05-08T18:11:00Z,2023-05-08T18:11:00Z,MEMBER,"Setting status back to draft for now, still evaluating solutions for the CF encoding/decoding.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1676309093 https://github.com/pydata/xarray/pull/7654#issuecomment-1538818465,https://api.github.com/repos/pydata/xarray/issues/7654,1538818465,IC_kwDOAMm_X85buIGh,5821660,2023-05-08T18:09:59Z,2023-05-08T18:09:59Z,MEMBER,"I've converted to draft for now, as I'm still evaluating solutions for the CF encoding/decoding.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1633623916 https://github.com/pydata/xarray/pull/7827#issuecomment-1538364933,https://api.github.com/repos/pydata/xarray/issues/7827,1538364933,IC_kwDOAMm_X85bsZYF,5821660,2023-05-08T13:29:07Z,2023-05-08T13:29:07Z,MEMBER,"@spencerkclark I'd appreciate if you could have a look here. All but one test pass, but I can't immediately see what that test is doing. Looks like mismatched dtypes on the attributes. If you have any suggestions how to possibly improve, please let me know. I've not added tests here, yet.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1700227455 https://github.com/pydata/xarray/issues/7817#issuecomment-1538354499,https://api.github.com/repos/pydata/xarray/issues/7817,1538354499,IC_kwDOAMm_X85bsW1D,5821660,2023-05-08T13:22:22Z,2023-05-08T13:22:52Z,MEMBER,"@dcherian Yes, I've setup a prototype in #7827. But the overall solution doesn't look that nice. The handling of fill_value has still to be done in CFMaskCoder. Also #7098 is needed for this.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1696097756 https://github.com/pydata/xarray/issues/7816#issuecomment-1535941525,https://api.github.com/repos/pydata/xarray/issues/7816,1535941525,IC_kwDOAMm_X85bjJuV,5821660,2023-05-05T08:55:42Z,2023-05-05T08:55:42Z,MEMBER,"@gauteh No worries, glad it works now!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1695809136 https://github.com/pydata/xarray/issues/7814#issuecomment-1535776861,https://api.github.com/repos/pydata/xarray/issues/7814,1535776861,IC_kwDOAMm_X85bihhd,5821660,2023-05-05T06:31:20Z,2023-05-05T06:31:20Z,MEMBER,"@paul0207 Thanks for providing the datafiles. I can't reproduce on my machine. Please provide more information, the output of `xr.show_versions()` would help and a complete traceback of the error you are experiencing. A complete list of installed Python Packages would be nice (eg. by `pip list`), too. Another couple of questions to get some more insight: - Does this happen only with these special files, or do you experience this every time? - Does the problem persists when specifying `engine=""netcdf4""` or `engine=""h5netcdf""` in the call to `open_mfdataset`? - Does this also happen, if you open the files one-by-one (with `xr.open_dataset`) and combine the Datasets with `xr.concat`?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1695028906 https://github.com/pydata/xarray/issues/7816#issuecomment-1535724636,https://api.github.com/repos/pydata/xarray/issues/7816,1535724636,IC_kwDOAMm_X85biUxc,5821660,2023-05-05T05:46:46Z,2023-05-05T05:46:46Z,MEMBER,"@gauteh Yes, please provide as much information as possible. It is also of interest, how you installed the package and what Python environment you are using (eg. system python, conda, venv etc.)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1695809136 https://github.com/pydata/xarray/issues/7816#issuecomment-1535596259,https://api.github.com/repos/pydata/xarray/issues/7816,1535596259,IC_kwDOAMm_X85bh1bj,5821660,2023-05-05T01:46:12Z,2023-05-05T01:46:12Z,MEMBER,"@gauteh You would probably have to delete this line: https://github.com/gauteh/hidefix/blob/main/python/hidefix/xarray.py#L192 As @headtr1ck already explained, it is all handled via plugin system to be able to handle duplicate engine names on discovery by the python metadata. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1695809136 https://github.com/pydata/xarray/issues/7817#issuecomment-1534855008,https://api.github.com/repos/pydata/xarray/issues/7817,1534855008,IC_kwDOAMm_X85bfAdg,5821660,2023-05-04T14:11:26Z,2023-05-04T14:11:26Z,MEMBER,"cc @spencerkclark @DocOtak I've tried to at least find one example which incarnates as bug. Nevertheless the transformation from int to float in CFMaskCoder should be avoided. We might think about special casing time data in CFMaskCoder, or handle masking of time data in CFDatetimeCoder/CFTimedeltaCoder.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1696097756 https://github.com/pydata/xarray/issues/7790#issuecomment-1532441433,https://api.github.com/repos/pydata/xarray/issues/7790,1532441433,IC_kwDOAMm_X85bVzNZ,5821660,2023-05-03T04:25:50Z,2023-05-03T04:25:50Z,MEMBER,"@christine-e-smit Great this works on you side with the proposed patch in #7098. Nevertheless, we've identified three more issues here in the debugging process which can now be handled one by one. So again, thanks for your contribution here.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1685803922 https://github.com/pydata/xarray/issues/7790#issuecomment-1531050846,https://api.github.com/repos/pydata/xarray/issues/7790,1531050846,IC_kwDOAMm_X85bQfte,5821660,2023-05-02T08:04:45Z,2023-05-03T04:20:11Z,MEMBER,"As in #7098, citing @dcherian: > I think the real solution here is to explicitly handle NaNs during the decoding step. We do want these to be NaT in the output. There are three more issues revealed here when using datetime64: - if _FillValue is set in encoding, it has to be of same type/resolution as the times in the array - If _FillValue is provided, we need to provide `dtype` and `units` to which fit our data, eg. if the _FillValue is referenced to unix-epoch the unit's should be equivalent - when encoding in the presence of NaT the data array is converted to floating point with NaN, which is problematic for the subsequent conversion to int64","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1685803922 https://github.com/pydata/xarray/issues/5490#issuecomment-1531496369,https://api.github.com/repos/pydata/xarray/issues/5490,1531496369,IC_kwDOAMm_X85bSMex,5821660,2023-05-02T13:38:49Z,2023-05-02T13:38:49Z,MEMBER,"This is indeed an issue with `scale_factor` and `add_offset` as @d70-t has already mentioned. That is not a problem per se, but those attributes are obviously different for different files. When concatenating only the first files's attributes survive. That might already be the source of the above problem, as it might slightly change values. An even bigger problem is, when the dynamic range of the decoded data (min/max) doesn't overlap. Then the data might be folded from the lower border to the upper border or vica versa. I've put an example into #5739. The suggestion for now is as @keewis comment to drop encoding in such cases and use floating point values for writing. You might use the available compression options for floating point data. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,924676925 https://github.com/pydata/xarray/issues/5490#issuecomment-1531465011,https://api.github.com/repos/pydata/xarray/issues/5490,1531465011,IC_kwDOAMm_X85bSE0z,5821660,2023-05-02T13:20:46Z,2023-05-02T13:20:46Z,MEMBER,"Xref: #5739 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,924676925 https://github.com/pydata/xarray/issues/7790#issuecomment-1530991257,https://api.github.com/repos/pydata/xarray/issues/7790,1530991257,IC_kwDOAMm_X85bQRKZ,5821660,2023-05-02T07:09:38Z,2023-05-02T08:14:36Z,MEMBER,"@christine-e-smit I've created an fresh environment with only xarray and zarr and it still works on my machine. I've then followed the Darwin idea and digged up #6191 (I've got those casting warnings from exactly the line you were referring to). Comment https://github.com/pydata/xarray/issues/6191#issuecomment-1209567966 should explain what happens here. tl;dr citing @DocOtak > The short explanation is that the time conversion functions do an `astype(np.int64)` or equivalent cast on arrays that contain nans. This is [undefined behavior](https://github.com/numpy/numpy/issues/13101#issuecomment-740058842) and very soon, doing this will[ start to emit RuntimeWarnings](https://github.com/numpy/numpy/pull/21437). There is also an open PR #7098. Thanks @christine-e-smit for sticking with me to find the root-cause here by providing detailed information and code examples. :+1: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1685803922 https://github.com/pydata/xarray/issues/7790#issuecomment-1530141083,https://api.github.com/repos/pydata/xarray/issues/7790,1530141083,IC_kwDOAMm_X85bNBmb,5821660,2023-05-01T20:01:50Z,2023-05-01T20:01:50Z,MEMBER,"@christine-e-smit One more idea, you might delete the zarr folder before re-creating (if you are not doing that already). I've removed the complete folder before any new write (by putting eg. `!rm -rf xarray_and_units.zarr` at the beginning of the notebook-cell). It would also be great if you could run the code from https://github.com/pydata/xarray/issues/7790#issuecomment-1529894939 and post the output here, just for the sake of comparison (please delete the zarr-folder before if it exists). Thanks! ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1685803922 https://github.com/pydata/xarray/issues/7790#issuecomment-1530131533,https://api.github.com/repos/pydata/xarray/issues/7790,1530131533,IC_kwDOAMm_X85bM_RN,5821660,2023-05-01T19:53:53Z,2023-05-01T19:53:53Z,MEMBER,"@christine-e-smit I've plugged your code into a fresh notebook, here is my output: ```python ********************** xarray created with NaT fill value ---------------------- array([ 'NaT', '2023-01-02T00:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] NaT 2023-01-02 ********************** xarray created read with NaT fill value ---------------------- array([ 'NaT', '2023-01-02T00:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] NaT 2023-01-02 {} {'chunks': (2,), 'preferred_chunks': {'time': 2}, 'compressor': Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0), 'filters': None, '_FillValue': -9223372036854775808, 'units': 'nanoseconds since 1970-01-01', 'calendar': 'proleptic_gregorian', 'dtype': dtype('int64')} ``` The output seems OK on my side. I've no idea why the data isn't correctly decoded as NaT on your side. I've checked that my environment is comparable to yours. The only difference remaining is you are on Darwin arm64 whereas I'm on Linux. ``` INSTALLED VERSIONS ------------------ commit: None python: 3.11.2 | packaged by conda-forge | (main, Mar 31 2023, 17:51:05) [GCC 11.3.0] python-bits: 64 OS: Linux OS-release: 5.4.0-144-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: de_DE.UTF-8 LOCALE: ('de_DE', 'UTF-8') libhdf5: 1.14.0 libnetcdf: None xarray: 2023.4.2 pandas: 2.0.1 numpy: 1.24.3 scipy: 1.10.1 netCDF4: None pydap: None h5netcdf: 1.1.0 h5py: 3.8.0 Nio: None zarr: 2.14.2 cftime: None nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: None dask: 2023.3.2 distributed: 2023.3.2 matplotlib: None cartopy: None seaborn: None numbagg: None fsspec: 2023.3.0 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 67.6.1 pip: 23.0.1 conda: None pytest: 7.2.2 mypy: 0.982 IPython: 8.12.0 sphinx: None ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1685803922 https://github.com/pydata/xarray/issues/7790#issuecomment-1530111912,https://api.github.com/repos/pydata/xarray/issues/7790,1530111912,IC_kwDOAMm_X85bM6eo,5821660,2023-05-01T19:30:22Z,2023-05-01T19:30:22Z,MEMBER,"> Unfortunately, I think you may have also gotten some wires crossed? You set the time fill value to 1900-01-01, but then use NaT in the actual array? Yes, I use NaT because I want to check if the encoder does correctly translate NaT to the provided _FillValue on write. So from your last example I'm assuming you would like to have the int64 representation of NaT as _FillValue, right? I'll try to adapt this, and see what I get ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1685803922 https://github.com/pydata/xarray/issues/7790#issuecomment-1529894939,https://api.github.com/repos/pydata/xarray/issues/7790,1529894939,IC_kwDOAMm_X85bMFgb,5821660,2023-05-01T16:05:19Z,2023-05-01T16:05:19Z,MEMBER,"So, after some debugging I think I've found two issues here with the current code. First, we need to give the fillvalue with a fitting resolution. Second, we have an issue with inferring the units from the data (if not given). Here is some workaround code which (finally, :crossed_fingers:) should at least write and read correct data (added comments below): ```python # Create a numpy array of type np.datetime64 with one fill value and one date # FIRST ISSUE WITH _FillValue # we need to provide ns resolution here too, otherwise we get wrong fillvalues (day-reference) time_fill_value = np.datetime64(""1900-01-01 00:00:00.00000000"", ""ns"") time = np.array([np.datetime64(""NaT"", ""ns""), '2023-01-02 00:00:00.00000000'], dtype='M8[ns]') # Create a dataset with this one array xr_time_array = xr.DataArray(data=time,dims=['time'],name='time') xr_ds = xr.Dataset(dict(time=xr_time_array)) print(""******************"") print(""Created with fill value 1900-01-01"") print(xr_ds[""time""]) # Save the dataset to zarr location_new_fill = ""from_xarray_new_fill.zarr"" # SECOND ISSUE with inferring units from data # We need to specify ""dtype"" and ""units"" which fit our data # Note: as we provide a _FillValue with a reference to unix-epoch # we need to provide a fitting units too encoding = { ""time"":{""_FillValue"":time_fill_value, ""dtype"":np.int64, ""units"":""nanoseconds since 1970-01-01""} } xr_ds.to_zarr(location_new_fill, mode=""w"", encoding=encoding) xr_read = xr.open_zarr(location_new_fill) print(""******************"") print(""Read back out of the zarr store with xarray"") print(xr_read[""time""]) print(xr_read[""time""].attrs) print(xr_read[""time""].encoding) z_new_fill = zarr.open('from_xarray_new_fill.zarr','r', ) print(""******************"") print(""Read back out of the zarr store with zarr"") print(z_new_fill[""time""]) print(z_new_fill[""time""].attrs) print(z_new_fill[""time""][:]) ``` ```python ****************** Created with fill value 1900-01-01 array([ 'NaT', '2023-01-02T00:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] NaT 2023-01-02 ****************** Read back out of the zarr store with xarray array([ 'NaT', '2023-01-02T00:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] NaT 2023-01-02 {} {'chunks': (2,), 'preferred_chunks': {'time': 2}, 'compressor': Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0), 'filters': None, '_FillValue': -2208988800000000000, 'units': 'nanoseconds since 1970-01-01', 'calendar': 'proleptic_gregorian', 'dtype': dtype('int64')} ****************** Read back out of the zarr store with zarr [-2208988800000000000 1672617600000000000] ``` @christine-e-smit Please let me know, if the above workaround gives you correct results in your workflow. If so, then we can think about how to automatically align fillvalue-resolution with data-resolution and what needs to be done to correctly deduce the units.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1685803922 https://github.com/pydata/xarray/issues/7790#issuecomment-1529076482,https://api.github.com/repos/pydata/xarray/issues/7790,1529076482,IC_kwDOAMm_X85bI9sC,5821660,2023-04-30T16:52:25Z,2023-04-30T16:52:25Z,MEMBER,"> ```python > xr_ds.to_zarr(location_new_fill,encoding=encoding) > > xr_read = xr.open_zarr(location) > print(""******************"") > print(""Read back out of the zarr store with xarray"") > print(xr_read[""time""]) > print(xr_read[""time""].encoding) > ``` @christine-e-smit Is this just a remnant of copy&paste? The above code writes to `location_new_fill`, but reads from `location`. Here is my code and output for comparison (using latest zarr/xarray): ```python # Create a numpy array of type np.datetime64 with one fill value and one date time_fill_value = np.datetime64(""1900-01-01"") time = np.array([np.datetime64(""NaT""), '2023-01-02'], dtype='M8[ns]') # Create a dataset with this one array xr_time_array = xr.DataArray(data=time,dims=['time'],name='time') xr_ds = xr.Dataset(dict(time=xr_time_array)) print(""******************"") print(""Created with fill value 1900-01-01"") print(xr_ds[""time""]) # Save the dataset to zarr location_new_fill = ""from_xarray_new_fill.zarr"" encoding = { ""time"":{""_FillValue"":time_fill_value,""dtype"":np.int64} } xr_ds.to_zarr(location_new_fill, encoding=encoding) xr_read = xr.open_zarr(location_new_fill) print(""******************"") print(""Read back out of the zarr store with xarray"") print(xr_read[""time""]) print(xr_read[""time""].encoding) ``` ```python ****************** Created with fill value 1900-01-01 array([ 'NaT', '2023-01-02T00:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] NaT 2023-01-02 ****************** Read back out of the zarr store with xarray array([ 'NaT', '2023-01-02T00:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] NaT 2023-01-02 {'chunks': (2,), 'preferred_chunks': {'time': 2}, 'compressor': Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0), 'filters': None, '_FillValue': -25567, 'units': 'days since 2023-01-02 00:00:00', 'calendar': 'proleptic_gregorian', 'dtype': dtype('int64')} ``` This doesn't look correct either. At least the decoded `_FillValue` or the `units` are wrong. So -25567 is 1900-01-01 when referenced to of unix-epoch (Question: Is zarr time based on unix epoch?). When read back via zarr only this would decode into: ```python array(['1953-01-02T00:00:00.000000000', '2023-01-02T00:00:00.000000000'], dtype='datetime64[ns]') ``` I totally agree with @christine-e-smit, this is all very confusing. As said at the beginning, I have little knowledge of zarr. I'm currently digging into cf encoding/decoding which made me jump on here. AFAICT, it looks like already the encoding has a problem, at least the data on disk is already not what we expect. It seems that somehow the xarray cf_encoding/decoding is not well aligned with the zarr writing/reading of datetimes. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1685803922 https://github.com/pydata/xarray/issues/2478#issuecomment-1527527029,https://api.github.com/repos/pydata/xarray/issues/2478,1527527029,IC_kwDOAMm_X85bDDZ1,5821660,2023-04-28T12:59:04Z,2023-04-28T15:46:09Z,MEMBER,"@sbiner Sorry for the massive delay here. It doesn't have changed much since creation of your issue. Xarray doesn't take the netcdf default fill values into account (there are reasons, which @shoyer has explained in https://github.com/pydata/xarray/pull/5680#issuecomment-895455163 and https://github.com/pydata/xarray/pull/5680#issuecomment-895508489). On write it just uses `NaN` as `_FillValue` (in case no specific `encoding` is given). Xref: #2374, #7723, #5680 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,368833116 https://github.com/pydata/xarray/issues/7713#issuecomment-1527605739,https://api.github.com/repos/pydata/xarray/issues/7713,1527605739,IC_kwDOAMm_X85bDWnr,5821660,2023-04-28T13:55:17Z,2023-04-28T13:55:17Z,MEMBER,"The code is there since #867 by @shoyer which was committed almost 7 years ago. I've no idea what's the purpose for packing tuples into 0d arrays but as there are also tests for it in the above PR I'm assuming there is one real reason. Maybe @shoyer can chime in here to shed some light?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1652227927 https://github.com/pydata/xarray/issues/7647#issuecomment-1527544656,https://api.github.com/repos/pydata/xarray/issues/7647,1527544656,IC_kwDOAMm_X85bDHtQ,5821660,2023-04-28T13:12:08Z,2023-04-28T13:12:08Z,MEMBER,@wangshuaicumt Did you get along with this issue? If this is still unresolved it would be great if you could provide the data or a MCVE.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1631491844 https://github.com/pydata/xarray/issues/7630#issuecomment-1527541305,https://api.github.com/repos/pydata/xarray/issues/7630,1527541305,IC_kwDOAMm_X85bDG45,5821660,2023-04-28T13:09:22Z,2023-04-28T13:09:22Z,MEMBER,@AlxndrLhr I suppose your original issue is resolved. Please reopen or create a new issue if you still have problems with this.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1624560934 https://github.com/pydata/xarray/issues/6429#issuecomment-1527537064,https://api.github.com/repos/pydata/xarray/issues/6429,1527537064,IC_kwDOAMm_X85bDF2o,5821660,2023-04-28T13:06:14Z,2023-04-28T13:06:14Z,MEMBER,"It looks like this is no issue any more with recent versions of the stack. At least I can't reproduce this. @mjwillson Please reopen, if you still encounter problems while plotting.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1188262115 https://github.com/pydata/xarray/issues/7092#issuecomment-1527498384,https://api.github.com/repos/pydata/xarray/issues/7092,1527498384,IC_kwDOAMm_X85bC8aQ,5821660,2023-04-28T12:34:03Z,2023-04-28T12:34:03Z,MEMBER,"@leicunxing-rs Sorry for the delay here. Your issue might be connected with concatenation/merge of several files containing packed data with different `scale_factor`/`add_offset`. See issue #5739 for more details (there they also merge different ERA5 datasets, hence the idea).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1387341095 https://github.com/pydata/xarray/issues/5739#issuecomment-1527461082,https://api.github.com/repos/pydata/xarray/issues/5739,1527461082,IC_kwDOAMm_X85bCzTa,5821660,2023-04-28T12:00:15Z,2023-04-28T12:00:15Z,MEMBER,"@dougrichardson Sorry for the delay. If you are still interested in the source of this issue here is what I found: The root cause is different `scale_factor` and `add_offset` in the source files. When merging only the `.encoding` of the first dataset survives. This leads to wrongly encoded file for the may-dates. But why is this so? The issue is with the packed dtype (""int16"") and the particular values of `scale_factor`/`add_offset`. For feb the dynamic range is (228.96394336525748, 309.9690856933594) K whereas for may it is (205.7644192729947, 311.7797088623047) K. Now we can clearly see that all our values which are above 309.969 K will be folded to the lower end (>229 K). To circumvent that you have at least two options: - change `scale_factor` and `add_offset` values in the variables `.encoding` before writing to appropriate values which cover your whole dynamic range - drop `scale_factor`/`add_offset` (and other CF related attributes) from .encoding to write floating point values It might be nice to have checks for that in the encoding steps, to prevent writing erroneous values. So this is not really a bug, but might be less impactful when encoding is dropped on operations (see discussion in #6323). ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,979916914 https://github.com/pydata/xarray/issues/5170#issuecomment-1527376059,https://api.github.com/repos/pydata/xarray/issues/5170,1527376059,IC_kwDOAMm_X85bCei7,5821660,2023-04-28T10:47:38Z,2023-04-28T10:47:38Z,MEMBER,"@floriankrb Sorry for the long delay. If you are still interested in the source of the issue, here is what I found: By default Xarray will promote any data variable which shares it's name with a dimension to a coordinate. That accounts for ['number', 'time', 'step', 'heightAboveGround', 'latitude', 'longitude']. `valid_time` is a two dimensional coordinate (by CF standard) and is a coordinate here because `t2m` data variable has a corresponding `coordinates`-attribute containing `valid_time`. In the decoding-step `valid_time` gets added to the `.coords`. The attribute is removed from `t2m`'s attrs and kept in `t2m.encoding`. So far so good. By renaming `number` to `n` that coordinates attribute (in encoding) does **not** change as well. So when the data is written, `t2m` will still hold `number` in it's `coordinates`-attribute (on disk). The issue manifests on subsequent read as now the decoding-step tries to align the found `coordinates` with the available data variables. As `number` is not available, no coordinate from that string will be taken into account as coordinate (note the `all` on line 444): https://github.com/pydata/xarray/blob/0f4e99d036b0d6d76a3271e6191eacbc9922662f/xarray/conventions.py#L439-L447 This can easily be observed by looking into `t2m.attrs` where the `coordinates` remains instead of being preserved in `.encoding`. So the source of all problems here is that the renaming `number` -> `n` was missed for `coordinates`-attribute of `t2m`'s `.encoding`. ","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,859772411 https://github.com/pydata/xarray/issues/2192#issuecomment-1527234694,https://api.github.com/repos/pydata/xarray/issues/2192,1527234694,IC_kwDOAMm_X85bB8CG,5821660,2023-04-28T09:06:22Z,2023-04-28T09:06:22Z,MEMBER,"Can't reproduce with recent xarray/matplotlib/cartopy. Looks like this has been resolved. ```python import xarray as xr import cartopy.crs as ccrs ds = xr.tutorial.load_dataset('air_temperature') ds = ds.sel(lon = slice(250, 300)) air = ds['air'] transform = ccrs.PlateCarree() projection = ccrs.Mercator(air.lon.values.mean(), air.lat.values.min(), air.lat.values.max()) p = air.isel(time=[0,1]).plot(transform = transform, aspect = ds.dims['lon']/ds.dims['lat'], col = 'time', col_wrap = 1, subplot_kws = {'projection': projection}) for ax in p.axs.flat: ax.set_extent((air.lon.values.min(), air.lon.values.max(), air.lat.values.min(), air.lat.values.max()), crs = transform) ax.set_aspect('equal', 'box') ``` ![subplot_issue](https://user-images.githubusercontent.com/5821660/235105075-117e54da-6902-42a9-8607-efb736fc0693.png) Please reopen, if this is still an issue.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,327101646 https://github.com/pydata/xarray/issues/7790#issuecomment-1527050493,https://api.github.com/repos/pydata/xarray/issues/7790,1527050493,IC_kwDOAMm_X85bBPD9,5821660,2023-04-28T06:21:38Z,2023-04-28T06:21:38Z,MEMBER,"Thanks @dcherian for filling in the details. I've digged up some more related issues: #2265, #3942, #4045 IIUC, #4684 did a great job to iron out much of these issues, but as it looks like only in the case when no `NaT` is within the time array (cc @spencerkclark). @christine-e-smit If you have no `NaT` in your time array then you can just omit `encoding` completely and Xarray will use int64 per default and your data should be fine on disk. In the presence of `NaT` it looks like one workaround to circumvent that issue for the time being is to add the `dtype` in addition to `_FillValue` when writing out to zarr : ```python encoding = { ""time"":{""_FillValue"": time_fill_value, ""dtype"": np.int64} xr_ds.to_zarr(location, encoding=encoding) } ``` One note to this: Xarray is deducing the `units` from the current time data. So for the above example it will result in `'days since 2023-01-02 00:00:00'` where `days` would now be the resolution in the file. If you want the resolution to be nanoseconds on disk `units` would need to be added to the encoding. ```python encoding = { ""time"":{""_FillValue"": time_fill_value, ""dtype"": np.int64, 'units': 'nanoseconds since 2023-01-02'} } xr_ds.to_zarr(location, encoding=encoding) ``` @christine-e-smit It would be great if you could confirm that from your side (some sanity check needed on my side). ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1685803922 https://github.com/pydata/xarray/issues/7790#issuecomment-1525790614,https://api.github.com/repos/pydata/xarray/issues/7790,1525790614,IC_kwDOAMm_X85a8beW,5821660,2023-04-27T14:23:16Z,2023-04-27T14:23:16Z,MEMBER,"@christine-e-smit I see, thanks for the details. AFAICT from the code it looks like `zarr` is special-cased in some ways compared to other backends. I'd really rely on some zarr-expert shedding light here and over at #7776.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1685803922 https://github.com/pydata/xarray/issues/7713#issuecomment-1525780533,https://api.github.com/repos/pydata/xarray/issues/7713,1525780533,IC_kwDOAMm_X85a8ZA1,5821660,2023-04-27T14:17:26Z,2023-04-27T14:17:26Z,MEMBER,"@zoj613 Thanks for raising this. The root-cause is that the tuple is returned from `as_compatible_data` as single element array: ```python import xarray as xr print(xr.core.variable.as_compatible_data((2, 3, 4))) ``` ```python array((2, 3, 4), dtype=object) ``` This then breaks with the error you are seeing. I'm not quite sure if this is a bug in the code, a bug in the doc or no bug at all. But as a tuple is easily wrapped by `np.array` there should be a reason why Xarray is currently not able to digest tuples.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1652227927 https://github.com/pydata/xarray/issues/7782#issuecomment-1525705799,https://api.github.com/repos/pydata/xarray/issues/7782,1525705799,IC_kwDOAMm_X85a8GxH,5821660,2023-04-27T13:33:50Z,2023-04-27T13:33:50Z,MEMBER,"> > As we can see from the above output, in netCDF4-python scaling is adapting the dtype to unsigned, not masking. This is also reflected in the docs [unidata.github.io/netcdf4-python/#Variable](https://unidata.github.io/netcdf4-python/#Variable). > > Do we know why this is so? TL;DR: NETCDF3 detail to allow (signal) unsigned integer, still used in recent formats - more discussion details on this over at https://github.com/Unidata/netcdf4-python/issues/656 - at NetCDF Users Guide on [packed data](https://docs.unidata.ucar.edu/nug/current/best_practices.html#bp_Packed-Data-Values): _A conventional way to indicate whether a byte, short, or int variable is meant to be interpreted as unsigned, even for the netCDF-3 classic model that has no external unsigned integer type, is by providing the special variable attribute \_Unsigned with value ""true"". However, most existing data for which packed values are intended to be interpreted as unsigned are stored without this attribute, so readers must be aware of packing assumptions in this case. In the enhanced netCDF-4 data model, packed integers may be declared to be of the appropriate unsigned type._ My suggestion would be to nudge the user by issuing warnings and link to new to be added documentation on the topic. This could be in line with the cf-coding conformance checks which have been discussed yesterday in the dev-meeting. ","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1681353195 https://github.com/pydata/xarray/issues/7790#issuecomment-1525524428,https://api.github.com/repos/pydata/xarray/issues/7790,1525524428,IC_kwDOAMm_X85a7afM,5821660,2023-04-27T11:26:15Z,2023-04-27T11:26:15Z,MEMBER,"Xref: discussion #7776, which got no attention up to now.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1685803922 https://github.com/pydata/xarray/issues/7790#issuecomment-1525513525,https://api.github.com/repos/pydata/xarray/issues/7790,1525513525,IC_kwDOAMm_X85a7X01,5821660,2023-04-27T11:19:24Z,2023-04-27T11:19:24Z,MEMBER,"@christine-e-smit So, I'm no expert for `zarr`, but it turns out that your `NaT` was converted to `-9.223372036854776e+18` in the encoding step. I'm assuming that `zarr` is converting `NaT` as the format doesn't allow to use `NaT` directly, so it chooses a (default) value. The `_FillValue` is not lost, but it will be preserved in the `.encoding`-dict of the underlying Variable: ```python xr_read = xr.open_zarr(location) print(""******************"") print(""No fill value"") print(xr_read[""time""]) print(xr_read[""time""].encoding) ``` ```python ****************** No fill value array([ 'NaT', '2023-01-02T00:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] NaT 2023-01-02 {'chunks': (2,), 'preferred_chunks': {'time': 2}, 'compressor': Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0), 'filters': None, '_FillValue': -9.223372036854776e+18, 'units': 'days since 2023-01-02 00:00:00', 'calendar': 'proleptic_gregorian', 'dtype': dtype('float64')} ``` You might also check this without decoding (`decode_cd=False`): ```python with xr.open_zarr(location, decode_cf=False) as xr_read: print(""******************"") print(""No fill value"") print(xr_read[""time""]) print(xr_read[""time""].encoding) ``` ```python ****************** No fill value array([-9.223372e+18, 0.000000e+00]) Coordinates: * time (time) float64 -9.223e+18 0.0 Attributes: calendar: proleptic_gregorian units: days since 2023-01-02 00:00:00 _FillValue: -9.223372036854776e+18 {'chunks': (2,), 'preferred_chunks': {'time': 2}, 'compressor': Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0), 'filters': None, 'dtype': dtype('float64')} ``` Maybe a zarr-expert can chime in here, what's the best practice for time-fill_values. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1685803922 https://github.com/pydata/xarray/pull/7788#issuecomment-1524805132,https://api.github.com/repos/pydata/xarray/issues/7788,1524805132,IC_kwDOAMm_X85a4q4M,5821660,2023-04-27T06:13:23Z,2023-04-27T07:19:47Z,MEMBER,"@maxhollmann I've checked and memory served well, the following issue might be related: #2377. It looks like your use-case is at least connected to @gerritholl's. It would be great if you could add your original use case (as MCVE, if possible) to get more details. A special case (masked integer arrays) is discussed in #3955. As this might give additional information, it might not exactly fit your problem.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1685422501 https://github.com/pydata/xarray/pull/7788#issuecomment-1523829332,https://api.github.com/repos/pydata/xarray/issues/7788,1523829332,IC_kwDOAMm_X85a08pU,5821660,2023-04-26T17:55:13Z,2023-04-26T17:55:13Z,MEMBER,"@maxhollmann I'll have a look into this, I think I've seen something like this some time ago. Maybe you can add the tests to the PR or as comment? This might get more attention and will really help to debug. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1685422501 https://github.com/pydata/xarray/pull/7788#issuecomment-1523786065,https://api.github.com/repos/pydata/xarray/issues/7788,1523786065,IC_kwDOAMm_X85a0yFR,5821660,2023-04-26T17:18:44Z,2023-04-26T17:18:44Z,MEMBER,"I've marked this by accident, sorry @maxhollmann. Let us know when you feel this is ready ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1685422501 https://github.com/pydata/xarray/issues/7782#issuecomment-1522997083,https://api.github.com/repos/pydata/xarray/issues/7782,1522997083,IC_kwDOAMm_X85axxdb,5821660,2023-04-26T08:28:39Z,2023-04-26T08:28:39Z,MEMBER,"This is how netCDF4-python handles this data with different parameters: ```python import netCDF4 as nc with nc.Dataset(""http://dap.ceda.ac.uk/thredds/dodsC/neodc/esacci/snow/data/scfv/MODIS/v2.0/2010/01/20100101-ESACCI-L3C_SNOW-SCFV-MODIS_TERRA-fv2.0.nc"") as ds_dap: v = ds_dap[""scfv""] print(v) print(""\n- default"") print(f""variable dtype: {v.dtype}"") print(f""first 2 elements: {v[0, 0, :2].dtype} {v[0, 0, :2]}"") print(f""last 2 elements: {v[0, 0, -2:].dtype} {v[0, 0, -2:]}"") print(""\n- maskandscale False"") ds_dap.set_auto_maskandscale(False) v = ds_dap[""scfv""] print(f""variable dtype: {v.dtype}"") print(f""first 2 elements: {v[0, 0, :2].dtype} {v[0, 0, :2]}"") print(f""last 2 elements: {v[0, 0, -2:].dtype} {v[0, 0, -2:]}"") print(""\n- mask/scale False"") ds_dap.set_auto_mask(False) ds_dap.set_auto_scale(False) v = ds_dap[""scfv""] print(f""variable dtype: {v.dtype}"") print(f""first 2 elements: {v[0, 0, :2].dtype} {v[0, 0, :2]}"") print(f""last 2 elements: {v[0, 0, -2:].dtype} {v[0, 0, -2:]}"") print(""\n- mask True / scale False"") ds_dap.set_auto_mask(True) ds_dap.set_auto_scale(False) v = ds_dap[""scfv""] print(f""variable dtype: {v.dtype}"") print(f""first 2 elements: {v[0, 0, :2].dtype} {v[0, 0, :2]}"") print(f""last 2 elements: {v[0, 0, -2:].dtype} {v[0, 0, -2:]}"") print(""\n- mask False / scale True"") ds_dap.set_auto_mask(False) ds_dap.set_auto_scale(True) v = ds_dap[""scfv""] print(f""variable dtype: {v.dtype}"") print(f""first 2 elements: {v[0, 0, :2].dtype} {v[0, 0, :2]}"") print(f""last 2 elements: {v[0, 0, -2:].dtype} {v[0, 0, -2:]}"") print(""\n- mask True / scale True"") ds_dap.set_auto_mask(True) ds_dap.set_auto_scale(True) v = ds_dap[""scfv""] print(f""variable dtype: {v.dtype}"") print(f""first 2 elements: {v[0, 0, :2].dtype} {v[0, 0, :2]}"") print(f""last 2 elements: {v[0, 0, -2:].dtype} {v[0, 0, -2:]}"") print(""\n- maskandscale True"") ds_dap.set_auto_mask(False) ds_dap.set_auto_scale(False) ds_dap.set_auto_maskandscale(True) v = ds_dap[""scfv""] print(f""variable dtype: {v.dtype}"") print(f""first 2 elements: {v[0, 0, :2].dtype} {v[0, 0, :2]}"") print(f""last 2 elements: {v[0, 0, -2:].dtype} {v[0, 0, -2:]}"") ``` ```python int8 scfv(time, lat, lon) _Unsigned: true _FillValue: -1 standard_name: snow_area_fraction_viewable_from_above long_name: Snow Cover Fraction Viewable units: percent valid_range: [ 0 -2] actual_range: [ 0 100] flag_values: [-51 -50 -46 -41 -4 -3 -2] flag_meanings: Cloud Polar_Night_or_Night Water Permanent_Snow_and_Ice Classification_failed Input_Data_Error No_Satellite_Acquisition missing_value: -1 ancillary_variables: scfv_unc grid_mapping: spatial_ref _ChunkSizes: [ 1 1385 2770] unlimited dimensions: time current shape = (1, 18000, 36000) filling off - default variable dtype: int8 first 2 elements: uint8 [215 215] last 2 elements: uint8 [215 215] - maskandscale False variable dtype: int8 first 2 elements: int8 [-41 -41] last 2 elements: int8 [-41 -41] - mask/scale False variable dtype: int8 first 2 elements: int8 [-41 -41] last 2 elements: int8 [-41 -41] - mask True / scale False variable dtype: int8 first 2 elements: int8 [-- --] last 2 elements: int8 [-- --] - mask False / scale True variable dtype: int8 first 2 elements: uint8 [215 215] last 2 elements: uint8 [215 215] - mask True / scale True variable dtype: int8 first 2 elements: uint8 [215 215] last 2 elements: uint8 [215 215] - maskandscale True variable dtype: int8 first 2 elements: uint8 [215 215] last 2 elements: uint8 [215 215] ``` First, the dataset was created with `filling off` (read more about that in the netcdf file format specs https://docs.unidata.ucar.edu/netcdf-c/current/file_format_specifications.html). This should not be a problem for the analysis, but it tells us that all data points should have been written to somehow. As we can see from the above output, in netCDF4-python `scaling` is adapting the dtype to unsigned, not masking. This is also reflected in the docs https://unidata.github.io/netcdf4-python/#Variable. If Xarray is trying to align with netCDF4-python it should separate `mask` and `scale` as netCDF4-python is doing. It does this already by using different coders but it doesn't separate it API-wise. We would need a similar approach here for Xarray with additional kwargs `scale` and `mask` in addition to `mask_and_scale`. We cannot just move the UnsignedCoder out of mask_and_scale and apply it unconditionally. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1681353195 https://github.com/pydata/xarray/issues/7782#issuecomment-1520804745,https://api.github.com/repos/pydata/xarray/issues/7782,1520804745,IC_kwDOAMm_X85apaOJ,5821660,2023-04-24T20:47:43Z,2023-04-24T20:47:43Z,MEMBER,"@dcherian The main issue here is that we have two different CF things which are applied, Unsigned and _FillValue/missing_value. For netcdf4-python the values would just be masked and the dtype would be preserved. For xarray it will be cast to float32 because of the _FillValue/missing_value. I agree, moving the Unsigned Coder out of mask_and_scale should help in that particular case.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1681353195 https://github.com/pydata/xarray/issues/7782#issuecomment-1520514792,https://api.github.com/repos/pydata/xarray/issues/7782,1520514792,IC_kwDOAMm_X85aoTbo,5821660,2023-04-24T16:52:30Z,2023-04-24T16:52:30Z,MEMBER,"@dcherian Yes, that would work. We would want to check the different attributes and apply the coders only as needed. That might need some refactoring. I'm already wrapping my head around this for several weeks now. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1681353195 https://github.com/pydata/xarray/issues/7782#issuecomment-1520363622,https://api.github.com/repos/pydata/xarray/issues/7782,1520363622,IC_kwDOAMm_X85anuhm,5821660,2023-04-24T15:10:24Z,2023-04-24T15:11:00Z,MEMBER,"Then you are somewhat deadlocked. `mask_and_scale=False` will also deactivate the Unsigned decoding. You might be able to achieve what want by using `decode_cf=False` (completely deactivate cf decoding). Then you would have to remove _FillValue attribute as well as missing_value attribute from the variables. Finally, you can run `xr.decode_cf(ds)` to correctly decode your data. I'll add a code example tomorrow if no one beats me to it.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1681353195 https://github.com/pydata/xarray/issues/7782#issuecomment-1520277594,https://api.github.com/repos/pydata/xarray/issues/7782,1520277594,IC_kwDOAMm_X85anZha,5821660,2023-04-24T14:31:00Z,2023-04-24T14:31:00Z,MEMBER,"@Articoking As both variables have a _FillValue attached xarray converts these values to NaN effectively casting to float32 in this case. You might inspect the `.encoding`-property of the respective variables to get information of the source dtype. You can deactivate the automatic conversion by adding kwarg `mask_and_scale=False`. There is more information in the docs https://docs.xarray.dev/en/stable/user-guide/io.html","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1681353195 https://github.com/pydata/xarray/pull/7771#issuecomment-1516573065,https://api.github.com/repos/pydata/xarray/issues/7771,1516573065,IC_kwDOAMm_X85aZRGJ,5821660,2023-04-20T15:53:58Z,2023-04-20T15:53:58Z,MEMBER,"OK it seems this is ready for a first round of reviews. A bit of added context: Currently there is no dedicated function for checking for CF standard conformance. The idea is to read as much as possible also non-standard conforming data files, but restrict writing non-standard conforming files. The implemented function `ensure_scale_offset_conformance` takes a `strict` keyword argument, which is `True` when encoding and `False` when decoding. If `strict=True` it will raise errors if there is a mismatch with the standard and when `strict=False` it will issue warnings. I've only had to adapt a few tests which where not conforming to standard on encoding to align with that. I've observed some warnings in the test suite which we might to have a look into. One idea would be to fix erroneous `scale_factor`/`add_offset` with our best fitting estimate. This is already done for list-type `scale_factor`/`add_offset`. I will follow-up with checks for CFMaskCoder. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1676309093 https://github.com/pydata/xarray/issues/7770#issuecomment-1515146820,https://api.github.com/repos/pydata/xarray/issues/7770,1515146820,IC_kwDOAMm_X85aT05E,5821660,2023-04-19T17:59:00Z,2023-04-19T17:59:00Z,MEMBER,It's also possible to use the custom BackendEntrypoint-class directly in the call to `xr.open_dataset` with the `engine` keyword.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1675299031 https://github.com/pydata/xarray/issues/7767#issuecomment-1514437541,https://api.github.com/repos/pydata/xarray/issues/7767,1514437541,IC_kwDOAMm_X85aRHul,5821660,2023-04-19T09:42:29Z,2023-04-19T09:42:29Z,MEMBER,"I think the equivalent incantation would be (note the different order of arguments in `xr.where`): ```python da = xr.DataArray(np.arange(10)) print(xr.where(da < 5, da, 0).values) print(da.where(da < 5, 0).values) ``` ``` [0 1 2 3 4 0 0 0 0 0] [0 1 2 3 4 0 0 0 0 0] ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1674532233 https://github.com/pydata/xarray/issues/7742#issuecomment-1501070685,https://api.github.com/repos/pydata/xarray/issues/7742,1501070685,IC_kwDOAMm_X85ZeIVd,5821660,2023-04-09T08:03:18Z,2023-04-09T08:03:18Z,MEMBER,"@ChristmasZCY Please have a look at the documentation about string encoding https://docs.xarray.dev/en/stable/user-guide/io.html#string-encoding Good chance that this gives you the needed information.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1659786592 https://github.com/pydata/xarray/pull/7720#issuecomment-1500333865,https://api.github.com/repos/pydata/xarray/issues/7720,1500333865,IC_kwDOAMm_X85ZbUcp,5821660,2023-04-07T14:21:02Z,2023-04-07T14:21:21Z,MEMBER,Rebased on top main after merge of #7719. This is ready for review. It's a one-liner actually :grin: ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1655000231 https://github.com/pydata/xarray/issues/4826#issuecomment-1498799474,https://api.github.com/repos/pydata/xarray/issues/4826,1498799474,IC_kwDOAMm_X85ZVd1y,5821660,2023-04-06T09:59:42Z,2023-04-06T09:59:42Z,MEMBER,@JoerivanEngelen Thanks for taking the time. Much appreciated.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,789410367 https://github.com/pydata/xarray/pull/7719#issuecomment-1498794212,https://api.github.com/repos/pydata/xarray/issues/7719,1498794212,IC_kwDOAMm_X85ZVcjk,5821660,2023-04-06T09:55:25Z,2023-04-06T09:55:25Z,MEMBER,This looks like it is ready to go. This will surely help further refactoring `encode_cf_variable`/`decode_cf_variable`. At least while working on it I spotted several locations where inconsistencies can be ironed out. A neat mostly flaw-free encoding/decoding is needed especially with regard to #6323.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1654988876 https://github.com/pydata/xarray/issues/7723#issuecomment-1498647087,https://api.github.com/repos/pydata/xarray/issues/7723,1498647087,IC_kwDOAMm_X85ZU4ov,5821660,2023-04-06T08:00:09Z,2023-04-06T08:00:09Z,MEMBER,"> > I'm still convinced this could be fixed for floating point data. > > Generally its worse if we obey some default fill values but not others, because it becomes quite confusing to a user. I think this depends from which side you look at it :-) My point here is, we do not have to submissively obey to default fill values, but just use them when decoding. This only need to happen if no `_FillValue` is attached to the variable. By doing this we ensure that these missing values are mapped to `np.nan` (as it is expected by users). In further course we can just apply the xarray standard `np.nan` when writing out. We need to document that in that case exact roundtrip isn't possible (it also isn't currently possible, in this example). Consider this example: ```python dtype = ""f4"" with nc.Dataset(""test-fillvalues-01.nc"", mode=""w"") as ds: x = ds.createDimension(""x"", 10) test_fillval_fillon = ds.createVariable(""test_fillval_fillon"", dtype, (""x"",), fill_value=nc.default_fillvals[dtype]) test_fillval_fillon[:5] = np.array([0.0, nc.default_fillvals[dtype], np.nan, 1.0, 8.0], dtype=dtype) test_nofillval_fillon = ds.createVariable(""test_nofillval_fillon"", dtype, (""x"",), fill_value=None) test_nofillval_fillon[:5] = np.array([0.0, nc.default_fillvals[dtype], np.nan, 1.0, 8.0], dtype=dtype) with nc.Dataset(""test-fillvalues-01.nc"") as ds: print(""\n read with netCDF4-python"") print(""---------------------------"") print(ds[""test_fillval_fillon""]) print(ds[""test_fillval_fillon""][:]) print(ds[""test_nofillval_fillon""]) print(ds[""test_nofillval_fillon""][:]) with xr.open_dataset(""test-fillvalues-01.nc"").load() as ds: print(""\n read with xarray"") print(""---------------------------"") print(ds[""test_fillval_fillon""]) print(ds[""test_fillval_fillon""][:]) print(ds[""test_nofillval_fillon""]) print(ds[""test_nofillval_fillon""][:]) ``` ```python read with netCDF4-python --------------------------- float32 test_fillval_fillon(x) _FillValue: 9.96921e+36 unlimited dimensions: current shape = (10,) filling on [0.0 -- nan 1.0 8.0 -- -- -- -- --] float32 test_nofillval_fillon(x) unlimited dimensions: current shape = (10,) filling on, default _FillValue of 9.969209968386869e+36 used [0.0 -- nan 1.0 8.0 -- -- -- -- --] read with xarray-python --------------------------- array([ 0., nan, nan, 1., 8., nan, nan, nan, nan, nan], dtype=float32) Dimensions without coordinates: x array([0.00000e+00, 9.96921e+36, nan, 1.00000e+00, 8.00000e+00, 9.96921e+36, 9.96921e+36, 9.96921e+36, 9.96921e+36, 9.96921e+36], dtype=float32) Dimensions without coordinates: x ``` The only difference between these two variables is that on the first the `_FillValue` is declared, and on the other the default `_FillValue` is used. So if xarray obeys (by CF standard) the first it should also obey the second. This might just work, if these cases the default fillvalue is used for decoding to `np.nan`, and declared that `np.nan` will be the new `_FillValue`. Does that make sense?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1655569401 https://github.com/pydata/xarray/pull/7719#issuecomment-1498540636,https://api.github.com/repos/pydata/xarray/issues/7719,1498540636,IC_kwDOAMm_X85ZUepc,5821660,2023-04-06T06:07:50Z,2023-04-06T06:23:40Z,MEMBER,"Now, this is interesting! It looks like those FillValue issues are following me. What did change that this now materializes here, all of a sudden. Update: Small change - big issue. Checked for `fv_exists` instead of `not fv_exists` :grimacing: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1654988876 https://github.com/pydata/xarray/issues/7722#issuecomment-1498490570,https://api.github.com/repos/pydata/xarray/issues/7722,1498490570,IC_kwDOAMm_X85ZUSbK,5821660,2023-04-06T04:55:02Z,2023-04-06T04:55:02Z,MEMBER,"The recommendation is to use `_FillValue` if there is only one value describing missing/fillvalue. https://cfconventions.org/Data/cf-conventions/cf-conventions-1.10/cf-conventions.html#missing-data It's also written that `missing_value` is > This attribute is not treated in any special way by the library or conforming generic applications, but is often useful documentation and may be used by specific applications. https://docs.unidata.ucar.edu/netcdf-c/current/attribute_conventions.html Not sure, if xarray is a conforming generic application or a specific application. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1655483374 https://github.com/pydata/xarray/issues/7723#issuecomment-1498464352,https://api.github.com/repos/pydata/xarray/issues/7723,1498464352,IC_kwDOAMm_X85ZUMBg,5821660,2023-04-06T04:09:11Z,2023-04-06T04:09:11Z,MEMBER,"@dcherian Great, a duplicate. :-( Sorry I must have overlooked that one. It's somewhat counter-intuitive to get differing results when using netcdf4-python and xarray. Would be a good idea to document this behaviour. It looks like it might at least be resolved for floating point source data: Let's take the above simple example. We have np.nan written to the file, but the netcdf representation on disk uses a default (undeclared by attribute) `_FillValue` for unwritten parts. For the netcdf4-python user the np.nan will not be masked, but the unfilled parts will be masked. For xarray the default fillvalue won't be masked, appearing as valid data, which it is not. On subsequent writes np.nan will be introduced as the new fillvalue (by attribute), effectively changing the meaning of the default fillvalues. Wouldn't it make sense then, to transform these default fill values to np.nan on read too, instead of giving the a seemingly meaningful value? Maybe yet another keyword switch, `use_default_fillvalues`? There should be at least a warning on read, in these situations, that there are undefined values in the dataset which were never written and which will not be masked. If the dataset contains unwritten parts, and a default fillvalue is used, in turn meaning the data creator did this by purpose (by not setting a `_FillValue`) it can mean several things: - The creators data does actually not have missing values which need declaring, but it means, that his data will get masked for default fillvalue entries (maybe they doesn't know about this, but that might be unlikely). - The creator doesn't care at all, with same conclusion as above. - The creator purposefully uses default fillvalue as missing value, since they use this as a means of saving disk space. But this could also be done, by just defining that as `_FillValue` attribute at creation time, if I`m not mistaken. I'm still convinced this could be fixed for floating point data. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1655569401 https://github.com/pydata/xarray/issues/4826#issuecomment-1497971459,https://api.github.com/repos/pydata/xarray/issues/4826,1497971459,IC_kwDOAMm_X85ZSTsD,5821660,2023-04-05T18:56:23Z,2023-04-05T18:56:23Z,MEMBER,Please check #7720 if that fixes the conversion problems. Thanks.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,789410367 https://github.com/pydata/xarray/issues/7573#issuecomment-1497866542,https://api.github.com/repos/pydata/xarray/issues/7573,1497866542,IC_kwDOAMm_X85ZR6Eu,5821660,2023-04-05T17:31:05Z,2023-04-05T17:31:05Z,MEMBER,"If it helps to minimize interoperability issues I'm all in for the change. One thing I would maybe do is wait for the next version. With the current PR we would end up with two different build numbers with differing behaviour, which might confuse folks. But I'd rely on @ocefpaf's expertise.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1603957501 https://github.com/pydata/xarray/pull/7654#issuecomment-1496973403,https://api.github.com/repos/pydata/xarray/issues/7654,1496973403,IC_kwDOAMm_X85ZOgBb,5821660,2023-04-05T06:15:58Z,2023-04-05T06:15:58Z,MEMBER,"As explained I've created two PR (#7719 and #7720) for the ""easy"" changes from this PR. Would be great, if those could go in fast. Thanks!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1633623916 https://github.com/pydata/xarray/pull/7654#issuecomment-1496950962,https://api.github.com/repos/pydata/xarray/issues/7654,1496950962,IC_kwDOAMm_X85ZOaiy,5821660,2023-04-05T05:46:15Z,2023-04-05T05:46:15Z,MEMBER,@dcherian Just a heads-up: I find this PR getting more and more involved at different parts of the machinery and hard to follow for reviewers. I'll split this up and start with the more or less undisputed changes.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1633623916 https://github.com/pydata/xarray/pull/7654#issuecomment-1496044623,https://api.github.com/repos/pydata/xarray/issues/7654,1496044623,IC_kwDOAMm_X85ZK9RP,5821660,2023-04-04T14:10:33Z,2023-04-04T14:10:33Z,MEMBER,"Still hunting for corner cases and issues inside encode_cf_variable/decode_cf_variable. It looks like I already see some light again. Not sure, if this is the last iteration, but the testsuite is still running green with added and enhanced tests, which is not that bad. Unfortunately https://github.com/pydata/xarray/issues/2304 is still an issue for now. I'll clarify that later with an added test.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1633623916 https://github.com/pydata/xarray/pull/7654#issuecomment-1493930592,https://api.github.com/repos/pydata/xarray/issues/7654,1493930592,IC_kwDOAMm_X85ZC5Jg,5821660,2023-04-03T08:53:17Z,2023-04-03T08:53:17Z,MEMBER,While trying to create a test which specifically tests `_choose_float_dtype` I've found some issues with checking for availability of scale_factor/add_offset. Now testing for `None`.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1633623916 https://github.com/pydata/xarray/pull/7654#issuecomment-1493296175,https://api.github.com/repos/pydata/xarray/issues/7654,1493296175,IC_kwDOAMm_X85ZAeQv,5821660,2023-04-02T10:47:21Z,2023-04-02T10:47:21Z,MEMBER,"This is now ready for another round of reviews, @dcherian, @Illviljan and @mankoff. As @mankoff already pointed out, xarray is very generous to try to encode/decode non CF conforming data. This makes things a bit complicated as some issues only surface in rare corner cases. I've tried to be as explicit in `_choose_float_dtype`, also added comments/tests where needed. I'm finding the typing a bit hard. It seems that mypy can't derive the correct types from return types in certain cases. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1633623916 https://github.com/pydata/xarray/pull/7654#issuecomment-1493127898,https://api.github.com/repos/pydata/xarray/issues/7654,1493127898,IC_kwDOAMm_X85Y_1La,5821660,2023-04-01T21:23:40Z,2023-04-01T21:23:40Z,MEMBER,"If at first you don't succeed... It looks like we have something working here. Some more typing and maybe some more tests covering the cases with scale_factor/add_offset/_FillValue non-conforming CF and we should be good to go. Or do I miss something?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1633623916 https://github.com/pydata/xarray/pull/7654#issuecomment-1493084805,https://api.github.com/repos/pydata/xarray/issues/7654,1493084805,IC_kwDOAMm_X85Y_qqF,5821660,2023-04-01T19:34:18Z,2023-04-01T19:34:18Z,MEMBER,"The latest changes brake #1840 again. We have two contradicting forces here, which need to be aligned. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1633623916 https://github.com/pydata/xarray/issues/5597#issuecomment-1492937244,https://api.github.com/repos/pydata/xarray/issues/5597,1492937244,IC_kwDOAMm_X85Y_Goc,5821660,2023-04-01T11:03:02Z,2023-04-01T11:03:02Z,MEMBER," > To fix this, I think logic in `_choose_float_dtype` should be updated to look at `encoding['dtype']` (if available) instead of `dtype`, in order to understand how the data was originally stored. This is aimed at in #7654 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,942738904 https://github.com/pydata/xarray/pull/7654#issuecomment-1492895855,https://api.github.com/repos/pydata/xarray/issues/7654,1492895855,IC_kwDOAMm_X85Y-8hv,5821660,2023-04-01T09:48:57Z,2023-04-01T09:48:57Z,MEMBER,"@Illviljan I'm not able to figure out the typing if I want to use Data-types as functions to convert python numbers to array scalars. If you have any suggestion how to fix this, please let me know.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1633623916 https://github.com/pydata/xarray/pull/7654#issuecomment-1492880874,https://api.github.com/repos/pydata/xarray/issues/7654,1492880874,IC_kwDOAMm_X85Y-43q,5821660,2023-04-01T08:46:49Z,2023-04-01T09:28:16Z,MEMBER,"@dcherian @Illviljan Thanks for the first round of review. I've rebased everything on latest main. Now the code moving from `conventions.py` to `coding.variable.py` is correct. I've also removed the functions which have been converted to `VariableCoders` and adapted the tests. To sum up this PR, it does: - convert functions to `VariableCoders` along @shoyer's TODO: https://github.com/pydata/xarray/blob/1c81162755457b3f4dc1f551f0321c75ec9daf6c/xarray/conventions.py#L298-L302 https://github.com/pydata/xarray/blob/1c81162755457b3f4dc1f551f0321c75ec9daf6c/xarray/conventions.py#L393-L405 - preserve boolean dtype within `encoding`: https://github.com/pydata/xarray/issues/7652#issuecomment-1476956975 - deterrmine cf packed dtype from `scale_factor`/`add_offset` #7691, #2304 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1633623916 https://github.com/pydata/xarray/issues/7691#issuecomment-1492078304,https://api.github.com/repos/pydata/xarray/issues/7691,1492078304,IC_kwDOAMm_X85Y707g,5821660,2023-03-31T15:05:17Z,2023-03-31T15:05:17Z,MEMBER,"> , the PR seems to solve my specific issue without changing the encoding Great, thanks for testing. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1643408278 https://github.com/pydata/xarray/issues/7691#issuecomment-1491915288,https://api.github.com/repos/pydata/xarray/issues/7691,1491915288,IC_kwDOAMm_X85Y7NIY,5821660,2023-03-31T13:19:01Z,2023-03-31T13:19:01Z,MEMBER,"@euronion There is a potential fix for your issue in #7654. It would be great, if you could have a closer look and test against that PR.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1643408278 https://github.com/pydata/xarray/pull/7654#issuecomment-1491760266,https://api.github.com/repos/pydata/xarray/issues/7654,1491760266,IC_kwDOAMm_X85Y6nSK,5821660,2023-03-31T11:13:49Z,2023-03-31T11:13:49Z,MEMBER,"@dcherian @basnijholt After the dev-meeting I've taken a step back and first implemented the coders as mentioned in @shoyer's ToDo. I've fixed the one bool->int issue and it now derives the dtype for ScaleOffset coding from scale_factor add_offset. I've improved some test with regard to the scale/offset issue. I'll concentrate on the string fillvalue issues in a follow up PR. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1633623916 https://github.com/pydata/xarray/issues/7691#issuecomment-1486870845,https://api.github.com/repos/pydata/xarray/issues/7691,1486870845,IC_kwDOAMm_X85Yn9k9,5821660,2023-03-28T13:16:31Z,2023-03-28T13:31:46Z,MEMBER,"MCVE: ```python fname = ""test-7691.nc"" import netCDF4 as nc with nc.Dataset(fname, ""w"") as ds0: ds0.createDimension(""t"", 5) ds0.createVariable(""x"", ""int16"", (""t"",), fill_value=-32767) v = ds0.variables[""x""] v.set_auto_maskandscale(False) v.add_offset = 278.297319296597 v.scale_factor = 1.16753614203674e-05 v[:] = np.array([-32768, -32767, -32766, 32767, 0]) with nc.Dataset(fname) as ds1: x1 = ds1[""x""][:] print(""netCDF4-python:"", x1.dtype, x1) with xr.open_dataset(fname) as ds2: x2 = ds2[""x""].values ds2.to_netcdf(""test-7691-01.nc"") print(""xarray first read:"", x2.dtype, x2) with xr.open_dataset(""test-7691-01.nc"") as ds3: x3 = ds3[""x""].values print(""xarray roundtrip:"", x3.dtype, x3) ``` ```python netCDF4-python: float64 [277.9147410535744 -- 277.9147644042972 278.67988586425815 278.297319296597] xarray first read: float32 [277.91476 nan 277.91476 278.6799 278.29733] xarray roundtrip: float32 [ nan nan nan 278.6799 278.29733] ``` I've confirmed that correctly promoting to `float64` in `CFMaskCoder` solves this issue. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1643408278 https://github.com/pydata/xarray/issues/7691#issuecomment-1486817329,https://api.github.com/repos/pydata/xarray/issues/7691,1486817329,IC_kwDOAMm_X85Ynwgx,5821660,2023-03-28T12:41:43Z,2023-03-28T12:41:43Z,MEMBER,"> As this doesn't surface that often it might just happen here by accident. If the `_FillValue`/`missing_value` would be `-32768` then the issue would not manifest. So for NetCDF the default fillvalue for NC_SHORT (`int16`) is `-32767`. That means the promotion to `float32` instead the needed `float64` is the problem here (floating point precision).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1643408278