html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/2857#issuecomment-807126680,https://api.github.com/repos/pydata/xarray/issues/2857,807126680,MDEyOklzc3VlQ29tbWVudDgwNzEyNjY4MA==,2418513,2021-03-25T17:17:48Z,2021-03-25T17:18:21Z,NONE,"> OK, we might check if that depends on the data size or on the number of groups, or both.

It scales with data size it seems, but: even if you reduce data size to 1 element, after 50 iterations a single write goes up to 150ms already (whereas it's a few milliseconds in an empty file). These 150ms is the pure 'file traversal' etc part; the rest (of the 2 seconds) is the part where it seemingly reads stuff - which scales with data. Ideally it should just stay at <10ms all the time.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,427410885
https://github.com/pydata/xarray/issues/2857#issuecomment-806863336,https://api.github.com/repos/pydata/xarray/issues/2857,806863336,MDEyOklzc3VlQ29tbWVudDgwNjg2MzMzNg==,2418513,2021-03-25T14:35:28Z,2021-03-25T17:15:06Z,NONE,"> I wonder if it would help to use the same underlying `h5py.File` or `h5netcdf.File` when appending.

I don't think it's about what's happening in the current Python's process, which instances are being cached or not, it's about the general logic.

For instance, in the example above, if you run it once (e.g. set the range to 50); and then run it but comment out the block that clears the file, and set the range to 50-100. The very first dataset written the second time will be already very slow, slower than the last dataset written the first time - which means it's not about reusing the same `File` instance.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,427410885
https://github.com/pydata/xarray/issues/2857#issuecomment-806776909,https://api.github.com/repos/pydata/xarray/issues/2857,806776909,MDEyOklzc3VlQ29tbWVudDgwNjc3NjkwOQ==,2418513,2021-03-25T13:48:04Z,2021-03-25T13:48:29Z,NONE,"Without digging into implementational details, my logic as a library user would be this:

- If I write one dataset to file1 and another dataset to file2 using to_netcdf(), to different groups
- And then I simply combine the two hdf5 files using some external tools (again, datasets stored in different groups)
- I will be able to read them both perfectly well using `open_dataset()` or `load_dataset()`
- This implies that the datasets can be written just fine independently without knowing about each other
- Why then those writing functions (flush in particular) traverse and read the entire file every time?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,427410885
https://github.com/pydata/xarray/issues/2857#issuecomment-806767981,https://api.github.com/repos/pydata/xarray/issues/2857,806767981,MDEyOklzc3VlQ29tbWVudDgwNjc2Nzk4MQ==,2418513,2021-03-25T13:44:22Z,2021-03-25T13:45:04Z,NONE,"Just checked it out.

| Number of datasets in file | netCDF4 (ms/write) | h5netcdf (ms/write) |
| --- | --- | --- |
| 1 | 4 | 11 |
| 250 | 142| 1933 |","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,427410885
https://github.com/pydata/xarray/issues/2857#issuecomment-806740965,https://api.github.com/repos/pydata/xarray/issues/2857,806740965,MDEyOklzc3VlQ29tbWVudDgwNjc0MDk2NQ==,2418513,2021-03-25T13:27:17Z,2021-03-25T13:27:17Z,NONE,"Here's the minimal example, try running this:

```python
import time
import xarray as xr
import numpy as np
import h5py

arr = xr.DataArray(np.random.RandomState(0).randint(-100, 100, (50_000, 3)), dims=['x', 'y'])
ds = xr.Dataset({'arr': arr})

filename = 'test.h5'
save = lambda group: ds.to_netcdf(filename, engine='h5netcdf', mode='a', group=str(group))

with h5py.File(filename, 'w') as _:
    pass

for i in range(250):
    t0 = time.time()
    save(i)
    print(time.time() - t0)
```","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,427410885
https://github.com/pydata/xarray/issues/2857#issuecomment-806713825,https://api.github.com/repos/pydata/xarray/issues/2857,806713825,MDEyOklzc3VlQ29tbWVudDgwNjcxMzgyNQ==,2418513,2021-03-25T13:10:13Z,2021-03-25T13:10:13Z,NONE,"Is it possible to use `.to_netcdf()` without `h5netcdf.File` touching **any** of the pre-existing data or attempting to read it or traverse it? This will inevitably cause quadratic slowdowns as you write multiple datasets to the file - and that's what seems to be happening.

Or at least, don't traverse anything above the current root group that the dataset is being written into.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,427410885
https://github.com/pydata/xarray/issues/2857#issuecomment-806711702,https://api.github.com/repos/pydata/xarray/issues/2857,806711702,MDEyOklzc3VlQ29tbWVudDgwNjcxMTcwMg==,2418513,2021-03-25T13:08:46Z,2021-03-25T13:08:46Z,NONE,"@kmuehlbauer Just installed h5netcdf=0.10.0, here's the timings when there's 200 groups in file - `store.close()` takes 92.4% of time again:

```
  1078         1          1.0      1.0      0.0      try:
  1079                                                   # TODO: allow this work (setting up the file for writing array data)
  1080                                                   # to be parallelized with dask
  1081         2     221642.0 110821.0      4.2          dump_to_store(
  1082         1          2.0      2.0      0.0              dataset, store, writer, encoding=encoding, unlimited_dims=unlimited_dims
  1083                                                   )
  1084         1          3.0      3.0      0.0          if autoclose:
  1085                                                       store.close()
  1086                                           
  1087         1          1.0      1.0      0.0          if multifile:
  1088                                                       return writer, store
  1089                                           
  1090         1          6.0      6.0      0.0          writes = writer.sync(compute=compute)
  1091                                           
  1092         1          1.0      1.0      0.0          if path_or_file is None:
  1093                                                       store.sync()
  1094                                                       return target.getvalue()
  1095                                               finally:
  1096         1          2.0      2.0      0.0          if not multifile and compute:
  1097         1    4857912.0 4857912.0     92.6              store.close()
```

And here's `_lookup_dimensions()`: (note that it only takes **half** of the time, there's tons of other time spent in `File.flush()` which I don't understand):

```
Timer unit: 1e-06 s

Total time: 2.44857 s
File: .../python3.8/site-packages/h5netcdf/core.py
Function: _lookup_dimensions at line 92

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    92                                               def _lookup_dimensions(self):
    93       400      65513.0    163.8      2.7          attrs = self._h5ds.attrs
    94       400       6175.0     15.4      0.3          if ""_Netcdf4Coordinates"" in attrs:
    95                                                       order_dim = _reverse_dict(self._parent._dim_order)
    96                                                       return tuple(
    97                                                           order_dim[coord_id] for coord_id in attrs[""_Netcdf4Coordinates""]
    98                                                       )
    99                                           
   100       400      44938.0    112.3      1.8          child_name = self.name.split(""/"")[-1]
   101       400       5006.0     12.5      0.2          if child_name in self._parent.dimensions:
   102                                                       return (child_name,)
   103                                           
   104       400        350.0      0.9      0.0          dims = []
   105       400        781.0      2.0      0.0          phony_dims = defaultdict(int)
   106      1400     166093.0    118.6      6.8          for axis, dim in enumerate(self._h5ds.dims):
   107                                                       # get current dimension
   108      1000     119507.0    119.5      4.9              dimsize = self.shape[axis]
   109      1000       2459.0      2.5      0.1              phony_dims[dimsize] += 1
   110      1000      34345.0     34.3      1.4              if len(dim):
   111      1000    2001071.0   2001.1     81.7                  name = _name_from_dimension(dim)
   112                                                       else:
   113                                                           # if unlabeled dimensions are found
   114                                                           if self._root._phony_dims_mode is None:
   115                                                               raise ValueError(
   116                                                                   ""variable %r has no dimension scale ""
   117                                                                   ""associated with axis %s. \n""
   118                                                                   ""Use phony_dims=%r for sorted naming or ""
   119                                                                   ""phony_dims=%r for per access naming.""
   120                                                                   % (self.name, axis, ""sort"", ""access"")
   121                                                               )
   122                                                           else:
   123                                                               # get dimension name
   124                                                               name = self._parent._phony_dims[(dimsize, phony_dims[dimsize] - 1)]
   125      1000       1820.0      1.8      0.1              dims.append(name)
   126       400        512.0      1.3      0.0          return tuple(dims)
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,427410885
https://github.com/pydata/xarray/issues/2857#issuecomment-806680140,https://api.github.com/repos/pydata/xarray/issues/2857,806680140,MDEyOklzc3VlQ29tbWVudDgwNjY4MDE0MA==,2418513,2021-03-25T12:48:23Z,2021-03-25T12:49:19Z,NONE,"There's some absolutely obscure things here, e.g. `h5netcdf.core.BaseVariable._lookup_dimensions`:

For 0 datasets:

```
Timer unit: 1e-06 s

Total time: 0.005034 s
File: .../python3.8/site-packages/h5netcdf/core.py
Function: _lookup_dimensions at line 86

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    86                                               def _lookup_dimensions(self):
    87         2        633.0    316.5     12.6          attrs = self._h5ds.attrs
    88         2         53.0     26.5      1.1          if '_Netcdf4Coordinates' in attrs:
    89                                                       order_dim = _reverse_dict(self._parent._dim_order)
    90                                                       return tuple(order_dim[coord_id]
    91                                                                    for coord_id in attrs['_Netcdf4Coordinates'])
    92                                           
    93         2        471.0    235.5      9.4          child_name = self.name.split('/')[-1]
    94         2         51.0     25.5      1.0          if child_name in self._parent.dimensions:
    95                                                       return (child_name,)
    96                                           
    97         2          4.0      2.0      0.1          dims = []
    98         7       1671.0    238.7     33.2          for axis, dim in enumerate(self._h5ds.dims):
    99                                                       # TODO: read dimension labels even if there is no associated
   100                                                       # scale? it's not netCDF4 spec, but it is unambiguous...
   101                                                       # Also: the netCDF lib can read HDF5 datasets with unlabeled
   102                                                       # dimensions.
   103         5        355.0     71.0      7.1              if len(dim) == 0:
   104                                                           raise ValueError('variable %r has no dimension scale '
   105                                                                            'associated with axis %s'
   106                                                                            % (self.name, axis))
   107         5       1772.0    354.4     35.2              name = _name_from_dimension(dim)
   108         5         18.0      3.6      0.4              dims.append(name)
   109         2          6.0      3.0      0.1          return tuple(dims)
```

For 200 datasets:

```
Timer unit: 1e-06 s

Total time: 2.34179 s
File: .../python3.8/site-packages/h5netcdf/core.py
Function: _lookup_dimensions at line 86

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    86                                               def _lookup_dimensions(self):
    87       400      66185.0    165.5      2.8          attrs = self._h5ds.attrs
    88       400       6106.0     15.3      0.3          if '_Netcdf4Coordinates' in attrs:
    89                                                       order_dim = _reverse_dict(self._parent._dim_order)
    90                                                       return tuple(order_dim[coord_id]
    91                                                                    for coord_id in attrs['_Netcdf4Coordinates'])
    92                                           
    93       400      45176.0    112.9      1.9          child_name = self.name.split('/')[-1]
    94       400       5006.0     12.5      0.2          if child_name in self._parent.dimensions:
    95                                                       return (child_name,)
    96                                           
    97       400        317.0      0.8      0.0          dims = []
    98      1400     168708.0    120.5      7.2          for axis, dim in enumerate(self._h5ds.dims):
    99                                                       # TODO: read dimension labels even if there is no associated
   100                                                       # scale? it's not netCDF4 spec, but it is unambiguous...
   101                                                       # Also: the netCDF lib can read HDF5 datasets with unlabeled
   102                                                       # dimensions.
   103      1000      35653.0     35.7      1.5              if len(dim) == 0:
   104                                                           raise ValueError('variable %r has no dimension scale '
   105                                                                            'associated with axis %s'
   106                                                                            % (self.name, axis))
   107      1000    2012597.0   2012.6     85.9              name = _name_from_dimension(dim)
   108      1000       1640.0      1.6      0.1              dims.append(name)
   109       400        400.0      1.0      0.0          return tuple(dims)
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,427410885
https://github.com/pydata/xarray/issues/2857#issuecomment-806667029,https://api.github.com/repos/pydata/xarray/issues/2857,806667029,MDEyOklzc3VlQ29tbWVudDgwNjY2NzAyOQ==,2418513,2021-03-25T12:40:18Z,2021-03-25T12:49:00Z,NONE,"- All of the time in `store.close()` is, in its turn spent in `CachingFileManager.close()`
- That time is spent in `h5netcdf.File.close()`
- All of which is spent in `h5netcdf.File.flush()`

`h5netcdf.File.flush()` when there's 0 datasets in file:

```
0.21619391441345215

Timer unit: 1e-06 s

Total time: 0.006862 s
File: .../python3.8/site-packages/h5netcdf/core.py
Function: flush at line 689

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   689                                               def flush(self):
   690         1          4.0      4.0      0.1          if 'r' not in self._mode:
   691         1        111.0    111.0      1.6              self._set_unassigned_dimension_ids()
   692         1       3521.0   3521.0     51.3              self._create_dim_scales()
   693         1       3224.0   3224.0     47.0              self._attach_dim_scales()
   694         1          2.0      2.0      0.0              if not self._preexisting_file and self._write_ncproperties:
   695                                                           self.attrs._h5attrs['_NCProperties'] = _NC_PROPERTIES
```

`h5netcdf.File.flush()` when there's 200 datasets in file (**758 times slower**):

```
Timer unit: 1e-06 s

Total time: 4.55295 s
File: .../python3.8/site-packages/h5netcdf/core.py
Function: flush at line 689

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   689                                               def flush(self):
   690         1          3.0      3.0      0.0          if 'r' not in self._mode:
   691         1    1148237.0 1148237.0     25.2              self._set_unassigned_dimension_ids()
   692         1     462926.0  462926.0     10.2              self._create_dim_scales()
   693         1    2941779.0 2941779.0     64.6              self._attach_dim_scales()
   694         1          2.0       2.0      0.0              if not self._preexisting_file and self._write_ncproperties:
   695                                                           self.attrs._h5attrs['_NCProperties'] = _NC_PROPERTIES
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,427410885
https://github.com/pydata/xarray/issues/2857#issuecomment-806651823,https://api.github.com/repos/pydata/xarray/issues/2857,806651823,MDEyOklzc3VlQ29tbWVudDgwNjY1MTgyMw==,2418513,2021-03-25T12:30:39Z,2021-03-25T12:46:26Z,NONE,"@shoyer This problem persisted all of this time, but since I faced it again, I did a bit of digging. (it's strange noone else noticed it so far as it's pretty bad)

I've line-profiled this snippet for various number of datasets already written to file (`xarray.backends.api.to_netcdf`):

https://github.com/pydata/xarray/blob/8452120e52862df564a6e629d1ab5a7d392853b0/xarray/backends/api.py#L1075-L1094

| Number of datasets in file | `dump_to_store()` | `store_open()` | `store.close()` |
| --- | --- | --- | --- |
| 0 | 88% | 1%  | 10% |
| 50 | 18% | 2% | 80% |
| 200 | 4% | 2% | 94% |

The above can be measured simply in a notebook via `%lprun -f xarray.backends.api.to_netcdf test_func()`. The writing was done in `mode='a'`, with blosc:zstd compression. All datasets are written into *different groups* (i.e. by passing `group=...`).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,427410885
https://github.com/pydata/xarray/pull/4684#issuecomment-745366696,https://api.github.com/repos/pydata/xarray/issues/4684,745366696,MDEyOklzc3VlQ29tbWVudDc0NTM2NjY5Ng==,2418513,2020-12-15T15:29:30Z,2020-12-15T15:29:30Z,NONE,"Looks great, thanks! Do I understand this correctly - you won't have to specify `encoding` manually, as `int64` encoding will be picked by default for `M8[ns]` dtype?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,764440458
https://github.com/pydata/xarray/issues/4045#issuecomment-735851973,https://api.github.com/repos/pydata/xarray/issues/4045,735851973,MDEyOklzc3VlQ29tbWVudDczNTg1MTk3Mw==,2418513,2020-11-30T15:22:09Z,2020-11-30T15:22:09Z,NONE,"> Can we use the encoding[""dtype""] field to solve this? i.e. use int64 when encoding[""dtype""] is not set and use the specified value when available?

I think a lot of logic needs to be reshuffled, because as of right now it will complain ""you can't store a float64 in int64"" or something along those lines, when trying to do it with a nanosecond timestamp.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,614275938
https://github.com/pydata/xarray/issues/4045#issuecomment-735849936,https://api.github.com/repos/pydata/xarray/issues/4045,735849936,MDEyOklzc3VlQ29tbWVudDczNTg0OTkzNg==,2418513,2020-11-30T15:18:55Z,2020-11-30T15:21:02Z,NONE,"> In principle we should be able to handle this (contributions are welcome)

I don't mind contributing but not knowing the netcdf stuff inside out I'm not sure I have a good vision on what's the proper way to do it. My use case is very simple - I have an in-memory xr.Dataset that I want to save() and then load() without losses.

Should it just be an `xr.save(..., m8=True)` (or whatever that flag would be called), so that all of numpy's `M8[...]` and `m8[...]` would be serialized transparently (as int64, that is) without passing them through the whole cftime pipeline. It would be then nice, of course, if `xr.load` was also aware of this convention (via some special attribute or somehow else) and could convert them back like `.view('M8[ns]')` when loading. I think xarray should also throw an exception if it detects timestamps/timedeltas of nanosecond precision that it can't serialize without going through int-float-int routine (or automatically revert to using this transparent but netcdf-incompatible mode).

Maybe this is not the proper way to do it - ideas welcome (there's also an open PR - #4400 - mind checking that out?)","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,614275938
https://github.com/pydata/xarray/pull/4400#issuecomment-735777126,https://api.github.com/repos/pydata/xarray/issues/4400,735777126,MDEyOklzc3VlQ29tbWVudDczNTc3NzEyNg==,2418513,2020-11-30T13:12:47Z,2020-11-30T13:12:47Z,NONE,"Yea, well, in this case it's not about Python... `M8[ns]` datatype is simply an `int64` underneath, why not just store it as that, no bells and whistles required, no corruption possible, no funky conversions?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,690546795
https://github.com/pydata/xarray/pull/4400#issuecomment-735431187,https://api.github.com/repos/pydata/xarray/issues/4400,735431187,MDEyOklzc3VlQ29tbWVudDczNTQzMTE4Nw==,2418513,2020-11-29T17:52:37Z,2020-11-29T17:52:37Z,NONE,"I'm working on an application where nanosecond-resolution is critical and took me days to find why my timestamps are all scrambled or off-by-1 after I write them with xarray and them read them back... would probably much rather prefer if it threw an exception instead of corrupting your data silently. 

Non-standard netcdf or not, if it was possible to just store them as plain int64s and read them back as is, that would help a ton...","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,690546795
https://github.com/pydata/xarray/pull/4400#issuecomment-735430231,https://api.github.com/repos/pydata/xarray/issues/4400,735430231,MDEyOklzc3VlQ29tbWVudDczNTQzMDIzMQ==,2418513,2020-11-29T17:45:14Z,2020-11-29T17:45:14Z,NONE,"I think netcdf lists ""nanoseconds"" as a valid unit though?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,690546795
https://github.com/pydata/xarray/pull/4400#issuecomment-734963454,https://api.github.com/repos/pydata/xarray/issues/4400,734963454,MDEyOklzc3VlQ29tbWVudDczNDk2MzQ1NA==,2418513,2020-11-27T19:38:47Z,2020-11-27T19:38:47Z,NONE,But the test already passes (i.e. you can at least do a `.encoding={.... 'nanoseconds'}` and avoid float conversion?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,690546795
https://github.com/pydata/xarray/pull/4400#issuecomment-734962866,https://api.github.com/repos/pydata/xarray/issues/4400,734962866,MDEyOklzc3VlQ29tbWVudDczNDk2Mjg2Ng==,2418513,2020-11-27T19:36:02Z,2020-11-27T19:36:02Z,NONE,"Oh, that requires `cftime._cftime` support first? :/","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,690546795
https://github.com/pydata/xarray/pull/4400#issuecomment-734962563,https://api.github.com/repos/pydata/xarray/issues/4400,734962563,MDEyOklzc3VlQ29tbWVudDczNDk2MjU2Mw==,2418513,2020-11-27T19:34:48Z,2020-11-27T19:34:48Z,NONE,Is there anything preventing to merge this? ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,690546795
https://github.com/pydata/xarray/issues/4045#issuecomment-734951187,https://api.github.com/repos/pydata/xarray/issues/4045,734951187,MDEyOklzc3VlQ29tbWVudDczNDk1MTE4Nw==,2418513,2020-11-27T18:47:26Z,2020-11-27T18:51:00Z,NONE,"Just stumbled upon this as well. Internally, `datetime64[ns]` is simply an 8-byte int. Why on earth would it be serialized in a lossy way as a float64?...

Simply telling it to `encoding={...: {'dtype': 'int64'}}` won't work since then it complains about serializing float as an int.

Is there a way out of this, other than not using `M8[ns]` dtypes at all with xarray?

This is a huge issue, as anyone using nanosecond-precision timestamps with xarray would unknowingly and silently read wrong data after deserializing.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,614275938
https://github.com/pydata/xarray/issues/1626#issuecomment-687267764,https://api.github.com/repos/pydata/xarray/issues/1626,687267764,MDEyOklzc3VlQ29tbWVudDY4NzI2Nzc2NA==,2418513,2020-09-04T16:55:48Z,2020-09-04T16:55:48Z,NONE,"This is an ancient issue, but still - wondering if anyone here managed to hack together some workarounds?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,264582338
https://github.com/pydata/xarray/pull/3703#issuecomment-575835942,https://api.github.com/repos/pydata/xarray/issues/3703,575835942,MDEyOklzc3VlQ29tbWVudDU3NTgzNTk0Mg==,2418513,2020-01-17T23:39:39Z,2020-01-17T23:39:39Z,NONE,"Wondering, would it be possible to release a minor version with this stuff anytime soon, or is the plan to wait for the next big 0.15?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,551532886
https://github.com/pydata/xarray/pull/3703#issuecomment-575835720,https://api.github.com/repos/pydata/xarray/issues/3703,575835720,MDEyOklzc3VlQ29tbWVudDU3NTgzNTcyMA==,2418513,2020-01-17T23:38:20Z,2020-01-17T23:38:20Z,NONE,Thanks a million!,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,551532886
https://github.com/pydata/xarray/issues/3695#issuecomment-575371718,https://api.github.com/repos/pydata/xarray/issues/3695,575371718,MDEyOklzc3VlQ29tbWVudDU3NTM3MTcxOA==,2418513,2020-01-16T22:13:55Z,2020-01-16T22:13:55Z,NONE,Any thoughts?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,549712566
https://github.com/pydata/xarray/issues/3695#issuecomment-574555353,https://api.github.com/repos/pydata/xarray/issues/3695,574555353,MDEyOklzc3VlQ29tbWVudDU3NDU1NTM1Mw==,2418513,2020-01-15T08:43:10Z,2020-01-15T08:43:10Z,NONE,https://mypy.readthedocs.io/en/latest/command_line.html#cmdoption-mypy-no-implicit-reexport,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,549712566
https://github.com/pydata/xarray/issues/277#issuecomment-491231541,https://api.github.com/repos/pydata/xarray/issues/277,491231541,MDEyOklzc3VlQ29tbWVudDQ5MTIzMTU0MQ==,2418513,2019-05-10T09:52:35Z,2019-05-10T09:53:36Z,NONE,"It might also make sense then to implement all numpy-like constructors for `DataArray`, plus the `empty()`, which is typically faster for larger arrays:

- `.full()` (kind of what's suggested here)
- `.ones()`
- `.zeros()`
- `.empty()`

This should be trivial to implement.","{""total_count"": 9, ""+1"": 9, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,48301141
https://github.com/pydata/xarray/issues/1603#issuecomment-491229992,https://api.github.com/repos/pydata/xarray/issues/1603,491229992,MDEyOklzc3VlQ29tbWVudDQ5MTIyOTk5Mg==,2418513,2019-05-10T09:47:39Z,2019-05-10T09:47:39Z,NONE,"There's now a good few dozen issues that reference this PR.

Wondering if there's any particular help needed (in the form of coding, discussion, or any other fashion), so as to try and speed it up and unblock those issues?

(I'm personally interested in resolving problems like #934 myself - allowing selection on non-dim coords, which seems to be a major hassle for a lot of use cases.)","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,262642978
https://github.com/pydata/xarray/issues/2836#issuecomment-475605323,https://api.github.com/repos/pydata/xarray/issues/2836,475605323,MDEyOklzc3VlQ29tbWVudDQ3NTYwNTMyMw==,2418513,2019-03-22T12:36:48Z,2019-03-22T12:36:48Z,NONE,"> Ooh I missed that too! This probably wont serialize well to netcdf, would it?

Prob not, with n-d attrs? It would serialize just fine to plain HDF5 though...","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,423749397
https://github.com/pydata/xarray/issues/2837#issuecomment-475284043,https://api.github.com/repos/pydata/xarray/issues/2837,475284043,MDEyOklzc3VlQ29tbWVudDQ3NTI4NDA0Mw==,2418513,2019-03-21T15:43:56Z,2019-03-21T15:58:23Z,NONE,"> matplotlib only knows about numpy arrays so plt.plot(arr, ...) will act like plt.plot(arr.values, ...) by design.

How does it (matplotlib) preserve Series index then?

> `style` is pandas-only kwarg (xarray lightly wraps matplotlib)

Would it make sense to make it (DA plotting interface) a bit more pandas-compatible by supporting `style`? Given that it copies pandas syntax like `arr.plot.line()` anyway... 

Also, if `plot()` is meant to be a thin wrapper around matplotlib, it should support positional arguments, since you can do `plt.plot(x, y, '.-')` just fine, but `da.plot('.-')` fails complaining about unexpected positional arguments.

Currently, neither of the two options above work, making DA plot interface inferior to both raw matplotlib and pandas.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,423774214
https://github.com/pydata/xarray/issues/2837#issuecomment-475289244,https://api.github.com/repos/pydata/xarray/issues/2837,475289244,MDEyOklzc3VlQ29tbWVudDQ3NTI4OTI0NA==,2418513,2019-03-21T15:55:13Z,2019-03-21T15:55:13Z,NONE,"> I think it plots assuming that the index is [0:len(da.values)].

Nope. It plots datetime index just fine.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,423774214
https://github.com/pydata/xarray/issues/2836#issuecomment-475285050,https://api.github.com/repos/pydata/xarray/issues/2836,475285050,MDEyOklzc3VlQ29tbWVudDQ3NTI4NTA1MA==,2418513,2019-03-21T15:46:13Z,2019-03-21T15:46:13Z,NONE,"I could try; what's the most stable way to check equality? Do we want to enforce that types are the same, shame/ndim are the same (dtypes?), plus element-wise comparison? What if one is DA array, one is np array?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,423749397
https://github.com/pydata/xarray/issues/2836#issuecomment-475264613,https://api.github.com/repos/pydata/xarray/issues/2836,475264613,MDEyOklzc3VlQ29tbWVudDQ3NTI2NDYxMw==,2418513,2019-03-21T14:59:28Z,2019-03-21T14:59:28Z,NONE,"@dcherian In the second example that fails, the attr in question is 1-D, one-dimensional attributes are fine?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,423749397
https://github.com/pydata/xarray/issues/2825#issuecomment-474909166,https://api.github.com/repos/pydata/xarray/issues/2825,474909166,MDEyOklzc3VlQ29tbWVudDQ3NDkwOTE2Ng==,2418513,2019-03-20T16:16:43Z,2019-03-20T16:16:43Z,NONE,IIRC the workaround is to use a slice with neighbouring dates which is unintuitive and ugly.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,423023519
https://github.com/pydata/xarray/issues/2825#issuecomment-474908707,https://api.github.com/repos/pydata/xarray/issues/2825,474908707,MDEyOklzc3VlQ29tbWVudDQ3NDkwODcwNw==,2418513,2019-03-20T16:15:47Z,2019-03-20T16:15:47Z,NONE,Oh God! Classic pandas...,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,423023519
https://github.com/pydata/xarray/issues/2170#issuecomment-474786687,https://api.github.com/repos/pydata/xarray/issues/2170,474786687,MDEyOklzc3VlQ29tbWVudDQ3NDc4NjY4Nw==,2418513,2019-03-20T11:13:40Z,2019-03-20T11:13:40Z,NONE,"Please!

It's really painful in some cases where `keepdims` option is not available, tons of unneeded boilerplate required to mimic the same thing.

","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,325436508
https://github.com/pydata/xarray/issues/2824#issuecomment-474654983,https://api.github.com/repos/pydata/xarray/issues/2824,474654983,MDEyOklzc3VlQ29tbWVudDQ3NDY1NDk4Mw==,2418513,2019-03-20T02:05:55Z,2019-03-20T02:05:55Z,NONE,"I guess I expected it to “just work” since it’s a part of numpy core functionality. (same as you can just pass a recarray to pandas dataframe constructor and it infers the rest, without you having to create a dict of columns manually - there’s only one way to do it so it can be done automatically)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,423016453
https://github.com/pydata/xarray/issues/1434#issuecomment-474637401,https://api.github.com/repos/pydata/xarray/issues/1434,474637401,MDEyOklzc3VlQ29tbWVudDQ3NDYzNzQwMQ==,2418513,2019-03-20T00:34:12Z,2019-03-20T00:34:12Z,NONE,"Looks like this is still a problem, just tested on 0.11.3 and it still results in `object`...","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,232350436