home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

73 rows where author_association = "CONTRIBUTOR" and user = 500246 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, reactions, created_at (date), updated_at (date)

issue 27

  • Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 9
  • Segmentation fault reading many groups from many files 7
  • Hooks for custom attribute handling in xarray operations 6
  • da.plot.pcolormesh fails when there is a datetime coordinate 6
  • ds.notnull() fails with AttributeError on pandas 0.21.0rc1 6
  • Use masked arrays while preserving int 4
  • Document the new __repr__ 3
  • Many methods are broken (e.g., concat/stack/sortby) when using repeated dimensions 3
  • `set_index` converts string-dtype to object-dtype 2
  • Cannot use xarrays own times for indexing 2
  • Encoding lost upon concatenation 2
  • BUG/TST: Retain encoding upon concatenation 2
  • Cannot open NetCDF file if dimension with time coordinate has length 0 (`ValueError` when decoding CF datetime) 2
  • BUG: Allow unsigned integer indexing, fixes #1405 2
  • Respect PEP 440 2
  • Comparing scalar xarray with ma.masked fails with ValueError: assignment destination is read-only 2
  • Cache root netCDF4.Dataset objects instead of groups 2
  • Handle scale_factor and add_offset as scalar 2
  • Save to netCDF with record dimension? 1
  • Towards a (temporary?) workaround for datetime issues at the xarray-level 1
  • opening NetCDF file fails with ValueError when time variable is multidimensional 1
  • `where` grows new dimensions for unrelated variables 1
  • Rules for propagating attrs and encoding 1
  • Comparison with masked array yields object-array with nans for masked values 1
  • passing unlimited_dims to to_netcdf triggers RuntimeError: NetCDF: Invalid argument 1
  • Context manager `AttributeError` when engine='h5netcdf' 1
  • Lazy saving to NetCDF4 fails randomly if an array is used multiple times 1

user 1

  • gerritholl · 73 ✖

author_association 1

  • CONTRIBUTOR · 73 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1275687277 https://github.com/pydata/xarray/issues/6300#issuecomment-1275687277 https://api.github.com/repos/pydata/xarray/issues/6300 IC_kwDOAMm_X85MCXFt gerritholl 500246 2022-10-12T07:03:09Z 2022-10-12T07:03:09Z CONTRIBUTOR

I experience the same problem under the same circumstances. My versions:

``` INSTALLED VERSIONS


commit: None python: 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:35:26) [GCC 10.4.0] python-bits: 64 OS: Linux OS-release: 4.18.0-305.12.1.el8_4.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: 4.8.1

xarray: 0.19.0 pandas: 1.5.0 numpy: 1.23.3 scipy: 1.9.1 netCDF4: 1.6.1 pydap: None h5netcdf: None h5py: 3.7.0 Nio: None zarr: 2.13.3 cftime: 1.6.2 nc_time_axis: None PseudoNetCDF: None rasterio: 1.3.2 cfgrib: None iris: None bottleneck: None dask: 2021.12.0 distributed: 2022.9.2 matplotlib: None cartopy: None seaborn: None numbagg: None pint: None setuptools: 65.4.1 pip: 22.2.2 conda: None pytest: None IPython: 8.5.0 sphinx: None ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Lazy saving to NetCDF4 fails randomly if an array is used multiple times 1149364539
704161345 https://github.com/pydata/xarray/pull/4485#issuecomment-704161345 https://api.github.com/repos/pydata/xarray/issues/4485 MDEyOklzc3VlQ29tbWVudDcwNDE2MTM0NQ== gerritholl 500246 2020-10-06T09:54:19Z 2020-10-06T09:54:19Z CONTRIBUTOR

If this makes more sense as an integration test than as a unit test (for which I need help, see other comment), should I mark the current test in some way and/or move it to a different source file?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Handle scale_factor and add_offset as scalar 714844298
703635258 https://github.com/pydata/xarray/pull/4485#issuecomment-703635258 https://api.github.com/repos/pydata/xarray/issues/4485 MDEyOklzc3VlQ29tbWVudDcwMzYzNTI1OA== gerritholl 500246 2020-10-05T13:33:38Z 2020-10-05T13:33:38Z CONTRIBUTOR

Is this bugfix notable enough to need a whats-new.rst entry?

For the unit test, I tried to construct an object that would emulate what is produced when reading a NetCDF4 file with the h5netcdf engine, but I gave up and settled for a temporary file instead. If this is an undesired approach, I could use some guidance in how to construct the appropriate object that will expose the problem.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Handle scale_factor and add_offset as scalar 714844298
703058067 https://github.com/pydata/xarray/issues/4471#issuecomment-703058067 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMzA1ODA2Nw== gerritholl 500246 2020-10-03T06:59:07Z 2020-10-03T06:59:07Z CONTRIBUTOR

I can try to fix this in a PR, I just need to be sure what the fix should look like - to change the dimensionality of attributes (has the potential to break backward compatibility) or to adapt other components to handle either scalars or length 1 arrays (safer alternative, but may occur in more locations both inside and outside xarray, so in this case perhaps a note in the documentation could be in order as well). I don't know if xarray thrives for consistency between what the different engines expose on opening the same file.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
702708138 https://github.com/pydata/xarray/issues/4471#issuecomment-702708138 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMjcwODEzOA== gerritholl 500246 2020-10-02T12:32:40Z 2020-10-02T12:32:40Z CONTRIBUTOR

According to The NetCDF User's Guide, attributes are supposed to be vectors:

The current version treats all attributes as vectors; scalar values are treated as single-element vectors.

That suggests that, strictly speaking, the h5netcdf engine is right and the netcdf4 engine is wrong, and that other components (such as where the scale factor and add_offset are applied) need to be adapted to handle arrays of length 1 for those values.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
702671253 https://github.com/pydata/xarray/issues/4471#issuecomment-702671253 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMjY3MTI1Mw== gerritholl 500246 2020-10-02T11:07:33Z 2020-10-02T11:07:33Z CONTRIBUTOR

The ds.load() prevents the traceback because it means the entire n-d data variable is multiplied with the 1-d scale factor. Similarly, requesting a slice (ds["Rad"][400:402, 300:302]) also prevents a traceback. The traceback occurs if a single value is requested, because then Python will complain about multiplying a scalar with a 1-d array. I'm not entirely sure why, but would be a numpy issue:

``` In [7]: a = np.array(0)

In [8]: b = np.array([0])

In [9]: a * b Out[9]: array([0])

In [10]: a *= b

ValueError Traceback (most recent call last) <ipython-input-10-0d04f348f081> in <module> ----> 1 a *= b

ValueError: non-broadcastable output operand with shape () doesn't match the broadcast shape (1,) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
702645270 https://github.com/pydata/xarray/issues/4471#issuecomment-702645270 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMjY0NTI3MA== gerritholl 500246 2020-10-02T10:10:45Z 2020-10-02T10:10:45Z CONTRIBUTOR

Interestingly, the problem is prevented if one adds

ds.load()

before the print statement.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
702643539 https://github.com/pydata/xarray/issues/4471#issuecomment-702643539 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMjY0MzUzOQ== gerritholl 500246 2020-10-02T10:07:17Z 2020-10-02T10:07:34Z CONTRIBUTOR

My last comment was inaccurate. Although the open succeeds, the non-scalar scale factor does trigger failure upon accessing data (due to lazy loading) even without any open file:

python import xarray fn = "OR_ABI-L1b-RadF-M3C07_G16_s20170732006100_e20170732016478_c20170732016514.nc" with xarray.open_dataset(fn, engine="h5netcdf") as ds: print(ds["Rad"][400, 300])

The data file is publicly available at:

s3://noaa-goes16/ABI-L1b-RadF/2017/073/20/OR_ABI-L1b-RadF-M3C07_G16_s20170732006100_e20170732016478_c20170732016514.nc

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
702021297 https://github.com/pydata/xarray/issues/4471#issuecomment-702021297 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMjAyMTI5Nw== gerritholl 500246 2020-10-01T09:47:00Z 2020-10-01T09:47:00Z CONTRIBUTOR

However, a simple `xarray.open_dataset(fn, engine="h5netcdf") still fails with ValueError only if passed an open file, so there appear to be still other differences apart from the dimensionality of the variable attributes depending on backend.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
702018925 https://github.com/pydata/xarray/issues/4471#issuecomment-702018925 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMjAxODkyNQ== gerritholl 500246 2020-10-01T09:42:35Z 2020-10-01T09:42:35Z CONTRIBUTOR

Some further digging shows it's due to differences between the h5netcdf and netcdf4 backends:

python import xarray fn = "/data/gholl/cache/fogtools/abi/2017/03/14/20/06/7/OR_ABI-L1b-RadF-M3C07_G16_s20170732006100_e20170732016478_c20170732016514.nc" with xarray.open_dataset(fn, decode_cf=False, mask_and_scale=False, engine="netcdf4") as ds: print(ds["esun"].attrs["_FillValue"]) print(ds["Rad"].attrs["scale_factor"]) with xarray.open_dataset(fn, decode_cf=False, mask_and_scale=False, engine="h5netcdf") as ds: print(ds["esun"].attrs["_FillValue"]) print(ds["Rad"].attrs["scale_factor"])

Results in:

-999.0 0.001564351 [-999.] [0.00156435]

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
701973665 https://github.com/pydata/xarray/issues/4471#issuecomment-701973665 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMTk3MzY2NQ== gerritholl 500246 2020-10-01T08:20:20Z 2020-10-01T08:20:20Z CONTRIBUTOR

Probably related: when reading an open file through a file system instance, the _FillValue, scale_factor, and add_offset are arrays of length one. When opening by passing a filename, those are all scalar (as expected):

python import xarray from fsspec.implementations.local import LocalFileSystem fn = "/data/gholl/cache/fogtools/abi/2017/03/14/20/06/7/OR_ABI-L1b-RadF-M3C07_G16_s20170732006100_e20170732016478_c20170732016514.nc" ds1 = xarray.open_dataset(fn, decode_cf=True, mask_and_scale=False) print(ds1["esun"].attrs["_FillValue"]) print(ds1["Rad"].attrs["scale_factor"]) with LocalFileSystem().open(fn) as of: ds2 = xarray.open_dataset(of, decode_cf=True, mask_and_scale=False) print(ds2["esun"].attrs["_FillValue"]) print(ds2["Rad"].attrs["scale_factor"])

Result:

-999.0 0.001564351 [-999.] [0.00156435]

I strongly suspect that this is what causes the ValueError, and in any case it also causes downstream problems even if opening succeeds as per the previous comment.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
701948369 https://github.com/pydata/xarray/issues/4471#issuecomment-701948369 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMTk0ODM2OQ== gerritholl 500246 2020-10-01T07:33:11Z 2020-10-01T07:33:11Z CONTRIBUTOR

I just tested this with some more combinations:

  • decode_cf=True, mask_and_scale=False, everything seems fine.
  • decode_cf=False, mask_and_scale=True, everything seems fine.
  • decode_cf=True, mask_and_scale=True results in the ValueError and associated traceback.
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
682435326 https://github.com/pydata/xarray/issues/1240#issuecomment-682435326 https://api.github.com/repos/pydata/xarray/issues/1240 MDEyOklzc3VlQ29tbWVudDY4MjQzNTMyNg== gerritholl 500246 2020-08-28T09:48:18Z 2020-08-28T09:48:56Z CONTRIBUTOR

I fixed my conda environment now (something was wrong as I appeared to have two xarray installations in parallel). I still get the KeyError with latest xarray master and latest pandas master:

$ conda list | egrep -w '(pandas|xarray)' pandas 1.2.0.dev0+167.g1f35b0621 pypi_0 pypi xarray 0.16.1.dev65+g13caf96e pypi_0 pypi $ python mwe83.py Traceback (most recent call last): File "mwe83.py", line 5, in <module> da.sel(time=da.coords["time"][0]) File "/data/gholl/miniconda3/envs/py38/lib/python3.8/site-packages/xarray/core/dataarray.py", line 1142, in sel ds = self._to_temp_dataset().sel( File "/data/gholl/miniconda3/envs/py38/lib/python3.8/site-packages/xarray/core/dataset.py", line 2096, in sel pos_indexers, new_indexes = remap_label_indexers( File "/data/gholl/miniconda3/envs/py38/lib/python3.8/site-packages/xarray/core/coordinates.py", line 395, in remap_label_indexers pos_indexers, new_indexes = indexing.remap_label_indexers( File "/data/gholl/miniconda3/envs/py38/lib/python3.8/site-packages/xarray/core/indexing.py", line 270, in remap_label_indexers idxr, new_idx = convert_label_indexer(index, label, dim, method, tolerance) File "/data/gholl/miniconda3/envs/py38/lib/python3.8/site-packages/xarray/core/indexing.py", line 189, in convert_label_indexer indexer = index.get_loc( File "/data/gholl/miniconda3/envs/py38/lib/python3.8/site-packages/pandas/core/indexes/datetimes.py", line 622, in get_loc raise KeyError(key) KeyError: 0 $ cat mwe83.py import xarray as xr import numpy as np da = xr.DataArray([0, 1], dims=("time",), coords={"time": np.array([0, 1], dtype="M8[s]")}) da.sel(time=slice(da.coords["time"][0], da.coords["time"][1])) da.sel(time=da.coords["time"][0])

Oops, by "already have" you meant it's already been reported, I thought you meant it had already been fixed. All clear then.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Cannot use xarrays own times for indexing 204071440
682393298 https://github.com/pydata/xarray/issues/1240#issuecomment-682393298 https://api.github.com/repos/pydata/xarray/issues/1240 MDEyOklzc3VlQ29tbWVudDY4MjM5MzI5OA== gerritholl 500246 2020-08-28T08:14:26Z 2020-08-28T08:14:26Z CONTRIBUTOR

This was closed and was solved for slicing, but not for element indexing:

python import xarray as xr import numpy as np da = xr.DataArray([0, 1], dims=("time",), coords={"time": np.array([0, 1], dtype="M8[s]")}) da.sel(time=da.coords["time"][0])

results in

Traceback (most recent call last): File "mwe83.py", line 4, in <module> da.sel(time=da.coords["time"][0]) File "/data/gholl/miniconda3/envs/py38/lib/python3.8/site-packages/xarray/core/dataarray.py", line 1142, in sel ds = self._to_temp_dataset().sel( File "/data/gholl/miniconda3/envs/py38/lib/python3.8/site-packages/xarray/core/dataset.py", line 2096, in sel pos_indexers, new_indexes = remap_label_indexers( File "/data/gholl/miniconda3/envs/py38/lib/python3.8/site-packages/xarray/core/coordinates.py", line 395, in remap_label_indexers pos_indexers, new_indexes = indexing.remap_label_indexers( File "/data/gholl/miniconda3/envs/py38/lib/python3.8/site-packages/xarray/core/indexing.py", line 270, in remap_label_indexers idxr, new_idx = convert_label_indexer(index, label, dim, method, tolerance) File "/data/gholl/miniconda3/envs/py38/lib/python3.8/site-packages/xarray/core/indexing.py", line 189, in convert_label_indexer indexer = index.get_loc( File "/data/gholl/miniconda3/envs/py38/lib/python3.8/site-packages/pandas/core/indexes/datetimes.py", line 622, in get_loc raise KeyError(key) KeyError: 0

using xarray 0.15.2.dev64+g2542a63f (latest master). I think it would be desirable that it works in both cases. Should we reopen this issue or should I open a new?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Cannot use xarrays own times for indexing 204071440
662848438 https://github.com/pydata/xarray/issues/2377#issuecomment-662848438 https://api.github.com/repos/pydata/xarray/issues/2377 MDEyOklzc3VlQ29tbWVudDY2Mjg0ODQzOA== gerritholl 500246 2020-07-23T06:56:10Z 2020-07-23T06:56:10Z CONTRIBUTOR

This issue is still relevant.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Comparing scalar xarray with ma.masked fails with ValueError: assignment destination is read-only 352999600
580761178 https://github.com/pydata/xarray/issues/1194#issuecomment-580761178 https://api.github.com/repos/pydata/xarray/issues/1194 MDEyOklzc3VlQ29tbWVudDU4MDc2MTE3OA== gerritholl 500246 2020-01-31T14:42:36Z 2020-01-31T14:42:36Z CONTRIBUTOR

Pandas 1.0 uses pd.NA for integers, boolean, and string dtypes: https://pandas.pydata.org/pandas-docs/stable/whatsnew/v1.0.0.html#experimental-na-scalar-to-denote-missing-values

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use masked arrays while preserving int 199188476
558225971 https://github.com/pydata/xarray/issues/3572#issuecomment-558225971 https://api.github.com/repos/pydata/xarray/issues/3572 MDEyOklzc3VlQ29tbWVudDU1ODIyNTk3MQ== gerritholl 500246 2019-11-25T16:12:37Z 2019-11-25T16:12:37Z CONTRIBUTOR

You are right. Reported at https://github.com/shoyer/h5netcdf/issues/63 .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Context manager `AttributeError` when engine='h5netcdf' 528154893
509525636 https://github.com/pydata/xarray/pull/3082#issuecomment-509525636 https://api.github.com/repos/pydata/xarray/issues/3082 MDEyOklzc3VlQ29tbWVudDUwOTUyNTYzNg== gerritholl 500246 2019-07-09T07:31:09Z 2019-07-09T07:31:09Z CONTRIBUTOR

@shoyer I'm afraid I don't understand well enough what is going to say much usefully...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Cache root netCDF4.Dataset objects instead of groups 464787713
509134555 https://github.com/pydata/xarray/pull/3082#issuecomment-509134555 https://api.github.com/repos/pydata/xarray/issues/3082 MDEyOklzc3VlQ29tbWVudDUwOTEzNDU1NQ== gerritholl 500246 2019-07-08T08:39:59Z 2019-07-08T08:39:59Z CONTRIBUTOR

As the original reported for #2954 I can confirm that both of my test scripts that were previously segfaulting are with this PR running as expected.

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 1,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Cache root netCDF4.Dataset objects instead of groups 464787713
509134252 https://github.com/pydata/xarray/issues/2954#issuecomment-509134252 https://api.github.com/repos/pydata/xarray/issues/2954 MDEyOklzc3VlQ29tbWVudDUwOTEzNDI1Mg== gerritholl 500246 2019-07-08T08:39:01Z 2019-07-08T08:39:01Z CONTRIBUTOR

And I can confirm that the problem I reported originally on May 10 is also gone with #3082.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Segmentation fault reading many groups from many files 442617907
509132581 https://github.com/pydata/xarray/issues/2954#issuecomment-509132581 https://api.github.com/repos/pydata/xarray/issues/2954 MDEyOklzc3VlQ29tbWVudDUwOTEzMjU4MQ== gerritholl 500246 2019-07-08T08:34:11Z 2019-07-08T08:34:38Z CONTRIBUTOR

@shoyer I checked out your branch and the latter test example runs successfully - no segmentation fault and no files left open.

I will test the former test example now.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Segmentation fault reading many groups from many files 442617907
508900470 https://github.com/pydata/xarray/issues/2954#issuecomment-508900470 https://api.github.com/repos/pydata/xarray/issues/2954 MDEyOklzc3VlQ29tbWVudDUwODkwMDQ3MA== gerritholl 500246 2019-07-06T06:09:04Z 2019-07-06T06:09:04Z CONTRIBUTOR

There are some files triggering the problem at ftp://ftp.eumetsat.int/pub/OPS/out/test-data/Test-data-for-External-Users/MTG_FCI_Test-Data/FCI_L1C_24hr_Test_Data_for_Users/1.0/UNCOMPRESSED/ I will test the PR later (latest on Monday)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Segmentation fault reading many groups from many files 442617907
508772044 https://github.com/pydata/xarray/issues/2954#issuecomment-508772044 https://api.github.com/repos/pydata/xarray/issues/2954 MDEyOklzc3VlQ29tbWVudDUwODc3MjA0NA== gerritholl 500246 2019-07-05T14:13:20Z 2019-07-05T14:14:20Z CONTRIBUTOR

This triggers a segmentation fault (in the .persist() call) on my system, which may be related:

python import xarray import os import subprocess xarray.set_options(file_cache_maxsize=1) f = "/path/to/netcdf/file.nc" ds1 = xarray.open_dataset(f, "/group1", chunks=1024) ds2 = xarray.open_dataset(f, "/group2", chunks=1024) ds_cat = xarray.concat([ds1, ds2]) ds_cat.persist() subprocess.run(fr"lsof | grep {os.getpid():d} | grep '\.nc$'", shell=True)

But there's something with the specific netcdf file going on, for when I create artificial groups, it does not segfault.

``` Fatal Python error: Segmentation fault

Thread 0x00007f542bfff700 (most recent call first): File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/multiprocessing/pool.py", line 470 in _handle_results File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 865 in run File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 917 in _bootstrap_inner File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 885 in _bootstrap

Thread 0x00007f5448ff9700 (most recent call first): File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/multiprocessing/pool.py", line 422 in _handle_tasks File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 865 in run File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 917 in _bootstrap_inner File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 885 in _bootstrap

Thread 0x00007f54497fa700 (most recent call first): File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/multiprocessing/pool.py", line 413 in _handle_workers File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 865 in run File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 917 in _bootstrap_inner File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 885 in _bootstrap

Thread 0x00007f5449ffb700 (most recent call first): File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/multiprocessing/pool.py", line 110 in worker File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 865 in run File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 917 in _bootstrap_inner File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 885 in _bootstrap

Thread 0x00007f544a7fc700 (most recent call first): File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/multiprocessing/pool.py", line 110 in worker File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 865 in run File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 917 in _bootstrap_inner File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 885 in _bootstrap

Thread 0x00007f544affd700 (most recent call first): File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/multiprocessing/pool.py", line 110 in worker File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 865 in run File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 917 in _bootstrap_inner File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 885 in _bootstrap

Thread 0x00007f544b7fe700 (most recent call first): File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/multiprocessing/pool.py", line 110 in worker File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 865 in run File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 917 in _bootstrap_inner File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 885 in _bootstrap

Thread 0x00007f544bfff700 (most recent call first): File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/multiprocessing/pool.py", line 110 in worker File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 865 in run File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 917 in _bootstrap_inner File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 885 in _bootstrap

Thread 0x00007f5458a75700 (most recent call first): File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/multiprocessing/pool.py", line 110 in worker File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 865 in run File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 917 in _bootstrap_inner File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 885 in _bootstrap

Thread 0x00007f5459276700 (most recent call first): File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/multiprocessing/pool.py", line 110 in worker File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 865 in run File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 917 in _bootstrap_inner File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 885 in _bootstrap

Thread 0x00007f5459a77700 (most recent call first): File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/multiprocessing/pool.py", line 110 in worker File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 865 in run File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 917 in _bootstrap_inner File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/threading.py", line 885 in _bootstrap

Current thread 0x00007f54731236c0 (most recent call first): File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/site-packages/xarray/backends/netCDF4_.py", line 244 in open_netcdf4_group File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/site-packages/xarray/backends/file_manager.py", line 173 in acquire File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/site-packages/xarray/backends/netCDF4.py", line 56 in get_array File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/site-packages/xarray/backends/netCDF4_.py", line 74 in getitem File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/site-packages/xarray/core/indexing.py", line 778 in explicit_indexing_adapter File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/site-packages/xarray/backends/netCDF4.py", line 64 in getitem File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/site-packages/xarray/core/indexing.py", line 510 in array File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/site-packages/numpy/core/numeric.py", line 538 in asarray File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/site-packages/xarray/core/indexing.py", line 604 in array File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/site-packages/numpy/core/numeric.py", line 538 in asarray File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/site-packages/xarray/core/variable.py", line 213 in _as_array_or_item File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/site-packages/xarray/core/variable.py", line 392 in values File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/site-packages/xarray/core/variable.py", line 297 in data File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/site-packages/xarray/core/variable.py", line 1204 in set_dims File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/site-packages/xarray/core/combine.py", line 298 in ensure_common_dims File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/site-packages/xarray/core/variable.py", line 2085 in concat File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/site-packages/xarray/core/combine.py", line 305 in _dataset_concat File "/media/nas/x21324/miniconda3/envs/py37d/lib/python3.7/site-packages/xarray/core/combine.py", line 120 in concat File "mwe13.py", line 19 in <module> Segmentation fault (core dumped) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Segmentation fault reading many groups from many files 442617907
508728959 https://github.com/pydata/xarray/issues/2954#issuecomment-508728959 https://api.github.com/repos/pydata/xarray/issues/2954 MDEyOklzc3VlQ29tbWVudDUwODcyODk1OQ== gerritholl 500246 2019-07-05T11:29:50Z 2019-07-05T11:29:50Z CONTRIBUTOR

This can also be triggered by a .persist(...) call, although I don't yet understand the precise circumstances.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Segmentation fault reading many groups from many files 442617907
491866549 https://github.com/pydata/xarray/issues/2954#issuecomment-491866549 https://api.github.com/repos/pydata/xarray/issues/2954 MDEyOklzc3VlQ29tbWVudDQ5MTg2NjU0OQ== gerritholl 500246 2019-05-13T15:18:33Z 2019-05-13T15:18:33Z CONTRIBUTOR

In our code, this problem gets triggered because of xarrays lazy handling. If we have

with xr.open_dataset('file.nc') as ds: val = ds["field"] return val

then when a caller tries to use val, xarray reopens the dataset and does not close it again. This means the context manager is actually useless: we're using the context manager to close the file as soon as we have accessed the value, but later the file gets opened again anyway. This is against the intention of the code.

We can avoid this by calling val.load() from within the context manager, as the linked satpy PR above does. But what is the intention of xarrays design here? Should lazy reading close the file after opening and reading the value? I would say it probably should do something like

if file_was_not_open: open file get value close file # this step currently omitted return value else: get value return value

is not closing the file after it has been opened for retrieving a "lazy" file by design, or might this be considered a wart/bug?

{
    "total_count": 2,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 2
}
  Segmentation fault reading many groups from many files 442617907
491221266 https://github.com/pydata/xarray/issues/2954#issuecomment-491221266 https://api.github.com/repos/pydata/xarray/issues/2954 MDEyOklzc3VlQ29tbWVudDQ5MTIyMTI2Ng== gerritholl 500246 2019-05-10T09:18:28Z 2019-05-10T09:18:28Z CONTRIBUTOR

Note that if I close every file neatly, there is no segmentation fault.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Segmentation fault reading many groups from many files 442617907
457220076 https://github.com/pydata/xarray/issues/1194#issuecomment-457220076 https://api.github.com/repos/pydata/xarray/issues/1194 MDEyOklzc3VlQ29tbWVudDQ1NzIyMDA3Ng== gerritholl 500246 2019-01-24T14:40:33Z 2019-01-24T14:40:33Z CONTRIBUTOR

@max-sixty Interesting! I wonder what it would take to make use of this "nullable integer data type" in xarray. It wouldn't work to convert it to a standard numpy array (da.values) retaining the dtype, but one could make a new .to_maskedarray() method returning a numpy masked array; that would probably be easier than to add full support for masked arrays.

{
    "total_count": 4,
    "+1": 4,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use masked arrays while preserving int 199188476
457159560 https://github.com/pydata/xarray/issues/1194#issuecomment-457159560 https://api.github.com/repos/pydata/xarray/issues/1194 MDEyOklzc3VlQ29tbWVudDQ1NzE1OTU2MA== gerritholl 500246 2019-01-24T11:10:46Z 2019-01-24T11:10:46Z CONTRIBUTOR

I think this issue should remain open. I think it would still be highly desirable to implement support for true masked arrays, such that any value can be masked without throwing away the original value.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use masked arrays while preserving int 199188476
457121127 https://github.com/pydata/xarray/issues/1234#issuecomment-457121127 https://api.github.com/repos/pydata/xarray/issues/1234 MDEyOklzc3VlQ29tbWVudDQ1NzEyMTEyNw== gerritholl 500246 2019-01-24T09:08:42Z 2019-01-24T09:08:42Z CONTRIBUTOR

Maybe this just needs a note in the documentation then?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `where` grows new dimensions for unrelated variables 203630267
457120865 https://github.com/pydata/xarray/issues/1238#issuecomment-457120865 https://api.github.com/repos/pydata/xarray/issues/1238 MDEyOklzc3VlQ29tbWVudDQ1NzEyMDg2NQ== gerritholl 500246 2019-01-24T09:07:52Z 2019-01-24T09:07:52Z CONTRIBUTOR

This behaviour appears to be still current.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `set_index` converts string-dtype to object-dtype 203999231
434787209 https://github.com/pydata/xarray/issues/1614#issuecomment-434787209 https://api.github.com/repos/pydata/xarray/issues/1614 MDEyOklzc3VlQ29tbWVudDQzNDc4NzIwOQ== gerritholl 500246 2018-10-31T17:56:41Z 2018-10-31T17:56:41Z CONTRIBUTOR

Another one to decide is xarray.zeros_like(...) and friends.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Rules for propagating attrs and encoding 264049503
415070680 https://github.com/pydata/xarray/issues/1792#issuecomment-415070680 https://api.github.com/repos/pydata/xarray/issues/1792 MDEyOklzc3VlQ29tbWVudDQxNTA3MDY4MA== gerritholl 500246 2018-08-22T15:20:08Z 2018-08-22T15:20:08Z CONTRIBUTOR

See also: #2377.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Comparison with masked array yields object-array with nans for masked values 283345586
415070591 https://github.com/pydata/xarray/issues/2377#issuecomment-415070591 https://api.github.com/repos/pydata/xarray/issues/2377 MDEyOklzc3VlQ29tbWVudDQxNTA3MDU5MQ== gerritholl 500246 2018-08-22T15:19:56Z 2018-08-22T15:19:56Z CONTRIBUTOR

See also: #1792.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Comparing scalar xarray with ma.masked fails with ValueError: assignment destination is read-only 352999600
376106248 https://github.com/pydata/xarray/issues/1378#issuecomment-376106248 https://api.github.com/repos/pydata/xarray/issues/1378 MDEyOklzc3VlQ29tbWVudDM3NjEwNjI0OA== gerritholl 500246 2018-03-26T09:38:00Z 2018-03-26T09:38:00Z CONTRIBUTOR

This also affects the stack method.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Many methods are broken (e.g., concat/stack/sortby) when using repeated dimensions 222676855
367153633 https://github.com/pydata/xarray/issues/1378#issuecomment-367153633 https://api.github.com/repos/pydata/xarray/issues/1378 MDEyOklzc3VlQ29tbWVudDM2NzE1MzYzMw== gerritholl 500246 2018-02-20T23:10:13Z 2018-02-20T23:10:13Z CONTRIBUTOR

@jhamman Ok, good to hear it's not slated to be removed. I would love to work on this, I wish I had the time! I'll keep it in mind if I do find some spare time.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Many methods are broken (e.g., concat/stack/sortby) when using repeated dimensions 222676855
367147759 https://github.com/pydata/xarray/issues/1378#issuecomment-367147759 https://api.github.com/repos/pydata/xarray/issues/1378 MDEyOklzc3VlQ29tbWVudDM2NzE0Nzc1OQ== gerritholl 500246 2018-02-20T22:46:27Z 2018-02-20T22:46:27Z CONTRIBUTOR

I cannot see a use case in which repeated dims actually make sense.

I use repeated dimensions to store a covariance matrix. The data variable containing the covariance matrix has 4 dimensions, of which the last 2 are repeated. For example, I have a data variable with dimensions (channel, scanline, element, element), storing an element-element covariance matrix for every scanline in satellite data.

This is valid NetCDF and should be valid in xarray. It would be a significant problem for me if they became disallowed.

{
    "total_count": 6,
    "+1": 6,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Many methods are broken (e.g., concat/stack/sortby) when using repeated dimensions 222676855
359523925 https://github.com/pydata/xarray/issues/1849#issuecomment-359523925 https://api.github.com/repos/pydata/xarray/issues/1849 MDEyOklzc3VlQ29tbWVudDM1OTUyMzkyNQ== gerritholl 500246 2018-01-22T18:45:35Z 2018-01-22T18:45:35Z CONTRIBUTOR

Not sure if the attachment came through. Trying again:

sample.nc.gz

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  passing unlimited_dims to to_netcdf triggers RuntimeError: NetCDF: Invalid argument 290572700
355914496 https://github.com/pydata/xarray/issues/678#issuecomment-355914496 https://api.github.com/repos/pydata/xarray/issues/678 MDEyOklzc3VlQ29tbWVudDM1NTkxNDQ5Ng== gerritholl 500246 2018-01-08T09:13:59Z 2018-01-08T09:13:59Z CONTRIBUTOR

Is this fixed by #1170?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Save to netCDF with record dimension? 121740837
351209309 https://github.com/pydata/xarray/pull/1777#issuecomment-351209309 https://api.github.com/repos/pydata/xarray/issues/1777 MDEyOklzc3VlQ29tbWVudDM1MTIwOTMwOQ== gerritholl 500246 2017-12-12T22:01:18Z 2017-12-12T22:01:18Z CONTRIBUTOR

I'm aware that the longer term plan should be to use versioneer, but I think this fix is useful until we make such a transition.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Respect PEP 440 281552158
351208780 https://github.com/pydata/xarray/pull/1777#issuecomment-351208780 https://api.github.com/repos/pydata/xarray/issues/1777 MDEyOklzc3VlQ29tbWVudDM1MTIwODc4MA== gerritholl 500246 2017-12-12T21:59:32Z 2017-12-12T21:59:59Z CONTRIBUTOR

Does this needs tests or a whats-new.rst entry? I'm not sure how I would write a test for it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Respect PEP 440 281552158
340093042 https://github.com/pydata/xarray/issues/1663#issuecomment-340093042 https://api.github.com/repos/pydata/xarray/issues/1663 MDEyOklzc3VlQ29tbWVudDM0MDA5MzA0Mg== gerritholl 500246 2017-10-27T21:30:08Z 2017-10-27T21:30:08Z CONTRIBUTOR

Oh, I missed that. I should have tried with xarray master.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ds.notnull() fails with AttributeError on pandas 0.21.0rc1 269143043
340032712 https://github.com/pydata/xarray/issues/1663#issuecomment-340032712 https://api.github.com/repos/pydata/xarray/issues/1663 MDEyOklzc3VlQ29tbWVudDM0MDAzMjcxMg== gerritholl 500246 2017-10-27T17:20:58Z 2017-10-27T17:20:58Z CONTRIBUTOR

I'm not sure if I understand correctly, but it appears xarray has a hardcoded list of names of pandas functions/methods that need to be treated in a particular way. I might be on the wrong track though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ds.notnull() fails with AttributeError on pandas 0.21.0rc1 269143043
340030194 https://github.com/pydata/xarray/issues/1663#issuecomment-340030194 https://api.github.com/repos/pydata/xarray/issues/1663 MDEyOklzc3VlQ29tbWVudDM0MDAzMDE5NA== gerritholl 500246 2017-10-27T17:10:54Z 2017-10-27T17:10:54Z CONTRIBUTOR

I think we'd need to change PANDAS_UNARY_FUNCTIONS = ['isnull', 'notnull'] in ops.py; I'll have a try.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ds.notnull() fails with AttributeError on pandas 0.21.0rc1 269143043
340024686 https://github.com/pydata/xarray/issues/1663#issuecomment-340024686 https://api.github.com/repos/pydata/xarray/issues/1663 MDEyOklzc3VlQ29tbWVudDM0MDAyNDY4Ng== gerritholl 500246 2017-10-27T16:48:39Z 2017-10-27T16:48:39Z CONTRIBUTOR

What I still don't know: is this a bug in xarray or a bug in pandas? Or neither?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ds.notnull() fails with AttributeError on pandas 0.21.0rc1 269143043
340023539 https://github.com/pydata/xarray/issues/1663#issuecomment-340023539 https://api.github.com/repos/pydata/xarray/issues/1663 MDEyOklzc3VlQ29tbWVudDM0MDAyMzUzOQ== gerritholl 500246 2017-10-27T16:44:01Z 2017-10-27T16:44:01Z CONTRIBUTOR

The offending commit is https://github.com/pandas-dev/pandas/commit/793020293ee1e5fa023f45c12943a4ac51cc23d0

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ds.notnull() fails with AttributeError on pandas 0.21.0rc1 269143043
340006322 https://github.com/pydata/xarray/issues/1663#issuecomment-340006322 https://api.github.com/repos/pydata/xarray/issues/1663 MDEyOklzc3VlQ29tbWVudDM0MDAwNjMyMg== gerritholl 500246 2017-10-27T15:36:11Z 2017-10-27T15:36:11Z CONTRIBUTOR

Just confirmed this is caused by a change in pandas somewhere between 0.20.3 and 0.21.0rc1. I don't know if that is a bug in pandas, or a deliberate change that xarray will somehow need to handle, in particular after 0.21.0 final is released.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ds.notnull() fails with AttributeError on pandas 0.21.0rc1 269143043
339515375 https://github.com/pydata/xarray/issues/1661#issuecomment-339515375 https://api.github.com/repos/pydata/xarray/issues/1661 MDEyOklzc3VlQ29tbWVudDMzOTUxNTM3NQ== gerritholl 500246 2017-10-26T00:40:20Z 2017-10-26T00:40:20Z CONTRIBUTOR

@TomAugspurger Is it this one?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  da.plot.pcolormesh fails when there is a datetime coordinate 268487752
339468720 https://github.com/pydata/xarray/issues/1084#issuecomment-339468720 https://api.github.com/repos/pydata/xarray/issues/1084 MDEyOklzc3VlQ29tbWVudDMzOTQ2ODcyMA== gerritholl 500246 2017-10-25T20:56:24Z 2017-10-25T20:56:24Z CONTRIBUTOR

Not sure if this is related, but pandas commit https://github.com/pandas-dev/pandas/commit/2310faa109bdfd9ff3ef4fc19a163d790d60c645 triggers xarray issue https://github.com/pydata/xarray/issues/1661 . Not sure if there exists an easy workaround for that one.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Towards a (temporary?) workaround for datetime issues at the xarray-level 187591179
339467595 https://github.com/pydata/xarray/issues/1661#issuecomment-339467595 https://api.github.com/repos/pydata/xarray/issues/1661 MDEyOklzc3VlQ29tbWVudDMzOTQ2NzU5NQ== gerritholl 500246 2017-10-25T20:52:30Z 2017-10-25T20:52:30Z CONTRIBUTOR

This happens after: https://github.com/pandas-dev/pandas/commit/2310faa109bdfd9ff3ef4fc19a163d790d60c645

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  da.plot.pcolormesh fails when there is a datetime coordinate 268487752
339451381 https://github.com/pydata/xarray/issues/1661#issuecomment-339451381 https://api.github.com/repos/pydata/xarray/issues/1661 MDEyOklzc3VlQ29tbWVudDMzOTQ1MTM4MQ== gerritholl 500246 2017-10-25T19:54:34Z 2017-10-25T19:54:34Z CONTRIBUTOR

The problem is triggered by a recent change in pandas. I'm currently bisecting pandas to see where it is but it's a little slow due to the compilation at every step.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  da.plot.pcolormesh fails when there is a datetime coordinate 268487752
339430955 https://github.com/pydata/xarray/issues/1661#issuecomment-339430955 https://api.github.com/repos/pydata/xarray/issues/1661 MDEyOklzc3VlQ29tbWVudDMzOTQzMDk1NQ== gerritholl 500246 2017-10-25T18:44:37Z 2017-10-25T18:44:37Z CONTRIBUTOR

Actually, it isn't in matplotlib really. It's xarrays responsibility after all. To plot with pcolormesh, one needs to convert the date axis using datenum, see https://stackoverflow.com/a/27918586/974555 . When plotting with xarray, that is out of control of the user, so there must be some step within xarray to prepare this. What I still don't know is why my code (not this MWE, but my actual code) worked several months ago but not now.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  da.plot.pcolormesh fails when there is a datetime coordinate 268487752
339416758 https://github.com/pydata/xarray/issues/1661#issuecomment-339416758 https://api.github.com/repos/pydata/xarray/issues/1661 MDEyOklzc3VlQ29tbWVudDMzOTQxNjc1OA== gerritholl 500246 2017-10-25T17:58:34Z 2017-10-25T17:58:34Z CONTRIBUTOR

Never mind, this is in matplotlib. See https://github.com/matplotlib/matplotlib/issues/9577.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  da.plot.pcolormesh fails when there is a datetime coordinate 268487752
339413389 https://github.com/pydata/xarray/issues/1661#issuecomment-339413389 https://api.github.com/repos/pydata/xarray/issues/1661 MDEyOklzc3VlQ29tbWVudDMzOTQxMzM4OQ== gerritholl 500246 2017-10-25T17:47:19Z 2017-10-25T17:47:19Z CONTRIBUTOR

I'm quite sure it worked in the past, but trying old versions of xarray yields the same error, so either my memory is wrong, or this started failing due to changes in dependencies.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  da.plot.pcolormesh fails when there is a datetime coordinate 268487752
318086000 https://github.com/pydata/xarray/issues/1329#issuecomment-318086000 https://api.github.com/repos/pydata/xarray/issues/1329 MDEyOklzc3VlQ29tbWVudDMxODA4NjAwMA== gerritholl 500246 2017-07-26T15:17:54Z 2017-07-26T15:17:54Z CONTRIBUTOR

I'd still like to fix this but I have too much workload at the moment. However, I've noticed it's also triggered if the time axis is not empty, but we subselect data such that it becomes empty, then run ds.load().

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Cannot open NetCDF file if dimension with time coordinate has length 0 (`ValueError` when decoding CF datetime) 217216935
315210673 https://github.com/pydata/xarray/pull/1406#issuecomment-315210673 https://api.github.com/repos/pydata/xarray/issues/1406 MDEyOklzc3VlQ29tbWVudDMxNTIxMDY3Mw== gerritholl 500246 2017-07-13T21:44:15Z 2017-07-13T21:44:15Z CONTRIBUTOR

I may make time for it later

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  BUG: Allow unsigned integer indexing, fixes #1405 228036180
315203022 https://github.com/pydata/xarray/pull/1299#issuecomment-315203022 https://api.github.com/repos/pydata/xarray/issues/1299 MDEyOklzc3VlQ29tbWVudDMxNTIwMzAyMg== gerritholl 500246 2017-07-13T21:09:10Z 2017-07-13T21:09:10Z CONTRIBUTOR

Sorry, I've been really busy, but I'll get around to it eventually!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  BUG/TST: Retain encoding upon concatenation 212471682
300832042 https://github.com/pydata/xarray/pull/1406#issuecomment-300832042 https://api.github.com/repos/pydata/xarray/issues/1406 MDEyOklzc3VlQ29tbWVudDMwMDgzMjA0Mg== gerritholl 500246 2017-05-11T15:47:34Z 2017-05-11T15:47:34Z CONTRIBUTOR

Perhaps I was too fast, I edited it directly in github.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  BUG: Allow unsigned integer indexing, fixes #1405 228036180
289496452 https://github.com/pydata/xarray/issues/1329#issuecomment-289496452 https://api.github.com/repos/pydata/xarray/issues/1329 MDEyOklzc3VlQ29tbWVudDI4OTQ5NjQ1Mg== gerritholl 500246 2017-03-27T15:51:08Z 2017-03-27T15:51:16Z CONTRIBUTOR

I might try it out but most likely not before the end of the week.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Cannot open NetCDF file if dimension with time coordinate has length 0 (`ValueError` when decoding CF datetime) 217216935
285087750 https://github.com/pydata/xarray/issues/1297#issuecomment-285087750 https://api.github.com/repos/pydata/xarray/issues/1297 MDEyOklzc3VlQ29tbWVudDI4NTA4Nzc1MA== gerritholl 500246 2017-03-08T16:19:10Z 2017-03-08T16:19:10Z CONTRIBUTOR

Mine retains it always upon concatenation, but if you prefer we could add an argument keep_encoding in analogy with keep_attrs. In that case we'd want to add it wherever we have keep_attrs.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Encoding lost upon concatenation 212177054
284787562 https://github.com/pydata/xarray/pull/1299#issuecomment-284787562 https://api.github.com/repos/pydata/xarray/issues/1299 MDEyOklzc3VlQ29tbWVudDI4NDc4NzU2Mg== gerritholl 500246 2017-03-07T17:02:23Z 2017-03-07T17:02:23Z CONTRIBUTOR

Concatenation of DataArray and Dataset use the same underlying code, do you mean specifically for the test?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  BUG/TST: Retain encoding upon concatenation 212471682
284751866 https://github.com/pydata/xarray/issues/1297#issuecomment-284751866 https://api.github.com/repos/pydata/xarray/issues/1297 MDEyOklzc3VlQ29tbWVudDI4NDc1MTg2Ng== gerritholl 500246 2017-03-07T15:21:25Z 2017-03-07T15:21:25Z CONTRIBUTOR

This is more serious when we are concatenating datasets, because then the encoding is lost for each containing data-array…

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Encoding lost upon concatenation 212177054
283531258 https://github.com/pydata/xarray/issues/988#issuecomment-283531258 https://api.github.com/repos/pydata/xarray/issues/988 MDEyOklzc3VlQ29tbWVudDI4MzUzMTI1OA== gerritholl 500246 2017-03-02T01:51:08Z 2017-03-02T01:51:08Z CONTRIBUTOR

We do often deal with those in my line of work as well, I just happen not to right now. But time is the one thing that already carries units, doesn't it? One can convert between various datetime64 objects and adding, subtracting, dividing timedelta64 with different units mostly works as expected (except integer division; and I haven't tried indexing with timedelta64). But I take your point about unit coordinates, and I still like the idea to provisionally add such functionality on top of an optional dependency on pint, which already has the ability to write out siunitx-latex code which then can be incorporated into plotting tools (I haven't grasped xarrays plotting tools enough yet to know how easy or difficult that part would be, though). I don't see custom dtypes with unit incorporation coming any time soon to numpy and I'm not even sure it would be the right way to go (any real dtype can have any unit; but here is not the right place to discuss that).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Hooks for custom attribute handling in xarray operations 173612265
283515941 https://github.com/pydata/xarray/issues/988#issuecomment-283515941 https://api.github.com/repos/pydata/xarray/issues/988 MDEyOklzc3VlQ29tbWVudDI4MzUxNTk0MQ== gerritholl 500246 2017-03-02T00:22:18Z 2017-03-02T00:22:18Z CONTRIBUTOR

Good point. I didn't think of that; my coordinates happen to be either time or unitless, I think. How common is it though that the full power of a unit library is needed for coordinates? I suppose it arises with indexing, i.e. the ability to write da.sel[x=1.5 km] = value (to borrow PEP 472 syntax ;-), moreso than operations between different data arrays. With a Dataset, the coordinates would correspond to variables with their own attributes, would they not (or how else would a CF-compliant NetCDF file store units for coordinates?) so it would only require a slight expansion of the DataArray class to carry along attributes on coordinates.

When it's a bit more polished I intend to publish it somewhere, but currently several things are missing (.to(...), __rsub__, __rmul__ and friends, unit tests, some other things). I currently don't have time to add features I don't need myself (such as units on coordinates) though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Hooks for custom attribute handling in xarray operations 173612265
282273509 https://github.com/pydata/xarray/issues/988#issuecomment-282273509 https://api.github.com/repos/pydata/xarray/issues/988 MDEyOklzc3VlQ29tbWVudDI4MjI3MzUwOQ== gerritholl 500246 2017-02-24T11:49:42Z 2017-02-24T11:49:42Z CONTRIBUTOR

I wrote a small recipe that appears to contain basic functionality I'm looking for. There's plenty of caveats but it could be a start, if such an approach is deemed desirable at all.

``` from common import ureg # or ureg = pint.UnitRegistry()

import operator import xarray class UnitsAwareDataArray(xarray.DataArray): """Like xarray.DataArray, but transfers units """

def __array_wrap__(self, obj, context=None):
    new_var = super().__array_wrap__(obj, context)
    if self.attrs.get("units"):
        new_var.attrs["units"] = context[0](ureg(self.attrs.get("units"))).u
    return new_var

def _apply_binary_op_to_units(self, func, other, x):
    if self.attrs.get("units"):
        x.attrs["units"] = func(ureg.Quantity(1, self.attrs["units"]),
                                ureg.Quantity(1, getattr(other, "units", "1"))).u
    return x

# pow is different because resulting unit depends on argument, not on
# unit of argument (which must be unitless)
def __pow__(self, other):
    x = super().__pow__(other)
    if self.attrs.get("units"):
        x.attrs["units"] = pow(ureg.Quantity(1, self.attrs["units"]),
                               ureg.Quantity(other, getattr(other, "units", "1"))).u
    return x

for tp in ("add", "sub", "mul", "matmul", "truediv", "floordiv", "mod", "divmod"): meth = "{:s}".format(tp) def func(self, other, meth=meth, tp=tp): x = getattr(super(UnitsAwareDataArray, self), meth)(other) return self._apply_binary_op_to_units(getattr(operator, tp), other, x) func.name = meth print(func, id(func)) setattr(UnitsAwareDataArray, meth, func) del func ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Hooks for custom attribute handling in xarray operations 173612265
282081462 https://github.com/pydata/xarray/issues/988#issuecomment-282081462 https://api.github.com/repos/pydata/xarray/issues/988 MDEyOklzc3VlQ29tbWVudDI4MjA4MTQ2Mg== gerritholl 500246 2017-02-23T18:41:19Z 2017-02-23T18:41:19Z CONTRIBUTOR

Is it not? The documentation says it's new in numpy 1.11 and we're at 1.12 now.

I tried to make a small units-aware subclass of DataArray for myself. I managed to get the right behaviour for ufuncs (I think) but somehow my subclassed _binary_op is not getting called. I guess there is some logic somewhere that leads to replacing _binary_op in a subclass doesn't work (see below). But overall, how would you feel about an optional dependency on pint with a thin layer of code in the right place?

``` class UnitsAwareDataArray(xarray.DataArray): """Like xarray.DataArray, but transfers units """

def __array_wrap__(self, obj, context=None):
    new_var = super().__array_wrap__(obj, context)
    if self.attrs.get("units"):
        new_var.attrs["units"] = context[0](ureg(self.attrs.get("units"))).u
    return new_var

@staticmethod
def _binary_op(f, reflexive=False, join=None, **ignored_kwargs):
    # NB: http://stackoverflow.com/a/26807879/974555
    x = super(UnitsAwareDataArray, UnitsAwareDataArray)(f,
        reflexive, join, **ignored_kwargs)
    # do stuff
    return x

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Hooks for custom attribute handling in xarray operations 173612265
282070342 https://github.com/pydata/xarray/issues/988#issuecomment-282070342 https://api.github.com/repos/pydata/xarray/issues/988 MDEyOklzc3VlQ29tbWVudDI4MjA3MDM0Mg== gerritholl 500246 2017-02-23T18:00:32Z 2017-02-23T18:00:46Z CONTRIBUTOR

Apparently __numpy_ufunc__ is too new for xarray, but it would appear that adding the right code to __array_wrap__ should work, i.e. if a units attribute is present and units are enabled through pint, evaluate something like new_var.attrs["units"] = context[0](1*ureg(self.attrs["units"])).u.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Hooks for custom attribute handling in xarray operations 173612265
282063849 https://github.com/pydata/xarray/issues/988#issuecomment-282063849 https://api.github.com/repos/pydata/xarray/issues/988 MDEyOklzc3VlQ29tbWVudDI4MjA2Mzg0OQ== gerritholl 500246 2017-02-23T17:37:18Z 2017-02-23T17:37:18Z CONTRIBUTOR

I would say using the units attribute is the most natural way to go. It could be optional and then built on top of pint, which would make it rather easy to implement:

```

ureg is a pint unit registry

y = a/b y.attrs["units"] = ureg(a.attrs["units"]) / ureg(b.attrs["units"]) ```

which if I understand the codebase correctly could be added to DataArray._binary_op. Not sure if it is similarly easy for ufuncs, is that what __numpy_ufunc__ would be for?

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Hooks for custom attribute handling in xarray operations 173612265
276141026 https://github.com/pydata/xarray/issues/1238#issuecomment-276141026 https://api.github.com/repos/pydata/xarray/issues/1238 MDEyOklzc3VlQ29tbWVudDI3NjE0MTAyNg== gerritholl 500246 2017-01-30T18:05:36Z 2017-01-30T18:05:36Z CONTRIBUTOR

Is this a bug in pandas? I'm probably a rare breed in being an xarray user who has not used pandas.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `set_index` converts string-dtype to object-dtype 203999231
275650861 https://github.com/pydata/xarray/issues/1199#issuecomment-275650861 https://api.github.com/repos/pydata/xarray/issues/1199 MDEyOklzc3VlQ29tbWVudDI3NTY1MDg2MQ== gerritholl 500246 2017-01-27T12:02:46Z 2017-01-27T12:02:46Z CONTRIBUTOR

Perhaps more broadly documentation-wise, it might be good to add a terminology list. For example, that could clarify the difference and relation between dimensions, labels, indices, coordinates, etc.. There are dimensions without coordinates, dimensions that are labelled or unlabelled, there are coordinates that are indices, coordinates that are not indices. I'm still figuring out how all of those relate to each other and how I use them.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Document the new __repr__ 200125945
275518064 https://github.com/pydata/xarray/issues/1199#issuecomment-275518064 https://api.github.com/repos/pydata/xarray/issues/1199 MDEyOklzc3VlQ29tbWVudDI3NTUxODA2NA== gerritholl 500246 2017-01-26T21:23:32Z 2017-01-26T21:23:32Z CONTRIBUTOR

With any kind of marking (such as with *) the problem is that the user might not know what the marking is for, and syntax is hard to google. When I see *x without whitespace I think of iterable unpacking...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Document the new __repr__ 200125945
275475809 https://github.com/pydata/xarray/issues/1199#issuecomment-275475809 https://api.github.com/repos/pydata/xarray/issues/1199 MDEyOklzc3VlQ29tbWVudDI3NTQ3NTgwOQ== gerritholl 500246 2017-01-26T18:49:41Z 2017-01-26T18:49:41Z CONTRIBUTOR

I think "Dimensions without coordinates" is clearer than "Unindexed dimensions", and only marginally more verbose (30 characters instead of 20). Any dimension can be indexed, just the index lookup is by position rather than by coordinate/label. I don't think marking the dimension/coordinate matches makes it any clearer as this matching is by name anyway, and my confusion was due to none of the dimensions having coordinates. I would support simply changing the label.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Document the new __repr__ 200125945
275174484 https://github.com/pydata/xarray/issues/1229#issuecomment-275174484 https://api.github.com/repos/pydata/xarray/issues/1229 MDEyOklzc3VlQ29tbWVudDI3NTE3NDQ4NA== gerritholl 500246 2017-01-25T17:28:51Z 2017-01-25T17:28:51Z CONTRIBUTOR

That was quick! I was just studying the test suite to see where I would add a test for a fix :)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  opening NetCDF file fails with ValueError when time variable is multidimensional 203159853
271077863 https://github.com/pydata/xarray/issues/1194#issuecomment-271077863 https://api.github.com/repos/pydata/xarray/issues/1194 MDEyOklzc3VlQ29tbWVudDI3MTA3Nzg2Mw== gerritholl 500246 2017-01-07T11:24:49Z 2017-01-07T11:32:06Z CONTRIBUTOR

I don't see how an integer dtype could ever support missing values; float missing values are specifically defined by IEEE 754 but for ints, every sequence of bits corresponds to a valid value. OTOH, NetCDF does have a _FillValue attribute that works for any type including int. If we view xarray as "NetCDF in memory" that could be an approach to follow, but for numpy in general it would fairly heavily break existing code (see also http://www.numpy.org/NA-overview.html) in particular for 8-bit types. If i understand correctly, R uses INT_MAX which would be 127 for 'int8… Apparently, R ints are always 32 bits. I'm new to xarray so I don't have a good idea on how much work adding support for masked arrays would be, but I'll take your word that it's not straightforward.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use masked arrays while preserving int 199188476

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 27.173ms · About: xarray-datasette