home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

14 rows where state = "closed" and user = 1200058 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, state_reason, created_at (date), updated_at (date), closed_at (date)

type 1

  • issue 14

state 1

  • closed · 14 ✖

repo 1

  • xarray 14
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
884649380 MDU6SXNzdWU4ODQ2NDkzODA= 5287 Support for pandas Extension Arrays Hoeze 1200058 closed 0     8 2021-05-10T17:00:17Z 2024-04-18T12:52:04Z 2024-04-18T12:52:04Z NONE      

Is your feature request related to a problem? Please describe. I started writing an ExtensionArray which is basically a Tuple[Array[str], Array[int], Array[int], Array[str], Array[str]]. Its scalar type is a Tuple[str, int, int, str, str].

This is working great in Pandas, I can read and write Parquet as well as csv with it. However, as soon as I'm using any .to_xarray() method, it gets converted to a NumPy array of objects. Also, converting back to Pandas keeps a Series of objects instead of my extension type.

Describe the solution you'd like Would it be possible to support Pandas Extension Types on coordinates? It's not necessary to compute anything on them, I'd just like to use them for dimensions.

Describe alternatives you've considered I was thinking over implementing a NumPy duck array, but I have never tried this and it looks quite complicated compared to the Pandas Extension types.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5287/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
384002323 MDU6SXNzdWUzODQwMDIzMjM= 2570 np.clip() executes eagerly Hoeze 1200058 closed 0     4 2018-11-24T16:25:03Z 2023-12-03T05:29:17Z 2023-12-03T05:29:17Z NONE      

Example:

python x = xr.DataArray(np.random.uniform(size=[100, 100])).chunk(10) x

<xarray.DataArray (dim_0: 100, dim_1: 100)> dask.array<shape=(100, 100), dtype=float64, chunksize=(10, 10)> Dimensions without coordinates: dim_0, dim_1

python np.clip(x, 0, 0.5)

<xarray.DataArray (dim_0: 100, dim_1: 100)> array([[0.264276, 0.32227 , 0.336396, ..., 0.110182, 0.28255 , 0.399041], [0.5 , 0.030289, 0.5 , ..., 0.428923, 0.262249, 0.5 ], [0.5 , 0.5 , 0.280971, ..., 0.427334, 0.026649, 0.5 ], ..., [0.5 , 0.5 , 0.294943, ..., 0.053143, 0.5 , 0.488239], [0.5 , 0.341485, 0.5 , ..., 0.5 , 0.250441, 0.5 ], [0.5 , 0.156285, 0.179123, ..., 0.5 , 0.076242, 0.319699]]) Dimensions without coordinates: dim_0, dim_1

python x.clip(0, 0.5)

<xarray.DataArray (dim_0: 100, dim_1: 100)> dask.array<shape=(100, 100), dtype=float64, chunksize=(10, 10)> Dimensions without coordinates: dim_0, dim_1

Problem description

Using np.clip() directly calculates the result, while xr.DataArray.clip() does not.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2570/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  not_planned xarray 13221727 issue
481838855 MDU6SXNzdWU0ODE4Mzg4NTU= 3224 Add "on"-parameter to "merge" method Hoeze 1200058 closed 0     2 2019-08-17T02:44:46Z 2022-04-18T15:57:09Z 2022-04-18T15:57:09Z NONE      

I'd like to propose a change to the merge method.

Often, I meet cases where I'd like to merge subsets of the same dataset. However, this currently requires renaming of all dimensions, changing indices and merging them by hand.

As an example, please consider the following dataset: Dimensions: (genes: 8787, observations: 8166) Coordinates: * observations (observations) object 'GTEX-111CU-1826-SM-5GZYN' ... 'GTEX-ZXG5-0005-SM-57WCN' * genes (genes) object 'ENSG00000227232' ... 'ENSG00000198727' individual (observations) object 'GTEX-111CU' ... 'GTEX-ZXG5' subtissue (observations) object 'Adipose_Subcutaneous' ... 'Whole_Blood' Data variables: cdf (observations, genes) float32 0.18883839 ... 0.4876754 l2fc (observations, genes) float32 -0.21032093 ... -0.032540113 padj (observations, genes) float32 1.0 1.0 1.0 ... 1.0 1.0 1.0 There is for each subtissue and individuum at most one observation.

Now, I'd like to plot all values in subtissue == "Whole_Blood" against subtissue == "Adipose_Subcutaneous". Therefore, I have to join all "Whole_Blood" observations with all "Adipose_Subcutaneous" observations by the "individual" coordinate.

To simplify this task, I'd like to have the following abstraction: ```python3

select tissues

tissue_1 = ds.sel(observations = (ds.subtissue == "Whole_Blood")) tissue_2 = ds.sel(observations = (ds.subtissue == "Adipose_Subcutaneous"))

inner join by individual

merged = tissue_1.merge(tissue_2, on="individual", newdim="merge_dim", join="inner")

print(merged) The result should look like this: Dimensions: ("genes": 8787, "individual": 286) Coordinates: * genes (genes) object 'ENSG00000227232' ... 'ENSG00000198727' * merge_dim (merge_dim) object 'GTEX-111CU' ... 'GTEX-ZXG5' observations:1 (merge_dim) object 'GTEX-111CU-1826-SM-5GZYN' ... 'GTEX-ZXG5-1826-SM-5GZYN' observations:2 (merge_dim) object 'GTEX-111CU-0005-SM-57WCN' ... 'GTEX-ZXG5-0005-SM-57WCN' subtissue:1 (merge_dim) object 'Whole_Blood' ... 'Whole_Blood' subtissue:1 (merge_dim) object 'Adipose_Subcutaneous' ... 'Adipose_Subcutaneous' Data variables: cdf:1 (merge_dim, genes) float32 0.18883839 ... 0.4876754 cdf:2 (merge_dim, genes) float32 ... l2fc:1 (merge_dim, genes) float32 -0.21032093 ... -0.032540113 l2fc:2 (merge_dim, genes) float32 ... padj:1 (merge_dim, genes) float32 1.0 1.0 1.0 ... 1.0 1.0 1.0 padj:2 (merge_dim, genes) float32 ... ```


To summarize, I'd propose the following changes: - Add parameter on: Union[str, List[str], Tuple[str], Dict[str, str]] This should specify one or multiple coordinates which should be merged. - Simple merge: string => merge by left[str] and right[str] - Merge of multiple coords: list or tuple of strings => merge by left[str1, str2, ...] and right[str1, str2, ...] - To merge differently named coords: dict, e.g. {"str_left": "str_right}) => merge by left[str_left] and right[str_right] - Add some parameter like newdim to specify the newly created index dimension. If on specifies multiple coords, this new index dimension should be a multi-index of these coords. - Rename all duplicate coordinates not specified in on to some unique name e.g. left["cdf"] => merged["cdf:1"] and right["cdf"] => merged["cdf:2"]

In case if the on parameter's coordinates do not unambiguously describe each data point, they should be combined in a cross-product manner. However, since this could cause a quadratic runtime and memory requirement, I am not sure how this can be handled in a safe manner.

What do you think about this addition?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3224/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
510844652 MDU6SXNzdWU1MTA4NDQ2NTI= 3432 Scalar slice of MultiIndex is turned to tuples Hoeze 1200058 closed 0     5 2019-10-22T18:55:52Z 2022-03-17T17:11:41Z 2022-03-17T17:11:41Z NONE      

Today I updated to v0.14 of xarray and it broke some of my code.

I tried to select one observation of the following dataset: <xarray.Dataset> Dimensions: (genes: 31523, observations: 236) Coordinates: * genes (genes) object 'ENSG00000227232' ... 'ENSG00000232254' * observations (observations) MultiIndex - individual (observations) object 'GTEX-111YS' ... 'GTEX-ZXG5' - subtissue (observations) object 'Whole_Blood' ... 'Whole_Blood' Data variables: [...] ds.isel(observations=1): <xarray.Dataset> Dimensions: (genes: 31523) Coordinates: * genes (genes) object 'ENSG00000227232' ... 'ENSG00000232254' observations object ('GTEX-1122O', 'Whole_Blood') Data variables: [...]

As you can see, observations is now a tuple of ('GTEX-1122O', 'Whole_Blood'). However, the individual and the subtissue should be kept as coordinates.

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.4 (default, Aug 13 2019, 20:35:49) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-514.16.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.1 xarray: 0.14.0 pandas: 0.25.1 numpy: 1.17.2 scipy: 1.3.1 netCDF4: 1.4.2 pydap: None h5netcdf: 0.7.4 h5py: 2.9.0 Nio: None zarr: 2.3.2 cftime: 1.0.3.4 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.5.2 distributed: 2.5.2 matplotlib: 3.1.1 cartopy: None seaborn: 0.9.0 numbagg: None setuptools: 41.4.0 pip: 19.2.3 conda: None pytest: 5.0.1 IPython: 7.8.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3432/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
325475880 MDU6SXNzdWUzMjU0NzU4ODA= 2173 Formatting error in conjunction with pandas.DataFrame Hoeze 1200058 closed 0     6 2018-05-22T21:49:24Z 2021-04-13T15:04:51Z 2021-04-13T15:04:51Z NONE      

Code Sample, a copy-pastable example if possible

```python import pandas as pd import numpy as np import xarray as xr

sample_data = np.random.uniform(size=[2,2000,10000]) x = xr.Dataset({"sample_data": (sample_data.shape, sample_data)}) print(x)

df = pd.DataFrame({"x": [1,2,3], "y": [2,4,6]}) x["df"] = df print(x) ```

Problem description

Printing a xarray.Dataset results in an error when containing a pandas.DataFrame

Expected Output

Should print string representation of Dataset

Output of xr.show_versions()

/usr/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. from ._conv import register_converters as _register_converters INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Linux OS-release: 4.16.9-1-ARCH machine: x86_64 processor: byteorder: little LC_ALL: None LANG: de_DE.UTF-8 LOCALE: de_DE.UTF-8 xarray: 0.10.4 pandas: 0.22.0 numpy: 1.14.3 scipy: 1.0.1 netCDF4: None h5netcdf: 0.5.1 h5py: 2.7.1 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: None distributed: None matplotlib: 2.2.2 cartopy: None seaborn: None setuptools: 39.2.0 pip: 10.0.1 conda: None pytest: None IPython: 6.3.1 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2173/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
511498714 MDU6SXNzdWU1MTE0OTg3MTQ= 3438 Re-indexing causes coordinates to be dropped Hoeze 1200058 closed 0     2 2019-10-23T18:31:18Z 2020-01-09T01:46:46Z 2020-01-09T01:46:46Z NONE      

Hi, I encounter a problem with the index being dropped when I rename a dimension and stack it afterwards:

MCVE Code Sample

```python ds = xr.Dataset({ "test": xr.DataArray( [[[1,2],[3,4]], [[1,2],[3,4]]], dims=("genes", "observations", "subtissues"), coords={ "observations": xr.DataArray(["x-1", "y-1"], dims=("observations",)), "individuals": xr.DataArray(["x", "y"], dims=("observations",)), "genes": xr.DataArray(["a", "b"], dims=("genes",)), "subtissues": xr.DataArray(["c", "d"], dims=("subtissues",)), } ) })

`individuals` is set here:python3 print(ds.rename_dims(observations="individuals")) <xarray.Dataset> Dimensions: (genes: 2, individuals: 2, subtissues: 2) Coordinates: observations (individuals) <U3 'x-1' 'y-1' * individuals (individuals) <U1 'x' 'y' * genes (genes) <U1 'a' 'b' * subtissues (subtissues) <U1 'c' 'd' Data variables: test (genes, individuals, subtissues) int64 1 2 3 4 1 2 3 4 Stacking caused `individuals` to disappear and be replaced with integers:python3 print(ds.rename_dims(observations="individuals").stack(observations=["individuals", "genes"])) <xarray.Dataset> Dimensions: (observations: 4, subtissues: 2) Coordinates: * observations (observations) MultiIndex - individuals (observations) int64 0 0 1 1 - genes (observations) object 'a' 'b' 'a' 'b' * subtissues (subtissues) <U1 'c' 'd' Data variables: test (subtissues, observations) int64 1 1 3 3 2 2 4 4 Explicitly setting `individuals` keeps them correctly after stacking:python3 print(ds.rename_dims(observations="individuals").set_index({"individuals": "individuals"}).set_coords("individuals").stack(observations=["individuals", "genes"])) <xarray.Dataset> Dimensions: (observations: 4, subtissues: 2) Coordinates: * observations (observations) MultiIndex - individuals (observations) object 'x' 'x' 'y' 'y' - genes (observations) object 'a' 'b' 'a' 'b' * subtissues (subtissues) <U1 'c' 'd' Data variables: test (subtissues, observations) int64 1 1 3 3 2 2 4 4 ```

Is this by intention?

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.4 (default, Aug 13 2019, 20:35:49) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-957.10.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.1 xarray: 0.14.0 pandas: 0.25.1 numpy: 1.17.2 scipy: 1.3.1 netCDF4: 1.4.2 pydap: None h5netcdf: 0.7.4 h5py: 2.9.0 Nio: None zarr: 2.3.2 cftime: 1.0.3.4 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.5.2 distributed: 2.5.2 matplotlib: 3.1.1 cartopy: None seaborn: 0.9.0 numbagg: None setuptools: 41.4.0 pip: 19.2.3 conda: None pytest: 5.0.1 IPython: 7.8.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3438/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
522238536 MDU6SXNzdWU1MjIyMzg1MzY= 3518 Have "unstack" return a boolean mask? Hoeze 1200058 closed 0     1 2019-11-13T13:54:49Z 2019-11-16T14:36:43Z 2019-11-16T14:36:43Z NONE      

MCVE Code Sample

python arr = xr.DataArray(np.arange(6).reshape(2, 3), coords=[('x', ['a', 'b']), ('y', [0, 1, 2])]) arr stacked = arr.stack(z=('x', 'y')) stacked[:4].unstack().dtype

Expected Output

```python

arr = xr.DataArray(np.arange(6).reshape(2, 3), ... coords=[('x', ['a', 'b']), ('y', [0, 1, 2])]) arr <xarray.DataArray (x: 2, y: 3)> array([[0, 1, 2], [3, 4, 5]]) Coordinates: * x (x) <U1 'a' 'b' * y (y) int64 0 1 2 stacked = arr.stack(z=('x', 'y')) stacked[:4].unstack().dtype dtype('float64') ```

Problem Description

Unstacking changes the data type to float for NaN's. Are there thoughts on alternative options, e.g. fill_value=0 or return_boolean_mask, in order to retain the original data type?

Currently, I obtain a boolean missing array by checking for isnan. Then I call fillnan(0) and convert the data type back to integer. However, this is quite inefficient.

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.4 (default, Aug 13 2019, 20:35:49) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-957.10.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.1 xarray: 0.14.0 pandas: 0.25.1 numpy: 1.17.2 scipy: 1.3.1 netCDF4: 1.4.2 pydap: None h5netcdf: 0.7.4 h5py: 2.9.0 Nio: None zarr: 2.3.2 cftime: 1.0.3.4 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.5.2 distributed: 2.5.2 matplotlib: 3.1.1 cartopy: None seaborn: 0.9.0 numbagg: None setuptools: 41.4.0 pip: 19.2.3 conda: None pytest: 5.0.1 IPython: 7.8.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3518/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
491172429 MDU6SXNzdWU0OTExNzI0Mjk= 3296 [Docs] parameters + data type broken Hoeze 1200058 closed 0     2 2019-09-09T15:36:40Z 2019-09-09T15:41:36Z 2019-09-09T15:40:29Z NONE      

Hi, since this is already present some time and I could not find a corresponding issue:

The documentation format seems to be broken. Parameter name and data type stick together: Source: http://xarray.pydata.org/en/stable/generated/xarray.apply_ufunc.html

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3296/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
338226520 MDU6SXNzdWUzMzgyMjY1MjA= 2267 Some simple broadcast_dim method? Hoeze 1200058 closed 0     9 2018-07-04T10:48:27Z 2019-07-06T13:06:45Z 2019-07-06T13:06:45Z NONE      

I've already found xr.broadcast(arrays). However, I'd like to just add a new dimension with a specific size to one DataArray. I could not find any simple option to do this.

If there is no such option: - add a size parameter to DataArray.expand_dims? - DataArray.broadcast_dims({"a": M, "b": N})?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2267/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
337619718 MDU6SXNzdWUzMzc2MTk3MTg= 2263 [bug] Exception ignored in generator object Variable Hoeze 1200058 closed 0     9 2018-07-02T18:30:57Z 2019-01-23T00:56:19Z 2019-01-23T00:56:18Z NONE      

X is an xr.DataArray of shape (2000, 100), idx is a np.ndarray vector of shape (500,). Now I'd like to fetch:python X[idx].values ```

Problem description

During this, the following warning pops up: ``` Exception ignored in: <generator object Variable._broadcast_indexes.\<locals>.\<genexpr> at 0x7fcdd479f1a8> Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/xarray/core/variable.py", line 470, in <genexpr> if all(isinstance(k, BASIC_INDEXING_TYPES) for k in key): SystemError: error return without exception set

```

Expected Output

No error

Possible solution:

Each time I execute: all(not isinstance(k, Variable) for k in key) the error pops up at a Tuple Iterator. Changing this to all([not isinstance(k, Variable) for k in key]) solves the problem.

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Linux OS-release: 4.15.0-23-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: de_DE.UTF-8 LOCALE: de_DE.UTF-8 xarray: 0.10.7 pandas: 0.23.1 numpy: 1.14.5 scipy: 1.1.0 netCDF4: 1.3.1 h5netcdf: 0.5.1 h5py: 2.8.0 Nio: None zarr: None bottleneck: None cyordereddict: None dask: 0.17.5 distributed: None matplotlib: 2.2.2 cartopy: None seaborn: 0.8.1 setuptools: 39.2.0 pip: 9.0.1 conda: None pytest: None IPython: 6.4.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2263/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
336220647 MDU6SXNzdWUzMzYyMjA2NDc= 2253 autoclose=True is not implemented for the h5netcdf backend Hoeze 1200058 closed 0     2 2018-06-27T13:03:44Z 2019-01-13T01:38:24Z 2019-01-13T01:38:24Z NONE      

Hi, are there any plans to enable autoclose=True for h5netcdf? I'd like to use "open_mfdataset" for 15.000 files, which is not possible with autoclose=False.

Error message:

Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2963, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "<ipython-input-7-43f56b648d8a>", line 16, in <module> autoclose=True File "/usr/local/lib/python3.6/dist-packages/xarray/backends/api.py", line 624, in open_mfdataset datasets = [open_(p, **open_kwargs) for p in paths] File "/usr/local/lib/python3.6/dist-packages/xarray/backends/api.py", line 624, in <listcomp> datasets = [open_(p, **open_kwargs) for p in paths] File "/usr/local/lib/python3.6/dist-packages/xarray/backends/api.py", line 331, in open_dataset **backend_kwargs) File "/usr/local/lib/python3.6/dist-packages/xarray/backends/h5netcdf_.py", line 84, in __init__ raise NotImplementedError('autoclose=True is not implemented ' NotImplementedError: autoclose=True is not implemented for the h5netcdf backend pending further exploration, e.g., bug fixes (in h5netcdf?)

Output of xr.show_versions()

>>> xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Linux OS-release: 4.15.0-23-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: de_DE.UTF-8 LOCALE: de_DE.UTF-8 xarray: 0.10.7 pandas: 0.23.1 numpy: 1.14.5 scipy: 1.1.0 netCDF4: 1.4.0 h5netcdf: 0.5.1 h5py: 2.8.0 Nio: None zarr: None bottleneck: None cyordereddict: None dask: 0.17.5 distributed: None matplotlib: 2.2.2 cartopy: None seaborn: 0.8.1 setuptools: 39.2.0 pip: 9.0.1 conda: None pytest: None IPython: 6.4.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2253/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
378326194 MDU6SXNzdWUzNzgzMjYxOTQ= 2549 to_dask_dataframe for xr.DataArray Hoeze 1200058 closed 0     2 2018-11-07T15:02:22Z 2018-11-07T16:27:56Z 2018-11-07T16:27:56Z NONE      

Is there some xr.DataArray.to_dask_dataframe() method?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2549/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
325470877 MDU6SXNzdWUzMjU0NzA4Nzc= 2172 Errors on pycharm completion Hoeze 1200058 closed 0     2 2018-05-22T21:31:42Z 2018-05-27T20:48:30Z 2018-05-27T20:48:30Z NONE      

Code Sample, a copy-pastable example if possible

```python

execute:

import numpy as np import xarray as xr

sample_data = np.random.uniform(size=[2,2000,10000]) x = xr.Dataset({"sample_data": (sample_data.shape, sample_data)})

x2 = x["sample_data"]

now type by hand:

x2.

```

Problem description

I'm not completely sure if it's a xarray problem, but each time I enter [some dataset]. (note the point) inside PyCharm's python console, I get a python exception instead of some autocompletion.

Expected Output

Output of xr.show_versions()

/usr/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. from ._conv import register_converters as _register_converters INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Linux OS-release: 4.16.9-1-ARCH machine: x86_64 processor: byteorder: little LC_ALL: None LANG: de_DE.UTF-8 LOCALE: de_DE.UTF-8 xarray: 0.10.4 pandas: 0.22.0 numpy: 1.14.3 scipy: 1.0.1 netCDF4: None h5netcdf: 0.5.1 h5py: 2.7.1 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: None distributed: None matplotlib: 2.2.2 cartopy: None seaborn: None setuptools: 39.2.0 pip: 10.0.1 conda: None pytest: None IPython: 6.3.1 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2172/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
325889600 MDU6SXNzdWUzMjU4ODk2MDA= 2177 Dataset.to_netcdf() cannot create group with engine="h5netcdf" Hoeze 1200058 closed 0     1 2018-05-23T22:03:07Z 2018-05-25T00:52:07Z 2018-05-25T00:52:07Z NONE      

Code Sample, a copy-pastable example if possible

```python import pandas as pd import numpy as np import xarray as xr

sample_data = np.random.uniform(size=[2,2000,10000]) x = xr.Dataset({"sample_data": (("x", "y", "z"), sample_data)})

df = pd.DataFrame({"x": [1,2,3], "y": [2,4,6]}) x["df"] = df print(x)

not working:

x.to_netcdf("test.h5", group="asdf", engine="h5netcdf")

working:

x.to_netcdf("test.h5", group="asdf", engine="netcdf4")

```

Problem description

h5netcdf does not allow creating groups

Expected Output

should save data to "test.h5"

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Linux OS-release: 4.16.9-1-ARCH machine: x86_64 processor: byteorder: little LC_ALL: None LANG: de_DE.UTF-8 LOCALE: de_DE.UTF-8 xarray: 0.10.4 pandas: 0.22.0 numpy: 1.14.3 scipy: 1.0.1 netCDF4: 1.3.1 h5netcdf: 0.5.1 h5py: 2.7.1 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: None distributed: None matplotlib: 2.2.2 cartopy: None seaborn: None setuptools: 39.2.0 pip: 10.0.1 conda: None pytest: None IPython: 6.3.1 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2177/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 6717.12ms · About: xarray-datasette