home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

37 rows where author_association = "NONE" and user = 1200058 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

issue 25

  • [bug] Exception ignored in generator object Variable 4
  • Some simple broadcast_dim method? 4
  • Sparse arrays 3
  • Boolean indexing with multi-dimensional key arrays 2
  • [Feature Request] Visualizing dimensions 2
  • [proposal] concatenate by axis, ignore dimension names 2
  • N-dimensional boolean indexing 2
  • Support __matmul__ operator (@) 1
  • Use masked arrays while preserving int 1
  • Many methods are broken (e.g., concat/stack/sortby) when using repeated dimensions 1
  • Explicit indexes in xarray's data-model (Future of MultiIndex) 1
  • Append along an unlimited dimension to an existing netCDF file 1
  • Slow performance of isel 1
  • autoclose=True is not implemented for the h5netcdf backend 1
  • to_dask_dataframe for xr.DataArray 1
  • np.clip() executes eagerly 1
  • cov() and corr() 1
  • Stack() & unstack() issues on Multindex 1
  • [Docs] parameters + data type broken 1
  • Scalar slice of MultiIndex is turned to tuples 1
  • [feature request] __iter__() for rolling-window on datasets 1
  • Error when writing string coordinate variables to zarr 1
  • cov() and corr() - finalization 1
  • Implement `value_counts` method 1
  • Make creating a MultiIndex in stack optional 1

user 1

  • Hoeze · 37 ✖

author_association 1

  • NONE · 37 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
844486483 https://github.com/pydata/xarray/issues/5179#issuecomment-844486483 https://api.github.com/repos/pydata/xarray/issues/5179 MDEyOklzc3VlQ29tbWVudDg0NDQ4NjQ4Mw== Hoeze 1200058 2021-05-19T21:27:17Z 2021-05-19T21:27:17Z NONE

fyi, I updated the boolean indexing to support additional or missing dimensions: https://gist.github.com/Hoeze/96616ef9d179180b0b7de97c97e00a27 I'm using this on a 4D-array with >300GB to flatten three of the four dimensions and it works, even on 64GB of RAM.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  N-dimensional boolean indexing  860418546
841692487 https://github.com/pydata/xarray/issues/3476#issuecomment-841692487 https://api.github.com/repos/pydata/xarray/issues/3476 MDEyOklzc3VlQ29tbWVudDg0MTY5MjQ4Nw== Hoeze 1200058 2021-05-15T16:56:00Z 2021-05-15T17:03:00Z NONE

Hi, I also keep running into this issue all the time. Right now, there is no way of round-tripping xr.open_zarr().to_zarr(), also because of https://github.com/pydata/xarray/issues/5219.

The only workaround that seems to help is the following: python to_store = xrds.copy() for var in to_store.variables: to_store[var].encoding.clear()

{
    "total_count": 3,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 1
}
  Error when writing string coordinate variables to zarr 516306758
825488224 https://github.com/pydata/xarray/issues/5202#issuecomment-825488224 https://api.github.com/repos/pydata/xarray/issues/5202 MDEyOklzc3VlQ29tbWVudDgyNTQ4ODIyNA== Hoeze 1200058 2021-04-23T08:21:44Z 2021-04-23T08:22:34Z NONE

It's a large problem when working with Dask/Zarr: - First, it loads all indices into memory - Then, it computes in a single thread the MultiIndex

I had cases where stacking the dimensions took ~15 minutes while computing+saving the dataset was done in < 1min.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Make creating a MultiIndex in stack optional 864249974
824782830 https://github.com/pydata/xarray/issues/1887#issuecomment-824782830 https://api.github.com/repos/pydata/xarray/issues/1887 MDEyOklzc3VlQ29tbWVudDgyNDc4MjgzMA== Hoeze 1200058 2021-04-22T12:08:45Z 2021-04-22T12:11:55Z NONE

Current proposal ("stack"), of da[key] and with a dimension of key's name (and probably no multiindex): python In [86]: da.values[key.values] Out[86]: array([0, 3, 6, 9]) # But the xarray version

The part about this new proposal that is most annoying is that the key needs a name, which we can use to name the new dimension. That's not too hard to do, but it is little annoying -- in practice you would have to write something like da[key.rename('key_name')] much of the time to make this work.

IMO, the perfect solution would be masking support. I.e. da[key] would return the same array with an additional variable da.mask == key: python In [87]: da[key] Out[87]: <xarray.DataArray (a: 3, b: 4)> array([[ 0, <NA>, <NA>, 3], [<NA>, <NA>, 6, <NA>], [<NA>, 9, <NA>, <NA>]]) dtype: int Dimensions without coordinates: a, b Then we could have something like da[key].stack(new_dim=["a", "b"], dropna=True): python In [87]: da[key].stack(new_dim=["a", "b"], dropna=True) Out[87]: <xarray.DataArray (newdim: 4)> array([0, 3, 6, 9]) coords{ "a" (newdim): [0, 0, 1, 2], "b" (newdim): [0, 3, 2, 1], } Dimensions without coordinates: newdim Here, dropna=True would allow avoiding to create the cross-product of a, b.

Also, that would avoid all those unnecessary float casts for free.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Boolean indexing with multi-dimensional key arrays 294241734
822122172 https://github.com/pydata/xarray/issues/1603#issuecomment-822122172 https://api.github.com/repos/pydata/xarray/issues/1603 MDEyOklzc3VlQ29tbWVudDgyMjEyMjE3Mg== Hoeze 1200058 2021-04-19T02:18:58Z 2021-04-19T02:19:24Z NONE

Many array types do have implicit indices. For example, sparse arrays do have their coordinates / CSR representation as primary index (.sel()) while dense array's primary index is the position (.isel()). Every labeled dimension is therefore just a separate mapping of a string to the index position in the array.

Going one step further, one could have continuous dimensions where positional indexing (.isel()) does not really make sense. Looking at TileDB's dimensions provides an example for this.

=> Having explicit and implicit indices on arrays would be awesome, even if they don't support all xarray features!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Explicit indexes in xarray's data-model (Future of MultiIndex) 262642978
821881984 https://github.com/pydata/xarray/issues/5179#issuecomment-821881984 https://api.github.com/repos/pydata/xarray/issues/5179 MDEyOklzc3VlQ29tbWVudDgyMTg4MTk4NA== Hoeze 1200058 2021-04-17T20:22:13Z 2021-04-17T20:27:25Z NONE

@max-sixty The reason is that my method is basically a special case of point-wise indexing: http://xarray.pydata.org/en/stable/indexing.html#more-advanced-indexing You can get the same result by calling: ```python core_dim_locs = {key: value for key, value in core_dim_locs_from_cond(mask, new_dim_name="newdim")}

pointwise selection

data.sel( dim_0=outliers_subset["dim_0"], dim_1=outliers_subset["dim_1"], dim_2=outliers_subset["dim_2"] ) ``` (Note that you loose chunk information by this method, that's why it is less efficient)

When you want to select random items from a N-dimensional array, you can either model the result as some sparse array or by stacking the dimensions. (OK, stacking the dimensions means also a sparse COO encoding...)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  N-dimensional boolean indexing  860418546
709289668 https://github.com/pydata/xarray/issues/2933#issuecomment-709289668 https://api.github.com/repos/pydata/xarray/issues/2933 MDEyOklzc3VlQ29tbWVudDcwOTI4OTY2OA== Hoeze 1200058 2020-10-15T12:37:10Z 2020-10-15T12:37:10Z NONE

Is there a way without unstacking as well? (Unstack can be quite wasteful)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Stack() & unstack() issues on Multindex 438947247
610407293 https://github.com/pydata/xarray/issues/3945#issuecomment-610407293 https://api.github.com/repos/pydata/xarray/issues/3945 MDEyOklzc3VlQ29tbWVudDYxMDQwNzI5Mw== Hoeze 1200058 2020-04-07T14:06:03Z 2020-04-07T14:17:12Z NONE

First prototype: ```python def value_counts(v, global_unique_values, newdim: str): unique_values, counts = dask.compute(*np.unique(v, return_counts=True))

# find out where in `global_unique_values` the unique values of `v` are located
_, idx1, idx2 = np.intersect1d(unique_values, global_unique_values, return_indices=True)

# assign values according to `global_unique_values`
retval = np.zeros_like(global_unique_values)
retval[idx2] = counts[idx1]

# ## alternative:
# counts = xr.DataArray(
#     counts, 
#     dims=[newdim, ],
#     coords={newdim: unique_values},
# )
# counts, = xr.align(counts, indexes={newdim: global_unique_values}, fill_value=0)

return retval

def xr_value_counts(obj, unique_values=None, **kwargs): (newdim, apply_dims), = kwargs.items()

if type(apply_dims) == str:
    # convert scalars to list
    apply_dims = [apply_dims]
if type(apply_dims) != list:
    # cast iterables to list
    apply_dims = [*apply_dims]

if not unique_values:
    # map(np.unique) and reduce(np.unique)
    unique_values = np.unique(da.map_blocks(np.unique, obj.data.flatten()).compute())
else:
    unique_values = np.sort(unique_values)

retval = xr.apply_ufunc(
    lambda v: value_counts(v, global_unique_values=unique_values, newdim=newdim),
    obj,
    input_core_dims=[apply_dims],
    output_core_dims=[[newdim]],
    dask="allowed",
    vectorize=True,
)
retval.coords[newdim] = unique_values

return retval

test_da = xr.DataArray( [ [0,1,1,1,3,4], [0,6,1,1,3,4], ], dims=["dim_0", "dim_1"], coords={"dim_1": [2,5,7,4,3,6]}, )

test_values = xr_value_counts(test_da, value_counts="dim_1")

assert np.all( test_values.values == np.array([ [1, 3, 1, 1, 0], [1, 2, 1, 1, 1] ]) )

assert np.all( test_values.value_counts == np.array([0, 1, 3, 4, 6]) ) ```

Example: ```python test_da = xr.DataArray( [ [0,1,1,1,3,4], [0,6,1,1,3,4], ], dims=["dim_0", "dim_1"], coords={"dim_1": [2,5,7,4,3,6]}, )

print(test_da)

<xarray.DataArray (dim_0: 2, dim_1: 6)>

array([[0, 1, 1, 1, 3, 4],

[0, 6, 1, 1, 3, 4]])

Coordinates:

* dim_1 (dim_1) int64 2 5 7 4 3 6

Dimensions without coordinates: dim_0

print(xr_value_counts(test_da, value_counts="dim_1"))

<xarray.DataArray (dim_0: 2, value_counts: 5)>

array([[1, 3, 1, 1, 0],

[1, 2, 1, 1, 1]])

Coordinates:

* value_counts (value_counts) int64 0 1 3 4 6

Dimensions without coordinates: dim_0

```

Probably not the fastest solution and executes eagerly but it works. What do you think?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement `value_counts` method 595784008
605632224 https://github.com/pydata/xarray/issues/1194#issuecomment-605632224 https://api.github.com/repos/pydata/xarray/issues/1194 MDEyOklzc3VlQ29tbWVudDYwNTYzMjIyNA== Hoeze 1200058 2020-03-29T13:00:29Z 2020-03-29T13:03:46Z NONE

Currently I keep carrying a "<arrayname>_missing" mask with all of my unstacked arrays to solve this issue. It would be very desirable to have a clean solution for this to keep arrays from being converted to float. Also, NaN does not necessarily mean NA which already caused me quite some head-scratching in the past. Further, it would be a very cool indicator to see which values of a dense array should be converted into a sparse array.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use masked arrays while preserving int 199188476
558693816 https://github.com/pydata/xarray/issues/2227#issuecomment-558693816 https://api.github.com/repos/pydata/xarray/issues/2227 MDEyOklzc3VlQ29tbWVudDU1ODY5MzgxNg== Hoeze 1200058 2019-11-26T15:54:25Z 2019-11-26T15:54:25Z NONE

Hi, I'd like to understand how isel works exactly in conjunction with dask arrays. As it seems, #3481 propagates the isel operation onto each dask chunk for lazy evaluation. Is this correct?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Slow performance of isel 331668890
557071354 https://github.com/pydata/xarray/pull/3550#issuecomment-557071354 https://api.github.com/repos/pydata/xarray/issues/3550 MDEyOklzc3VlQ29tbWVudDU1NzA3MTM1NA== Hoeze 1200058 2019-11-21T12:52:21Z 2019-11-21T12:52:21Z NONE

Awesome, thanks a lot @r-beer!

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() - finalization 525685973
548790428 https://github.com/pydata/xarray/issues/3452#issuecomment-548790428 https://api.github.com/repos/pydata/xarray/issues/3452 MDEyOklzc3VlQ29tbWVudDU0ODc5MDQyOA== Hoeze 1200058 2019-11-01T13:38:10Z 2019-11-01T13:38:47Z NONE

Thanks for your suggestion @dcherian

You should look at rolling.construct. It could be a lot more efficient than iterating.

I think iterating over rolling.construct() should then be equally efficient compared to some rolling.__iter__() function?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  [feature request] __iter__() for rolling-window on datasets 512879550
545134181 https://github.com/pydata/xarray/issues/3432#issuecomment-545134181 https://api.github.com/repos/pydata/xarray/issues/3432 MDEyOklzc3VlQ29tbWVudDU0NTEzNDE4MQ== Hoeze 1200058 2019-10-22T20:14:38Z 2019-10-22T20:17:26Z NONE

@max-sixty here you go: ```python3 import xarray as xr

print(xr.version)

ds = xr.Dataset({ "test": xr.DataArray( [[[1,2],[3,4]], [[1,2],[3,4]]], dims=("genes", "individuals", "subtissues"), coords={ "genes": ["a", "b"], "individuals": ["c", "d"], "subtissues": ["e", "f"], } ) }) print(ds)

stacked = ds.stack(observations=["individuals", "subtissues"]) print(stacked)

print(stacked.isel(observations=1)) ```

result: <xarray.Dataset> Dimensions: (genes: 2) Coordinates: * genes (genes) <U1 'a' 'b' observations object ('c', 'f') Data variables: test (genes) int64 2 2

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Scalar slice of MultiIndex is turned to tuples 510844652
544693024 https://github.com/pydata/xarray/issues/1887#issuecomment-544693024 https://api.github.com/repos/pydata/xarray/issues/1887 MDEyOklzc3VlQ29tbWVudDU0NDY5MzAyNA== Hoeze 1200058 2019-10-21T20:27:14Z 2019-10-21T20:27:14Z NONE

Since https://github.com/pydata/xarray/issues/3206 has been implemented now: Maybe fancy boolean indexing (da[boolean_mask]) could return a sparse array as well.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Boolean indexing with multi-dimensional key arrays 294241734
529539234 https://github.com/pydata/xarray/issues/3296#issuecomment-529539234 https://api.github.com/repos/pydata/xarray/issues/3296 MDEyOklzc3VlQ29tbWVudDUyOTUzOTIzNA== Hoeze 1200058 2019-09-09T15:41:36Z 2019-09-09T15:41:36Z NONE

Thanks @dcherian!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  [Docs] parameters + data type broken 491172429
529390440 https://github.com/pydata/xarray/issues/3281#issuecomment-529390440 https://api.github.com/repos/pydata/xarray/issues/3281 MDEyOklzc3VlQ29tbWVudDUyOTM5MDQ0MA== Hoeze 1200058 2019-09-09T09:48:46Z 2019-09-09T09:48:46Z NONE

@TomNicholas No, not really. Stack can be only used to combine multiple coordinates into one coordinate, e.g. data[x, y, z] -> stacked_data[a, z] with a as a multi-index of x and y.

In this case, we do not have shared data with coordinates to combine.
Instead, multiple independent DataArrays should be concatenated along some dimension.

The most similar methods to this one are xr.concat and xr.combine_nested. However, they do not allow to implicitly rename dimensions and force-delete non-shared metadata.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  [proposal] concatenate by axis, ignore dimension names 489825483
528528962 https://github.com/pydata/xarray/issues/3281#issuecomment-528528962 https://api.github.com/repos/pydata/xarray/issues/3281 MDEyOklzc3VlQ29tbWVudDUyODUyODk2Mg== Hoeze 1200058 2019-09-05T19:05:20Z 2019-09-05T19:06:13Z NONE

Thanks for your answer @shoyer. OK, then this can be closed, since this function should actually remove metadata for me :)

For example, lets consider a dataset with: - dimensions:
("obs", "features_1", "features_2", ..., "features_n") - variables:
x1 ("obs", "features_1"), x2 ("obs", "features_2"), ..., xn ("obs", "features_n")

Now I want to stick those side-by-side to get an array x_combined ("obs", "features") with features = features_1 + ... + features_n".

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  [proposal] concatenate by axis, ignore dimension names 489825483
511165269 https://github.com/pydata/xarray/pull/2652#issuecomment-511165269 https://api.github.com/repos/pydata/xarray/issues/2652 MDEyOklzc3VlQ29tbWVudDUxMTE2NTI2OQ== Hoeze 1200058 2019-07-14T01:17:54Z 2019-07-14T01:17:54Z NONE

Is this pull request still up to date?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() 396102183
454164876 https://github.com/pydata/xarray/issues/2570#issuecomment-454164876 https://api.github.com/repos/pydata/xarray/issues/2570 MDEyOklzc3VlQ29tbWVudDQ1NDE2NDg3Ng== Hoeze 1200058 2019-01-14T21:18:03Z 2019-01-14T21:18:03Z NONE

@max-sixty IMHO this issue should be kept open: 1) it is still not fixed 2) it can be fixed with an upcoming version of NumPy

I'd rather add some TODO tag or similar to this issue

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  np.clip() executes eagerly 384002323
436684706 https://github.com/pydata/xarray/issues/2549#issuecomment-436684706 https://api.github.com/repos/pydata/xarray/issues/2549 MDEyOklzc3VlQ29tbWVudDQzNjY4NDcwNg== Hoeze 1200058 2018-11-07T16:26:23Z 2018-11-07T16:26:23Z NONE

Ah ok, I tried this but I got some strange "inconsistent chunks" error...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  to_dask_dataframe for xr.DataArray 378326194
405954547 https://github.com/pydata/xarray/issues/1378#issuecomment-405954547 https://api.github.com/repos/pydata/xarray/issues/1378 MDEyOklzc3VlQ29tbWVudDQwNTk1NDU0Nw== Hoeze 1200058 2018-07-18T14:39:04Z 2018-07-18T14:39:04Z NONE

Annotating distance matrices with xarray is not possible as well due to the duplicate dimension.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Many methods are broken (e.g., concat/stack/sortby) when using repeated dimensions 222676855
405950145 https://github.com/pydata/xarray/issues/2267#issuecomment-405950145 https://api.github.com/repos/pydata/xarray/issues/2267 MDEyOklzc3VlQ29tbWVudDQwNTk1MDE0NQ== Hoeze 1200058 2018-07-18T14:26:58Z 2018-07-18T14:27:35Z NONE

Maybe related:

Consider the following example to calculate pairwise distances: x = np.array([[1,2,3,4]]) dist = x.T - x numpy automatically broadcasts the one-dimensions to get 4x4 matrices and substracts them.

As far as I can see, this example is really hard to recreate with xarray, since there is nearly no possibility to add a new dimension to x and broadcast it properly.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Some simple broadcast_dim method? 338226520
405635314 https://github.com/pydata/xarray/issues/1053#issuecomment-405635314 https://api.github.com/repos/pydata/xarray/issues/1053 MDEyOklzc3VlQ29tbWVudDQwNTYzNTMxNA== Hoeze 1200058 2018-07-17T16:00:44Z 2018-07-17T16:04:32Z NONE

How about just keeping the current behavior? Currently a @ b just returns a new numpy array if either a or b is no xr.DataArray. This makes perfectly sense to me.

If both arrays are xr.DataArrays, I get an error which was rather unexpected. Here, xarray could simply stick to xr.DataArray.dot().

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support __matmul__ operator (@) 184238633
403110781 https://github.com/pydata/xarray/issues/2263#issuecomment-403110781 https://api.github.com/repos/pydata/xarray/issues/2263 MDEyOklzc3VlQ29tbWVudDQwMzExMDc4MQ== Hoeze 1200058 2018-07-06T18:19:59Z 2018-07-06T18:19:59Z NONE

Thank you very much for trying to help :)

It's no big problem at the moment, as it only comes up when I'm debugging. If this should really get a big problem I'll try to publish a test script here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  [bug] Exception ignored in generator object Variable 337619718
402700723 https://github.com/pydata/xarray/issues/2263#issuecomment-402700723 https://api.github.com/repos/pydata/xarray/issues/2263 MDEyOklzc3VlQ29tbWVudDQwMjcwMDcyMw== Hoeze 1200058 2018-07-05T12:06:21Z 2018-07-05T12:08:07Z NONE

```

X.variable._data array([[ 3478., 9079., 51928., ..., 34312., 11081., 79157.], [ 6976., 14495., 64495., ..., 30606., 12816., 157080.], [ 4260., 10066., 47271., ..., 47332., 14947., 118562.], ..., [ 12976., 9201., 31295., ..., 22093., 8846., 96991.], [ 7033., 8238., 22521., ..., 20476., 9051., 67057.], [ 13566., 10308., 28916., ..., 15529., 7426., 84852.]], dtype=float32) type(X.variable._data) <class 'numpy.ndarray'> idx array([ 705, 753, 342, 398, 688, 661, 630, 624, 668, 669, 631, 430, 54, 828, 478, 912, 772, 78, 627, 164, 557, 393, 1019, 559, 440, 290, 226, 299, 870, 718, 603, 947, 800, 483, 66, 453, 485, 919, 671, 213, 877, 126, 684, 600, 146, 1008, 496, 1028, 196, 51, 749, 971, 779, 232, 918, 608, 72, 707, 58, 1014, 1043, 370, 185, 818, 378, 315, [... i deleted here ...] 81, 145, 1053, 839, 466, 901, 1017, 48, 521, 162, 773, 170, 846, 1091, 1361, 1257, 1249, 1163, 1166, 634, 907, 1349, 432, 304, 1417, 1343, 424, 520, 1206, 1370, 23, 1477, 1047, 1407, 1314, 69, 1197, 1443, 261, 1479, 539, 675, 47, 619, 455, 1224, 513, 667, 894], dtype=int32) type(idx) <class 'numpy.ndarray'>

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  [bug] Exception ignored in generator object Variable 337619718
402699810 https://github.com/pydata/xarray/issues/1375#issuecomment-402699810 https://api.github.com/repos/pydata/xarray/issues/1375 MDEyOklzc3VlQ29tbWVudDQwMjY5OTgxMA== Hoeze 1200058 2018-07-05T12:02:30Z 2018-07-05T12:02:30Z NONE

How should these sparse arrays get stored in NetCDF4? I know that NetCDF4 has some conventions how to store sparse data, but do we have to implement our own conversion mechanisms for each sparse type?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Sparse arrays 221858543
402699290 https://github.com/pydata/xarray/issues/1375#issuecomment-402699290 https://api.github.com/repos/pydata/xarray/issues/1375 MDEyOklzc3VlQ29tbWVudDQwMjY5OTI5MA== Hoeze 1200058 2018-07-05T12:00:15Z 2018-07-05T12:00:15Z NONE

Would it be an option to use dask's sparse support? http://dask.pydata.org/en/latest/array-sparse.html This way xarray could let dask do the main work.

Currently I load everything into a dask array by hand and pass this dask array to xarray. This works pretty good.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Sparse arrays 221858543
402528134 https://github.com/pydata/xarray/issues/2267#issuecomment-402528134 https://api.github.com/repos/pydata/xarray/issues/2267 MDEyOklzc3VlQ29tbWVudDQwMjUyODEzNA== Hoeze 1200058 2018-07-04T17:06:51Z 2018-07-04T17:06:51Z NONE

@shoyer so there is no direct xarray equivalent to np.broadcast_to?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Some simple broadcast_dim method? 338226520
402524911 https://github.com/pydata/xarray/issues/2267#issuecomment-402524911 https://api.github.com/repos/pydata/xarray/issues/2267 MDEyOklzc3VlQ29tbWVudDQwMjUyNDkxMQ== Hoeze 1200058 2018-07-04T16:45:39Z 2018-07-04T16:45:39Z NONE

As an explanation: I'd like to change my program to only use lazy / chunked calculations in order to save RAM.

I recognized that np.broadcast_to converts the DataArray into a numpy one. Therefore I needed some xarray way to solve this.

I tried: python DataArray.expand_dims("new_dim").isel("new_dim", np.repeat(0, target_dim_size)) but this really looks ugly and I'm not sure about the performance implications of this.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Some simple broadcast_dim method? 338226520
402459865 https://github.com/pydata/xarray/issues/2267#issuecomment-402459865 https://api.github.com/repos/pydata/xarray/issues/2267 MDEyOklzc3VlQ29tbWVudDQwMjQ1OTg2NQ== Hoeze 1200058 2018-07-04T12:07:49Z 2018-07-04T12:18:54Z NONE

No, I'd need something like np.tile. expand_dims inserts only a dimension of length '1'

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Some simple broadcast_dim method? 338226520
402445256 https://github.com/pydata/xarray/issues/2263#issuecomment-402445256 https://api.github.com/repos/pydata/xarray/issues/2263 MDEyOklzc3VlQ29tbWVudDQwMjQ0NTI1Ng== Hoeze 1200058 2018-07-04T11:01:02Z 2018-07-04T11:01:02Z NONE

There is no X._data, but X.values is a simple ndarray.

I pickled X and idx: unit_test.zip

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  [bug] Exception ignored in generator object Variable 337619718
402099108 https://github.com/pydata/xarray/issues/2263#issuecomment-402099108 https://api.github.com/repos/pydata/xarray/issues/2263 MDEyOklzc3VlQ29tbWVudDQwMjA5OTEwOA== Hoeze 1200058 2018-07-03T10:30:45Z 2018-07-03T10:31:45Z NONE

I tried to, but I'm not sure whats the root cause of this problem. This problem occures as part of my Tensorflow input pipeline (it's called by a py_func) and it's hard to reproduce without Tensorflow.

Maybe this helps?

It might also be possible that this is some problem with python itself or some multithreading issue. I just found this issue which exactly describes my problem: https://github.com/ContinuumIO/anaconda-issues/issues/8737

There's also the hint that this problem only occurs in debug mode. I'm always running my unit tests in debug mode, that's why I did not realize this.

Having that said, it's maybe not worth to change this in xarray and wait for a python solution to this. However, since two brackets would solve this issue, it might not be too invasive to simply fix this one line :)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  [bug] Exception ignored in generator object Variable 337619718
400840681 https://github.com/pydata/xarray/issues/2253#issuecomment-400840681 https://api.github.com/repos/pydata/xarray/issues/2253 MDEyOklzc3VlQ29tbWVudDQwMDg0MDY4MQ== Hoeze 1200058 2018-06-27T21:51:10Z 2018-06-27T21:51:10Z NONE

OK, maybe I'll look into it when I got some spare time. It's currently the only point where I use netcdf4, so if opening a lot of files would also work with h5netcdf, I could delete this dependency.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  autoclose=True is not implemented for the h5netcdf backend 336220647
395009307 https://github.com/pydata/xarray/issues/1375#issuecomment-395009307 https://api.github.com/repos/pydata/xarray/issues/1375 MDEyOklzc3VlQ29tbWVudDM5NTAwOTMwNw== Hoeze 1200058 2018-06-06T09:39:43Z 2018-06-06T09:41:28Z NONE

I'd know a project which could make perfect use of xarray, if it would support sparse tensors: https://github.com/theislab/anndata

Currently I have to work with both xarray and anndata to store counts in sparse arrays separate from other depending data which is a little bit annoying :)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Sparse arrays 221858543
391760875 https://github.com/pydata/xarray/issues/2175#issuecomment-391760875 https://api.github.com/repos/pydata/xarray/issues/2175 MDEyOklzc3VlQ29tbWVudDM5MTc2MDg3NQ== Hoeze 1200058 2018-05-24T15:38:51Z 2018-05-24T15:40:47Z NONE

Some weeks ago I tried to solve this using a Latex-Script generator (https://github.com/Hoeze/matrixtolatex), but I gave it up as it was too hard for me to create 3D figures with LaTex. I think having this built on top of a senseful plotting framework would be a lot easier.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  [Feature Request] Visualizing dimensions 325661581
391441780 https://github.com/pydata/xarray/issues/2175#issuecomment-391441780 https://api.github.com/repos/pydata/xarray/issues/2175 MDEyOklzc3VlQ29tbWVudDM5MTQ0MTc4MA== Hoeze 1200058 2018-05-23T17:58:55Z 2018-05-24T15:34:27Z NONE

In general, I'd work with data "lego blocks". Visualizations up to three dimensions would be self-explaining. One block = scalar, a row of blocks = vector, a plane of blocks = matrix, a cuboid of blocks = 3D array.

Different variables can then be aligned along each dimension (similar to the red and orange planes aligned to the right side of the pink cuboid)

More than three dimensions could be handled by placing multiple cuboid-blocks (like the blue and pink cuboid in the logo).

The relational sizes of different dimensions should be chosen carefully, maybe with some non-linear scaling? Or we could separate large dimensions in the middle: (just an illustration, drawing what I'd like to have in libreoffice is hard)

However, I'm not sure how to realize that...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  [Feature Request] Visualizing dimensions 325661581
391758539 https://github.com/pydata/xarray/issues/1672#issuecomment-391758539 https://api.github.com/repos/pydata/xarray/issues/1672 MDEyOklzc3VlQ29tbWVudDM5MTc1ODUzOQ== Hoeze 1200058 2018-05-24T15:32:24Z 2018-05-24T15:32:24Z NONE

Any updates on this?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Append along an unlimited dimension to an existing netCDF file 269700511

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 879.71ms · About: xarray-datasette