home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

116 rows where user = 1386642 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, reactions, created_at (date), updated_at (date)

issue 30

  • Add methods for combining variables of differing dimensionality 28
  • Raise when pcolormesh coordinate is not sorted 10
  • xarray contrib module 6
  • Performance: numpy indexes small amounts of data 1000 faster than xarray 6
  • interp and reindex should work for 1d -> nd indexing 6
  • API for reshaping DataArrays as 2D "data matrices" for use in machine learning 5
  • Automatic parallelization for dask arrays in apply_ufunc 5
  • Unable to run example for xarray.DataArray.to_unstacked_dataset 4
  • [WIP] Implement 1D to ND interpolation 4
  • Add public API for Dataset._copy_listed 4
  • Improve typehints of xr.Dataset.__getitem__ 4
  • Feature plotting 3
  • support for units 3
  • Add trapz to DataArray for mathematical integration 3
  • FacetGrid with independent colorbars 3
  • bug: 2D pcolormesh plots are wrong when coordinate is not ascending order 3
  • Adding CDL Parser/`open_cdl`? 3
  • Plot methods 2
  • Potential error in apply_ufunc docstring for input_core_dims 2
  • Support parallel writes to regions of zarr stores 2
  • TypeError on DataArray.stack() if any of the dimensions to be stacked has a MultiIndex 1
  • Add simple array creation functions for easier unit testing 1
  • Slow performance with isel on stacked coordinates 1
  • zarr and xarray chunking compatibility and `to_zarr` performance 1
  • calculating cumsums on a groupby object 1
  • Document writing netcdf from xarray directly to S3 1
  • Improving typing of `xr.Dataset.__getitem__` 1
  • Lazy concatenation of arrays 1
  • Add mode="r+" for to_zarr and use consolidated writes/reads by default 1
  • Using entry_points to register dataset and dataarray accessors? 1

user 1

  • nbren12 · 116 ✖

author_association 1

  • CONTRIBUTOR 116
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1398017683 https://github.com/pydata/xarray/issues/7348#issuecomment-1398017683 https://api.github.com/repos/pydata/xarray/issues/7348 IC_kwDOAMm_X85TVA6T nbren12 1386642 2023-01-20T07:37:12Z 2023-01-20T07:41:06Z CONTRIBUTOR

I see your point, but xarray could do both, and most accessors I've used come in a pip installable package and we can make that workflow a bit smoother with entry points. IMO it is an advantage of entrypoints that they don't require editing source code, just a 1 line change to a setup.py, setup.cfg, or pyproject.toml.

I wonder how often "users" define their own accessors...I use python functions and modules. The "black magic" you mention breaks most static analysis tooling (type checking, linting, completion) and saves at most a couple of characters, so I never felt the need, but that's a discussion for another day.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Using entry_points to register dataset and dataarray accessors? 1473152374
1122601160 https://github.com/pydata/xarray/issues/4628#issuecomment-1122601160 https://api.github.com/repos/pydata/xarray/issues/4628 IC_kwDOAMm_X85C6YjI nbren12 1386642 2022-05-10T16:11:14Z 2022-05-10T16:11:14Z CONTRIBUTOR

@rabernat It seems that great minds think alike ;)

{
    "total_count": 2,
    "+1": 0,
    "-1": 0,
    "laugh": 2,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Lazy concatenation of arrays 753852119
1101553689 https://github.com/pydata/xarray/issues/3894#issuecomment-1101553689 https://api.github.com/repos/pydata/xarray/issues/3894 IC_kwDOAMm_X85BqGAZ nbren12 1386642 2022-04-18T16:41:39Z 2022-04-18T16:41:39Z CONTRIBUTOR

I think the issue is still valid, we just couldn't think of what to name the new API.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add public API for Dataset._copy_listed 588112617
1039346079 https://github.com/pydata/xarray/issues/6269#issuecomment-1039346079 https://api.github.com/repos/pydata/xarray/issues/6269 IC_kwDOAMm_X8498ymf nbren12 1386642 2022-02-14T17:18:38Z 2022-02-14T17:18:38Z CONTRIBUTOR

@jhamman We have a similar schema package https://github.com/ai2cm/fv3net/tree/master/external/synth, cool to see you confronting the same challenges and advertising your solutions more broadly. One problem we had is that our schema objects ended up being quite verbose: https://github.com/ai2cm/fv3net/blob/master/external/loaders/tests/test__batch/one_step_zarr_schema.json. CDL is a lot more concise.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Adding CDL Parser/`open_cdl`? 1132894350
1036516070 https://github.com/pydata/xarray/issues/6269#issuecomment-1036516070 https://api.github.com/repos/pydata/xarray/issues/6269 IC_kwDOAMm_X849x_rm nbren12 1386642 2022-02-11T18:52:12Z 2022-02-11T18:52:12Z CONTRIBUTOR

To be fair, ds.info is not 100% CDL, but it's darn close.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Adding CDL Parser/`open_cdl`? 1132894350
1036502148 https://github.com/pydata/xarray/issues/6269#issuecomment-1036502148 https://api.github.com/repos/pydata/xarray/issues/6269 IC_kwDOAMm_X849x8SE nbren12 1386642 2022-02-11T18:33:42Z 2022-02-11T18:33:49Z CONTRIBUTOR

Aren't there xarray extension packages around where this would fit into?

I'm not sure. Any suggestions? Just wondering if xarray has left the door open to this kind of contribution since it 1. already supports other i/o backends 2. creates CDL using ds.info().

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Adding CDL Parser/`open_cdl`? 1132894350
865554874 https://github.com/pydata/xarray/pull/5252#issuecomment-865554874 https://api.github.com/repos/pydata/xarray/issues/5252 MDEyOklzc3VlQ29tbWVudDg2NTU1NDg3NA== nbren12 1386642 2021-06-22T04:56:09Z 2021-06-22T04:56:09Z CONTRIBUTOR

I'm sorry too! I don't have any good excuse though...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add mode="r+" for to_zarr and use consolidated writes/reads by default 874331538
786764651 https://github.com/pydata/xarray/issues/2799#issuecomment-786764651 https://api.github.com/repos/pydata/xarray/issues/2799 MDEyOklzc3VlQ29tbWVudDc4Njc2NDY1MQ== nbren12 1386642 2021-02-26T16:51:50Z 2021-02-26T16:51:50Z CONTRIBUTOR

@jhamman Weren't you talking about an xarray lite (TM) package?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Performance: numpy indexes small amounts of data 1000 faster than xarray 416962458
747145009 https://github.com/pydata/xarray/pull/3262#issuecomment-747145009 https://api.github.com/repos/pydata/xarray/issues/3262 MDEyOklzc3VlQ29tbWVudDc0NzE0NTAwOQ== nbren12 1386642 2020-12-17T01:29:12Z 2020-12-17T01:29:12Z CONTRIBUTOR

I'm going to close this since I won't be working on it any longer.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  [WIP] Implement 1D to ND interpolation 484863660
675057700 https://github.com/pydata/xarray/issues/3894#issuecomment-675057700 https://api.github.com/repos/pydata/xarray/issues/3894 MDEyOklzc3VlQ29tbWVudDY3NTA1NzcwMA== nbren12 1386642 2020-08-17T19:03:55Z 2020-08-17T19:03:55Z CONTRIBUTOR

NVM, get is already a method. Maybe we could overload it?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add public API for Dataset._copy_listed 588112617
675056880 https://github.com/pydata/xarray/issues/3894#issuecomment-675056880 https://api.github.com/repos/pydata/xarray/issues/3894 MDEyOklzc3VlQ29tbWVudDY3NTA1Njg4MA== nbren12 1386642 2020-08-17T19:02:24Z 2020-08-17T19:02:24Z CONTRIBUTOR

Or maybe "get" since it's a synonym of "select" that isn't overloaded with spatial indexing in the code base.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add public API for Dataset._copy_listed 588112617
675055276 https://github.com/pydata/xarray/issues/3894#issuecomment-675055276 https://api.github.com/repos/pydata/xarray/issues/3894 MDEyOklzc3VlQ29tbWVudDY3NTA1NTI3Ng== nbren12 1386642 2020-08-17T18:59:05Z 2020-08-17T18:59:05Z CONTRIBUTOR

Is there a way in mypy we could use something like overload to specify the above contract here, as an alternative to another method?

That is correct. The output type is predictable from the inputs types. With #4144, mypy may have a chance at detecting this kind of error. Still, I don't know how many users use type-checking. I expect most will only discover this problem at runtime.

It would be better to have an explicit method for subsetting Dataset variables.

I agree. sel_vars is more clear IMO since subset could apply to the coordinates too e.g. a spatial subset.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add public API for Dataset._copy_listed 588112617
655298190 https://github.com/pydata/xarray/issues/4122#issuecomment-655298190 https://api.github.com/repos/pydata/xarray/issues/4122 MDEyOklzc3VlQ29tbWVudDY1NTI5ODE5MA== nbren12 1386642 2020-07-08T05:39:14Z 2020-07-08T05:39:14Z CONTRIBUTOR

I’ve run into this as well. It’s not pretty, but my usual work around is to write to a local temporary file and then upload with fsspec. I can never remember exactly which netCDF engine to use...

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Document writing netcdf from xarray directly to S3 631085856
644231473 https://github.com/pydata/xarray/pull/4144#issuecomment-644231473 https://api.github.com/repos/pydata/xarray/issues/4144 MDEyOklzc3VlQ29tbWVudDY0NDIzMTQ3Mw== nbren12 1386642 2020-06-15T16:15:58Z 2020-06-15T16:32:24Z CONTRIBUTOR

@crusaderky Thanks for the re-work. For my own benefit, could you explain why that code worked? I remember writing something very similar, and running into mypy errors. My understanding of how mypy intreprets overload seems incomplete.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Improve typehints of xr.Dataset.__getitem__ 636611699
643506298 https://github.com/pydata/xarray/pull/4144#issuecomment-643506298 https://api.github.com/repos/pydata/xarray/issues/4144 MDEyOklzc3VlQ29tbWVudDY0MzUwNjI5OA== nbren12 1386642 2020-06-12T22:23:56Z 2020-06-12T22:23:56Z CONTRIBUTOR

No problem! I think I am done with this one unless you think its important that I document or test this somehow. Can someone review it?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Improve typehints of xr.Dataset.__getitem__ 636611699
643079280 https://github.com/pydata/xarray/pull/4144#issuecomment-643079280 https://api.github.com/repos/pydata/xarray/issues/4144 MDEyOklzc3VlQ29tbWVudDY0MzA3OTI4MA== nbren12 1386642 2020-06-12T05:49:56Z 2020-06-12T05:49:56Z CONTRIBUTOR

Okay. Assuming the tests pass, I think this is ready for review. I tried adding a test, but mypy didn't seem to find problems even with code that I know doesn't work (e.g. 'a'+ 1). Is there some strategy for testing tricky type hints like this?

In any case, this code does work: ``` $ cat test_mypy.py (fv3net) import xarray as xr ds = xr.Dataset({"a": ()})

arr = ds['a'] union_obj = ds[['a']]

reveal_locals() $ mypy test_mypy.py (fv3net) test_mypy.py:8: note: Revealed local types are: test_mypy.py:8: note: arr: xarray.core.dataarray.DataArray test_mypy.py:8: note: ds: xarray.core.dataset.Dataset test_mypy.py:8: note: union_obj: Union[xarray.core.dataarray.DataArray, xarray.core.dataset.Dataset] ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Improve typehints of xr.Dataset.__getitem__ 636611699
643014353 https://github.com/pydata/xarray/pull/4144#issuecomment-643014353 https://api.github.com/repos/pydata/xarray/issues/4144 MDEyOklzc3VlQ29tbWVudDY0MzAxNDM1Mw== nbren12 1386642 2020-06-12T01:27:55Z 2020-06-12T01:33:41Z CONTRIBUTOR

@mathause On further consideration, I think it might not be possible to get this to work. This method has three behaviors: - Mapping -> Dataset - Hashable -> DataArray - else (List): -> Dataset

With my limited understanding of mypy, I think that any two of these is supported by overload, but I'm not sure it's possible to support all 3. I tried several different options, but maybe I am missing something.

Would a good middle ground be something like this? - Hashable -> DataArray - Any -> Union[DataArray, Dataset]

I think this would work since both the input/outputs of the first one are subtypes of the second one. It's not a complete solution, but it would solve the most common problem of ds['a'] returning a union type rather than a DataArray.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Improve typehints of xr.Dataset.__getitem__ 636611699
642315134 https://github.com/pydata/xarray/issues/4125#issuecomment-642315134 https://api.github.com/repos/pydata/xarray/issues/4125 MDEyOklzc3VlQ29tbWVudDY0MjMxNTEzNA== nbren12 1386642 2020-06-10T23:15:43Z 2020-06-10T23:28:51Z CONTRIBUTOR

~~overload was added in Python 3.5. Is this okay with xarray's backwards compatibility? It would be pretty easy to write a mock overload that does nothing if not.~~

Edit: NVM @mathause's example is in the code already.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Improving typing of `xr.Dataset.__getitem__` 631940742
627799236 https://github.com/pydata/xarray/pull/4035#issuecomment-627799236 https://api.github.com/repos/pydata/xarray/issues/4035 MDEyOklzc3VlQ29tbWVudDYyNzc5OTIzNg== nbren12 1386642 2020-05-13T07:22:40Z 2020-05-13T07:22:40Z CONTRIBUTOR

@rabernat I learn something new everyday. sorry for cluttering up this PR with my ignorance haha.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support parallel writes to regions of zarr stores 613012939
627090332 https://github.com/pydata/xarray/pull/4035#issuecomment-627090332 https://api.github.com/repos/pydata/xarray/issues/4035 MDEyOklzc3VlQ29tbWVudDYyNzA5MDMzMg== nbren12 1386642 2020-05-12T03:44:14Z 2020-05-12T03:44:14Z CONTRIBUTOR

@rabernat pointed this PR out to me, and this is great progress towards allowing more database-like CRUD operations on zarr datasets. A similar neat feature would be to read xarray datasets from regions of zarr groups w/o dask arrays.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support parallel writes to regions of zarr stores 613012939
553294966 https://github.com/pydata/xarray/issues/2799#issuecomment-553294966 https://api.github.com/repos/pydata/xarray/issues/2799 MDEyOklzc3VlQ29tbWVudDU1MzI5NDk2Ng== nbren12 1386642 2019-11-13T08:32:05Z 2019-11-13T08:32:16Z CONTRIBUTOR

This variable workaround is awesome @max-sixty. Are there any guidelines on when to use Variable vs DataArray? Some calculations (e.g. fast difference and derivatives/stencil operations) seem cleaner without explicit coordinate labels.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Performance: numpy indexes small amounts of data 1000 faster than xarray 416962458
549085085 https://github.com/pydata/xarray/pull/3262#issuecomment-549085085 https://api.github.com/repos/pydata/xarray/issues/3262 MDEyOklzc3VlQ29tbWVudDU0OTA4NTA4NQ== nbren12 1386642 2019-11-02T22:01:10Z 2019-11-02T22:01:10Z CONTRIBUTOR

Unfortunately, I don’t think I have much time now to contribute to a general purpose solution leveraging xarray’s built-in indexing. So feel free to add to or close this PR. To be successful, I would need to study xarray’s indexing internals more since I don’t think it is as easily implemented as a routine calling DataArray methods. Some custom numba code I wrote fits in my brain much better, and is general enough for my purposes when wrapped with xr.apply_ufunc. I encourage someone else to pick up where I left off, or we could close this PR.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  [WIP] Implement 1D to ND interpolation 484863660
525157967 https://github.com/pydata/xarray/pull/3262#issuecomment-525157967 https://api.github.com/repos/pydata/xarray/issues/3262 MDEyOklzc3VlQ29tbWVudDUyNTE1Nzk2Nw== nbren12 1386642 2019-08-27T06:26:49Z 2019-08-27T06:26:49Z CONTRIBUTOR

Thanks so much for the help. This is a good learning experience for me.

That potentially would let you avoid redundant operations on the entire Dataset object.

Yes. This is where I got stuck TBH.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  [WIP] Implement 1D to ND interpolation 484863660
525025890 https://github.com/pydata/xarray/pull/3262#issuecomment-525025890 https://api.github.com/repos/pydata/xarray/issues/3262 MDEyOklzc3VlQ29tbWVudDUyNTAyNTg5MA== nbren12 1386642 2019-08-26T20:47:33Z 2019-08-26T20:48:03Z CONTRIBUTOR

@shoyer Thanks for the comments. I was struggling to incorporate it into Dataset.interp since core.missing is a pretty complicated. Would it be worth refactoring that module to clarify how interp calls are mapped to a given function? Also, most of the methods in interp work like Dataset -> Variables -> Numpy arrays, but the method you proposed above operates on the Dataset level, so it doesn't quite fit into core.missing.interp.

The interpolation code I was working with doesn't regrid the coordinates appropriately, so we would need to do that too.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  [WIP] Implement 1D to ND interpolation 484863660
524579537 https://github.com/pydata/xarray/issues/3252#issuecomment-524579537 https://api.github.com/repos/pydata/xarray/issues/3252 MDEyOklzc3VlQ29tbWVudDUyNDU3OTUzNw== nbren12 1386642 2019-08-24T20:54:02Z 2019-08-24T20:54:02Z CONTRIBUTOR

Ok. I realized this problem occurs only because x was a dimension of the new index idx and the original dataset. Perhaps sel should warn the user or raise an error when this happens.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  interp and reindex should work for 1d -> nd indexing 484622545
524578659 https://github.com/pydata/xarray/issues/3252#issuecomment-524578659 https://api.github.com/repos/pydata/xarray/issues/3252 MDEyOklzc3VlQ29tbWVudDUyNDU3ODY1OQ== nbren12 1386642 2019-08-24T20:35:45Z 2019-08-24T20:36:30Z CONTRIBUTOR

Ok. I started playing around with this, but I am getting errors when indexing arrays with ND variables. data.sel(x=nd_x) works, but any subsequent operations complain that IndexVariable objects must be 1-dimensional: ```

import xarray as xr import numpy as np npdata = np.tile(np.arange(10), (5, 1)) ... data = xr.DataArray(npdata, dims=['y', 'x'], ... coords={'x': np.r_[:10], 'y': np.r_[:5]}) ... idx = xr.DataArray(npdata, dims=['z', 'x']) ... ... ans = data.sel(x=idx, method='bfill') ... assert set(ans.dims) == {'z', 'y', 'x'} ... print(ans) Traceback (most recent call last): File "<ipython-input-4-1d0ac0cac680>", line 8, in <module> print(ans) File "/Users/noah/workspace/software/xarray/xarray/core/common.py", line 129, in repr return formatting.array_repr(self) File "/Users/noah/workspace/software/xarray/xarray/core/formatting.py", line 463, in array_repr summary.append(repr(arr.coords)) File "/Users/noah/workspace/software/xarray/xarray/core/coordinates.py", line 78, in repr return formatting.coords_repr(self) File "/Users/noah/workspace/software/xarray/xarray/core/formatting.py", line 381, in coords_repr coords, title="Coordinates", summarizer=summarize_coord, col_width=col_width File "/Users/noah/workspace/software/xarray/xarray/core/formatting.py", line 361, in _mapping_repr summary += [summarizer(k, v, col_width) for k, v in mapping.items()] File "/Users/noah/workspace/software/xarray/xarray/core/formatting.py", line 361, in <listcomp> summary += [summarizer(k, v, col_width) for k, v in mapping.items()] File "/Users/noah/workspace/software/xarray/xarray/core/formatting.py", line 307, in summarize_coord coord = var.variable.to_index_variable() File "/Users/noah/workspace/software/xarray/xarray/core/variable.py", line 440, in to_index_variable self.dims, self._data, self._attrs, encoding=self._encoding, fastpath=True File "/Users/noah/workspace/software/xarray/xarray/core/variable.py", line 1943, in init raise ValueError("%s objects must be 1-dimensional" % type(self).name) ValueError: IndexVariable objects must be 1-dimensional ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  interp and reindex should work for 1d -> nd indexing 484622545
524570552 https://github.com/pydata/xarray/issues/3252#issuecomment-524570552 https://api.github.com/repos/pydata/xarray/issues/3252 MDEyOklzc3VlQ29tbWVudDUyNDU3MDU1Mg== nbren12 1386642 2019-08-24T18:13:10Z 2019-08-24T18:13:10Z CONTRIBUTOR

So when would interp use this manual interpolation method? Would it try it only if scipy fails or check the dimensions?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  interp and reindex should work for 1d -> nd indexing 484622545
524515190 https://github.com/pydata/xarray/issues/3252#issuecomment-524515190 https://api.github.com/repos/pydata/xarray/issues/3252 MDEyOklzc3VlQ29tbWVudDUyNDUxNTE5MA== nbren12 1386642 2019-08-24T03:40:32Z 2019-08-24T03:40:32Z CONTRIBUTOR

After reading the method='ffill' docs, I agree with you.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  interp and reindex should work for 1d -> nd indexing 484622545
524448135 https://github.com/pydata/xarray/issues/3252#issuecomment-524448135 https://api.github.com/repos/pydata/xarray/issues/3252 MDEyOklzc3VlQ29tbWVudDUyNDQ0ODEzNQ== nbren12 1386642 2019-08-23T20:17:06Z 2019-08-23T20:17:06Z CONTRIBUTOR

In my experience, computing w efficiently is the tricky part. The function is slightly different, but metpy uses a lot of tricks to make this work efficiently. A manual for-loop is much cleaner for this kind of stencil calculation IMO. What kind of duck arrays were you thinking of?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  interp and reindex should work for 1d -> nd indexing 484622545
524424788 https://github.com/pydata/xarray/issues/3252#issuecomment-524424788 https://api.github.com/repos/pydata/xarray/issues/3252 MDEyOklzc3VlQ29tbWVudDUyNDQyNDc4OA== nbren12 1386642 2019-08-23T18:53:26Z 2019-08-23T18:53:26Z CONTRIBUTOR

I have some numba code which does this for linear interpolation. Does scipy support this pattern?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  interp and reindex should work for 1d -> nd indexing 484622545
519688842 https://github.com/pydata/xarray/issues/2300#issuecomment-519688842 https://api.github.com/repos/pydata/xarray/issues/2300 MDEyOklzc3VlQ29tbWVudDUxOTY4ODg0Mg== nbren12 1386642 2019-08-08T21:10:54Z 2019-08-08T21:10:54Z CONTRIBUTOR

I am getting the same error too.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  zarr and xarray chunking compatibility and `to_zarr` performance 342531772
513901362 https://github.com/pydata/xarray/issues/3148#issuecomment-513901362 https://api.github.com/repos/pydata/xarray/issues/3148 MDEyOklzc3VlQ29tbWVudDUxMzkwMTM2Mg== nbren12 1386642 2019-07-22T18:31:21Z 2019-07-22T18:31:21Z CONTRIBUTOR

Okay. it looks like the docstring for to_unstacked_dataset is in fact incorrect:

https://github.com/pydata/xarray/blob/77a31e56d221245ff7dc10041bf0ca34cab91897/xarray/core/dataarray.py#L1654

I will can submit a PR fixing that soon.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Unable to run example for xarray.DataArray.to_unstacked_dataset 470322983
513636179 https://github.com/pydata/xarray/issues/3141#issuecomment-513636179 https://api.github.com/repos/pydata/xarray/issues/3141 MDEyOklzc3VlQ29tbWVudDUxMzYzNjE3OQ== nbren12 1386642 2019-07-22T04:31:40Z 2019-07-22T04:31:40Z CONTRIBUTOR

Xarray has a pretty extensive contributor's guide that you might find helpful. In short, the way to contribute changes is to create your own fork of xarray, commit/push some changes, and finally submit a pull request (PR).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  calculating cumsums on a groupby object 469633509
513635290 https://github.com/pydata/xarray/issues/3148#issuecomment-513635290 https://api.github.com/repos/pydata/xarray/issues/3148 MDEyOklzc3VlQ29tbWVudDUxMzYzNTI5MA== nbren12 1386642 2019-07-22T04:25:49Z 2019-07-22T04:25:49Z CONTRIBUTOR

Could you link the specific page that is wrong?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Unable to run example for xarray.DataArray.to_unstacked_dataset 470322983
513522465 https://github.com/pydata/xarray/issues/3148#issuecomment-513522465 https://api.github.com/repos/pydata/xarray/issues/3148 MDEyOklzc3VlQ29tbWVudDUxMzUyMjQ2NQ== nbren12 1386642 2019-07-21T04:56:26Z 2019-07-21T04:56:26Z CONTRIBUTOR

Actually, the docs seem correct: https://github.com/pydata/xarray/blob/8da3f67ea583e0588291162067229b2f3ce2993e/xarray/core/dataset.py#L2895

Am I missing something here?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Unable to run example for xarray.DataArray.to_unstacked_dataset 470322983
513521687 https://github.com/pydata/xarray/issues/3148#issuecomment-513521687 https://api.github.com/repos/pydata/xarray/issues/3148 MDEyOklzc3VlQ29tbWVudDUxMzUyMTY4Nw== nbren12 1386642 2019-07-21T04:46:05Z 2019-07-21T04:46:05Z CONTRIBUTOR

Huh, it looks like that change in the API that happened in the last stages of the PR didn't make it into the example docs. I will try to submit a PR fixing that sometime soon.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Unable to run example for xarray.DataArray.to_unstacked_dataset 470322983
508802893 https://github.com/pydata/xarray/pull/1597#issuecomment-508802893 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDUwODgwMjg5Mw== nbren12 1386642 2019-07-05T15:59:51Z 2019-07-05T15:59:51Z CONTRIBUTOR

Thanks Joe!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
508598650 https://github.com/pydata/xarray/pull/1597#issuecomment-508598650 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDUwODU5ODY1MA== nbren12 1386642 2019-07-05T01:07:21Z 2019-07-05T01:07:21Z CONTRIBUTOR

phew! Thanks for all the reviews and discussion everyone!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
508593312 https://github.com/pydata/xarray/pull/1597#issuecomment-508593312 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDUwODU5MzMxMg== nbren12 1386642 2019-07-05T00:09:43Z 2019-07-05T00:09:43Z CONTRIBUTOR

It looks like the CI errors above aren’t related to this PR. There seems to be an issue with to_pandas.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
507822750 https://github.com/pydata/xarray/pull/1597#issuecomment-507822750 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDUwNzgyMjc1MA== nbren12 1386642 2019-07-02T19:57:59Z 2019-07-02T19:57:59Z CONTRIBUTOR

drat. Should be fixed now.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
507512420 https://github.com/pydata/xarray/pull/1597#issuecomment-507512420 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDUwNzUxMjQyMA== nbren12 1386642 2019-07-02T04:23:10Z 2019-07-02T04:23:10Z CONTRIBUTOR

Okay. I responded to @benbovy's last comments and merged the upstream changes to master. How is this looking?

cc @jhamman

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
505533500 https://github.com/pydata/xarray/pull/1597#issuecomment-505533500 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDUwNTUzMzUwMA== nbren12 1386642 2019-06-25T17:01:52Z 2019-06-25T17:01:52Z CONTRIBUTOR

is there actual use cases where it is useful to provide multiple dimensions for sample_dims?

Yes! My main use-case does. For example, if you have a weather dataset where every lat/lon pair should be considered a separate "sample", you would use sample_dims=['lat', 'lon'].

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
504623974 https://github.com/pydata/xarray/pull/1597#issuecomment-504623974 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDUwNDYyMzk3NA== nbren12 1386642 2019-06-22T03:37:35Z 2019-06-22T03:37:35Z CONTRIBUTOR

It looks like the tests passed. @benbovy How does it look now? Did I fix the issues you mentioned?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
504621993 https://github.com/pydata/xarray/pull/1597#issuecomment-504621993 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDUwNDYyMTk5Mw== nbren12 1386642 2019-06-22T03:12:00Z 2019-06-22T03:12:00Z CONTRIBUTOR

It looks like 43834ac8186a851b7 might have fixed this. Let's see if the tests pass.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
499915652 https://github.com/pydata/xarray/pull/1597#issuecomment-499915652 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDQ5OTkxNTY1Mg== nbren12 1386642 2019-06-07T14:51:46Z 2019-06-07T14:54:25Z CONTRIBUTOR

Does anybody have an idea why the py36-dask-dev tests are failing? None of the errors seems related to this PR.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
499789587 https://github.com/pydata/xarray/pull/1597#issuecomment-499789587 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDQ5OTc4OTU4Nw== nbren12 1386642 2019-06-07T07:40:38Z 2019-06-07T07:40:38Z CONTRIBUTOR

@benbovy Paradoxically I think making to_stacked_array accept the list of dimensions which won't be stacked is clearer. These unstacked dims needed to shared across all variables, which is a very simple requirement. I realized this after writing a bunch of error-prone validation code. See my last commit for more of my reasoning (5ca9a1d868a). I also updated the docs and tests accordingly.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
499761734 https://github.com/pydata/xarray/pull/1597#issuecomment-499761734 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDQ5OTc2MTczNA== nbren12 1386642 2019-06-07T05:28:51Z 2019-06-07T05:28:51Z CONTRIBUTOR

@benbovy

Is it expected behavior? I guess that not every combination is possible because we're not broadcasting here unlike to_array().

I believe so. In your third case, to_stacked_array basically wants to combine a and b into a DataArray with a dimension y. But that is not possible without broadcasting because b does not have a dimension y. I will try to add a more informative error message.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
494893796 https://github.com/pydata/xarray/pull/1597#issuecomment-494893796 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDQ5NDg5Mzc5Ng== nbren12 1386642 2019-05-22T17:21:24Z 2019-05-22T17:21:24Z CONTRIBUTOR

Thanks for your review @benbovy. I'll try to take a look soon.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
493249598 https://github.com/pydata/xarray/pull/1597#issuecomment-493249598 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDQ5MzI0OTU5OA== nbren12 1386642 2019-05-16T22:11:55Z 2019-05-16T22:11:55Z CONTRIBUTOR

Hey @rabernat. Did I resolve your comments?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
483807351 https://github.com/pydata/xarray/issues/525#issuecomment-483807351 https://api.github.com/repos/pydata/xarray/issues/525 MDEyOklzc3VlQ29tbWVudDQ4MzgwNzM1MQ== nbren12 1386642 2019-04-16T19:16:19Z 2019-04-16T19:16:19Z CONTRIBUTOR

Would __array_function__ solve the problem with operator precedence? I thought they are separate issues because __mul__ and __rmul__ need not call any numpy functions, and will therefore not necessary dispatch to __array_function__.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  support for units 100295585
483370027 https://github.com/pydata/xarray/issues/1850#issuecomment-483370027 https://api.github.com/repos/pydata/xarray/issues/1850 MDEyOklzc3VlQ29tbWVudDQ4MzM3MDAyNw== nbren12 1386642 2019-04-15T18:41:22Z 2019-04-15T18:41:22Z CONTRIBUTOR

To be clear, I think there is some optimal middle ground between the "mega xarray-contrib" package and the current situation. I think the "micro-package" approach works when the collection of micro-packages is being maintained by an active/permanent entity (e.g. Ryan research group). On the other hand, postdocs and grad students are very likely to leave the field entirely within a few years, at which point they will probably stop maintaining their "micro-packages".

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray contrib module 290593053
483342686 https://github.com/pydata/xarray/issues/1850#issuecomment-483342686 https://api.github.com/repos/pydata/xarray/issues/1850 MDEyOklzc3VlQ29tbWVudDQ4MzM0MjY4Ng== nbren12 1386642 2019-04-15T17:22:37Z 2019-04-15T17:22:37Z CONTRIBUTOR

I'd also like to thank @teoliphant for weighing in!

Bearing in mind the history of scipy, I agree that the xarray community doesn't need 100% centralization, but there should be some conglomeration. IMO, the current situation of "one graduate student/postdoc per package" is not sustainable.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray contrib module 290593053
482643700 https://github.com/pydata/xarray/issues/525#issuecomment-482643700 https://api.github.com/repos/pydata/xarray/issues/525 MDEyOklzc3VlQ29tbWVudDQ4MjY0MzcwMA== nbren12 1386642 2019-04-12T16:45:17Z 2019-04-12T16:45:17Z CONTRIBUTOR

One additional issue. It seems like pint has some odd behavior with dask. Multiplication (and I assume addition) is not commutative: ``` In [42]: da.ones((10,)) * ureg.m Out[42]: dask.array<mul, shape=(10,), dtype=float64, chunksize=(10,)>

In [43]: ureg.m * da.ones((10,)) Out[43]: dask.array<mul, shape=(10,), dtype=float64, chunksize=(10,)> <Unit('meter')> ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  support for units 100295585
482640504 https://github.com/pydata/xarray/pull/1597#issuecomment-482640504 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDQ4MjY0MDUwNA== nbren12 1386642 2019-04-12T16:35:17Z 2019-04-12T16:35:17Z CONTRIBUTOR

Hey. I just wanted to bump this PR. How does it look @rabernat @jhamman @shoyer?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
482639629 https://github.com/pydata/xarray/issues/525#issuecomment-482639629 https://api.github.com/repos/pydata/xarray/issues/525 MDEyOklzc3VlQ29tbWVudDQ4MjYzOTYyOQ== nbren12 1386642 2019-04-12T16:32:25Z 2019-04-12T16:32:25Z CONTRIBUTOR

@rabernat recent post inspired me to check out this issue. What would this issue entail now that __array_function__ is in numpy? Is there some reason this is more complicated than adding an appropriate __array_function__ to pint's quantity class?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  support for units 100295585
480063323 https://github.com/pydata/xarray/issues/1850#issuecomment-480063323 https://api.github.com/repos/pydata/xarray/issues/1850 MDEyOklzc3VlQ29tbWVudDQ4MDA2MzMyMw== nbren12 1386642 2019-04-04T21:04:37Z 2019-04-04T21:04:37Z CONTRIBUTOR

Thanks @rabernat that awesome list looks pretty awesome.

However, I would still advocate for a more centralized approach to this problem. For instance, the NCL has a huge library of contributed functions which they distribute along with the code. By now, I am sure that xarray users have basically reimplemented equivalents to all of these functions, but without a centralized home it is still too difficult to find or contribute new codes.

For instance, I have a useful wrapper to scipy.ndimage that I use all the time, but it seems overkill to release/support a whole package for this one module. I would be much more likely to contribute a PR to a community run repository. I am also much more likely to use such a repo.

I would be more than willing to volunteer for such an effort, but I think it needs to involve multiple people. Various individuals have tried to make such repos on their own, but none seem to have reached critical mass. For example, https://github.com/crusaderky/xarray_extras https://github.com/fujiisoup/xr-scipy I think there should be multiple maintainers, so that if one person drops out, there still appears to be activity on the repo.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray contrib module 290593053
479128919 https://github.com/pydata/xarray/pull/1597#issuecomment-479128919 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDQ3OTEyODkxOQ== nbren12 1386642 2019-04-02T18:15:10Z 2019-04-02T18:15:10Z CONTRIBUTOR

Ok. I added an example which is very similar to the DataArray.unstack example.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
478794367 https://github.com/pydata/xarray/pull/1597#issuecomment-478794367 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDQ3ODc5NDM2Nw== nbren12 1386642 2019-04-02T00:20:24Z 2019-04-02T00:20:24Z CONTRIBUTOR

Okay. It looks like it is passing CI now, so I think it's ready for another look. How does it look? Are you still interested in including this functionality?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
478748969 https://github.com/pydata/xarray/pull/1597#issuecomment-478748969 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDQ3ODc0ODk2OQ== nbren12 1386642 2019-04-01T21:09:31Z 2019-04-01T21:09:31Z CONTRIBUTOR

@shoyer Sorry. I rebased...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
478702680 https://github.com/pydata/xarray/pull/1597#issuecomment-478702680 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDQ3ODcwMjY4MA== nbren12 1386642 2019-04-01T18:58:18Z 2019-04-01T18:58:18Z CONTRIBUTOR

@jhamman I did a little more work on this today. How do you recommend I update this to master? rebase?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
469451210 https://github.com/pydata/xarray/issues/2799#issuecomment-469451210 https://api.github.com/repos/pydata/xarray/issues/2799 MDEyOklzc3VlQ29tbWVudDQ2OTQ1MTIxMA== nbren12 1386642 2019-03-04T22:40:07Z 2019-03-04T22:40:07Z CONTRIBUTOR

Sure, I've been using that as a workaround as well. Unfortunately, that approach throws away all the nice info (e.g. metadata, coordinate) that xarray objects have and requires duplicating much of xarray's indexing logic.

{
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Performance: numpy indexes small amounts of data 1000 faster than xarray 416962458
469447632 https://github.com/pydata/xarray/issues/2799#issuecomment-469447632 https://api.github.com/repos/pydata/xarray/issues/2799 MDEyOklzc3VlQ29tbWVudDQ2OTQ0NzYzMg== nbren12 1386642 2019-03-04T22:27:57Z 2019-03-04T22:27:57Z CONTRIBUTOR

@max-sixty I tend to agree this use case could be outside of the scope of xarray. It sounds like significant progress might require re-implementing core xarray objects in C/Cython. Without more than 10x improvement, I would probably just continue using numpy arrays.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Performance: numpy indexes small amounts of data 1000 faster than xarray 416962458
469443856 https://github.com/pydata/xarray/issues/2799#issuecomment-469443856 https://api.github.com/repos/pydata/xarray/issues/2799 MDEyOklzc3VlQ29tbWVudDQ2OTQ0Mzg1Ng== nbren12 1386642 2019-03-04T22:15:49Z 2019-03-04T22:15:49Z CONTRIBUTOR

Thanks so much @shoyer. I didn't realize there was that much overhead for a single function call. OTOH, 2x slower than numpy would be way better than 1000x.

After looking at the profiling info more, I tend to agree with your 10x maximum speed-up. A couple of particularly slow functions (e.g. Dataset._validate_indexers) account for about 75% of run time. However, the remaining 25% is split across several other pure python routines.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Performance: numpy indexes small amounts of data 1000 faster than xarray 416962458
469394020 https://github.com/pydata/xarray/issues/2799#issuecomment-469394020 https://api.github.com/repos/pydata/xarray/issues/2799 MDEyOklzc3VlQ29tbWVudDQ2OTM5NDAyMA== nbren12 1386642 2019-03-04T19:45:11Z 2019-03-04T19:45:11Z CONTRIBUTOR

cc @rabernat

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Performance: numpy indexes small amounts of data 1000 faster than xarray 416962458
445483774 https://github.com/pydata/xarray/pull/1597#issuecomment-445483774 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDQ0NTQ4Mzc3NA== nbren12 1386642 2018-12-08T19:28:40Z 2018-12-08T19:28:40Z CONTRIBUTOR

I'd be happy to pick this up again if you think it will go through.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
399192741 https://github.com/pydata/xarray/issues/2241#issuecomment-399192741 https://api.github.com/repos/pydata/xarray/issues/2241 MDEyOklzc3VlQ29tbWVudDM5OTE5Mjc0MQ== nbren12 1386642 2018-06-21T18:03:05Z 2018-06-21T18:03:05Z CONTRIBUTOR

Thanks for the tips everyone. From the dask issue above, it appears that dask.reshape does not respect the boundaries of chunks. Therefore, pulling the first chunk of the stacked array requires pulling several chunks from the original un-reshaped data. This is a bit odd because the first chunk of the stacked array could just be a reshaped version of the first chunk from the unstacked array.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Slow performance with isel on stacked coordinates 334366223
368201527 https://github.com/pydata/xarray/issues/1850#issuecomment-368201527 https://api.github.com/repos/pydata/xarray/issues/1850 MDEyOklzc3VlQ29tbWVudDM2ODIwMTUyNw== nbren12 1386642 2018-02-24T05:23:04Z 2018-02-24T05:23:04Z CONTRIBUTOR

@maxim-lian There is a very short list of such packages hidden in the xarray documention.

In general, there are a ton of these awesome-... repos floating around the internet which just list the useful/related tools/libraries which are related to ... . For example, there are repos out there like awesome-python and awesome-bash. Maybe someone could start an awesome-xarray package.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray contrib module 290593053
366548976 https://github.com/pydata/xarray/pull/1597#issuecomment-366548976 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDM2NjU0ODk3Ng== nbren12 1386642 2018-02-18T21:24:28Z 2018-02-18T21:24:28Z CONTRIBUTOR

Sorry for random activity. I accidentally hard reset the master branch nbren12/xarray to pydata/xarray.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
366540505 https://github.com/pydata/xarray/pull/1885#issuecomment-366540505 https://api.github.com/repos/pydata/xarray/issues/1885 MDEyOklzc3VlQ29tbWVudDM2NjU0MDUwNQ== nbren12 1386642 2018-02-18T19:26:31Z 2018-02-18T19:26:31Z CONTRIBUTOR

cool! Thanks

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Raise when pcolormesh coordinate is not sorted 294089233
366468848 https://github.com/pydata/xarray/pull/1885#issuecomment-366468848 https://api.github.com/repos/pydata/xarray/issues/1885 MDEyOklzc3VlQ29tbWVudDM2NjQ2ODg0OA== nbren12 1386642 2018-02-17T20:24:36Z 2018-02-17T20:24:36Z CONTRIBUTOR

I just rebased onto master.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Raise when pcolormesh coordinate is not sorted 294089233
365372937 https://github.com/pydata/xarray/pull/1885#issuecomment-365372937 https://api.github.com/repos/pydata/xarray/issues/1885 MDEyOklzc3VlQ29tbWVudDM2NTM3MjkzNw== nbren12 1386642 2018-02-13T19:15:30Z 2018-02-13T19:15:47Z CONTRIBUTOR

no problem. I have always preferred putting operators on new lines though. Didn't realize that was against pep8. oh well 🤷‍♂️ .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Raise when pcolormesh coordinate is not sorted 294089233
365332360 https://github.com/pydata/xarray/pull/1885#issuecomment-365332360 https://api.github.com/repos/pydata/xarray/issues/1885 MDEyOklzc3VlQ29tbWVudDM2NTMzMjM2MA== nbren12 1386642 2018-02-13T17:01:26Z 2018-02-13T17:01:26Z CONTRIBUTOR

Sorry @nbren12 there is still a pep8 error somewhere, and you can freely make it an error as @shoyer suggests.

Hopefully, the commit I just pushed fixes this. I had some global flake8 settings that were messing with my local linting.

Note that currently the test for monotonic coords is for pcolormesh only, while it could be for all 2d plots as well.

I don't think I was having any issues with contour or contourf when I first opened the issue, but it probably does break imshow.

If we start to check input for sanity we might as well raise an error when coordinates are not regularly spaced in the imshow case.

Maybe we could leave this to a later PR.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Raise when pcolormesh coordinate is not sorted 294089233
363533796 https://github.com/pydata/xarray/pull/1885#issuecomment-363533796 https://api.github.com/repos/pydata/xarray/issues/1885 MDEyOklzc3VlQ29tbWVudDM2MzUzMzc5Ng== nbren12 1386642 2018-02-06T19:17:40Z 2018-02-06T19:17:40Z CONTRIBUTOR

Yah. I knew my solution before was probably too cute.

On Tue, Feb 6, 2018 at 11:14 AM, Fabien Maussion notifications@github.com wrote:

Thanks! This looks good.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/pull/1885#issuecomment-363532796, or mute the thread https://github.com/notifications/unsubscribe-auth/ABUokvVDiWCnX_JPa5TsDtW8JtQxdTDEks5tSKSZgaJpZM4R4EWU .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Raise when pcolormesh coordinate is not sorted 294089233
363531759 https://github.com/pydata/xarray/pull/1885#issuecomment-363531759 https://api.github.com/repos/pydata/xarray/issues/1885 MDEyOklzc3VlQ29tbWVudDM2MzUzMTc1OQ== nbren12 1386642 2018-02-06T19:11:04Z 2018-02-06T19:11:04Z CONTRIBUTOR

There. I think this code should work for all dtypes and 2D coords.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Raise when pcolormesh coordinate is not sorted 294089233
362992657 https://github.com/pydata/xarray/pull/1885#issuecomment-362992657 https://api.github.com/repos/pydata/xarray/issues/1885 MDEyOklzc3VlQ29tbWVudDM2Mjk5MjY1Nw== nbren12 1386642 2018-02-05T06:24:40Z 2018-02-05T06:24:40Z CONTRIBUTOR

One reason is that it's not obvious if they would like increasing or decreasing coordinates.

For me at least, plt.pcolormesh automatically displays in increasing order even if one of the input arrays is sorted in descending order. This happens all the time with meteorological data available in pressure coordinates (pressure goes down with height). I usually have to manually call plt.gca().invert_yaxis() to flip the y axis. Does xarray.plot's behave differently?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Raise when pcolormesh coordinate is not sorted 294089233
362990917 https://github.com/pydata/xarray/pull/1885#issuecomment-362990917 https://api.github.com/repos/pydata/xarray/issues/1885 MDEyOklzc3VlQ29tbWVudDM2Mjk5MDkxNw== nbren12 1386642 2018-02-05T06:12:08Z 2018-02-05T06:12:08Z CONTRIBUTOR

@shoyer That would work with me. Is there any chance people would want to make heatmaps involving categorical variables though?

If we do decide to raise an error, why not go one step further and just sort the coordinates automatically?

On Sun, Feb 4, 2018 at 4:00 PM, Stephan Hoyer notifications@github.com wrote:

@shoyer commented on this pull request.

In xarray/plot/plot.py https://github.com/pydata/xarray/pull/1885#discussion_r165863769:

@@ -750,6 +767,13 @@ def _infer_interval_breaks(coord, axis=0): [ 2.5, 3.5, 4.5]]) """ coord = np.asarray(coord) + + if not _is_monotonic(coord, axis=axis): + warnings.warn("The input coordinate is not sorted in increasing order "

Rather an a warning, why not make this an error? I don't see any use-cases for 2d plots with non-monotonic coordinates. With the current version of xarray, these plots always end up wrong in some way, either by not plotting everything or with bad axis labels.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/pull/1885#pullrequestreview-93863697, or mute the thread https://github.com/notifications/unsubscribe-auth/ABUokuMwNBc8JRgafrF7HNmGfgzxi64Qks5tRkSbgaJpZM4R4EWU .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Raise when pcolormesh coordinate is not sorted 294089233
362853059 https://github.com/pydata/xarray/pull/1885#issuecomment-362853059 https://api.github.com/repos/pydata/xarray/issues/1885 MDEyOklzc3VlQ29tbWVudDM2Mjg1MzA1OQ== nbren12 1386642 2018-02-03T20:47:30Z 2018-02-03T20:47:30Z CONTRIBUTOR

Ok. I think everything is ready now.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Raise when pcolormesh coordinate is not sorted 294089233
362836782 https://github.com/pydata/xarray/pull/1885#issuecomment-362836782 https://api.github.com/repos/pydata/xarray/issues/1885 MDEyOklzc3VlQ29tbWVudDM2MjgzNjc4Mg== nbren12 1386642 2018-02-03T17:21:16Z 2018-02-03T17:21:16Z CONTRIBUTOR

Thanks for your help @fmaussion. Hopefully the commit I just pushed fixes the failing tests. I will work on adding a test and fixing the formatting.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Raise when pcolormesh coordinate is not sorted 294089233
360232940 https://github.com/pydata/xarray/issues/1852#issuecomment-360232940 https://api.github.com/repos/pydata/xarray/issues/1852 MDEyOklzc3VlQ29tbWVudDM2MDIzMjk0MA== nbren12 1386642 2018-01-24T18:42:58Z 2018-01-24T18:42:58Z CONTRIBUTOR

I think automatically sorting is OK for 1D coordinates at least. I agree it is more complicated for other situations.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  bug: 2D pcolormesh plots are wrong when coordinate is not ascending order 291103680
360230869 https://github.com/pydata/xarray/issues/1852#issuecomment-360230869 https://api.github.com/repos/pydata/xarray/issues/1852 MDEyOklzc3VlQ29tbWVudDM2MDIzMDg2OQ== nbren12 1386642 2018-01-24T18:36:12Z 2018-01-24T18:36:19Z CONTRIBUTOR

pcolormesh expects n+1 coordinates

I agree that pcolormesh ultimately works with the mesh corners, but I am pretty sure passing coordinates of the same length also works.

you'll probably have to sort the values beforehand too

True. But we can do it automatically with xarray because it has the coordinate information.

So the question is how much sanity check xarray should do before sending the data to matplotlib, and maybe a warning of some kind would be useful.

I am not sure there there is any circumstance where it would be preferable to plot the scrambled data. Is there some the problem with just adding something like the two lines a wrote above to plot.pcolormesh?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  bug: 2D pcolormesh plots are wrong when coordinate is not ascending order 291103680
360039440 https://github.com/pydata/xarray/issues/1852#issuecomment-360039440 https://api.github.com/repos/pydata/xarray/issues/1852 MDEyOklzc3VlQ29tbWVudDM2MDAzOTQ0MA== nbren12 1386642 2018-01-24T07:03:00Z 2018-01-24T07:03:12Z CONTRIBUTOR

This is pretty easily fixed running python sort_inds = {dim: np.argsort(z[dim].values) for dim in z.dims} z = z.isel(**sort_inds)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  bug: 2D pcolormesh plots are wrong when coordinate is not ascending order 291103680
359963097 https://github.com/pydata/xarray/issues/1850#issuecomment-359963097 https://api.github.com/repos/pydata/xarray/issues/1850 MDEyOklzc3VlQ29tbWVudDM1OTk2MzA5Nw== nbren12 1386642 2018-01-23T23:09:21Z 2018-01-23T23:09:21Z CONTRIBUTOR

I agree that the separate repository model is probably best. However, should it be in just one repository or in many?

Using many repos would solve the domain-specific dependency problem, but the sklearn-contrib packages are not that discoverable IMO. I found two of them via google on separate occasions before realizing that they were part of the same github organization.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray contrib module 290593053
359570153 https://github.com/pydata/xarray/issues/1850#issuecomment-359570153 https://api.github.com/repos/pydata/xarray/issues/1850 MDEyOklzc3VlQ29tbWVudDM1OTU3MDE1Mw== nbren12 1386642 2018-01-22T21:25:53Z 2018-01-22T21:26:31Z CONTRIBUTOR

Thanks for starting this issue @shoyer. One thing I would be interested to know is how sklearn and tensorflow balance code-quality and API consistency with low barrier to entry. For instance, most of the sklearn contrib packages provide classes which inherit from sklearn's Transformer, BaseEstimator, or Regressor classes, which ensures that all the contrib packages share a common interface.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray contrib module 290593053
359534363 https://github.com/pydata/xarray/issues/1288#issuecomment-359534363 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDM1OTUzNDM2Mw== nbren12 1386642 2018-01-22T19:19:25Z 2018-01-22T19:19:25Z CONTRIBUTOR

I would also be very interested in seeing your codes @lamorton. Overall, I think the xarray community could really benefit from some kind of centralized contrib package which has a low barrier to entry for these kinds of functions. So far, I suspect there has been a large amount of code duplication for routine tasks like the fft, since I have also written a function for that.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
358853933 https://github.com/pydata/xarray/issues/1839#issuecomment-358853933 https://api.github.com/repos/pydata/xarray/issues/1839 MDEyOklzc3VlQ29tbWVudDM1ODg1MzkzMw== nbren12 1386642 2018-01-19T03:07:32Z 2018-01-19T03:09:08Z CONTRIBUTOR

I guess the main reason barrier for me is initializing the coordinates quickly, which xr.DataArray(np.ones(3, 4), dims=['x', 'y']) doesn't do.

What would use suggest for the signature of a function like xr.ones?

Something like the following would work for me:

```python def ones(shape, dims=None): """Create DataArray of ones with initialized coordinates

Parameters
----------
shape : tuple
    shape of the array
dims : list of str, optional
    list of dimensions with same length as shape. The default will be
    dim_0, dim_1,...,etc
coordinate_initializer : optional
    function which returns the appropriate coordinates. The signature for
    this function must be coordinate_initializer(dim, dim_length). The
    default is to initialize all the coordinates with
    ``np.arange(dim_length)``
"""
pass

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add simple array creation functions for easier unit testing 289837692
343643422 https://github.com/pydata/xarray/pull/1597#issuecomment-343643422 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDM0MzY0MzQyMg== nbren12 1386642 2017-11-11T06:01:52Z 2017-11-11T06:02:12Z CONTRIBUTOR

@shoyer If you are okay with it, I think we might want to leave that to a later date if ever. I am not exactly sure what a useful API for that would be ATM. On the other hand, I have been using the original stack_cat and unstack_cat functions for a couple of months, and they have handled my basic uses pretty well.

For more complicated uses (e.g. taking different subsets of each variable and concatenating the output), I have started working on a project which is similar to sklearn-pandas. Since there are a million ways several xarray variables could be processed/subsetted, stacked and then concatenated, I think this functionality should probably remain in a third party package for now.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
343312698 https://github.com/pydata/xarray/pull/1597#issuecomment-343312698 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDM0MzMxMjY5OA== nbren12 1386642 2017-11-09T22:28:22Z 2017-11-09T22:28:22Z CONTRIBUTOR

Okay. I think I'm done with the updates to the documentation.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
342665601 https://github.com/pydata/xarray/pull/1597#issuecomment-342665601 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDM0MjY2NTYwMQ== nbren12 1386642 2017-11-08T00:04:57Z 2017-11-08T00:04:57Z CONTRIBUTOR

I just added some docs, but need to add to whats-new still,

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
337970838 https://github.com/pydata/xarray/issues/1317#issuecomment-337970838 https://api.github.com/repos/pydata/xarray/issues/1317 MDEyOklzc3VlQ29tbWVudDMzNzk3MDgzOA== nbren12 1386642 2017-10-19T16:56:37Z 2017-10-19T16:56:37Z CONTRIBUTOR

Sorry. I guess I should have made my last comment in the PR.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  API for reshaping DataArrays as 2D "data matrices" for use in machine learning 216215022
337796691 https://github.com/pydata/xarray/issues/1317#issuecomment-337796691 https://api.github.com/repos/pydata/xarray/issues/1317 MDEyOklzc3VlQ29tbWVudDMzNzc5NjY5MQ== nbren12 1386642 2017-10-19T04:32:03Z 2017-10-19T04:32:03Z CONTRIBUTOR

After using my own version of this code for the past month or so, it has occurred to me that this API probably will not support stacking arrays of with different sizes along shared arrays. For instance, I need to "stack" humidity below an altitude of 10km with temperature between 0 and 16 km. IMO, the easiest way to do this would be to change these methods into top-level functions which can take any dict or iterable of datarrays. We could leave that for a later PR of course.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  API for reshaping DataArrays as 2D "data matrices" for use in machine learning 216215022
333337845 https://github.com/pydata/xarray/pull/1597#issuecomment-333337845 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDMzMzMzNzg0NQ== nbren12 1386642 2017-09-30T21:41:58Z 2017-09-30T21:41:58Z CONTRIBUTOR

@rabernat Your point is well taken. I will add some docs/motivation to the reshaping page.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
333275551 https://github.com/pydata/xarray/pull/1597#issuecomment-333275551 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDMzMzI3NTU1MQ== nbren12 1386642 2017-09-30T02:20:41Z 2017-09-30T02:21:14Z CONTRIBUTOR

Okay. I just changed the names of the methods and wrote a test case for the problem with the dtype of the stacked dimensions not being preserved by to_stacked_array.

At the moment, I am filling in the missing dimensions with None, so the resulting index has dtype object'. This is then concatenated with the bona fide indices of the variables which are not missing this index, so the index as a whole is cast to the lowest common denominator, which is object. Unfortunately we can't just put Nan in because, NaN is a floating point number thing, so I don't think all numpy dtypes have the equivalent of NaN. @shoyer Do you have any thoughts about how to resolve this?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
333271025 https://github.com/pydata/xarray/pull/1597#issuecomment-333271025 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDMzMzI3MTAyNQ== nbren12 1386642 2017-09-30T01:06:09Z 2017-09-30T01:06:09Z CONTRIBUTOR

That naming sounds good to me.

Also, I was having an issue with unstack_cat returning an index with dtype object that I would like to sort out.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
333023152 https://github.com/pydata/xarray/pull/1597#issuecomment-333023152 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDMzMzAyMzE1Mg== nbren12 1386642 2017-09-29T03:38:07Z 2017-09-29T03:38:07Z CONTRIBUTOR

Or maybe DataArray.to_dataset_unstacked is good.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
333022698 https://github.com/pydata/xarray/pull/1597#issuecomment-333022698 https://api.github.com/repos/pydata/xarray/issues/1597 MDEyOklzc3VlQ29tbWVudDMzMzAyMjY5OA== nbren12 1386642 2017-09-29T03:33:34Z 2017-09-29T03:34:02Z CONTRIBUTOR

Thanks! Yah...I'm not very good at naming things. I think Dataset.to_stacked_array makes sense, but I would typically expect something like from_stacked_array to be a static method of Dataset (e.g. pd.MultiIndex.from_tuples). We could always move unstack_cat to Dataset, but then we miss out on the ability to make sequential method calls.

IMO the behavior of unstack_cat is kind of what I would expect DataArray.to_dataset to do in the case where dim is a stacked coordinate, so maybe we could put it in there.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add methods for combining variables of differing dimensionality 261131958
330598492 https://github.com/pydata/xarray/issues/1577#issuecomment-330598492 https://api.github.com/repos/pydata/xarray/issues/1577 MDEyOklzc3VlQ29tbWVudDMzMDU5ODQ5Mg== nbren12 1386642 2017-09-19T16:37:46Z 2017-09-19T16:37:46Z CONTRIBUTOR

My understanding is that the "core dimensions" are moved to the end of each input, and then singleton axes are inserted to make them broadcastable in a numpy sense. Is that correct?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Potential error in apply_ufunc docstring for input_core_dims 258640421
330374224 https://github.com/pydata/xarray/issues/1577#issuecomment-330374224 https://api.github.com/repos/pydata/xarray/issues/1577 MDEyOklzc3VlQ29tbWVudDMzMDM3NDIyNA== nbren12 1386642 2017-09-18T22:28:58Z 2017-09-18T22:29:08Z CONTRIBUTOR

Also, it could be clarified what is meant by a core dimension.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Potential error in apply_ufunc docstring for input_core_dims 258640421
330359229 https://github.com/pydata/xarray/issues/1554#issuecomment-330359229 https://api.github.com/repos/pydata/xarray/issues/1554 MDEyOklzc3VlQ29tbWVudDMzMDM1OTIyOQ== nbren12 1386642 2017-09-18T21:18:01Z 2017-09-18T21:18:01Z CONTRIBUTOR

I have been playing around with the MultiIndexes as part of #1317, so I could take a stab at implementing some sort of stack which combines the levels etc.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  TypeError on DataArray.stack() if any of the dimensions to be stacked has a MultiIndex 255597950
330282841 https://github.com/pydata/xarray/issues/1317#issuecomment-330282841 https://api.github.com/repos/pydata/xarray/issues/1317 MDEyOklzc3VlQ29tbWVudDMzMDI4Mjg0MQ== nbren12 1386642 2017-09-18T16:45:55Z 2017-09-18T16:46:37Z CONTRIBUTOR

@shoyer I wrote a class that does this a while ago. It is available here: data_matrix.py. It is used like this ```python

D is a dataset

the signature for DataMatrix.init is

DataMatrix(feature_dims, sample_dims, variables)

mat = DataMatrix(['z'], ['x'], ['a', 'b']) y = mat.dataset_to_mat(D) x = mat.mat_to_dataset(y) `` One of the problems I had to handle was with concatenating/stacking DataArrays with different numbers of dimensions---stackandunstackcombined withto_arraycan only handle the case where the desired feature variables all have the same dimensionality. ATM my code stacks the desired dimensions for each variable and then manually callsnp.hstack` to produce the final matrix, but I bet it would be easy to create a pandas Index object which can handle this use case.

Would you be open to a PR along these lines?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  API for reshaping DataArrays as 2D "data matrices" for use in machine learning 216215022
330023238 https://github.com/pydata/xarray/pull/1517#issuecomment-330023238 https://api.github.com/repos/pydata/xarray/issues/1517 MDEyOklzc3VlQ29tbWVudDMzMDAyMzIzOA== nbren12 1386642 2017-09-17T05:56:12Z 2017-09-17T05:56:12Z CONTRIBUTOR

Sure. I'd be happy to make a PR once this gets merged. On Sat, Sep 16, 2017 at 10:39 PM Stephan Hoyer notifications@github.com wrote:

Alternatively apply_ufunc could see if the func object has a pre_dask_atop method, and apply it if it does.

This seems like a reasonable option to me. Once we get this merged, want to make a PR?

@jhamman https://github.com/jhamman could you give this a review? I have not included extensive documentation yet, but I am also reluctant to squeeze that into this PR before we make it public API. (Which I'd like to save for another one.)

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/pull/1517#issuecomment-330022743, or mute the thread https://github.com/notifications/unsubscribe-auth/ABUoksuo9P3AJIzemncQQJZ3D5Ga2Opsks5sjLCdgaJpZM4PAViG .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Automatic parallelization for dask arrays in apply_ufunc 252358450

Next page

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 3043.419ms · About: xarray-datasette