issue_comments
60 rows where issue = 628719058 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: reactions, created_at (date), updated_at (date)
issue 1
- Feature Request: Hierarchical storage and processing in xarray · 60 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
1198743015 | https://github.com/pydata/xarray/issues/4118#issuecomment-1198743015 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X85Hc13n | jakirkham 3019665 | 2022-07-29T00:14:46Z | 2022-07-29T00:14:46Z | NONE | Wanted to note issue ( https://github.com/carbonplan/ndpyramid/issues/10 ) here, which may be of interest to people here. Also we are thinking about a Dask blogpost in this space if people have thoughts on what we should include and/or are interested in being involved. Details in issue ( https://github.com/dask/dask-blog/issues/141 ). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
1040778284 | https://github.com/pydata/xarray/issues/4118#issuecomment-1040778284 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X84-CQQs | mraspaud 167802 | 2022-02-15T20:48:51Z | 2022-07-18T13:05:09Z | CONTRIBUTOR | Thanks for launching this discussion @TomNicholas ! I'm a core dev of pytroll/satpy which handles earth observing satellite data. I got interested in DataTree because we have data from the same instruments available at mulitple resolution, hence not fitting into a single Dataset. For use Option 1 is probably feeling better. Even when having data at multiple resolutions, it is still a limited number of resolutions and hence splitting them in groups is the natural way of going I would say. We do not use the features you mention in Zarr or GRIB, as a majority of the satellite data we use is provided in netcdf nowadays. Don't hesitate to ask if you want to know more or if something is unclear, we are really interested in these developments, so if we can help that way... |
{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
904817641 | https://github.com/pydata/xarray/issues/4118#issuecomment-904817641 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X8417mvp | TomNicholas 35968931 | 2021-08-24T17:00:24Z | 2022-05-19T16:33:26Z | MEMBER | So I had a crack at making a full It's based on @benbovy's Some limitations of the approach I used are:
- Each dataset in the tree is entirely separate, so doing something like You can create a It's about 70% working, but some things I could do with some help with are:
1) ~Fundamental design questions about the class structure, such as whether There will definitely be many bugs, but any thoughts or input appreciated! |
{ "total_count": 8, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 8, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
1059499222 | https://github.com/pydata/xarray/issues/4118#issuecomment-1059499222 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X84_JqzW | tacaswell 199813 | 2022-03-04T20:25:47Z | 2022-03-04T20:25:47Z | NONE | @LunarLanding You may also be interested in awkward array. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
1059382908 | https://github.com/pydata/xarray/issues/4118#issuecomment-1059382908 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X84_JOZ8 | LunarLanding 4441338 | 2022-03-04T17:46:03Z | 2022-03-04T18:14:19Z | NONE |
I mean that I might have, for instance, a map from 2 variables to data, ie (x,y)->c, that I can write as a DataArray XY with two dimensions x and y and the values being c. Then I have a function f so that f(c)->d[g(c)], i.e. it yields an array whose length depends on c. I wish I could say : apply f to XY, building a variable length array as you get the output. It could be stored as sparse matrice (X,Y,G). This is a bit out of scope for this discussion; but it is related since creating a differently named group per dimension length is often mentioned as a workaround ( which does not scale when you have a 1000x(variable length dimension) data).
The use-case is iteratively adding values to a dataset by mapping functions over multiple variables / dimensions in arbitrary compositions. This happens in the context of data analysis, where you start with some source data and then iteratively create analysis functions, and then want to query / display / do statistics/reductions on the set of original data + analysis. Explicit hierarchical dimensions allow for merging and referring to data with no collisions in a single datatree/group. PS: in netcdf-4 dimensions are seen by children, it matches what I previously posted; in HDF5 nodes are hardlinks to the actual data , this might be exactly the xarray-datagroup posted above. Example of ideal datastructureThe datastructure that is more useful for this kind of analysis is the one that is an arbitrary graph of n-dimensional arrays; forcing the graph to have a hierarchical access allows optional organization; the graph itself can exist as python objects for nodes and references for edges. If the tree is not necessary/required everything can be placed on the first level, as it is done on a Dataset. # Example: ## Notation - `a:b` value `a` has type `b` - `t[...,n,...]` : type of data array of values of type `t`, with axis of length `n` - `D(n(,l))` dimension of size `n` with optional labels `l` - `A(t,*(dims:tuple[D])}` : type of data array of values of type `t`, with dimension `dims` - a tree node `T` is either: - a dict from hashables to tree nodes, `dict[Hashable,T]` - a dimension `D` - a data array `A` - `a[*tags]:=a[tag[0]][tag[1]]...[tag[len(tag)-1]]` - `map(f,*args:A,dims:tuple[D])` maps `f` over `args` broadcasting over `dims` Start with a 2d-dimensional DataArray: ``` d0 ( Graph : ( x->D(x_n,float[x_n]) y->D(y_n) v->A(float,x,y) ) Tree : ( { 'x':x, 'y':y, 'v':v, } ) ) ``` Map a function `f` that introduces a new dimension `w` with constant labels `f_w_l:int[f_w_n]` (through map_blocks or apply_ufunc) and add it to d0: ``` f : x:float->( Graph: f_w->D(f_w_n,f_w_l) a->A(float,f_w) b->A(float) Tree: { 'w':f_w, 'a':a, 'b':b, }) d1=d0.copy() d1['f']=map( f, d0['v'], (d0['x'],d0['y']) ) d1 ( Graph : x->D(x_n,float[x_n]) y->D(y_n) v->A(float,x,y) f_w->D(f_w_n,f_w_l) f_a->A(float,x,y,f_w) f_b->A(float,x,y) Tree : { 'x':x, 'y':y, 'v':v, 'f':{ 'w':f_w, 'a':f_a, 'b':f_b, } } ) ``` Map a function `g`, that has a dimension of the same name but different meaning and therefore possibly different length `g_w_n` and `g_w_l`: ``` g : x:float->( Graph: g_w->D(g_w_n,g_w_l) a->A(float,g_w) b->A(float) Tree: { 'w':g_w, 'a':a, 'b':b, }) d2=d1.copy() d2['g']=map( g, d1['v'], (d1['x'],d1['y']) ) d2 ( Graph : x->D(x_n,float[x_n]) y->D(y_n) v->A(float,x,y) f_w->D(f_w_n,f_w_l) f_a->A(float,x,y,f_w) f_b->A(float,x,y) g_w->D(g_w_n,g_w_l) g_a->A(float,x,y,g_w) g_b->A(float,x,y) Tree : { 'x':x, 'y':y, 'v':v, 'f':{ 'w':f_w, 'a':f_a, 'b':f_b, }, 'g':{ 'w':g_w, 'a':g_a, 'b':g_b, } } ) ``` Notice that both `f` and `g` output a dimension named 'w' but that they have different lengths and possibly different meanings. Suppose I now want to run analysis on f's and g's output, with a function that takes two a's and outputs a float Then d3 looks like: ``` h : a1:float,a2:float->( Graph: r->A(float) Tree: r d3=d2.copy() d3['f_g_aa']=map( h, d2['f','a'],d2['g','a'], (d2['x'],d2['y'],d2['f','w'],d2['g','w']) ) d3 { Graph : x->D(x_n,float[x_n]) y->D(y_n) v->A(float,x,y) f_w->D(f_w_n,f_w_l) f_a->A(float,x,y,f_w) f_b->A(float,x,y) g_w->D(g_w_n,g_w_l) g_a->A(float,x,y,g_w) g_b->A(float,x,y) f_g_aa->A(float,x,y,f_w,g_w) Tree : { 'x':x, 'y':y, 'v':v, 'f':{ 'w':f_w, 'a':f_a, 'b':f_b, }, 'g':{ 'w':g_w, 'a':g_a, 'b':g_b, } 'f_g_aa': f_g_aa } } ``` Compared to what I posted before, I dropped the resolving the dimension for a array by its position in the hierarchy since it would be innaplicable when a variable refers to dimensions in a different branch of the tree. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
1047944213 | https://github.com/pydata/xarray/issues/4118#issuecomment-1047944213 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X84-dlwV | TomNicholas 35968931 | 2022-02-22T15:58:48Z | 2022-02-22T15:58:48Z | MEMBER | Also thanks @OriolAbril , it's useful to have an ArViz perspective.
I see In either case I imagine all we might need to do is slightly extend |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
1047932340 | https://github.com/pydata/xarray/issues/4118#issuecomment-1047932340 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X84-di20 | TomNicholas 35968931 | 2022-02-22T15:47:15Z | 2022-02-22T15:50:41Z | MEMBER | Hi @LunarLanding , thanks for your ideas!
It sounds a bit like what you are suggesting is essentially a model in which dimensions are explicit objects, which can be referred to from other groups, like in netCDF. (NetCDF has "dimension IDs".) This would be a bit of a departure from the model that
By "variable" length, do you mean that the length of dimensions differs between variables in the same group, or just that you don't know the length of the dimension in advance? Is there a specific use case which you think would require explicit dimensions to solve? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
1047915016 | https://github.com/pydata/xarray/issues/4118#issuecomment-1047915016 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X84-deoI | LunarLanding 4441338 | 2022-02-22T15:30:00Z | 2022-02-22T15:38:52Z | NONE | Often I run a function over a dataset, with each call outputing a hierarchical data structure, containing fixed dimensions in the best cases and variable length in the worst.
For this, it would make more sense to be able to have dimensions ( with optional labels and coordinates ) assigned to nodes (and these would be inherited by any descendants). Leaf nodes would hold data.
On merge, dimensions could be bubbled up as long as length (and labels) matched.
Operations with dimensions would then go down to corresponding dimension level before applying the operator, i.e. Datagroup and Datatree are subcases of this general structure, which could be enforced via flags/checks. Option 1 is where the extremities of the tree are a node with two sets of child nodes, dimension labels and n-dimensional arrays. Option 2 is where the extremities of the tree are a node with a child node for a n-dimensional array A, and a sibling node for each dimension of A, containing the corresponding labels. I'm sure I'm missing some big issue with the mental model I have, for instance I haven't thought of transformations at all and about coordinates. But for clarity I tried to write it down below. The most general structure for a dataset I can think of is a directed graph. Each node A is a n-dimensional (sparse) array, where each dimension D points optionally to a one-dimensional node B with the same length. To get a hierarchical structure, we:
We can resolve D's target by (A) checking for a sibling in T with the same name, and then going up one level and goto (A). Multindexes ( multi-dimensional (sparse) labels ) generalize this model, but require tuple labels in T's edges i.e. : h/j/a[x,y,z] has a sybling h/j/(x,y)[x,y] , with z's labels being one level above, i.e. h/z[z] ( the notation a[b] means map of index b to value a ). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
1044853795 | https://github.com/pydata/xarray/issues/4118#issuecomment-1044853795 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X84-RzQj | OriolAbril 23738400 | 2022-02-18T17:06:57Z | 2022-02-18T17:06:57Z | CONTRIBUTOR | I am not sure I completely understand option 2, but option 1 seems a better fit to what we are doing at ArviZ (so far we are managing quite well with the InferenceData mentioned above which is a collection of independent xarray datasets). In our case, well defined selection for multiple variables at the same time (i.e. at the dataset level) is very useful. I was also wondering what changes (if any) would each option imply when using |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
1043638105 | https://github.com/pydata/xarray/issues/4118#issuecomment-1043638105 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X84-NKdZ | TomNicholas 35968931 | 2022-02-17T23:47:44Z | 2022-02-17T23:47:44Z | MEMBER |
@alexamici can you expand on the role of the CF conventions in this statement? Are you talking about CF conventions allowing one variable in one group to refer to dimension present in another group, or something else? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
1042769595 | https://github.com/pydata/xarray/issues/4118#issuecomment-1042769595 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X84-J2a7 | kmuehlbauer 5821660 | 2022-02-17T09:58:18Z | 2022-02-17T09:58:18Z | MEMBER |
Thanks for clarifying. I'm wondering if that can be a source of misunderstanding. How should the user differentiate that? I mean finally those dimensions which have the same name |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
1042753800 | https://github.com/pydata/xarray/issues/4118#issuecomment-1042753800 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X84-JykI | alexamici 226037 | 2022-02-17T09:41:29Z | 2022-02-17T09:53:55Z | MEMBER | @kmuehlbauer in the representation I use the fully qualified name for the dimension / coordinate, but the corresponding |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
1042731962 | https://github.com/pydata/xarray/issues/4118#issuecomment-1042731962 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X84-JtO6 | kmuehlbauer 5821660 | 2022-02-17T09:17:55Z | 2022-02-17T09:17:55Z | MEMBER | @alexamici
I'm having difficulties to understand your above point wrt to the scoping rules from the above CF document. Shouldn't it be impossible to create two arrays (in the same group) having dimensions with exactly the same name from different groups? Looking at the example here https://github.com/alexamici/xarray-datagroup there are coordinates with name "/lat" vs "lat". Aren't that two different names? Maybe I'm missing something essential here. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
1042656377 | https://github.com/pydata/xarray/issues/4118#issuecomment-1042656377 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X84-Jax5 | alexamici 226037 | 2022-02-17T07:39:15Z | 2022-02-17T08:17:51Z | MEMBER | @TomNicholas (cc @mraspaud)
The two main classes of on-disk formats that, I know of, which cannot be always represented in the "group is a Dataset" approach are: - in netCDF following the CF conventions for groups, it is legal for an array to refer to a dimension or a coordinate in a different group and so arrays in the same group may have dimensions with the same name, but different size / coordinate values, (this was the orginal motivation to explore the DataGroup approach) - the current spec for the Next-generation file formats (NGFF) for bio-imaging has all scales of the same 5D data in the same group. (cc @joshmoore) I don't have an example at hand, but my impression is that satellite products that use HDF5 file format also place arrays with inconsistent dimensions / coordinates in the same group. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
1042664227 | https://github.com/pydata/xarray/issues/4118#issuecomment-1042664227 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X84-Jcsj | alexamici 226037 | 2022-02-17T07:52:17Z | 2022-02-17T07:53:13Z | MEMBER | @TomNicholas I also have a few comments on the comparison:
This is only true for flat netCDF files, once you introduce groups in a netCDF AND accept CF conventions the DataGroup approach can map 100% of the files, while the DataTree approach fails on a (admittedly small) class of them.
Both points are only true for the DataArray in a single group, once you broadcast any operation to subgroups the two implementations would share the same limitations (dimensions in subgroups can be inconsistent in both cases). In my opinion the advantage for the DataTree is minimal.
The two approach are identical in this respect, group attributes are mapped in the same way to DataTree and DataGroup I share your views on all other points. |
{ "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
1042660100 | https://github.com/pydata/xarray/issues/4118#issuecomment-1042660100 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X84-JbsE | shoyer 1217238 | 2022-02-17T07:45:24Z | 2022-02-17T07:45:24Z | MEMBER | One thing that came up in our discussion about this in the developer meeting today is that we could also pretty easily expose a "low level" API for IO using dictionaries of xarray.Variable objects. This intermediate representation could be useful for cleaning up data into a form suitable for conversion into Dataset objects. On Wed, Feb 16, 2022 at 11:39 PM Alessandro Amici @.***> wrote:
|
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
1039572760 | https://github.com/pydata/xarray/issues/4118#issuecomment-1039572760 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X8499p8Y | TomNicholas 35968931 | 2022-02-14T21:19:56Z | 2022-02-14T21:40:21Z | MEMBER | We would like some opinions from the community on two different possible models for a tree-like structure in xarray. A tree contains many groups, but the question is what constraints should be imposed on the contents of those groups.
This is by no means the only question, and we have various choices to make within these options. The questions for the potential users here are: - Do you have use cases which one of these designs could handle but the other couldn't? - How important to you is being able to support all valid files of these certain formats? - Which of these designs is clearer/more intuitive/more appealing to you? (@alexamici , @shoyer, @jhamman, @aurghs please edit this comment to add anything I've missed) |
{ "total_count": 2, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 2, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
905472692 | https://github.com/pydata/xarray/issues/4118#issuecomment-905472692 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X841-Gq0 | TomNicholas 35968931 | 2021-08-25T12:50:04Z | 2021-08-25T13:02:10Z | MEMBER | Thanks @benbovy !
I don't know much about HTML, but graphs where you can mouseover nodes to see node information sound awesome!
They aren't separate: The idea was that creating a single node of a tree by specifying only its We could just merge the two signatures into one
They were originally separate (I had
Good to know that other nested structures took a similar approach. I think that as we want to be able to save and load any subtree even after changing parents etc. then we ideally don't want to treat any one node as special. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
905427176 | https://github.com/pydata/xarray/issues/4118#issuecomment-905427176 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X84197jo | benbovy 4160723 | 2021-08-25T11:47:10Z | 2021-08-25T11:47:10Z | MEMBER | Great work @TomNicholas! For rich/html reprs, I think that we could take much inspiration from some of the dask reprs shown in this blog post. I haven't looked at your repository in detail yet, but I have one general question about the design: what is the rationale of having two separate classes |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
904987705 | https://github.com/pydata/xarray/issues/4118#issuecomment-904987705 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X8418QQ5 | TomNicholas 35968931 | 2021-08-24T21:25:17Z | 2021-08-24T21:25:37Z | MEMBER | Thanks @jhamman - expect things to break as I keep realizing certain methods have to be defined differently from in Dataset for things to work. Help with 3 would be especially appreciated, as at the moment whilst I can open and alter a file with groups, I can't save my resulting tree :sweat_smile: |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
904970588 | https://github.com/pydata/xarray/issues/4118#issuecomment-904970588 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X8418MFc | jhamman 2443309 | 2021-08-24T21:00:33Z | 2021-08-24T21:00:33Z | MEMBER | Thanks @TomNicholas! I've just been starting to look into this. I'm going to give it a spin and would be happy to help with your numbers 3 and 4. |
{ "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
901954045 | https://github.com/pydata/xarray/issues/4118#issuecomment-901954045 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X841wrn9 | TomNicholas 35968931 | 2021-08-19T14:16:45Z | 2021-08-19T14:16:45Z | MEMBER | Oh excellent, thanks for the clarification Stephan! On Thu, 19 Aug 2021, 00:23 Stephan Hoyer, @.***> wrote:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
901598698 | https://github.com/pydata/xarray/issues/4118#issuecomment-901598698 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X841vU3q | shoyer 1217238 | 2021-08-19T04:23:15Z | 2021-08-19T04:23:15Z | MEMBER |
NetCDF does not allow variables and groups with the same name, e..g, ```python import netCDF4 nc = netCDF4.Dataset('testing.nc', 'w') nc.createVariable('foo', float) nc.createGroup('foo') RuntimeError: NetCDF: String match to name in use``` I'm pretty sure this is also prohibited for all HDF5 files, just like how you can't have a directory and file with the same name on most filesystems. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
901594249 | https://github.com/pydata/xarray/issues/4118#issuecomment-901594249 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X841vTyJ | TomNicholas 35968931 | 2021-08-19T04:10:30Z | 2021-08-19T04:10:30Z | MEMBER | I think that xarray's current use of both dict-like access and attribute-like access for variables makes representing a general netCDF file in a single Consider a tree with a node structure for a hypothetical
We ideally want to be able to seamlessly access both subtrees and individual variables via chains of keys, e.g.
This particular example is fine, and would correspond to a netCDF file with groups "root", "root/weather", and "root/weather/temperature", plus the four stored DataArray variables. However, if one of the variables has the same name as one of the groups (which I think is permitted in the netCDF format), then there is no easy way to access all the elements whilst retaining the nice syntax. For example consider
Now we have a key collision between the group named "B" and the DataArray named "B", i.e. We can't just forbid this type of tree because then there would be netCDF files that we couldn't represent as a We can't use different types of access (e.g. (We could divide access through The only way I can see around this is to hide a node's data variables behind a It sounds like @emilbiju avoided this by not satisfying
so I'm wondering if anyone else has other suggestions or thoughts? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
876397215 | https://github.com/pydata/xarray/issues/4118#issuecomment-876397215 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDg3NjM5NzIxNQ== | martinitus 7611856 | 2021-07-08T12:27:58Z | 2021-07-08T12:27:58Z | NONE | As a user who (so far) does not use any netCDF or HDF5 features of xarray I obviously would not like to have a otherwise potentially useful feature blocked by restrictions imposed by netCDF or HDF5 ;-). That said - I think @tacaswell comment about round trips is very reasonable and such invariants should be maintained! It would be extremely confusing for users if netcdf -> xarray-> netcdf is not a "no-op". The same obviously holds true for any other storage format. As a user I would generally expect something like the following:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
875121115 | https://github.com/pydata/xarray/issues/4118#issuecomment-875121115 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDg3NTEyMTExNQ== | tacaswell 199813 | 2021-07-06T22:21:02Z | 2021-07-06T22:21:02Z | NONE |
hdf5 allows for internal links so a datasets and groups can appear in multiple places in the tree. You can even make cycles where groups are in them selves (or their children). The NeXuS format (the xray/neutron one) makes heavy use of this to let data appear both where it "makes sense" from a science point of view from an instrumentation point of view. I think it is reasonable to expect that netcdf -> xarray -> netcdf always , however I think it is unreasonable to ask that xarray -> netcdf -> xarray will always work. I think it is OK if xarray can express more complex relationship and structures that you can in netcdf (or hdf5 or any existing at-rest format). In an extreme case, consider an interface to a database that returns xarrays 😈 . |
{ "total_count": 4, "+1": 4, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
873492892 | https://github.com/pydata/xarray/issues/4118#issuecomment-873492892 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDg3MzQ5Mjg5Mg== | TomNicholas 35968931 | 2021-07-04T00:51:19Z | 2021-07-04T00:51:19Z | MEMBER | Some other thoughts about tags: 1) Does the definition of tags include variable names of DataArrays? I think it should. 2) As @martinitus mentioned, a 3) Selecting via tags would need to allow a distinction between "get me all leaves with these exact tags" and "get me all leaves whose tags include these ones". Maybe 4) The latter type of tag-based access would make plotting different leaves against one another easier too - given a multi-resolution (or multi-model) datatree like this:
then assuming that the definition of tags included the DataArray variable names, then
would select all leaves with a tempature tag, check that the temperature DataArrays had the same dimensions (but no need for any 5) With a tag-based system you can create cycles of tags, like A&B, B&C, C&A, which you can't really do with hierarchical trees. I don't think that actually causes any problems though... |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
873316602 | https://github.com/pydata/xarray/issues/4118#issuecomment-873316602 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDg3MzMxNjYwMg== | shoyer 1217238 | 2021-07-03T00:40:55Z | 2021-07-03T00:40:55Z | MEMBER |
That sounds right to me -- a downside of tags is that they can't be (uniquely) expressed in a hierarchical arrangement like those found in HDF5/netCDF4 files. But if this is a better way to organize data in memory, we could consider how to make an adapter layer for on disk storage. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
873307873 | https://github.com/pydata/xarray/issues/4118#issuecomment-873307873 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDg3MzMwNzg3Mw== | TomNicholas 35968931 | 2021-07-02T23:54:09Z | 2021-07-02T23:54:09Z | MEMBER | @shoyer if you used tags wouldn't you lose the ability to round-trip a netCDF file with groups? When you read in the groups from the file you would be throwing information away by going from a hierarchy A/B to simply tags A&B, and there wouldn't be a way to restore that before calling |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
873231425 | https://github.com/pydata/xarray/issues/4118#issuecomment-873231425 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDg3MzIzMTQyNQ== | TomNicholas 35968931 | 2021-07-02T20:05:06Z | 2021-07-02T20:05:06Z | MEMBER |
That is interesting. I think there is an argument for using a hierarchical model to map onto the full netCDF data model with groups, but perhaps methods to select elements via tags could be included too, for the best of both? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
873227326 | https://github.com/pydata/xarray/issues/4118#issuecomment-873227326 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDg3MzIyNzMyNg== | shoyer 1217238 | 2021-07-02T19:55:31Z | 2021-07-02T19:55:31Z | MEMBER | @martinitus raises a really interesting point about tags vs hierarchical structures over in https://github.com/pydata/xarray/issues/1092#issuecomment-868324949
I think using tags is a really interesting alternative to hierarchies. I don't have a clear sense of the overall tradeoffs, though. |
{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
873179375 | https://github.com/pydata/xarray/issues/4118#issuecomment-873179375 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDg3MzE3OTM3NQ== | TomNicholas 35968931 | 2021-07-02T18:22:49Z | 2021-07-02T18:22:49Z | MEMBER | Flagging another possible use case, this time in Magnetic Confinement Fusion: representing the IMAS data model. IMAS is currently closed-source (being part of the ITER project), but there is a big push to make it open-source and the standard data model for tokamak plasma data. I'm not very familiar with IMAS (@smithsp and @orso82 are more so), but it is hierarchical. There is some more information in appendix A3 of this paper, which talks about "taking advantage of the homogeneity of grid sizes that is commonly found across arrays of structures", which sounds very closely related to the This might allow the |
{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
846137752 | https://github.com/pydata/xarray/issues/4118#issuecomment-846137752 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDg0NjEzNzc1Mg== | dcherian 2448579 | 2021-05-21T17:58:38Z | 2021-05-21T17:58:38Z | MEMBER | cc @d-v-b and https://github.com/JaneliaSciComp/xarray-multiscale |
{ "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
845845472 | https://github.com/pydata/xarray/issues/4118#issuecomment-845845472 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDg0NTg0NTQ3Mg== | nbercher 6772352 | 2021-05-21T10:15:13Z | 2021-05-21T10:17:00Z | NONE | A simple comment/question: In xarray.Dataset, why not just use the Unix-path notation into a "flat" dict model? Actually, netCDF4 implements this Unix-like path access to groups and variables: All of the hierarchical stuff (e.g., getting a sub-Dataset from a random group) and conventions (e.g., dimensions scoping rule) would then be driven by the parsing of strings only. It's all about symbolic names (like in a file system right?) and there would be not any hierarchical data in memory anymore. My question is then: Are there some tricky points for xarray.Dataset not to go this simple way? Some related remarks:
- About the attribute access to variables: I don't really know why this exist at all since it is all about mixing unrelated namespaces: (1) the class internals and (2) the user's variables one. Mixing namespaces seems very bad to me: it makes some variable names forbidden in order to avoid any collision between the two namespaces, it usually imply unnecessarily complex code with corner cases to deal with.
- About netCDF4 being a self-described format: xarray API has |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
833574864 | https://github.com/pydata/xarray/issues/4118#issuecomment-833574864 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDgzMzU3NDg2NA== | joshmoore 88113 | 2021-05-06T14:36:37Z | 2021-05-06T14:36:37Z | NONE | Picking up on @dcherian's https://github.com/pydata/xarray/issues/4118#issuecomment-806954634 and @rabernat's https://github.com/ome/ngff/issues/48#issuecomment-833456889, Zarr was also accepted to the second round and certainly references this issue in case we want to sync up. (Apologies if I missed where that discussion moved.) |
{ "total_count": 2, "+1": 0, "-1": 0, "laugh": 0, "hooray": 2, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
833535376 | https://github.com/pydata/xarray/issues/4118#issuecomment-833535376 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDgzMzUzNTM3Ng== | thewtex 25432 | 2021-05-06T13:45:16Z | 2021-05-06T13:45:16Z | CONTRIBUTOR | For scientific imaging, i.e. biomicroscopy, medical imaging, where xarray compatibility is being considered in the NGFF, it would be helpful to avoid unnecessary divergence by ensuring the proposed hierarchical storage is compatible. This would mean:
|
{ "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
808694777 | https://github.com/pydata/xarray/issues/4118#issuecomment-808694777 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDgwODY5NDc3Nw== | StanczakDominik 11289391 | 2021-03-27T08:55:26Z | 2021-03-27T08:55:26Z | CONTRIBUTOR | Whoa, that sounds awesome! Thanks for the heads up :) Definitely could be quite handy, looking forward to seeing how this develops. @rocco8773 this should be interesting for you as well :) |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
808366093 | https://github.com/pydata/xarray/issues/4118#issuecomment-808366093 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDgwODM2NjA5Mw== | TomNicholas 35968931 | 2021-03-26T16:47:53Z | 2021-03-26T16:47:53Z | MEMBER | This sounds like an interesting project - I'm also about to be able to work on xarray much more directly (thanks @rabernat ). Should I add this as another xarray project board alongside explicit indexes and so on? I wonder if this could find another domain use case in plasmapy as part of the overall |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
808057690 | https://github.com/pydata/xarray/issues/4118#issuecomment-808057690 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDgwODA1NzY5MA== | aurghs 35919497 | 2021-03-26T09:09:38Z | 2021-03-26T09:09:38Z | COLLABORATOR | We could also provide a use-case in remote sensing: it would be really useful in the interferometric processing for managing Sentinel-1 IW and EW SLC data, which has multiple tiles (burts) partially overlapping in one direction (azimuth). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
807908489 | https://github.com/pydata/xarray/issues/4118#issuecomment-807908489 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDgwNzkwODQ4OQ== | shoyer 1217238 | 2021-03-26T03:24:48Z | 2021-03-26T03:24:48Z | MEMBER | I'm excited to see this coming together! I would be happy to advise as well... Side note: at some point, this would probably be worth adding to Xarray's official roadmap. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
807892921 | https://github.com/pydata/xarray/issues/4118#issuecomment-807892921 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDgwNzg5MjkyMQ== | OriolAbril 23738400 | 2021-03-26T02:39:24Z | 2021-03-26T02:39:24Z | CONTRIBUTOR | Here are some biomedical papers that are using ArviZ and therefore xarray even if most don't cite xarray and some don't cite ArviZ either. Topics are quite disperse: covid, psychology, biomolecules, oncology... Some ArviZ recent biomedical citations* Arroyuelo, A., Vila, J., & Martin, O. A. (2020). Exploring the quality of protein structural models from a Bayesian perspective. bioRxiv. * Axen, S. D. (2020). Representing Ensembles of Molecules (Doctoral dissertation, UCSF). * Brauner, J. M., Mindermann, S., Sharma, M., Johnston, D., Salvatier, J., Gavenčiak, T., ... & Kulveit, J. (2021). Inferring the effectiveness of government interventions against COVID-19. Science, 371(6531). * Busch-Moreno, S., Tuomainen, J., & Vinson, D. (2020). Trait Anxiety Effects on Late Phase Threatening Speech Processing: Evidence from EEG. bioRxiv. * Busch-Moreno, S., Tuomainen, J., & Vinson, D. (2021). Semantic and prosodic threat processing in trait anxiety: is repetitive thinking influencing responses?. Cognition and Emotion, 35(1), 50-70. * Dehning, J., Zierenberg, J., Spitzner, F. P., Wibral, M., Neto, J. P., Wilczek, M., & Priesemann, V. (2020). Inferring change points in the spread of COVID-19 reveals the effectiveness of interventions. Science, 369(6500). * Heilbron, E., Martìn, O., & Fumagalli, E. (2020). Efectos protectores de los alimentos andinos contra el daño producido por el alcohol a nivel del epitelio intestinal, una aproximación estadística. Ciencia, Docencia y Tecnología, 31(61 nov-mar). * Legrand, N., Nikolova, N., Correa, C., Brændholt, M., Stuckert, A., Kildahl, N., ... & Allen, M. (2021). The heart rate discrimination task: a psychophysical method to estimate the accuracy and precision of interoceptive beliefs. bioRxiv. * Wang, Y. (2020, September). Data Analysis of Psychological Measurement of Intelligent Internet-assisted Sports Training based on Bio-Sensors. In 2020 International Conference on Smart Electronics and Communication (ICOSEC) (pp. 474-477). IEEE. * WASSERMAN, A., SHRAGER, J., & SHAPIRO, M. A Multilevel Bayesian Model for Precision Oncology. * Weindel, G., Anders, R., Alario, F. X., & Burle, B. (2020). Assessing model-based inferences in decision making with single-trial response time decomposition. Journal of Experimental Psychology: General. * Yamagata, Y. (2020). Simultaneous estimation of the effective reproducing number and the detection rate of COVID-19. arXiv e-prints, arXiv-2005. |
{ "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
806954634 | https://github.com/pydata/xarray/issues/4118#issuecomment-806954634 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDgwNjk1NDYzNA== | dcherian 2448579 | 2021-03-25T15:30:53Z | 2021-03-25T15:30:53Z | MEMBER | I can shoulder part of the load and help is definitely needed. LOI is due on Tuesday. I'll take a stab this evening and post a link. |
{ "total_count": 4, "+1": 4, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
806777363 | https://github.com/pydata/xarray/issues/4118#issuecomment-806777363 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDgwNjc3NzM2Mw== | danielballan 2279598 | 2021-03-25T13:48:14Z | 2021-03-25T13:48:14Z | CONTRIBUTOR | I volunteer to contribute writing to this from the condensed matter / sychrotron user facility perspective. |
{ "total_count": 3, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 3, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
806701802 | https://github.com/pydata/xarray/issues/4118#issuecomment-806701802 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDgwNjcwMTgwMg== | rabernat 1197350 | 2021-03-25T13:01:56Z | 2021-03-25T13:05:03Z | MEMBER | So we have: - Numerous promising prototypes to draw from - A technical team who can write the proposal and execute the proposed work (@aurghs & @alexamici of B-open) - Numerous supporting use cases from the bioimaging (@joshmoore), condensed matter (@tacaswell), and bayesian modeling (ArviZ; @OriolAbril) domains We are just missing a PI, someone who is willing to put their name on top of the proposal and click submit. I have gone on record as committed to not leading any new proposals this year. And in any case, this is a good opportunity for someone else from the @pydata/xarray core dev team to try on a leadership role. |
{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
806494079 | https://github.com/pydata/xarray/issues/4118#issuecomment-806494079 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDgwNjQ5NDA3OQ== | joshmoore 88113 | 2021-03-25T09:21:47Z | 2021-03-25T09:21:47Z | NONE | Happy to provide assistance on the image pyramid (i.e. "multiscale") use case. |
{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
806403993 | https://github.com/pydata/xarray/issues/4118#issuecomment-806403993 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDgwNjQwMzk5Mw== | aurghs 35919497 | 2021-03-25T06:41:09Z | 2021-03-25T06:42:58Z | COLLABORATOR | @alexamici and I can write the technical part of the proposal. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
804676315 | https://github.com/pydata/xarray/issues/4118#issuecomment-804676315 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDgwNDY3NjMxNQ== | OriolAbril 23738400 | 2021-03-23T07:16:28Z | 2021-03-23T07:16:28Z | CONTRIBUTOR | Not really sure if there is anything we can do from ArviZ to help with that, if there is let us know and we'll do our best cc @percygautam |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
802863863 | https://github.com/pydata/xarray/issues/4118#issuecomment-802863863 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDgwMjg2Mzg2Mw== | tacaswell 199813 | 2021-03-19T14:14:13Z | 2021-03-19T14:14:13Z | NONE | This is related to some very recent work we have been doing at NSLS-II, primarily lead by @danielballan . |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
801785278 | https://github.com/pydata/xarray/issues/4118#issuecomment-801785278 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDgwMTc4NTI3OA== | benbovy 4160723 | 2021-03-18T09:54:42Z | 2021-03-18T09:54:42Z | MEMBER | FWIW, a while ago I wrote a mock-up (and probably outdated) https://gist.github.com/benbovy/92e7c76220af1aaa4b3a0b65374e233a (nbviewer link) |
{ "total_count": 3, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 3, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
801257666 | https://github.com/pydata/xarray/issues/4118#issuecomment-801257666 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDgwMTI1NzY2Ng== | dcherian 2448579 | 2021-03-17T17:10:41Z | 2021-03-17T17:10:41Z | MEMBER |
No. @emilbiju are you interested in open-sourcing your work? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
801240559 | https://github.com/pydata/xarray/issues/4118#issuecomment-801240559 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDgwMTI0MDU1OQ== | rabernat 1197350 | 2021-03-17T16:47:20Z | 2021-03-17T16:47:20Z | MEMBER | On today's Xarray dev call, we discussed pursuing another CZI grant to support this feature in Xarray. The image pyramid use case would provide a strong link to the bioimaging community. @alexamici and the B-open folks seem enthusiastic. I had to leave the meeting early, so I didn't hear the end of the conversation. But did we decide who might serve as PI for such a proposal? |
{ "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
776812965 | https://github.com/pydata/xarray/issues/4118#issuecomment-776812965 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDc3NjgxMjk2NQ== | thewtex 25432 | 2021-02-10T15:58:30Z | 2021-02-10T15:58:30Z | CONTRIBUTOR | @jhamman @joshmoore a prototype to bring together XArray and OME-Zarr/NGFF with multiple groups: https://github.com/OpenImaging/miqa/blob/master/server/scripts/compress_encode.py |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
756204582 | https://github.com/pydata/xarray/issues/4118#issuecomment-756204582 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDc1NjIwNDU4Mg== | joshmoore 88113 | 2021-01-07T15:57:03Z | 2021-01-07T15:57:03Z | NONE | Thanks for the link, @jhamman. The most immediate issue I ran into when trying to use xarray with OME-Zarr data does seem similar. A rough representation of one multiscale image is:
but of course the x, y and z dimensions are of different sizes in each volume. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
756012443 | https://github.com/pydata/xarray/issues/4118#issuecomment-756012443 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDc1NjAxMjQ0Mw== | davidbrochart 4711805 | 2021-01-07T09:56:34Z | 2021-01-07T09:56:34Z | CONTRIBUTOR |
Just a note that this link has moved to: https://arviz-devs.github.io/arviz/getting_started/XarrayforArviZ.html |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
755465523 | https://github.com/pydata/xarray/issues/4118#issuecomment-755465523 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDc1NTQ2NTUyMw== | jhamman 2443309 | 2021-01-06T18:08:19Z | 2021-01-06T18:08:19Z | MEMBER | @joshmoore - based on https://github.com/pangeo-forge/pangeo-forge/pull/27#issuecomment-755397835, you may be interested in this issue. One way to do multiscale datasets in Xarray would be to use hierarchical groups (one group per scale). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
638481215 | https://github.com/pydata/xarray/issues/4118#issuecomment-638481215 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDYzODQ4MTIxNQ== | shoyer 1217238 | 2020-06-03T21:52:53Z | 2020-06-03T23:08:47Z | MEMBER | The data model you sketch out here looks very similar to what we discussed in #1092. I agree that the semantics are well defined. The main question in my mind is whether it would make more sense to make an entirely new data structure (e.g., Probably a new data structure would be easier at this point, because would keep |
{ "total_count": 5, "+1": 5, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
638478790 | https://github.com/pydata/xarray/issues/4118#issuecomment-638478790 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDYzODQ3ODc5MA== | shoyer 1217238 | 2020-06-03T21:46:48Z | 2020-06-03T21:46:48Z | MEMBER | I would be open to exploring adding a hierarchical data structure into xarray (on an experimental basis, to start), but it would need someone with serious interest and time to make it happen. Certainly there are plenty of use cases across various fields. |
{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
637660689 | https://github.com/pydata/xarray/issues/4118#issuecomment-637660689 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDYzNzY2MDY4OQ== | dcherian 2448579 | 2020-06-02T16:20:09Z | 2020-06-02T16:20:09Z | MEMBER | Thanks for writing this up @emilbiju . These are very interesting ideas
|
{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
637382925 | https://github.com/pydata/xarray/issues/4118#issuecomment-637382925 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDYzNzM4MjkyNQ== | emilbiju 39640592 | 2020-06-02T08:33:42Z | 2020-06-02T08:33:42Z | NONE | Thanks @jhamman for sharing the link. Here are my thoughts on the same: For use-cases similar to the one I have mentioned, I think it would be more meaningful to allow the tree structure (calling it Besides, xarray only allows attribute access for getting (and not setting) values, but a separate data structure can allow attribute access for setting values as well. For example, the data structure that I have implemented would allow something like I am currently using attribute-based access for accessing child nodes/data arrays in the Instead of using netCDF4 groups for encoding the Therefore, within the netCDF file, it would exist just as a Dataset. A specially implemented |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
637163506 | https://github.com/pydata/xarray/issues/4118#issuecomment-637163506 | https://api.github.com/repos/pydata/xarray/issues/4118 | MDEyOklzc3VlQ29tbWVudDYzNzE2MzUwNg== | jhamman 2443309 | 2020-06-01T22:42:10Z | 2020-06-01T22:42:10Z | MEMBER | @emilbiju - thanks for opening an issue here. You may want to take a look at the conversation in #1092. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 22