issue_comments
18 rows where issue = 341643235 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: reactions, created_at (date), updated_at (date)
issue 1
- Support non-string dimension/variable names · 18 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
979759002 | https://github.com/pydata/xarray/issues/2292#issuecomment-979759002 | https://api.github.com/repos/pydata/xarray/issues/2292 | IC_kwDOAMm_X846Ze-a | derhintze 25172489 | 2021-11-26T07:48:21Z | 2021-11-26T07:52:42Z | NONE | Would like to chime in that we use a similar approach as in the last comment of @DerWeh . But we extend this by overloading the ``` class Dimension(str, enum.Enum): """Base class for all dimension enums
``` Using this the xarray output is more consistent: ```
We then have deserialization code, that re-creates enum members when reading NetCDF files with corresponding dimensions (and coordinates). Access to coordinates works with enum members as well as their string value. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support non-string dimension/variable names 341643235 | |
722565840 | https://github.com/pydata/xarray/issues/2292#issuecomment-722565840 | https://api.github.com/repos/pydata/xarray/issues/2292 | MDEyOklzc3VlQ29tbWVudDcyMjU2NTg0MA== | DerWeh 22542812 | 2020-11-05T18:41:24Z | 2020-11-05T18:41:24Z | NONE | I just came along this question as I tried something similar to @joshburkart. Using a string-enum instead, the code works in principle: ```python import enum import numpy as np import pandas as pd import xarray as xr class CoordId(str, enum.Enum): LAT = 'lat' LON = 'lon' pd.DataFrame({CoordId.LAT: [1,2,3]}).to_csv() Returns: ',CoordId.LAT\n0,1\n1,2\n2,3\n'xr.DataArray( data=np.arange(3 * 2).reshape(3, 2), coords={CoordId.LAT: [1, 2, 3], CoordId.LON: [7, 8]}, dims=[CoordId.LAT, CoordId.LON], ) output<xarray.DataArray (lat: 3, lon: 2)>array([[0, 1],[2, 3],[4, 5]])Coordinates:* lat (CoordId.LAT) int64 1 2 3* lon (CoordId.LON) int64 7 8``` We however got somewhat ambivalent results, that the dimensions are still enum elements |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support non-string dimension/variable names 341643235 | |
496435570 | https://github.com/pydata/xarray/issues/2292#issuecomment-496435570 | https://api.github.com/repos/pydata/xarray/issues/2292 | MDEyOklzc3VlQ29tbWVudDQ5NjQzNTU3MA== | crusaderky 6213168 | 2019-05-28T09:18:52Z | 2019-05-28T09:18:52Z | MEMBER | @shoyer the biggest problem I see with your suggestion is that, for DataArrays, you'd likewise need to write |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support non-string dimension/variable names 341643235 | |
496064041 | https://github.com/pydata/xarray/issues/2292#issuecomment-496064041 | https://api.github.com/repos/pydata/xarray/issues/2292 | MDEyOklzc3VlQ29tbWVudDQ5NjA2NDA0MQ== | shoyer 1217238 | 2019-05-27T03:15:53Z | 2019-05-27T03:15:53Z | MEMBER | From a typing perspective, what if I don't think we need to change the signature of xarray functions to support "hashable or sequence of hashable". It's OK if convenience features (like support for passing in only a single argument as a string) don't work in all cases. I agree that it would be a good idea to use a centralized helper function for this, though. It is unfortunate that there doesn't seem to be a good way to distinguish between "string" and "non-string sequence of strings" in Python's typing system. But I don't know a good way to solve this problem. Certainly the folks who work on typing in Python are aware of this. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support non-string dimension/variable names 341643235 | |
490824656 | https://github.com/pydata/xarray/issues/2292#issuecomment-490824656 | https://api.github.com/repos/pydata/xarray/issues/2292 | MDEyOklzc3VlQ29tbWVudDQ5MDgyNDY1Ng== | gimperiale 47244312 | 2019-05-09T09:13:22Z | 2019-05-09T09:13:22Z | CONTRIBUTOR | A possible way out would be to open a PEP for "and" and "not" operators in the typing module. That way we could define a "variable-name-like" type and use it throughout the module: xarray.utils:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support non-string dimension/variable names 341643235 | |
490821558 | https://github.com/pydata/xarray/issues/2292#issuecomment-490821558 | https://api.github.com/repos/pydata/xarray/issues/2292 | MDEyOklzc3VlQ29tbWVudDQ5MDgyMTU1OA== | gimperiale 47244312 | 2019-05-09T09:04:21Z | 2019-05-09T09:05:48Z | CONTRIBUTOR | There are problems with typing. I already mentioned them in #2929 but I'll summarize here. The vast majority of xarray functions/methods allow for "string or sequence of strings, optional". When you move to "hashable or sequence of hashables, optional", however, you want to specifically avoid tuples, which are both Sequence and Hashable instances. Most functions currently look like this:
One way to mitigate it would be to have an helper function, which would be invoked everywhere around the codebase, and then religiously make sure that the helper function is always used.
A completely separate problem with typing is that I expect a huge amount of xarray users to just assume variable names and dims are always strings. They'll have things like
The final problem is that integers are Hashables, and there's a wealth of cases in xarray where there is special logic that dynamically treats ints as positional indices. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support non-string dimension/variable names 341643235 | |
410121888 | https://github.com/pydata/xarray/issues/2292#issuecomment-410121888 | https://api.github.com/repos/pydata/xarray/issues/2292 | MDEyOklzc3VlQ29tbWVudDQxMDEyMTg4OA== | shoyer 1217238 | 2018-08-03T02:04:33Z | 2018-08-03T02:04:33Z | MEMBER |
I should have said that required base classes don't feel very Pythonic. I'm not opposed to base classes in principle, and I'm definitely sympathetic to a desire to use static typing. See also https://github.com/pydata/xarray/issues/1900 for related discussion. One consideration is what the advantages are of using enums over "dummy enums" like:
You can still refer to these programmatically like I don't love the sound of names that deserialize to different types than their inputs. That seems very error prone, even if you do your best to overload all the special methods like What does seems like potentially a better idea to me would be a library with dedicated loader functions that "destringify" names by turning them back into enum objects. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support non-string dimension/variable names 341643235 | |
410008899 | https://github.com/pydata/xarray/issues/2292#issuecomment-410008899 | https://api.github.com/repos/pydata/xarray/issues/2292 | MDEyOklzc3VlQ29tbWVudDQxMDAwODg5OQ== | ttung 280924 | 2018-08-02T17:38:52Z | 2018-08-02T17:38:52Z | CONTRIBUTOR | The problem with generic scalar types is that it wouldn't work after serialization and deserialization (which would presumably go to strings). My suggestion has the advantage of being able to create a However, I think (1)/(2) are both reasonable solution (in fact, they seem to be identical except for when you call If that's the direction you'd like to see the project go towards, I'd be happy to take a stab at it. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support non-string dimension/variable names 341643235 | |
409829431 | https://github.com/pydata/xarray/issues/2292#issuecomment-409829431 | https://api.github.com/repos/pydata/xarray/issues/2292 | MDEyOklzc3VlQ29tbWVudDQwOTgyOTQzMQ== | shoyer 1217238 | 2018-08-02T07:12:17Z | 2018-08-02T07:12:17Z | MEMBER | Most of the places in the code where we do the That said, checking explicitly for strings wasn't take a careful API choice. I could see a case for replacing all these "as sequence" casts with something more generic, e.g., based on checking explicitly for more generic scalar types. Certainly enums should be scalars. If possible, I would rather stick to duck typing for any requirements we put on names. Base classes don't feel terribly Pythonic. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support non-string dimension/variable names 341643235 | |
409822333 | https://github.com/pydata/xarray/issues/2292#issuecomment-409822333 | https://api.github.com/repos/pydata/xarray/issues/2292 | MDEyOklzc3VlQ29tbWVudDQwOTgyMjMzMw== | ttung 280924 | 2018-08-02T06:38:32Z | 2018-08-02T06:38:32Z | CONTRIBUTOR | We're using xarray in a project that is encouraging use of python typing, and we too would like to use enums as data dimension names. How do you feel about using a base class that data dimension classes need to subclass? Here's a really simple proof-of-concept (though not very thorough, as it would certainly fail serialization): https://github.com/ttung/xarray/commit/8e623ebebc8f5c1e5615e6d07a82451c0dbe763d ``` In [1]: import xarray as xr In [2]: import numpy as np In [5]: from enum import Enum In [6]: class A(xr.core.dataarray.DimensionBase, Enum): ...: X = "abc" ...: Y = "def" ...: Z = "ghi" ...: In [7]: a = xr.DataArray(np.random.randint(0, 255, size=(4, 3, 5)), dims=[A.X, A.Y, A.Z]) In [8]: a[A.X] Out[8]: <xarray.DataArray \<A.X: 'abc'> (A.X: 4)> array([0, 1, 2, 3]) Dimensions without coordinates: A.X In [9]: a.max(A.X) Out[9]: <xarray.DataArray (A.Y: 3, A.Z: 5)> array([[254, 226, 181, 191, 233], [139, 195, 212, 167, 169], [191, 241, 199, 174, 208]]) Dimensions without coordinates: A.Y, A.Z In [10]: ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support non-string dimension/variable names 341643235 | |
406639686 | https://github.com/pydata/xarray/issues/2292#issuecomment-406639686 | https://api.github.com/repos/pydata/xarray/issues/2292 | MDEyOklzc3VlQ29tbWVudDQwNjYzOTY4Ng== | shoyer 1217238 | 2018-07-20T15:41:45Z | 2018-07-20T15:41:45Z | MEMBER | Oops, I was reading a little too quickly. I did indeed mean (1) above. The one thing I would emphasize is that we don't actually want to check for something like |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support non-string dimension/variable names 341643235 | |
406371878 | https://github.com/pydata/xarray/issues/2292#issuecomment-406371878 | https://api.github.com/repos/pydata/xarray/issues/2292 | MDEyOklzc3VlQ29tbWVudDQwNjM3MTg3OA== | shoyer 1217238 | 2018-07-19T18:29:11Z | 2018-07-20T15:39:10Z | MEMBER | ~~(2)~~ (1) seems like a pretty decent option to me. It's compatible with Python's duck-typing philosophy, and we don't really need string variable/dimension names for anything other than various serialization formats like netCDF. So the full requirement for names would be "Hashable, can be coerced with str() and not None" (we use CC @pydata/xarray in case anyone else has opinions. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support non-string dimension/variable names 341643235 | |
406617353 | https://github.com/pydata/xarray/issues/2292#issuecomment-406617353 | https://api.github.com/repos/pydata/xarray/issues/2292 | MDEyOklzc3VlQ29tbWVudDQwNjYxNzM1Mw== | joshburkart 3888181 | 2018-07-20T14:28:07Z | 2018-07-20T14:28:07Z | NONE | Just to clarify @shoyer, you said (2) sounds best to you, but your other comments (e.g. duck typing, requiring hashable) seem to describe (1)...? Slightly confused... |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support non-string dimension/variable names 341643235 | |
406356945 | https://github.com/pydata/xarray/issues/2292#issuecomment-406356945 | https://api.github.com/repos/pydata/xarray/issues/2292 | MDEyOklzc3VlQ29tbWVudDQwNjM1Njk0NQ== | joshburkart 3888181 | 2018-07-19T17:37:34Z | 2018-07-19T17:37:34Z | NONE | Some options that come to mind:
I dunno. Whatever the maintainers think is best? (3) seems least complex on the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support non-string dimension/variable names 341643235 | |
406345790 | https://github.com/pydata/xarray/issues/2292#issuecomment-406345790 | https://api.github.com/repos/pydata/xarray/issues/2292 | MDEyOklzc3VlQ29tbWVudDQwNjM0NTc5MA== | shoyer 1217238 | 2018-07-19T16:58:41Z | 2018-07-19T16:58:41Z | MEMBER | Another choice would be to intentionally simplify xarray's data model and not allow anything other than strings for variable/dimension names. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support non-string dimension/variable names 341643235 | |
405396751 | https://github.com/pydata/xarray/issues/2292#issuecomment-405396751 | https://api.github.com/repos/pydata/xarray/issues/2292 | MDEyOklzc3VlQ29tbWVudDQwNTM5Njc1MQ== | shoyer 1217238 | 2018-07-16T22:07:18Z | 2018-07-16T22:07:18Z | MEMBER | It would be an improvement even to clearly document the requirements for dimension/variable names. I suspect we don't actually need them to be sortable, though we do using sorting as part of the current repr() for some xarray objecs. This is mostly to ensure reproducible displays across multiple loads/runs of a file, but it's increasingly less relevant now than Python's dict is ordered by default (since Python 3.6). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support non-string dimension/variable names 341643235 | |
405380136 | https://github.com/pydata/xarray/issues/2292#issuecomment-405380136 | https://api.github.com/repos/pydata/xarray/issues/2292 | MDEyOklzc3VlQ29tbWVudDQwNTM4MDEzNg== | joshburkart 3888181 | 2018-07-16T20:59:13Z | 2018-07-16T20:59:13Z | NONE | Thanks @shoyer. I'll see if I can take a look in the near future... |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support non-string dimension/variable names 341643235 | |
405371789 | https://github.com/pydata/xarray/issues/2292#issuecomment-405371789 | https://api.github.com/repos/pydata/xarray/issues/2292 | MDEyOklzc3VlQ29tbWVudDQwNTM3MTc4OQ== | shoyer 1217238 | 2018-07-16T20:30:16Z | 2018-07-16T20:30:16Z | MEMBER | Hi @joshburkart -- thanks for raising this concern. I agree, it would be nice to support enums (really any hashable value) as dimension names. Our current checks for strings are somewhat inconsistent, e.g., you can actually use these in an xarray.Dataset if you use the fully explicit API for constructing a dataset:
But now if you try to print the dataset, you get an error about sorting: ```python-traceback TypeError Traceback (most recent call last) /usr/local/lib/python3.6/dist-packages/IPython/core/formatters.py in call(self, obj) 697 type_pprinters=self.type_printers, 698 deferred_pprinters=self.deferred_printers) --> 699 printer.pretty(obj) 700 printer.flush() 701 return stream.getvalue() /usr/local/lib/python3.6/dist-packages/IPython/lib/pretty.py in pretty(self, obj) 396 if callable(meth): 397 return meth(obj, self, cycle) --> 398 return _default_pprint(obj, self, cycle) 399 finally: 400 self.end_group() /usr/local/lib/python3.6/dist-packages/IPython/lib/pretty.py in default_pprint(obj, p, cycle) 516 if _safe_getattr(klass, '__repr__', None) not in _baseclass_reprs: 517 # A user-provided repr. Find newlines and replace them with p.break() --> 518 _repr_pprint(obj, p, cycle) 519 return 520 p.begin_group(1, '<') /usr/local/lib/python3.6/dist-packages/IPython/lib/pretty.py in repr_pprint(obj, p, cycle) 707 """A pprint that just redirects to the normal repr function.""" 708 # Find newlines and replace them with p.break() --> 709 output = repr(obj) 710 for idx,output_line in enumerate(output.splitlines()): 711 if idx: /usr/local/lib/python3.6/dist-packages/xarray/core/formatting.py in repr(self) 62 63 def repr(self): ---> 64 return ensure_valid_repr(self.unicode()) 65 66 /usr/local/lib/python3.6/dist-packages/xarray/core/dataset.py in unicode(self) 1188 1189 def unicode(self): -> 1190 return formatting.dataset_repr(self) 1191 1192 def info(self, buf=None): /usr/local/lib/python3.6/dist-packages/xarray/core/formatting.py in dataset_repr(ds) 415 416 dims_start = pretty_print(u'Dimensions:', col_width) --> 417 summary.append(u'%s(%s)' % (dims_start, dim_summary(ds))) 418 419 if ds.coords: /usr/local/lib/python3.6/dist-packages/xarray/core/formatting.py in dim_summary(obj) 324 325 def dim_summary(obj): --> 326 elements = [u'%s: %s' % (k, v) for k, v in obj.sizes.items()] 327 return u', '.join(elements) 328 /usr/local/lib/python3.6/dist-packages/xarray/core/formatting.py in <listcomp>(.0) 324 325 def dim_summary(obj): --> 326 elements = [u'%s: %s' % (k, v) for k, v in obj.sizes.items()] 327 return u', '.join(elements) 328 /usr/lib/python3.6/_collections_abc.py in iter(self) 741 742 def iter(self): --> 743 for key in self._mapping: 744 yield (key, self._mapping[key]) 745 /usr/local/lib/python3.6/dist-packages/xarray/core/utils.py in iter(self) 311 312 def iter(self): --> 313 return iter(self.mapping) 314 315 def len(self): /usr/local/lib/python3.6/dist-packages/xarray/core/utils.py in iter(self) 347 348 def iter(self): --> 349 return iter(sorted(self.mapping)) 350 351 def len(self): TypeError: '<' not supported between instances of 'CoordId' and 'CoordId' ``` I would be open to PRs to improve the situation. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support non-string dimension/variable names 341643235 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 7