issue_comments
4 rows where issue = 33637243 and user = 1217238 sorted by updated_at descending
This data as json, CSV (advanced)
These facets timed out: author_association
issue 1
- Dataset summary methods · 4 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
43365791 | https://github.com/pydata/xarray/issues/131#issuecomment-43365791 | https://api.github.com/repos/pydata/xarray/issues/131 | MDEyOklzc3VlQ29tbWVudDQzMzY1Nzkx | shoyer 1217238 | 2014-05-16T18:44:26Z | 2014-05-16T18:44:26Z | MEMBER | Module wide configuration flags are generally a bad idea, because such non-local effects make it harder to predict how code works. This is less of a concern for configuration options which only change how objects are displayed, which I believe is the only way such flags are used in numpy or pandas. But I don't have any objections to adding a method option. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Dataset summary methods 33637243 | |
43356581 | https://github.com/pydata/xarray/issues/131#issuecomment-43356581 | https://api.github.com/repos/pydata/xarray/issues/131 | MDEyOklzc3VlQ29tbWVudDQzMzU2NTgx | shoyer 1217238 | 2014-05-16T17:17:16Z | 2014-05-16T17:17:16Z | MEMBER | You're right that keeping attributes fully intact under any operation is a perfectly reasonable alternative to dropping them. So what do NCO and CDO do with attributes when you calculate the variance along a dimension of a variable? The choices, as I see them, are: 1. Drop all attributes 2. Keep all attributes 3. Keep all attributes with the exception of "units" (which is dropped) 4. Keep all attributes, but modify "units" according to the mathematical operation For xray, 2 is out, because it leaves wrong metadata intact. 3 and 4 are out, because we don't want to be in the business of relying on metadata. This leaves 1 -- dropping all attributes. For consistency, if 1 is the choice we need to make for "variance", then the same rule should apply for all "reduce" operations, including apparently innocuous operations like "mean". Note that this is also consistent with how xray handles attributes all other mathematical operations -- even adding 0 or multiplying by 1 removes all attributes. My sense (not being a heavy user of these tools) is that NCO and CDO have a little bit more freedom to keep around metadata because they maintain a "history" attribute. Loading files from disk is a little different. Notice that once variables get loaded into xray, any attributes that were used for decoding have been removed from "attributes" and moved to "encoding". The meaningful attributes only exist on files on disk (unavoidable given the limitations of NetCDF). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Dataset summary methods 33637243 | |
43294717 | https://github.com/pydata/xarray/issues/131#issuecomment-43294717 | https://api.github.com/repos/pydata/xarray/issues/131 | MDEyOklzc3VlQ29tbWVudDQzMjk0NzE3 | shoyer 1217238 | 2014-05-16T04:07:46Z | 2014-05-16T16:43:36Z | MEMBER | As a note on your points (1) and (2): currently, we remove all dataset and array attributes when doing any operations other than (re)indexing. This includes when reduce operations like mean are applied, because it didn't seem safe to assume that the original attributes were still descriptive. In particular, I was worried about units. I'm willing to reconsider this, but in general I would like to avoid any functionality that is metadata aware other than dimension and coordinate labels. In my experience, systems that rely on attributes become much more complex and harder to predict, so I would like to avoid that. I don't see a unit system as in scope for xray, at least not at this time. Your solution 4(b) -- dropping coordinates rather than attempting to summarize them -- would also be my preferred approach. It is consistent with pandas (try Speaking of non-numerical data, we will need to take an approach like pandas to ignore non-numerical variables with taking the mean. It might be worth taking a look at how pandas handles this, but I imagine using a In you're interested in taking a crack at implementation, take a look at |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Dataset summary methods 33637243 | |
43282058 | https://github.com/pydata/xarray/issues/131#issuecomment-43282058 | https://api.github.com/repos/pydata/xarray/issues/131 | MDEyOklzc3VlQ29tbWVudDQzMjgyMDU4 | shoyer 1217238 | 2014-05-16T00:29:26Z | 2014-05-16T01:46:59Z | MEMBER | Thanks for raising this as a separate issue. Yes, I agree it would be nice to add these summary methods! We can imagine DataArray methods on Datasets mapping over all variables in a somewhat similar way to how groupby methods map over each group. These methods are very convenient for pandas.DataFrame objects, so it makes sense to have them for xray.Dataset, too. The only unfortunate aspect that is that it is harder to see the values in a Dataset, because they aren't given in the standard string representation. In contrast, methods like |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Dataset summary methods 33637243 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 1