github: issue_comments: 8 rows where issue = 595492608 sorted by updated

8 rows where issue = 595492608 sorted by updated_at descending

Search:

✖

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
966264200	https://github.com/pydata/xarray/issues/3942#issuecomment-966264200	https://api.github.com/repos/pydata/xarray/issues/3942	IC_kwDOAMm_X845mAWI	spencerkclark 6628425	2021-11-11T12:30:21Z	2021-11-11T12:32:06Z	MEMBER	This logic has been around in xarray for a long time (I think it dates back to https://github.com/pydata/xarray/pull/12!), so it predates me. If I had to guess though, it would have to do with the fact that back then, a form of `cftime.date2num` was used to encode all times, even those that started as `np.datetime64` values. I think that's significant for two reasons: 1. In the old days, `date2num` would only return floating point values, even if the times could in principle be encoded with integers. For that reason, for accuracy reasons, it was best to keep the encoded values as small as possible to avoid roundoff error. 2. Even if (1) was not the case back then, `date2num` did not -- and still does not -- support nanosecond units, because it relies on microsecond-precision datetimes. This of course is not true anymore. We no longer use `date2num` to encode `np.datetime64` values, and we no longer encode dates with floating point values by default (#4045); we use integers for optimal round-tripping accuracy, and are capable of encoding dates with nanosecond units. To be honest, currently it seems the only remaining advantage to choosing a larger time encoding unit and proximate reference date is that it makes the raw encoded values a little more human-readable. However, encoding dates with units of `"nanoseconds since 1970-01-01"` is objectively optimal for `np.datetime64[ns]` values, as it guarantees the maximum range of possible encoded times, and maximum round-trip accuracy, so it could be worth revisiting our approach in light of the fact that it makes appending somewhat dangerous.	{ "total_count": 3, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 1 }	Time dtype encoding defaulting to `int64` when writing netcdf or zarr 595492608
965599998	https://github.com/pydata/xarray/issues/3942#issuecomment-965599998	https://api.github.com/repos/pydata/xarray/issues/3942	IC_kwDOAMm_X845jeL-	dcherian 2448579	2021-11-10T18:01:46Z	2021-11-10T18:01:46Z	MEMBER	Please may I ask: Why not default to xarray encoding time as 'units': 'nanoseconds since 1970-01-01' to be consistent with np.datetime64[ns]? It's choosing the highest resolution that matches the data, which has the benefit of allowing the maximum possible time range given the data's frequency: https://github.com/pydata/xarray/blob/5871637873cd83c3a656ee6f4df86ea6628cf68a/xarray/coding/times.py#L317-L319 I'm not sure if this is why it was originally chosen; but that is one advantage. Perhaps @spencerkclark has some insight here.	{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Time dtype encoding defaulting to `int64` when writing netcdf or zarr 595492608
965595392	https://github.com/pydata/xarray/issues/3942#issuecomment-965595392	https://api.github.com/repos/pydata/xarray/issues/3942	IC_kwDOAMm_X845jdEA	JackKelly 460756	2021-11-10T17:56:41Z	2021-11-10T17:57:17Z	NONE	Cool, I agree that an error and a documentation change is likely to be sufficient :slightly_smiling_face: (and I'd be keen to write a PR to help out!) But, before we commit to that path: Please may I ask: Why not default to xarray encoding time as `'units': 'nanoseconds since 1970-01-01'` to be consistent with `np.datetime64[ns]`? Sorry if I've missed something obvious!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Time dtype encoding defaulting to `int64` when writing netcdf or zarr 595492608
965591847	https://github.com/pydata/xarray/issues/3942#issuecomment-965591847	https://api.github.com/repos/pydata/xarray/issues/3942	IC_kwDOAMm_X845jcMn	dcherian 2448579	2021-11-10T17:52:20Z	2021-11-10T17:52:20Z	MEMBER	let's definitely add a note to the documentation to say that it might be a good idea for users to manually specify the encoding for datetimes if they wish to append to Zarrs. :+1: However I think xarray should raise an error when trying to append times that cannot be represented by the on-disk encoding. Adding this error message would make it obvious that this is happening. PRs are very welcome!	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Time dtype encoding defaulting to `int64` when writing netcdf or zarr 595492608
965562434	https://github.com/pydata/xarray/issues/3942#issuecomment-965562434	https://api.github.com/repos/pydata/xarray/issues/3942	IC_kwDOAMm_X845jVBC	JackKelly 460756	2021-11-10T17:17:29Z	2021-11-10T17:49:22Z	NONE	I think I've bumped into a symptom of this issue (my issue is described in #5969). And I think #3379 may be another symptom of this issue. Perhaps I'm biased (because I work with timeseries which only span a few years) but I wonder if xarray should default to encoding time as `'units': 'nanoseconds since 1970-01-01'` (to be consistent with `np.datetime64[ns]`) unless the timeseries includes dates before the year 1677, or after the year 2262 :slightly_smiling_face:? Would that work? If that's no good, then let's definitely add a note to the documentation to say that it might be a good idea for users to manually specify the encoding for datetimes if they wish to append to Zarrs.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Time dtype encoding defaulting to `int64` when writing netcdf or zarr 595492608
610615621	https://github.com/pydata/xarray/issues/3942#issuecomment-610615621	https://api.github.com/repos/pydata/xarray/issues/3942	MDEyOklzc3VlQ29tbWVudDYxMDYxNTYyMQ==	rafa-guedes 7799184	2020-04-07T20:55:29Z	2020-04-07T21:07:31Z	CONTRIBUTOR	Yep I managed to overcome this by manually setting encoding parameters, just wondering if there would be any downside in preferring `float64` over `int64` when automatically defining these? This seems to fix that issue. I guess it could result in some other precision losses due to float-point errors but these should be small..	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Time dtype encoding defaulting to `int64` when writing netcdf or zarr 595492608
610444090	https://github.com/pydata/xarray/issues/3942#issuecomment-610444090	https://api.github.com/repos/pydata/xarray/issues/3942	MDEyOklzc3VlQ29tbWVudDYxMDQ0NDA5MA==	rabernat 1197350	2020-04-07T15:10:40Z	2020-04-07T15:10:40Z	MEMBER	I agree with Deepak. Xarray intelligently chooses its encoding when it write the initial dataset to make sure it has enough precision to resolve all times. It cannot magically know that, in the future, you plan to append data which requires greater precision. Your options are: - If you know from the outset that you will require greater precision in time encoding, you can manually specify your encoding before you write (http://xarray.pydata.org/en/stable/io.html#scaling-and-type-conversions) - If you don't know from the outset, you will have to overwrite the full time variable with new encoding I also agree that we should definitely be raising a warning (or even an error) in your situation.	{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Time dtype encoding defaulting to `int64` when writing netcdf or zarr 595492608
610429922	https://github.com/pydata/xarray/issues/3942#issuecomment-610429922	https://api.github.com/repos/pydata/xarray/issues/3942	MDEyOklzc3VlQ29tbWVudDYxMDQyOTkyMg==	dcherian 2448579	2020-04-07T14:46:06Z	2020-04-07T14:46:06Z	MEMBER	I have run in to this problem before. The initial choice to use `int64` and `days since ...` is perfectly valid. However I think xarray should raise an error when trying to append times that cannot be represented by the on-disk encoding. Note that you can always specify an encoding to make sure that you can append properly.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Time dtype encoding defaulting to `int64` when writing netcdf or zarr 595492608

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);