github: issue_comments: 9 rows where issue = 614275938 sorted by updated

9 rows where issue = 614275938 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
744103639	https://github.com/pydata/xarray/issues/4045#issuecomment-744103639	https://api.github.com/repos/pydata/xarray/issues/4045	MDEyOklzc3VlQ29tbWVudDc0NDEwMzYzOQ==	spencerkclark 6628425	2020-12-14T00:50:46Z	2020-12-14T00:50:46Z	MEMBER	@half-adder I've verified that #4684 fixes your initial issue. Note, however, that outside of the time you referenced, your Dataset contained times that required nanosecond precision, e.g.: ```python data.time.isel(animal=0, timepoint=0, pair=-1, wavelength=0) <xarray.DataArray 'time' ()> array('2017-02-22T16:24:14.722999999', dtype='datetime64[ns]') Coordinates: wavelength <U3 '410' strain object 'HD233' stage_x float64 1.64e+04 stage_y float64 -429.0 stage_z float64 2.155e+04 bin_x float64 4.0 bin_y float64 4.0 exposure float64 90.0 mvmt-anterior uint8 0 mvmt-posterior uint8 0 mvmt-sides_of_tip uint8 0 mvmt-tip uint8 0 experiment_id object '2017_02_22-HD233_SAY47' time datetime64[ns] 2017-02-22T16:24:14.722999999 animal_ uint64 0 ``` So in order for things to be round-tripped accurately you will need to override the original units in the dataset with nanoseconds instead of microseconds. This was not possible before, but now is with #4684. ```python data.time.encoding["units"] = "nanoseconds since 1900-01-01" ``` With #4684 you could also just simply delete the original units, and xarray will now automatically choose the appropriate units so that the datetimes can be serialized with `int64` values (and hence be round-tripped exactly). ```python del data.time.encoding["units"] ```	{ "total_count": 3, "+1": 2, "-1": 0, "laugh": 0, "hooray": 1, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Millisecond precision is lost on datetime64 during IO roundtrip 614275938
735879413	https://github.com/pydata/xarray/issues/4045#issuecomment-735879413	https://api.github.com/repos/pydata/xarray/issues/4045	MDEyOklzc3VlQ29tbWVudDczNTg3OTQxMw==	dcherian 2448579	2020-11-30T16:05:12Z	2020-11-30T16:05:12Z	MEMBER	I would look here: https://github.com/pydata/xarray/blob/255bc8ee9cbe8b212e3262b0d4b2e32088a08064/xarray/coding/times.py#L440-L474	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Millisecond precision is lost on datetime64 during IO roundtrip 614275938
735851973	https://github.com/pydata/xarray/issues/4045#issuecomment-735851973	https://api.github.com/repos/pydata/xarray/issues/4045	MDEyOklzc3VlQ29tbWVudDczNTg1MTk3Mw==	aldanor 2418513	2020-11-30T15:22:09Z	2020-11-30T15:22:09Z	NONE	Can we use the encoding["dtype"] field to solve this? i.e. use int64 when encoding["dtype"] is not set and use the specified value when available? I think a lot of logic needs to be reshuffled, because as of right now it will complain "you can't store a float64 in int64" or something along those lines, when trying to do it with a nanosecond timestamp.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Millisecond precision is lost on datetime64 during IO roundtrip 614275938
735849936	https://github.com/pydata/xarray/issues/4045#issuecomment-735849936	https://api.github.com/repos/pydata/xarray/issues/4045	MDEyOklzc3VlQ29tbWVudDczNTg0OTkzNg==	aldanor 2418513	2020-11-30T15:18:55Z	2020-11-30T15:21:02Z	NONE	In principle we should be able to handle this (contributions are welcome) I don't mind contributing but not knowing the netcdf stuff inside out I'm not sure I have a good vision on what's the proper way to do it. My use case is very simple - I have an in-memory xr.Dataset that I want to save() and then load() without losses. Should it just be an `xr.save(..., m8=True)` (or whatever that flag would be called), so that all of numpy's `M8[...]` and `m8[...]` would be serialized transparently (as int64, that is) without passing them through the whole cftime pipeline. It would be then nice, of course, if `xr.load` was also aware of this convention (via some special attribute or somehow else) and could convert them back like `.view('M8[ns]')` when loading. I think xarray should also throw an exception if it detects timestamps/timedeltas of nanosecond precision that it can't serialize without going through int-float-int routine (or automatically revert to using this transparent but netcdf-incompatible mode). Maybe this is not the proper way to do it - ideas welcome (there's also an open PR - #4400 - mind checking that out?)	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Millisecond precision is lost on datetime64 during IO roundtrip 614275938
735847525	https://github.com/pydata/xarray/issues/4045#issuecomment-735847525	https://api.github.com/repos/pydata/xarray/issues/4045	MDEyOklzc3VlQ29tbWVudDczNTg0NzUyNQ==	dcherian 2448579	2020-11-30T15:15:13Z	2020-11-30T15:15:13Z	MEMBER	Can we use the `encoding["dtype"]` field to solve this? i.e. use `int64` when `encoding["dtype"]` is not set and use the specified value when available?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Millisecond precision is lost on datetime64 during IO roundtrip 614275938
735789517	https://github.com/pydata/xarray/issues/4045#issuecomment-735789517	https://api.github.com/repos/pydata/xarray/issues/4045	MDEyOklzc3VlQ29tbWVudDczNTc4OTUxNw==	spencerkclark 6628425	2020-11-30T13:35:26Z	2020-11-30T13:40:50Z	MEMBER	Internally, datetime64[ns] is simply an 8-byte int. Why on earth would it be serialized in a lossy way as a float64?... The short answer is that CF conventions allow for dates to be encoded with floating point values, so we encounter that in data that xarray ingests from other sources (i.e. files that were not even produced with Python, let alone xarray). If we didn't have to worry about roundtripping files that followed those conventions, I agree we would just encode everything with nanosecond units as `int64` values. This is a huge issue, as anyone using nanosecond-precision timestamps with xarray would unknowingly and silently read wrong data after deserializing. Yes, I can see why this would be quite frustrating. In principle we should be able to handle this (contributions are welcome); it just has not been a priority up to this point. In my experience xarray's current encoding and decoding methods for standard calendar times work well up to at least second precision.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Millisecond precision is lost on datetime64 during IO roundtrip 614275938
734951187	https://github.com/pydata/xarray/issues/4045#issuecomment-734951187	https://api.github.com/repos/pydata/xarray/issues/4045	MDEyOklzc3VlQ29tbWVudDczNDk1MTE4Nw==	aldanor 2418513	2020-11-27T18:47:26Z	2020-11-27T18:51:00Z	NONE	Just stumbled upon this as well. Internally, `datetime64[ns]` is simply an 8-byte int. Why on earth would it be serialized in a lossy way as a float64?... Simply telling it to `encoding={...: {'dtype': 'int64'}}` won't work since then it complains about serializing float as an int. Is there a way out of this, other than not using `M8[ns]` dtypes at all with xarray? This is a huge issue, as anyone using nanosecond-precision timestamps with xarray would unknowingly and silently read wrong data after deserializing.	{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Millisecond precision is lost on datetime64 during IO roundtrip 614275938
626257580	https://github.com/pydata/xarray/issues/4045#issuecomment-626257580	https://api.github.com/repos/pydata/xarray/issues/4045	MDEyOklzc3VlQ29tbWVudDYyNjI1NzU4MA==	spencerkclark 6628425	2020-05-10T01:15:53Z	2020-05-10T01:15:53Z	MEMBER	Thanks for the report @half-adder. This indeed is related to times being encoded as floats, but actually is not cftime-related (the times here not being encoded using cftime; we only use cftime for non-standard calendars and out of nanosecond-resolution bounds dates). Here's a minimal working example that illustrates the issue with the current logic in `coding.times.encode_cf_datetime`: ``` In [1]: import numpy as np; import pandas as pd In [2]: times = pd.DatetimeIndex([np.datetime64("2017-02-22T16:27:08.732000000")]) In [3]: reference = pd.Timestamp("1900-01-01") In [4]: units = np.timedelta64(1, "us") In [5]: (times - reference).values[0] Out[5]: numpy.timedelta64(3696769628732000000,'ns') In [6]: ((times - reference) / units).values[0] Out[6]: 3696769628732000.5 ``` In principle, we should be able to represent the difference between this date and the reference date in an integer amount of microseconds, but timedelta division produces a float. We currently try to cast these floats to integers when possible, but that's not always safe to do, e.g. in the case above. It would be great to make roundtripping times -- particularly standard calendar datetimes like these -- more robust. It's possible we could now leverage floor division (i.e. `//`) of timedeltas within NumPy for this (assuming we first check that the unit conversion divisor exactly divides each timedelta; if it doesn't we'd fall back to using floats): `In [7]: ((times - reference) // units).values[0] Out[7]: 3696769628732000` These precision issues can be tricky, however, so we'd need to think things through carefully. Even if we fixed this on the encoding side, things are converted to floats during decoding, so we'd need to make a change there too.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Millisecond precision is lost on datetime64 during IO roundtrip 614275938
625481974	https://github.com/pydata/xarray/issues/4045#issuecomment-625481974	https://api.github.com/repos/pydata/xarray/issues/4045	MDEyOklzc3VlQ29tbWVudDYyNTQ4MTk3NA==	DocOtak 868027	2020-05-07T20:32:22Z	2020-05-07T20:32:22Z	CONTRIBUTOR	This has something to do with the time values at some point being a float: ```python import numpy as np np.datetime64("2017-02-22T16:24:10.586000000").astype("float64").astype(np.dtype('<M8[ns]')) numpy.datetime64('2017-02-22T16:24:10.585999872') ``` It looks like this is happening somewhere in the cftime library.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Millisecond precision is lost on datetime64 during IO roundtrip 614275938

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);