github: issue_comments: 9 rows where issue = 1310058435 sorted by updated

9 rows where issue = 1310058435 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
1322443376	https://github.com/pydata/xarray/issues/6813#issuecomment-1322443376	https://api.github.com/repos/pydata/xarray/issues/6813	IC_kwDOAMm_X85O0uJw	weiji14 23487320	2022-11-21T17:55:16Z	2022-11-21T17:56:17Z	CONTRIBUTOR	Just hitting into this same issue mentioned downstream at https://github.com/xarray-contrib/datatree/pull/130 while trying to read ICESat-2 HDF5 files from S3, but realized that the fix should happening in `xarray`, so I've started a PR at #7304 to fix this (thanks @djhoese for the code snippet at https://github.com/pydata/xarray/issues/6813#issuecomment-1204300671)! Will look into the unit test failures and fix them one by one as they pop up on CI.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Opening fsspec s3 file twice results in invalid start byte 1310058435
1205503288	https://github.com/pydata/xarray/issues/6813#issuecomment-1205503288	https://api.github.com/repos/pydata/xarray/issues/6813	IC_kwDOAMm_X85H2oU4	djhoese 1828519	2022-08-04T16:36:43Z	2022-08-04T16:36:43Z	CONTRIBUTOR	@wroberts4 I'd say maybe make a pull request and we'll see what (if any) tests fail and what the people in charge of merging think about it. I think we've gone through the various possibilities and I think if there were any thread-safety issues trying to be protected against with the exception as it was, they weren't actually being protected against (later reading of the file could have caused an issue).	{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Opening fsspec s3 file twice results in invalid start byte 1310058435
1204355953	https://github.com/pydata/xarray/issues/6813#issuecomment-1204355953	https://api.github.com/repos/pydata/xarray/issues/6813	IC_kwDOAMm_X85HyQNx	djhoese 1828519	2022-08-03T18:57:20Z	2022-08-03T18:57:20Z	CONTRIBUTOR	Good point. My initial answer was going to be that it isn't a problem because in the second usage of the file we would get the exception about `.tell()` not being at 0, but after the `.seek(0)` that would be true and we wouldn't get that exception. So...I guess maybe it should be documented that xarray doesn't support opening the same file-like object from different threads. In which case, making the changes suggested here would only add usability/functionality and not cause any additional issues...unless we're missing something.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Opening fsspec s3 file twice results in invalid start byte 1310058435
1204344123	https://github.com/pydata/xarray/issues/6813#issuecomment-1204344123	https://api.github.com/repos/pydata/xarray/issues/6813	IC_kwDOAMm_X85HyNU7	wroberts4 38170479	2022-08-03T18:45:04Z	2022-08-03T18:45:04Z	CONTRIBUTOR	@djhoese Is that not already an existing problem since `filename_or_obj.seek(0)` is called in the existing code after reading the magic number?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Opening fsspec s3 file twice results in invalid start byte 1310058435
1204316906	https://github.com/pydata/xarray/issues/6813#issuecomment-1204316906	https://api.github.com/repos/pydata/xarray/issues/6813	IC_kwDOAMm_X85HyGrq	djhoese 1828519	2022-08-03T18:17:41Z	2022-08-03T18:17:41Z	CONTRIBUTOR	I am not certain whether seeking/reading from the same file in multiple places might have unforeseen consequences, such as when doing open_dataset in multiple threads. Oh duh, that's a good point. So it might be fine dask-wise if the assumption is that `open_dataset` is called in the main thread and then dask is used to do computations on the arrays later on. If we're talking regular Python Threads or dask delayed functions that are calling `open_dataset` on the same file-like object (that was passed to the worker function) then it would cause issues. Possibly rare case, but still probably something that xarray wants to support. Yeah I thought the `.read/.write` omission from the IOBase class was odd too. Just wanted to point out that the if block is using `.read` but `IOBase` is not guaranteed to have `.read`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Opening fsspec s3 file twice results in invalid start byte 1310058435
1204310218	https://github.com/pydata/xarray/issues/6813#issuecomment-1204310218	https://api.github.com/repos/pydata/xarray/issues/6813	IC_kwDOAMm_X85HyFDK	martindurant 6042212	2022-08-03T18:10:57Z	2022-08-03T18:10:57Z	CONTRIBUTOR	Yes, it is reasonable to always `seek(0)` or to copy the file. I am not certain why/where xarray is caching the open file, though - I would have thought that a new file instance is made for each `open_dataset()`. I am not certain whether seeking/reading from the same file in multiple places might have unforeseen consequences, such as when doing open_dataset in multiple threads. I am mildly against subclassing from RawIOBase, since some file-likes might choose to implement text mode right in the class (as opposed to a text wrapper layered on top). Pretty surprised that it doesn't have read()/write(), though, since all the derived classes do.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Opening fsspec s3 file twice results in invalid start byte 1310058435
1204300671	https://github.com/pydata/xarray/issues/6813#issuecomment-1204300671	https://api.github.com/repos/pydata/xarray/issues/6813	IC_kwDOAMm_X85HyCt_	djhoese 1828519	2022-08-03T18:02:20Z	2022-08-03T18:02:20Z	CONTRIBUTOR	I talked with @wroberts4 about this in person and if we're not missing some reason to not `.seek(0)` on a data source then this seems like a simple convenience and user experience improvement. We were thinking maybe it would make more sense to change the function to look like: `python def read_magic_number_from_file(filename_or_obj, count=8) -> bytes: # check byte header to determine file type if isinstance(filename_or_obj, bytes): magic_number = filename_or_obj[:count] elif isinstance(filename_or_obj, io.IOBase): if filename_or_obj.tell() != 0: filename_or_obj.seek(0) # warn about re-seeking? magic_number = filename_or_obj.read(count) filename_or_obj.seek(0) else: raise TypeError(f"cannot read the magic number form {type(filename_or_obj)}") return magic_number` Additionally, the `isinstance` check is for `io.IOBase` but that base class isn't guaranteed to have a `.read` method. The check should probably be for `RawIOBase`: https://docs.python.org/3/library/io.html#class-hierarchy @kmuehlbauer @lamorton I saw you commented on the almost related #3991, do you have any thoughts on this? Should we put a PR together to continue the discussion? Maybe the fsspec folks (@martindurant?) have an opinion on this?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Opening fsspec s3 file twice results in invalid start byte 1310058435
1189589335	https://github.com/pydata/xarray/issues/6813#issuecomment-1189589335	https://api.github.com/repos/pydata/xarray/issues/6813	IC_kwDOAMm_X85G57FX	wroberts4 38170479	2022-07-19T21:55:51Z	2022-07-20T14:56:33Z	CONTRIBUTOR	Note that the second example fails on the first open_dataset, which I assume is expected since it's not mode `mode='rb'`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Opening fsspec s3 file twice results in invalid start byte 1310058435
1189593287	https://github.com/pydata/xarray/issues/6813#issuecomment-1189593287	https://api.github.com/repos/pydata/xarray/issues/6813	IC_kwDOAMm_X85G58DH	wroberts4 38170479	2022-07-19T22:00:48Z	2022-07-19T22:00:48Z	CONTRIBUTOR	Adding `filename_or_obj.seek(0)` fixes this, but I am not sure if it poses a security risk? def read_magic_number_from_file(filename_or_obj, count=8) -> bytes: # check byte header to determine file type if isinstance(filename_or_obj, bytes): magic_number = filename_or_obj[:count] elif isinstance(filename_or_obj, io.IOBase): filename_or_obj.seek(0) if filename_or_obj.tell() != 0: raise ValueError( "cannot guess the engine, " "file-like object read/write pointer not at the start of the file, " "please close and reopen, or use a context manager" ) magic_number = filename_or_obj.read(count) filename_or_obj.seek(0) else: raise TypeError(f"cannot read the magic number form {type(filename_or_obj)}") return magic_number	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Opening fsspec s3 file twice results in invalid start byte 1310058435

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);