github: issue_comments: 6 rows where user = 2067093 sorted by updated

6 rows where user = 2067093 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
808605422	https://github.com/pydata/xarray/issues/5070#issuecomment-808605422	https://api.github.com/repos/pydata/xarray/issues/5070	MDEyOklzc3VlQ29tbWVudDgwODYwNTQyMg==	NowanIlfideme 2067093	2021-03-27T00:39:26Z	2021-03-27T00:43:35Z	NONE	Just ran into this. Unsure whether checking `hasattr` is better than just trying to read the object and catching an error - someone could implement a non-compliant `read` method, which would create other errors. As a workaround, you could read it into BytesIO and pass the BytesIO instance: ```python import fsspec import xarray as xr from io import BytesIO of = fsspec.open("example.nc") with of as f: xr.load_dataset(BytesIO(f.read())) ``` Also, here's the link to the code referenced above. Ideally xarray would work with `fsspec` or `pyfilesystem2` out of the box (to parse access URLs, for example). I've had to fall back to using BytesIO buffers too many times. 😛 Edit: You don't even need BytesIO, it works even with Bytes: ```python import fsspec import xarray as xr of = fsspec.open("example.nc") with of as f: xr.load_dataset(f.read()) ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	requires io.IOBase subclass rather than duck file-like 839823306
730263703	https://github.com/pydata/xarray/issues/2059#issuecomment-730263703	https://api.github.com/repos/pydata/xarray/issues/2059	MDEyOklzc3VlQ29tbWVudDczMDI2MzcwMw==	NowanIlfideme 2067093	2020-11-19T10:02:35Z	2020-11-19T10:02:35Z	NONE	This may be relevant here, maybe not, but it appears the HDF5 backend is also at odds with all the above serialization. Our internal project's dependencies changed, and that moved the `h5py` version from 2.10 to 3.1; apparently there was a breaking change that meant unicode strings were either encoded or decoded as `bytes`. Thankfully we had a test for that, but figuring out what was wrong was difficult. Essentially, netCDF4 files that were round-tripped to a BytesIO (via an HDF5 backend) had unicode strings converted to bytes. I'm not sure whether it was the encoding or decoding part, likely decoding, judging by the docs: https://docs.h5py.org/en/stable/strings.html https://docs.h5py.org/en/stable/whatsnew/3.0.html#breaking-changes-deprecations This might require even more special-casing to achieve consistent behavior for xarray users who don't really want to go into backend details (like me 😋).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	How should xarray serialize bytes/unicode strings across Python/netCDF versions? 314444743
657798184	https://github.com/pydata/xarray/issues/2995#issuecomment-657798184	https://api.github.com/repos/pydata/xarray/issues/2995	MDEyOklzc3VlQ29tbWVudDY1Nzc5ODE4NA==	NowanIlfideme 2067093	2020-07-13T21:17:06Z	2020-07-13T21:17:06Z	NONE	I ran into this issue, here's a simple workaround that seems to work: ```python def dataset_to_bytes(ds: xr.Dataset, name: str = "my-dataset") -> bytes: """Converts datset to bytes.""" `nc4_ds = netCDF4.Dataset(name, mode="w", diskless=True, memory=ds.nbytes) nc4_store = NetCDF4DataStore(nc4_ds) dump_to_store(ds, nc4_store) res_mem = nc4_ds.close() res_bytes = res_mem.tobytes() return res_bytes` ``` I tested this using the following: ```python import BytesIO fname = "REDACTED.nc" ds = xr.load_dataset(fname) ds_bytes = dataset_to_bytes(ds) ds2 = xr.load_dataset(BytesIO(ds_bytes)) assert ds2.equals(ds) and all(ds2.attrs[k]==ds.attrs[k] for k in set(ds2.attrs).union(ds.attrs)) ``` The assertion holds true, however the file size on disk is different. It's possible they were saved using different netCDF4 versions, I haven't had time to test that. I tried using just `ds.to_netcdf()` but get the following error: `ValueError: NetCDF 3 does not support type \|S32` That's because it falls back to the `'scipy'` engine. Would be nice to have a non-hacky way to write netcdf4 files to byte streams. :smiley:	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Remote writing NETCDF4 files to Amazon S3 449706080
557579503	https://github.com/pydata/xarray/issues/1603#issuecomment-557579503	https://api.github.com/repos/pydata/xarray/issues/1603	MDEyOklzc3VlQ29tbWVudDU1NzU3OTUwMw==	NowanIlfideme 2067093	2019-11-22T15:34:57Z	2019-11-22T15:34:57Z	NONE	Thanks @NowanIlfideme for your feedback. Could you perhaps share a gist of code related to your use case? The first example in this comment is similar to my use case: https://github.com/pydata/xarray/issues/3213#issuecomment-520741706 . There are several "core" dimensions, but some part of the coordinates may be hierarchical or cross-defined (e.g. country > province > city > building, but also country > province > voting district > building). We might have a full or nearly-full panel in the MultiIndex representation, but have a huge cross product (even if we keep strictly hierarchical dimensions out). Meanwhile using a true COO sparse representation (as I understand it) will likely end up with slower operations overall, since nearly all machine learning models (think: linear regression) require a dense array input anyways. I'll make an example of this when I find some free time, along with a contrasting one in Pandas. :)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Explicit indexes in xarray's data-model (Future of MultiIndex) 262642978
557563566	https://github.com/pydata/xarray/issues/1603#issuecomment-557563566	https://api.github.com/repos/pydata/xarray/issues/1603	MDEyOklzc3VlQ29tbWVudDU1NzU2MzU2Ng==	NowanIlfideme 2067093	2019-11-22T14:59:29Z	2019-11-22T14:59:29Z	NONE	I've noticed that basically all my current troubles with xarray lead to this issue (lack of MultiIndex support). I use xarray for machine learning/data science/econometrics. My current problem requires a semi-hierarchical indexing on one of the dimensions, and slicing/aggregation along some levels of those dimensions. My first attempt was to just assume each dimension was orthogonal, which resulted in out-of-memory errors. I ended up using a MultiIndex for the hierarchy dimension to have a "dense" representation of a sparse subspace. Unfortunately, currently `.sel()` and such will cut out MultiIndex dimensions, and I've had to do boolean masking to keep all the dimensions I need. Multidimensional groupby, especially within the MultiIndex, is a headache as it currently stands. I had to resort to making auxilliary dimensions with one-hot encoded levels (dummy variables) and doing multiply-aggregate operations by hand. `xarray` is really beautiful and should be used more by data scientists, but it's really difficult to recommend it to colleagues when not all the familiar `pandas`-style operations are supported.	{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Explicit indexes in xarray's data-model (Future of MultiIndex) 262642978
557476617	https://github.com/pydata/xarray/issues/3458#issuecomment-557476617	https://api.github.com/repos/pydata/xarray/issues/3458	MDEyOklzc3VlQ29tbWVudDU1NzQ3NjYxNw==	NowanIlfideme 2067093	2019-11-22T10:21:08Z	2019-11-22T10:21:08Z	NONE	Note that this doesn't work on MultiIndex levels, since vectorized operations on them are not currently supported. Meanwhile, using `sel(multiindex_level_name="a")` drops the level from the multiindex entirely. The running theme is that this is dependent on #1603, it seems. :)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Keep index dimension when selecting only a single coord 514077742

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);