github: issue_comments: 5 rows where issue = 187069161 and user = 2448579 sorted by updated

5 rows where issue = 187069161 and user = 2448579 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
1270514913	https://github.com/pydata/xarray/issues/1077#issuecomment-1270514913	https://api.github.com/repos/pydata/xarray/issues/1077	IC_kwDOAMm_X85LuoTh	dcherian 2448579	2022-10-06T18:31:51Z	2022-10-06T18:31:51Z	MEMBER	Thanks @lucianopaz I fixed some errors when I added it to cf-xarray It would be good to see if that version works for you.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	MultiIndex serialization to NetCDF 187069161
1101505074	https://github.com/pydata/xarray/issues/1077#issuecomment-1101505074	https://api.github.com/repos/pydata/xarray/issues/1077	IC_kwDOAMm_X85Bp6Iy	dcherian 2448579	2022-04-18T15:36:19Z	2022-04-18T15:36:19Z	MEMBER	I added the "compression by gathering" scheme to cf-xarray. 1. https://cf-xarray.readthedocs.io/en/latest/generated/cf_xarray.encode_multi_index_as_compress.html 1. https://cf-xarray.readthedocs.io/en/latest/generated/cf_xarray.decode_compress_to_multi_index.html	{ "total_count": 2, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 2, "rocket": 0, "eyes": 0 }	MultiIndex serialization to NetCDF 187069161
645416425	https://github.com/pydata/xarray/issues/1077#issuecomment-645416425	https://api.github.com/repos/pydata/xarray/issues/1077	MDEyOklzc3VlQ29tbWVudDY0NTQxNjQyNQ==	dcherian 2448579	2020-06-17T14:40:19Z	2020-06-17T14:40:19Z	MEMBER	@shoyer I now understand your earlier comment. I agree that it should work with both sparse and MultiIndex but as such there's no way to decide whether this should be decoded to a sparse array or a MultiIndexed dense array. Following your comment in https://github.com/pydata/xarray/issues/3213#issuecomment-521533999 Fortunately, there does seems to be a CF convention that would be a good fit for for sparse data in COO format, namely the indexed ragged array representation (example, note the instance_dimension attribute). That's probably the right thing to use for sparse arrays in xarray. How about using this "compression by gathering" idea for MultiIndexed dense arrays and "indexed ragged arrays" for sparse arrays? I do not know the internals of `sparse` or the details of the CF conventions to have a strong opinion on which representation to prefer for `sparse.COO` arrays. PS: CF convention for "indexed ragged arrays" is here: http://cfconventions.org/cf-conventions/cf-conventions.html#_indexed_ragged_array_representation	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	MultiIndex serialization to NetCDF 187069161
644803374	https://github.com/pydata/xarray/issues/1077#issuecomment-644803374	https://api.github.com/repos/pydata/xarray/issues/1077	MDEyOklzc3VlQ29tbWVudDY0NDgwMzM3NA==	dcherian 2448579	2020-06-16T14:31:23Z	2020-06-16T14:31:23Z	MEMBER	I may be missing something but @fujiisoup's concern is addressed by the scheme in the CF conventions. In your encoded, how can we tell the MultiIndex is [('a', 1), ('b', 1), ('a', 2), ('b', 2)] or [('a', 1), ('a', 2), ('b', 1), ('b', 2)]? The information about ordering is stored as 1D indexes of an ND array; constructed using `np.ravel_multi_index` in the `encode_multiindex` function: `encoded[idxname] = np.ravel_multi_index(ds.indexes[idxname].codes, shape)` For example, see the dimension coordinate `landpoint` in the encoded form ``` ds3 <xarray.Dataset> Dimensions: (landpoint: 4) Coordinates: * landpoint (landpoint) MultiIndex - lat (landpoint) object 'a' 'b' 'b' 'a' - lon (landpoint) int64 1 2 1 2 Data variables: landsoilt (landpoint) float64 -0.2699 -1.228 0.4632 0.2287 encode_multiindex(ds3, "landpoint") <xarray.Dataset> Dimensions: (landpoint: 4, lat: 2, lon: 2) Coordinates: * lat (lat) object 'a' 'b' * lon (lon) int64 1 2 * landpoint (landpoint) int64 0 3 2 1 Data variables: landsoilt (landpoint) float64 -0.2699 -1.228 0.4632 0.2287 ``` Here is a cleaned up version of the code for easy testing ``` python import numpy as np import pandas as pd import xarray as xr def encode_multiindex(ds, idxname): encoded = ds.reset_index(idxname) coords = dict(zip(ds.indexes[idxname].names, ds.indexes[idxname].levels)) for coord in coords: encoded[coord] = coords[coord].values shape = [encoded.sizes[coord] for coord in coords] encoded[idxname] = np.ravel_multi_index(ds.indexes[idxname].codes, shape) encoded[idxname].attrs["compress"] = " ".join(ds.indexes[idxname].names) return encoded def decode_to_multiindex(encoded, idxname): names = encoded[idxname].attrs["compress"].split(" ") shape = [encoded.sizes[dim] for dim in names] indices = np.unravel_index(encoded.landpoint.values, shape) arrays = [encoded[dim].values[index] for dim, index in zip(names, indices)] mindex = pd.MultiIndex.from_arrays(arrays) `decoded = xr.Dataset({}, {idxname: mindex}) for varname in encoded.data_vars: if idxname in encoded[varname].dims: decoded[varname] = (idxname, encoded[varname].values) return decoded` ds1 = xr.Dataset( {"landsoilt": ("landpoint", np.random.randn(4))}, { "landpoint": pd.MultiIndex.from_product( [["a", "b"], [1, 2]], names=("lat", "lon") ) }, ) ds2 = xr.Dataset( {"landsoilt": ("landpoint", np.random.randn(4))}, { "landpoint": pd.MultiIndex.from_arrays( [["a", "b", "c", "d"], [1, 2, 4, 10]], names=("lat", "lon") ) }, ) ds3 = xr.Dataset( {"landsoilt": ("landpoint", np.random.randn(4))}, { "landpoint": pd.MultiIndex.from_arrays( [["a", "b", "b", "a"], [1, 2, 1, 2]], names=("lat", "lon") ) }, ) idxname = "landpoint" for dataset in [ds1, ds2, ds3]: xr.testing.assert_identical( decode_to_multiindex(encode_multiindex(dataset, idxname), idxname), dataset ) ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	MultiIndex serialization to NetCDF 187069161
644442679	https://github.com/pydata/xarray/issues/1077#issuecomment-644442679	https://api.github.com/repos/pydata/xarray/issues/1077	MDEyOklzc3VlQ29tbWVudDY0NDQ0MjY3OQ==	dcherian 2448579	2020-06-15T23:29:11Z	2020-06-15T23:38:30Z	MEMBER	This seems to be possible following http://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html#compression-by-gathering Here is a quick proof of concept: ``` python import numpy as np import pandas as pd import xarray as xr example 1 ds = xr.Dataset( {"landsoilt": ("landpoint", np.random.randn(4))}, { "landpoint": pd.MultiIndex.from_product( [["a", "b"], [1, 2]], names=("lat", "lon") ) }, ) example 2 ds = xr.Dataset( {"landsoilt": ("landpoint", np.random.randn(4))}, { "landpoint": pd.MultiIndex.from_arrays( [["a", "b", "c", "d"], [1, 2, 4, 10]], names=("lat", "lon") ) }, ) encode step detect using isinstance(index, pd.MultiIndex) idxname = "landpoint" encoded = ds.reset_index(idxname) coords = dict(zip(ds.indexes[idxname].names, ds.indexes[idxname].levels)) for coord in coords: encoded[coord] = coords[coord].values shape = [encoded.sizes[coord] for coord in coords] encoded[idxname] = np.ravel_multi_index(ds.indexes[idxname].codes, shape) encoded[idxname].attrs["compress"] = " ".join(ds.indexes[idxname].names) decode step detect using "compress" in var.attrs idxname = "landpoint" names = encoded[idxname].attrs["compress"].split(" ") shape = [encoded.sizes[dim] for dim in names] indices = np.unravel_index(encoded.landpoint.values, shape) arrays = [encoded[dim].values[index] for dim, index in zip(names, indices)] mindex = pd.MultiIndex.from_arrays(arrays) decoded = xr.Dataset({}, {idxname: mindex}) decoded["landsoilt"] = (idxname, encoded["landsoilt"].values) xr.testing.assert_identical(decoded, ds) ``` `encoded` can be serialized using our existing code: `<xarray.Dataset> Dimensions: (landpoint: 4, lat: 2, lon: 2) Coordinates: * lat (lat) object 'a' 'b' * lon (lon) int64 1 2 * landpoint (landpoint) int64 0 1 2 3 Data variables: landsoilt (landpoint) float64 -1.668 -1.003 1.084 1.963`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	MultiIndex serialization to NetCDF 187069161

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

5 rows where issue = 187069161 and user = 2448579 sorted by updated_at descending

example 1

example 2

ds = xr.Dataset(

{"landsoilt": ("landpoint", np.random.randn(4))},

{

"landpoint": pd.MultiIndex.from_arrays(

[["a", "b", "c", "d"], [1, 2, 4, 10]], names=("lat", "lon")

)

},

)

encode step

detect using isinstance(index, pd.MultiIndex)

decode step

detect using "compress" in var.attrs

Advanced export