github: issues: 2 rows where repo = 13221727 and user = 6582745 sorted by updated

2 rows where repo = 13221727 and user = 6582745 sorted by updated_at descending

Search:

descending

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at ▲	closed_at	author_association	active_lock_reason	draft	pull_request	body	reactions	performed_via_github_app	state_reason	repo	type
851391441	MDU6SXNzdWU4NTEzOTE0NDE=	5115	`to_zarr()` dramatically alters dask graph	JSKenyon 6582745	closed	0			4	2021-04-06T12:50:04Z	2022-04-19T09:09:58Z	2022-04-19T03:46:55Z	NONE				What happened: The dask graph before a `to_zarr()` call differs wildly from the dask graph after a `to_zarr()` call. What you expected to happen: I would expect `to_zarr()` to add layers/dependencies to the graph as normal. Minimal Complete Verifiable Example: ```python import xarray import dask.array as da from pprint import pprint if name == "main": `arr = da.ones((2,), chunks=(1,)) + 1 xds = xarray.Dataset({"arr": (("x",), arr)}) pprint(xds.arr.data.__dask_graph__().layers) pprint(xds.arr.data.__dask_graph__().dependencies) xds = xds.to_zarr("out.zarr", mode="w", compute=False) pprint(xds.arr.data__dask_graph__().layers) pprint(xds.arr.data.__dask_graph__().dependencies)` ``` Anything else we need to know?: On my system the above will print the following before the `to_zarr()` call: ``` layers {'add-1118924c7d3d06d9d07bcca6afde2c7e': Blockwise<(('ones-76dd1e004518465cc97010eea7a88ebc', ('.0',)), (1, None)) -> add-1118924c7d3d06d9d07bcca6afde2c7e>, 'ones-76dd1e004518465cc97010eea7a88ebc': Blockwise<(('blockwise-create-ones-76dd1e004518465cc97010eea7a88ebc', (0,)),) -> ones-76dd1e004518465cc97010eea7a88ebc>} deps {'add-1118924c7d3d06d9d07bcca6afde2c7e': {'ones-76dd1e004518465cc97010eea7a88ebc'}, 'ones-76dd1e004518465cc97010eea7a88ebc': set()} `and` layers Delayed('getattr-bf22b6050bac2d8ef0a78589b04365f3') deps {139853652717696: set(), 139853652760176: set(), '_finalize_store-faeab92e-4e8d-4155-a915-cbfe8addae8e': {'store-648c67ef-96d5-11eb-ae7e-fc77746741ed'}, 'getattr-84732ba2ce83b0568edc0dad83f2d611': {'getattr-c0220fc5eded903243bac6f4a8067a7b'}, 'getattr-c0220fc5eded903243bac6f4a8067a7b': {'_finalize_store-faeab92e-4e8d-4155-a915-cbfe8addae8e'}} `` after. This seems a little strange as the layers describingarr` have disappeared. This is problematic when attempting to do any post-processing/optimization/annotation on the graph. Environment: Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.8.8 (default, Feb 20 2021, 21:09:14) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 5.3.0-7648-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: None libnetcdf: None xarray: 0.17.0 pandas: 1.2.3 numpy: 1.19.5 scipy: 1.6.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.6.1 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2021.03.0+49.gf4132551 distributed: 2021.03.0+29.g3b8b97e3 matplotlib: 3.3.4 cartopy: None seaborn: None numbagg: None pint: None setuptools: 54.1.2 pip: 21.0.1 conda: None pytest: 6.2.2 IPython: 7.21.0 sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5115/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		completed	xarray 13221727	issue
702646191	MDU6SXNzdWU3MDI2NDYxOTE=	4428	Behaviour change in xarray.Dataset.sortby/sel between dask==2.25.0 and dask==2.26.0	JSKenyon 6582745	closed	0			8	2020-09-16T10:26:38Z	2021-07-04T04:12:34Z	2021-07-04T04:12:34Z	NONE				What happened: A project of mine suddenly broke with: `ValueError: Object has inconsistent chunks along dimension row. This can be fixed by calling unify_chunks().` where previously it had worked. What you expected to happen: There should have been no change. Minimal Complete Verifiable Example: This is very difficult to reproduce. I have tried, but it clearly isn't triggered for relatively simple xarray.Datasets. In my code, the Datasets in question are the result of multiple concatenations, selection and chunking operations. What I shall do instead is attempt to demonstrate the change, in the hopes that someone more knowledgeable has some intuition for what has gone wrong. *dask==2.25.0* I have a dataset, foo, with a number of different variables, most indexed by row. I will focus on one variable to demonstrate the change in behaviour, specifically FLAG. This is what flag looks like prior to a `foo.sortby("row")` call. Note that there is only a single chunk (this is intentional). `<xarray.DataArray 'FLAG' (row: 40710, chan: 1024, corr: 4)> dask.array<rechunk-merge, shape=(40710, 1024, 4), dtype=bool, chunksize=(40710, 1024, 4), chunktype=numpy.ndarray> Coordinates: * row (row) int64 462991 462993 462994 462996 ... 505074 505075 505076 Dimensions without coordinates: chan, corr` After the `foo.sortby("row")` call: `<xarray.DataArray 'FLAG' (row: 40710, chan: 1024, corr: 4)> dask.array<getitem, shape=(40710, 1024, 4), dtype=bool, chunksize=(40710, 1024, 4), chunktype=numpy.ndarray> Coordinates: * row (row) int64 462991 462993 462994 462996 ... 505076 505077 505078 Dimensions without coordinates: chan, corr` Note that the chunksize is unchanged. *dask==2.26.0* Repeating exactly the same experiment, prior to the call: `<xarray.DataArray 'FLAG' (row: 40710, chan: 1024, corr: 4)> dask.array<rechunk-merge, shape=(40710, 1024, 4), dtype=bool, chunksize=(40710, 1024, 4), chunktype=numpy.ndarray> Coordinates: * row (row) int64 462991 462993 462994 462996 ... 505074 505075 505076 Dimensions without coordinates: chan, corr` After the `foo.sortby("row")` call: `<xarray.DataArray 'FLAG' (row: 40710, chan: 1024, corr: 4)> dask.array<getitem, shape=(40710, 1024, 4), dtype=bool, chunksize=(20355, 1024, 4), chunktype=numpy.ndarray> Coordinates: * row (row) int64 462991 462993 462994 462996 ... 505076 505077 505078 Dimensions without coordinates: chan, corr` Note the change in the chunksize. Anything else we need to know?: I have seen similar behaviour when using xarray.Dataset.sel. Environment: *dask==2.25.0* ``` INSTALLED VERSIONS commit: None python: 3.6.9 (default, Jul 17 2020, 12:50:27) [GCC 8.4.0] python-bits: 64 OS: Linux OS-release: 5.3.0-7648-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: None libnetcdf: None xarray: 0.15.1 pandas: 1.1.2 numpy: 1.19.2 scipy: 1.5.2 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.4.0 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.25.0 distributed: 2.26.0 matplotlib: None cartopy: None seaborn: None numbagg: None setuptools: 50.3.0 pip: 20.2.3 conda: None pytest: 6.0.2 IPython: None sphinx: None ``` *dask==2.26.0* ``` INSTALLED VERSIONS commit: None python: 3.6.9 (default, Jul 17 2020, 12:50:27) [GCC 8.4.0] python-bits: 64 OS: Linux OS-release: 5.3.0-7648-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: None libnetcdf: None xarray: 0.15.1 pandas: 1.1.2 numpy: 1.19.2 scipy: 1.5.2 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.4.0 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.26.0 distributed: 2.26.0 matplotlib: None cartopy: None seaborn: None numbagg: None setuptools: 50.3.0 pip: 20.2.3 conda: None pytest: 6.0.2 IPython: None sphinx: None ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4428/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		completed	xarray 13221727	issue

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);

issues

2 rows where repo = 13221727 and user = 6582745 sorted by updated_at descending

layers

deps

layers

deps

Advanced export