{"database": "github", "table": "issues", "is_view": false, "human_description_en": "where \"updated_at\" is on date 2020-12-24 and user = 1312546 sorted by updated_at descending", "rows": [[770937642, "MDU6SXNzdWU3NzA5Mzc2NDI=", 4708, "Potentially spurious warning in rechunk", 1312546, "closed", 0, null, null, 0, "2020-12-18T14:37:32Z", "2020-12-24T11:32:43Z", "2020-12-24T11:32:43Z", "MEMBER", null, null, null, "**What happened**:\r\n\r\nWhen reading an zarr dataset where the last chunk is smaller than the chunk size, users see a `UserWarning` that this may be inefficient, since the chunking differs from the chunking on disk. In general that's a good warning, but it shouldn't appear when the only difference between the on-disk chunking and the Dataset chunking is the last chunk.\r\n\r\n**What you expected to happen**:\r\n\r\nNo warning.\r\n\r\n**Minimal Complete Verifiable Example**:\r\n\r\n```python\r\n# Create and write the data\r\nimport numpy as np\r\nimport pandas as pd\r\nimport xarray as xr\r\n\r\nnp.random.seed(0)\r\ntemperature = 15 + 8 * np.random.randn(2, 2, 3)\r\nprecipitation = 10 * np.random.rand(2, 2, 3)\r\nlon = [[-99.83, -99.32], [-99.79, -99.23]]\r\nlat = [[42.25, 42.21], [42.63, 42.59]]\r\ntime = pd.date_range(\"2014-09-06\", periods=3)\r\nreference_time = pd.Timestamp(\"2014-09-05\")\r\nds = xr.Dataset(\r\n    data_vars=dict(\r\n        temperature=([\"x\", \"y\", \"time\"], temperature),\r\n        precipitation=([\"x\", \"y\", \"time\"], precipitation),\r\n    ),\r\n    coords=dict(\r\n        lon=([\"x\", \"y\"], lon),\r\n        lat=([\"x\", \"y\"], lat),\r\n        time=time,\r\n        reference_time=reference_time,\r\n    ),\r\n    attrs=dict(description=\"Weather related data.\"),\r\n)\r\nds2 = ds.chunk(chunks=dict(time=(2, 1)))\r\nds2['temperature'].chunks\r\n\r\nds2.to_zarr(\"/tmp/test.zarr\", mode=\"w\")\r\n```\r\n\r\nReading it produces a warning\r\n\r\n```python\r\nxr.open_zarr(\"/tmp/test.zarr\")\r\n/mnt/c/Users/taugspurger/src/xarray/xarray/core/dataset.py:408: UserWarning: Specified Dask chunks (2, 1) would separate on disks chunk shape 2 for dimension time. This could degrade performance. Consider rechunking after loading instead.\r\n  _check_chunks_compatibility(var, output_chunks, preferred_chunks)\r\n```\r\n\r\n**Anything else we need to know?**:\r\n\r\nThe check around https://github.com/pydata/xarray/blob/91318d2ee63149669404489be9198f230d877642/xarray/core/dataset.py#L371-L378 should probably ignore the very last chunk, since Zarr allows it to be different?\r\n\r\n**Environment**:\r\n\r\n<details><summary>Output of <tt>xr.show_versions()</tt></summary>\r\n\r\nINSTALLED VERSIONS\r\n------------------\r\ncommit: None\r\npython: 3.8.6 | packaged by conda-forge | (default, Oct  7 2020, 19:08:05) \r\n[GCC 7.5.0]\r\npython-bits: 64\r\nOS: Linux\r\nOS-release: 4.19.128-microsoft-standard\r\nmachine: x86_64\r\nprocessor: x86_64\r\nbyteorder: little\r\nLC_ALL: None\r\nLANG: C.UTF-8\r\nLOCALE: en_US.UTF-8\r\nlibhdf5: None\r\nlibnetcdf: None\r\n\r\nxarray: 0.16.3.dev21+g96e1aea0\r\npandas: 1.1.4\r\nnumpy: 1.19.4\r\nscipy: 1.5.4\r\nnetCDF4: None\r\npydap: None\r\nh5netcdf: None\r\nh5py: None\r\nNio: None\r\nzarr: 2.6.2.dev9+dirty\r\ncftime: 1.3.0\r\nnc_time_axis: None\r\nPseudoNetCDF: None\r\nrasterio: None\r\ncfgrib: None\r\niris: None\r\nbottleneck: None\r\ndask: 2.30.0\r\ndistributed: None\r\nmatplotlib: None\r\ncartopy: None\r\nseaborn: None\r\nnumbagg: None\r\npint: None\r\nsetuptools: 49.6.0.post20201009\r\npip: 20.2.4\r\nconda: None\r\npytest: 5.4.3\r\nIPython: 7.19.0\r\nsphinx: None\r\n\r\n\r\n</details>\r\n", "{\"url\": \"https://api.github.com/repos/pydata/xarray/issues/4708/reactions\", \"total_count\": 0, \"+1\": 0, \"-1\": 0, \"laugh\": 0, \"hooray\": 0, \"confused\": 0, \"heart\": 0, \"rocket\": 0, \"eyes\": 0}", null, "completed", 13221727, "issue"]], "truncated": false, "filtered_table_rows_count": 1, "expanded_columns": [], "expandable_columns": [[{"column": "repo", "other_table": "repos", "other_column": "id"}, "name"], [{"column": "milestone", "other_table": "milestones", "other_column": "id"}, "title"], [{"column": "assignee", "other_table": "users", "other_column": "id"}, "login"], [{"column": "user", "other_table": "users", "other_column": "id"}, "login"]], "columns": ["id", "node_id", "number", "title", "user", "state", "locked", "assignee", "milestone", "comments", "created_at", "updated_at", "closed_at", "author_association", "active_lock_reason", "draft", "pull_request", "body", "reactions", "performed_via_github_app", "state_reason", "repo", "type"], "primary_keys": ["id"], "units": {}, "query": {"sql": "select id, node_id, number, title, user, state, locked, assignee, milestone, comments, created_at, updated_at, closed_at, author_association, active_lock_reason, draft, pull_request, body, reactions, performed_via_github_app, state_reason, repo, type from issues where date(\"updated_at\") = :p0 and \"user\" = :p1 order by updated_at desc limit 101", "params": {"p0": "2020-12-24", "p1": "1312546"}}, "facet_results": {"state": {"name": "state", "type": "column", "hideable": false, "toggle_url": "/github/issues.json?updated_at__date=2020-12-24&user=1312546", "results": [{"value": "closed", "label": "closed", "count": 1, "toggle_url": "http://xarray-datasette.fly.dev/github/issues.json?updated_at__date=2020-12-24&user=1312546&state=closed", "selected": false}], "truncated": false}, "repo": {"name": "repo", "type": "column", "hideable": false, "toggle_url": "/github/issues.json?updated_at__date=2020-12-24&user=1312546", "results": [{"value": 13221727, "label": "xarray", "count": 1, "toggle_url": "http://xarray-datasette.fly.dev/github/issues.json?updated_at__date=2020-12-24&user=1312546&repo=13221727", "selected": false}], "truncated": false}, "type": {"name": "type", "type": "column", "hideable": false, "toggle_url": "/github/issues.json?updated_at__date=2020-12-24&user=1312546", "results": [{"value": "issue", "label": "issue", "count": 1, "toggle_url": "http://xarray-datasette.fly.dev/github/issues.json?updated_at__date=2020-12-24&user=1312546&type=issue", "selected": false}], "truncated": false}}, "suggested_facets": [{"name": "created_at", "type": "date", "toggle_url": "http://xarray-datasette.fly.dev/github/issues.json?updated_at__date=2020-12-24&user=1312546&_facet_date=created_at"}, {"name": "updated_at", "type": "date", "toggle_url": "http://xarray-datasette.fly.dev/github/issues.json?updated_at__date=2020-12-24&user=1312546&_facet_date=updated_at"}, {"name": "closed_at", "type": "date", "toggle_url": "http://xarray-datasette.fly.dev/github/issues.json?updated_at__date=2020-12-24&user=1312546&_facet_date=closed_at"}], "next": null, "next_url": null, "private": false, "allow_execute_sql": true, "query_ms": 22.767626214772463}