github: issues: 6 rows where comments = 3, state_reason = "completed" and user = 35968931 sorted by updated

6 rows where comments = 3, state_reason = "completed" and user = 35968931 sorted by updated_at descending

Search:

descending

id	node_id	number	title	user	state	comments	created_at	updated_at ▲	closed_at	author_association	body	reactions	state_reason	repo	type
2098882374	I_kwDOAMm_X859GmdG	8660	dtype encoding ignored during IO?	TomNicholas 35968931	closed	3	2024-01-24T18:50:47Z	2024-02-05T17:35:03Z	2024-02-05T17:35:02Z	MEMBER	What happened? When I set the `.encoding['dtype']` attribute before saving a to disk, the actual on-disk representation appears to store a record of the dtype encoding, but when opening it back up in xarray I get the same dtype I had before, not the one specified in the encoding. Is that what's supposed to happen? How does this work? (This happens with both zarr and netCDF.) What did you expect to happen? I expected that setting `.encoding['dtype']` would mean that once I open the data back up, it would be in the new dtype that I set in the encoding. Minimal Complete Verifiable Example ```Python air = xr.tutorial.open_dataset('air_temperature') air['air'].dtype # returns dtype('float32') air['air'].encoding['dtype'] # returns dtype('int16'), which already seems weird air.to_zarr('air.zarr') # I would assume here that the encoding actually does something during IO now if I check the zarr `.zarray` metadata for the `air` variable it says `"dtype":`"<i2"` air2 = xr.open_dataset('air.zarr', engine='zarr') # open it back up air2['air'].dtype # returns dtype('float32'), but I expected dtype('int16') (the same thing happens also with saving to netCDF instead of Zarr) ``` MVCE confirmation [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [X] Complete example — the example is self-contained, including all data and the text of any traceback. [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [X] New issue — a search of GitHub Issues suggests this is not a duplicate. [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies. Relevant log output No response Anything else we need to know? I know I didn't explicitly cast with `.asdtype`, but I'm still confused as to what the relation between the dtype encoding is supposed to be here. I am probably just misunderstanding how this is supposed to work, but then this is arguably a docs issue, because here it says "[the encoding dtype field] controls the type of the data written on disk", which I would have thought also affects the data you get back when you open it up again? Environment `main` branch of xarray	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8660/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
663235664	MDU6SXNzdWU2NjMyMzU2NjQ=	4243	Manually drop DataArray from memory?	TomNicholas 35968931	closed	3	2020-07-21T18:54:40Z	2023-09-12T16:17:12Z	2023-09-12T16:17:12Z	MEMBER	Is it possible to deliberately drop data associated with a particular DataArray from memory? Obviously `da.close()` exists, but what happens if you did for example `python ds = open_dataset(file) da = ds[var] da.compute() # something that loads da into memory da.close() # is the memory freed up again now? ds.something() # what about now?` Also does calling python's built-in garbage collector (i.e. `gc.collect()`) do anything in this instance? The context of this question is that I'm trying to resave some massive variables (~65GB each) that were loaded from thousands of files into just a few files for each variable. I would love to use @rabernat 's new rechunker package but I'm not sure how easily I can convert my current netCDF data to Zarr, and I'm interested in this question no matter how I end up solving the problem. I don't currently have a particularly good understanding of file I/O and memory management in xarray, but would like to improve it. Can anyone recommend a tool I can use to answer this kind of question myself on my own machine? I suppose it would need to be able to tell me the current memory usage of specific objects, not just the total memory usage. (@johnomotani you might be interested)	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4243/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1807782455	I_kwDOAMm_X85rwJI3	7996	Stable docs build not showing latest changes after release	TomNicholas 35968931	closed	3	2023-07-17T13:24:58Z	2023-07-17T20:48:19Z	2023-07-17T20:48:19Z	MEMBER	What happened? I released xarray version v2023.07.0 last night, but I'm not seeing changes to the documentation reflected in the `https://docs.xarray.dev/en/stable/` build. (In particular the Internals section now should have an entire extra page on wrapping chunked arrays.) I can however see the newest additions on `https://docs.xarray.dev/en/latest/` build. Is that how it's supposed to work? What did you expect to happen? No response Minimal Complete Verifiable Example No response MVCE confirmation [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [ ] Complete example — the example is self-contained, including all data and the text of any traceback. [ ] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [ ] New issue — a search of GitHub Issues suggests this is not a duplicate. Relevant log output No response Anything else we need to know? No response Environment	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7996/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
474247717	MDU6SXNzdWU0NzQyNDc3MTc=	3168	apply_ufunc erroneously operating on an empty array when dask used	TomNicholas 35968931	closed	3	2019-07-29T20:44:23Z	2020-03-30T15:08:16Z	2020-03-30T15:08:15Z	MEMBER	Problem description `apply_ufunc` with `dask='parallelized'` appears to be trying to act on an empty numpy array when the computation is specified, but before `.compute()` is called. In other words, a ufunc which just prints the shape of its argument will print `(0,0)` then print the correct shape once `.compute()` is called. Minimum working example ```python import numpy as np import xarray as xr def example_ufunc(x): print(x.shape) return np.mean(x, axis=-1) def new_mean(da, dim): result = xr.apply_ufunc(example_ufunc, da, input_core_dims=[[dim]], dask='parallelized', output_dtypes=[da.dtype]) return result shape = {'t': 2, 'x':3} data = xr.DataArray(data=np.random.rand(shape.values()), dims=shape.keys()) unchunked = data chunked = data.chunk(shape) actual = new_mean(chunked, dim='x') # raises the warning print(actual) print(actual.compute()) # does the computation correctly ``` Result `(0, 0) /home/tnichol/anaconda3/envs/py36/lib/python3.6/site-packages/numpy/core/fromnumeric.py:3118: RuntimeWarning: Mean of empty slice. out=out, *kwargs) <xarray.DataArray (t: 2)> dask.array<shape=(2,), dtype=float64, chunksize=(2,)> Dimensions without coordinates: t (2, 3) <xarray.DataArray (t: 2)> array([0.147205, 0.402913]) Dimensions without coordinates: t` Expected result Same thing without the `(0,0)` or the numpy warning. Output of `xr.show_versions()` (my xarray is up-to-date with master) INSTALLED VERSIONS ------------------ commit: None python: 3.6.6 \|Anaconda, Inc.\| (default, Oct 9 2018, 12:34:16) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-862.14.4.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8 libhdf5: 1.10.2 libnetcdf: 4.6.1 xarray: 0.12.3+23.g1d7bcbd pandas: 0.24.2 numpy: 1.16.4 scipy: 1.3.0 netCDF4: 1.4.2 pydap: None h5netcdf: None h5py: 2.8.0 Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 2.1.0 distributed: 2.1.0 matplotlib: 3.1.0 cartopy: None seaborn: 0.9.0 numbagg: None setuptools: 40.6.2 pip: 18.1 conda: None pytest: 4.0.0 IPython: 7.1.1 sphinx: 1.8.2	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3168/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
497184021	MDU6SXNzdWU0OTcxODQwMjE=	3334	plot.line fails when plot axis is a 1D coordinate	TomNicholas 35968931	closed	3	2019-09-23T15:52:48Z	2019-09-26T08:51:59Z	2019-09-26T08:51:59Z	MEMBER	MCVE Code Sample ```python import xarray as xr import numpy as np x_coord = xr.DataArray(data=[0.1, 0.2], dims=['x']) t_coord = xr.DataArray(data=[10, 20], dims=['t']) da = xr.DataArray(data=np.array([[0, 1], [5, 9]]), dims=['x', 't'], coords={'x': x_coord, 'time': t_coord}) print(da) da.transpose('time', 'x') `Output:` <xarray.DataArray (x: 2, t: 2)> array([[0, 1], [5, 9]]) Coordinates: * x (x) float64 0.1 0.2 time (t) int64 10 20 Traceback (most recent call last): File "mwe.py", line 22, in <module> da.transpose('time', 'x') File "/home/tegn500/Documents/Work/Code/xarray/xarray/core/dataarray.py", line 1877, in transpose "permuted array dimensions (%s)" % (dims, tuple(self.dims)) ValueError: arguments to transpose (('time', 'x')) must be permuted array dimensions (('x', 't')) ``` As `'time'` is a coordinate with only one dimension, this is an unambiguous operation that I want to perform. However, because `.transpose()` currently only accepts dimensions, this fails with that error. This causes bug in other parts of the code - for example I found this by trying to plot this type of dataarray: `python da.plot(x='time', hue='x')` which gives the same error. (You can get a similar error also with `da.plot(y='time', hue='x')`.) If the code which explicitly checks that the arguments to transpose are dims and not just coordinate dimensions is removed, then both of these examples work as expected. I would like to generalise the transpose function to also accept dimension coordinates, is there any reason not to do this?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3334/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
404383025	MDU6SXNzdWU0MDQzODMwMjU=	2725	Line plot with x=coord putting wrong variables on axes	TomNicholas 35968931	closed	3	2019-01-29T16:43:18Z	2019-01-30T02:02:22Z	2019-01-30T02:02:22Z	MEMBER	When I try to plot the values in a 1D DataArray against the values in one of its coordinates, it does not behave at all as expected: ```python import numpy as np import matplotlib.pyplot as plt from xarray import DataArray current = DataArray(name='current', data=np.array([5, 8, 14, 22, 30]), dims=['time'], coords={'time': (['time'], np.array([0.1, 0.2, 0.3, 0.4, 0.5])), 'voltage': (['time'], np.array([100, 200, 300, 400, 500]))}) print(current) Try to plot current against voltage current.plot.line(x='voltage') plt.show() ``` Output: `<xarray.DataArray 'current' (time: 5)> array([ 5, 8, 14, 22, 30]) Coordinates: * time (time) float64 0.1 0.2 0.3 0.4 0.5 voltage (time) int64 100 200 300 400 500` Problem description Not only is `'voltage'` not on the x axis, but `'current'` isn't on the y axis either! Expected Output Based on the documentation (and common sense) I would have expected it to plot voltage on the x axis and current on the y axis. (using a branch of xarray which is up-to-date with master)	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2725/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);

issues

6 rows where comments = 3, state_reason = "completed" and user = 35968931 sorted by updated_at descending

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

now if I check the zarr `.zarray` metadata for the `air` variable it says

MVCE confirmation

Relevant log output

Anything else we need to know?

Environment

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

MVCE confirmation

Relevant log output

Anything else we need to know?

Environment

Problem description

Minimum working example

Result

Expected result

Output of `xr.show_versions()`

MCVE Code Sample

Try to plot current against voltage

Problem description

Expected Output

Advanced export

issues

6 rows where comments = 3, state_reason = "completed" and user = 35968931 sorted by updated_at descending

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

now if I check the zarr .zarray metadata for the air variable it says

MVCE confirmation

Relevant log output

Anything else we need to know?

Environment

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

MVCE confirmation

Relevant log output

Anything else we need to know?

Environment

Problem description

Minimum working example

Result

Expected result

Output of xr.show_versions()

MCVE Code Sample

Try to plot current against voltage

Problem description

Expected Output

Advanced export

now if I check the zarr `.zarray` metadata for the `air` variable it says

Output of `xr.show_versions()`