github: issues: 4 rows where type = "issue" and user = 1797906 sorted by updated

4 rows where type = "issue" and user = 1797906 sorted by updated_at descending

Search:

descending

id	node_id	number	title	user	state	comments	created_at	updated_at ▲	closed_at	author_association	body	reactions	state_reason	repo	type
1364911775	I_kwDOAMm_X85RWuaf	7005	Cannot re-index or align objects with conflicting indexes	jamesstidard 1797906	open	2	2022-09-07T16:22:46Z	2022-09-09T16:04:05Z		NONE	What happened? I'm looking to rename the values of indices of an existing dataset, for both regular and multi-index. i.e. you might start with a dataset with and index `[1,2,3]` and you want to rename those to `["foo", "bar", "bar"]`. I appear to be able to rename a couple using the method I've written, though renaming a second multi-index in the same `xr.Dataset` causes a `ValueError` I am struggling to interpret. cannot re-index or align objects with conflicting indexes found for the following dimensions: 'x' (2 conflicting indexes) Conflicting indexes may occur when - they relate to different sets of coordinate and/or dimension names - they don't have the same type - they may be used to reindex data along common dimensions What did you expect to happen? I start with the `xr.Datset`: `xarray.DataArray (x: 6, y: 6, z: 3)> array(...) Coordinates: * x (x) object MultiIndex * x_one (x) object 'a' 'a' 'b' 'b' 'c' 'c' * x_two (x) int64 0 1 0 1 0 1 * y (y) object MultiIndex * y_one (y) object 'a' 'a' 'b' 'b' 'c' 'c' * y_two (y) int64 0 1 0 1 0 1 * z (z) int64 0 1 2` And remap the `z`, `x_one`, and `y_one` values to: `xarray.DataArray (x: 6, y: 6, z: 3)> array(...) Coordinates: * x (x) object MultiIndex * x_one (x) object 'aa' 'aa' 'bb' 'bb' 'cc' 'cc' * x_two (x) int64 0 1 0 1 0 1 * y (y) object MultiIndex * y_one (y) object 'aa' 'aa' 'bb' 'bb' 'cc' 'cc' * y_two (y) int64 0 1 0 1 0 1 * z (z) <U4 'zero' 'one' 'two'` Minimal Complete Verifiable Example ```Python import numpy as np import pandas as pd import xarray as xr def map_coords(ds, *, name, mapping): """ Takes a xarray dataset's coordinate values and updates them with the given the provided mapping. `Can handle both regular indices and multi-level indices. ds: the datasets name: name of the coordinate to update mapping: dictionary, key of old value, value of new value. """ coord = ds.coords[name] old_values = coord.values.tolist() new_values = [mapping[v] for v in old_values] ds.coords[name] = xr.DataArray(new_values, coords=coord.coords) ds.coords[name].attrs = dict(coord.attrs)` midx = pd.MultiIndex.from_product([list("abc"), [0, 1]], names=("x_one", "x_two")) midy = pd.MultiIndex.from_product([list("abc"), [0, 1]], names=("y_one", "y_two")) mda = xr.DataArray(np.random.rand(6, 6, 3), [("x", midx), ("y", midy), ("z", range(3))]) map_coords(mda, name="z", mapping={0: "zero", 1: "one", 2: "two"}). # success map_coords(mda, name="x_one", mapping={"a": "aa", "b": "bb", "c": "cc"}) # success map_coords(mda, name="y_one", mapping={"a": "aa", "b": "bb", "c": "cc"}) # ValueError ``` MVCE confirmation [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [X] Complete example — the example is self-contained, including all data and the text of any traceback. [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [X] New issue — a search of GitHub Issues suggests this is not a duplicate. Relevant log output Python Traceback (most recent call last): File "./main.py", line 30, in <module> map_coords(mda, name="y_one", mapping={"a": "aa", "b": "bb", "c": "cc"}) File "./main.py", line 20, in map_coords ds.coords[name] = xr.DataArray(new_values, coords=coord.coords) File "./.venv/lib/python3.10/site-packages/xarray/core/coordinates.py", line 32, in __setitem__ self.update({key: value}) File "./.venv/lib/python3.10/site-packages/xarray/core/coordinates.py", line 162, in update coords, indexes = merge_coords( File "./.venv/lib/python3.10/site-packages/xarray/core/merge.py", line 561, in merge_coords aligned = deep_align( File "./.venv/lib/python3.10/site-packages/xarray/core/alignment.py", line 827, in deep_align aligned = align( File "./.venv/lib/python3.10/site-packages/xarray/core/alignment.py", line 764, in align aligner.align() File "./.venv/lib/python3.10/site-packages/xarray/core/alignment.py", line 550, in align self.assert_no_index_conflict() File "./.venv/lib/python3.10/site-packages/xarray/core/alignment.py", line 319, in assert_no_index_conflict raise ValueError( ValueError: cannot re-index or align objects with conflicting indexes found for the following dimensions: 'x' (2 conflicting indexes) Conflicting indexes may occur when - they relate to different sets of coordinate and/or dimension names - they don't have the same type - they may be used to reindex data along common dimensions Anything else we need to know? I may also not be doing this remapping in the best way, this has been the easiest way I've found to do it. So maybe part of the problem is that, so open to alternative methods as well. Thanks. Environment INSTALLED VERSIONS ------------------ commit: None python: 3.10.4 (main, Mar 28 2022, 15:33:01) [Clang 13.1.6 (clang-1316.0.21.2)] python-bits: 64 OS: Darwin OS-release: 21.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: None xarray: 2022.6.0 pandas: 1.4.4 numpy: 1.23.2 scipy: 1.9.1 netCDF4: None pydap: None h5netcdf: 1.0.2 h5py: 3.7.0 Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: None cartopy: None seaborn: None numbagg: None fsspec: None cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 63.4.3 pip: 22.2.2 conda: None pytest: None IPython: None sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7005/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
291332965	MDU6SXNzdWUyOTEzMzI5NjU=	1854	Drop coordinates on loading large dataset.	jamesstidard 1797906	closed	22	2018-01-24T19:35:46Z	2020-02-15T14:49:53Z	2020-02-15T14:49:53Z	NONE	I've been struggling for quite a while to load a large dataset so I thought it best ask as I think I'm missing a trick. I've also looked through the issues but, even though there are a fair few questions that seemed promising. I have a number of `.nc` files with variables across the coordinates `latitude`, `longitude` and `time`. Each file has the data for all the latitude and longitudes of the world and then some period of time - about two months. The goal is to go through that data and get all the history of a single latitude/longitude coordinate - instead of the data for all latitude and longitude for small periods. This is my current few lines of script: `python ds = xr.open_mfdataset('path/to/ncs/.nc', chunks={'time': 127}) # 127 is normally the size of the time dimension in each file recs = ds.sel(latitude=10, longitude=10).to_dataframe().to_records() np.savez('location.npz', recs)` However, this blows out the memory on my machine on the `open_mfdataset` call when I use the full dataset. I've tried a bunch of different ways of chunking the data (like: 'latitude': 1, 'longitude': 1) but not been able to get past this stage. I was wondering if there's a way to either determine a good chunk size or maybe tell the `open_mfdataset` to only keep values from the lat/lng coordinates I care about (`coords` kwarg looked like it could've been it) . I'm using version `0.10.0` of xarray Would very much appreciate any help.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1854/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
257400162	MDU6SXNzdWUyNTc0MDAxNjI=	1572	Modifying data set resulting in much larger file size	jamesstidard 1797906	closed	7	2017-09-13T14:24:06Z	2017-09-18T08:59:24Z	2017-09-13T17:12:28Z	NONE	I'm loading a 130MB `nc` file and applying a `where` mask to it to remove a significant amount of the floating points - replacing them with `nan`. However, when I go to save this file it has increased to over 500MB. If I load the original data set and instantly save it the file stays roughly the same size. Here's how I'm applying the mask: ```python import os import xarray as xr fp = 'ERA20c/swh_2010_01_05_05.nc' ds = xr.open_dataset(fp) ds = ds.where(ds.latitude > 50) head, ext = os.path.splitext(fp) xr.open_dataset(fp).to_netcdf('{}-duplicate{}'.format(head, ext)) ds.to_netcdf('{}-masked{}'.format(head, ext)) ``` Is there a way to reduce this file size of the masked dataset? I'd expect it to be roughly the same size or smaller. Thanks.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1572/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
255997962	MDU6SXNzdWUyNTU5OTc5NjI=	1561	exit code 137 when using xarray.open_mfdataset	jamesstidard 1797906	closed	3	2017-09-07T16:31:50Z	2017-09-13T14:16:07Z	2017-09-13T14:16:06Z	NONE	While using the `xarray.open_mfdataset` I get a `exit code 137 SIGKILL 9` killing my process. I do not get this while using a subset of the data though. I'm also providing a chunks argument. Does anyone know what might be causing this? Could it be the computer is completely running out of memory (RAM + SWAP + HDD)? Unsure what's causing this as I get no stack trace just the `SIGKILL`. Thanks.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1561/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);

issues

4 rows where type = "issue" and user = 1797906 sorted by updated_at descending

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

MVCE confirmation

Relevant log output

Anything else we need to know?

Environment

Advanced export