issues
4 rows where repo = 13221727, state = "closed" and user = 56583917 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date), closed_at (date)
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at ▲ | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2108557477 | PR_kwDOAMm_X85lfTRf | 8684 | Enable `numbagg` in calculation of quantiles | maawoo 56583917 | closed | 0 | 5 | 2024-01-30T18:59:55Z | 2024-02-11T22:31:26Z | 2024-02-07T16:28:04Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/8684 |
Just saw your message in the related issue @max-sixty. This is what I came up with earlier. I also did a quick test, comparing the calculation with and without using numbagg for a dummy 3D DataArray. I was only wondering if the default usage of numbagg (given that it's available and |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8684/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
1497031605 | I_kwDOAMm_X85ZOuO1 | 7377 | Aggregating a dimension using the Quantiles method with `skipna=True` is very slow | maawoo 56583917 | closed | 0 | 17 | 2022-12-14T16:52:35Z | 2024-02-07T16:28:05Z | 2024-02-07T16:28:05Z | CONTRIBUTOR | What happened?Hi all,
as the title already summarizes, I'm running into performance issues when aggregating over the time-dimension of a 3D DataArray using the quantiles method with | | | |
| --------------- | --------------- | --------------- |
| 1 | I'm currently using a compute node with 40 CPUs and 180 GB RAM. Here is what the resource utilization looks like. First small bump are 1 and 2. Second longer peak is 3. In this small example, the process at least finishes after a few seconds. With my actual dataset the quantile calculation takes hours... I guess the following issue is relevant and should be revived: https://github.com/numpy/numpy/issues/16575 Are there any possible work-arounds? What did you expect to happen?No response Minimal Complete Verifiable Example```Python import pandas as pd import numpy as np import xarray as xr Create dummy data with 20% random NaNssize_spatial = 2000 size_temporal = 20 n_nan = int(size_spatial*20.2) time = pd.date_range("2000-01-01", periods=size_temporal) lat = np.random.uniform(low=-90, high=90, size=size_spatial) lon = np.random.uniform(low=-180, high=180, size=size_spatial) data = np.random.rand(size_temporal, size_spatial, size_spatial) index_nan = np.random.choice(data.size, n_nan, replace=False) data.ravel()[index_nan] = np.nan Create DataArrayda = xr.DataArray(data=data, dims=['time', 'x', 'y'], coords={'time': time, 'x': lon, 'y': lat}, attrs={'nodata': np.nan}) Calculate 95th quantile over time-dimensionda.quantile(0.95, dim='time', skipna=True) ``` MVCE confirmation
Relevant log outputNo response Anything else we need to know?No response Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:36:39) [GCC 10.4.0]
python-bits: 64
OS: Linux
OS-release: 5.4.0-125-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.9.0
xarray: 2022.12.0
pandas: 1.5.0
numpy: 1.23.3
scipy: 1.9.1
netCDF4: 1.6.1
pydap: None
h5netcdf: None
h5py: 3.7.0
Nio: None
zarr: None
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.3.3
cfgrib: None
iris: None
bottleneck: 1.3.5
dask: 2022.10.0
distributed: 2022.10.0
matplotlib: 3.6.1
cartopy: 0.21.0
seaborn: 0.12.0
numbagg: None
fsspec: 2022.8.2
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 65.5.0
pip: 22.3
conda: 4.12.0
pytest: None
mypy: None
IPython: 8.5.0
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7377/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2108212331 | PR_kwDOAMm_X85leIWu | 8683 | Docs: Fix url in "Contribute to xarray" guide | maawoo 56583917 | closed | 0 | 3 | 2024-01-30T15:59:37Z | 2024-01-30T18:13:36Z | 2024-01-30T18:13:26Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/8683 | The URL in the section about creating a local development environment was pointing to itself. The new URL is pointing to the (I assume) correct section further down in the same guide. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8683/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | pull | |||||
2098488235 | I_kwDOAMm_X859FGOr | 8654 | Inconsistent preservation of chunk alignment for groupby-/resample-reduce operations w/o using flox | maawoo 56583917 | closed | 0 | 2 | 2024-01-24T15:12:38Z | 2024-01-24T16:23:20Z | 2024-01-24T15:58:22Z | CONTRIBUTOR | What happened?When performing groupby-/resample-reduce operations (e.g., ...whereas the alignment is preserved when flox is enabled: What did you expect to happen?The alignment of chunks is preserved whether using flox or not. Minimal Complete Verifiable Example```Python import pandas as pd import numpy as np import xarray as xr size_spatial = 1000 size_temporal = 200 time = pd.date_range("2000-01-01", periods=size_temporal, freq='h') lat = np.random.uniform(low=-90, high=90, size=size_spatial) lon = np.random.uniform(low=-180, high=180, size=size_spatial) data = np.random.rand(size_temporal, size_spatial, size_spatial) da = xr.DataArray(data=data, dims=['time', 'x', 'y'], coords={'time': time, 'x': lon, 'y': lat}).chunk({'time': -1, 'x': 'auto', 'y': 'auto'}) Chunk alignment not preservedwith xr.set_options(use_flox=False): da_1 = da.copy(deep=True) da_1 = da_1.resample(time="6h").mean() Chunk alignment preservedwith xr.set_options(use_flox=True): da_2 = da.copy(deep=True) da_2 = da_2.resample(time="6h").mean() ``` MVCE confirmation
Relevant log outputNo response Anything else we need to know?No response Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.11.7 | packaged by conda-forge | (main, Dec 23 2023, 14:38:07) [Clang 16.0.6 ]
python-bits: 64
OS: Darwin
OS-release: 22.4.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: None
LANG: None
LOCALE: (None, 'UTF-8')
libhdf5: None
libnetcdf: None
xarray: 2024.1.1
pandas: 2.2.0
numpy: 1.26.3
scipy: 1.12.0
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: 2024.1.0
distributed: 2024.1.0
matplotlib: None
cartopy: None
seaborn: None
numbagg: 0.7.1
fsspec: 2023.12.2
cupy: None
pint: None
sparse: None
flox: 0.9.0
numpy_groupies: 0.10.2
setuptools: 69.0.3
pip: 23.3.2
conda: None
pytest: None
mypy: None
IPython: 8.20.0
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8654/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
not_planned | xarray 13221727 | issue |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issues] ( [id] INTEGER PRIMARY KEY, [node_id] TEXT, [number] INTEGER, [title] TEXT, [user] INTEGER REFERENCES [users]([id]), [state] TEXT, [locked] INTEGER, [assignee] INTEGER REFERENCES [users]([id]), [milestone] INTEGER REFERENCES [milestones]([id]), [comments] INTEGER, [created_at] TEXT, [updated_at] TEXT, [closed_at] TEXT, [author_association] TEXT, [active_lock_reason] TEXT, [draft] INTEGER, [pull_request] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [state_reason] TEXT, [repo] INTEGER REFERENCES [repos]([id]), [type] TEXT ); CREATE INDEX [idx_issues_repo] ON [issues] ([repo]); CREATE INDEX [idx_issues_milestone] ON [issues] ([milestone]); CREATE INDEX [idx_issues_assignee] ON [issues] ([assignee]); CREATE INDEX [idx_issues_user] ON [issues] ([user]);