issues
4 rows where type = "issue" and user = 56583917 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: comments, created_at (date), updated_at (date), closed_at (date)
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at ▲ | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1497031605 | I_kwDOAMm_X85ZOuO1 | 7377 | Aggregating a dimension using the Quantiles method with `skipna=True` is very slow | maawoo 56583917 | closed | 0 | 17 | 2022-12-14T16:52:35Z | 2024-02-07T16:28:05Z | 2024-02-07T16:28:05Z | CONTRIBUTOR | What happened?Hi all,
as the title already summarizes, I'm running into performance issues when aggregating over the time-dimension of a 3D DataArray using the quantiles method with | | | |
| --------------- | --------------- | --------------- |
| 1 | I'm currently using a compute node with 40 CPUs and 180 GB RAM. Here is what the resource utilization looks like. First small bump are 1 and 2. Second longer peak is 3. In this small example, the process at least finishes after a few seconds. With my actual dataset the quantile calculation takes hours... I guess the following issue is relevant and should be revived: https://github.com/numpy/numpy/issues/16575 Are there any possible work-arounds? What did you expect to happen?No response Minimal Complete Verifiable Example```Python import pandas as pd import numpy as np import xarray as xr Create dummy data with 20% random NaNssize_spatial = 2000 size_temporal = 20 n_nan = int(size_spatial*20.2) time = pd.date_range("2000-01-01", periods=size_temporal) lat = np.random.uniform(low=-90, high=90, size=size_spatial) lon = np.random.uniform(low=-180, high=180, size=size_spatial) data = np.random.rand(size_temporal, size_spatial, size_spatial) index_nan = np.random.choice(data.size, n_nan, replace=False) data.ravel()[index_nan] = np.nan Create DataArrayda = xr.DataArray(data=data, dims=['time', 'x', 'y'], coords={'time': time, 'x': lon, 'y': lat}, attrs={'nodata': np.nan}) Calculate 95th quantile over time-dimensionda.quantile(0.95, dim='time', skipna=True) ``` MVCE confirmation
Relevant log outputNo response Anything else we need to know?No response Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:36:39) [GCC 10.4.0]
python-bits: 64
OS: Linux
OS-release: 5.4.0-125-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.9.0
xarray: 2022.12.0
pandas: 1.5.0
numpy: 1.23.3
scipy: 1.9.1
netCDF4: 1.6.1
pydap: None
h5netcdf: None
h5py: 3.7.0
Nio: None
zarr: None
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.3.3
cfgrib: None
iris: None
bottleneck: 1.3.5
dask: 2022.10.0
distributed: 2022.10.0
matplotlib: 3.6.1
cartopy: 0.21.0
seaborn: 0.12.0
numbagg: None
fsspec: 2022.8.2
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 65.5.0
pip: 22.3
conda: 4.12.0
pytest: None
mypy: None
IPython: 8.5.0
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7377/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
2098488235 | I_kwDOAMm_X859FGOr | 8654 | Inconsistent preservation of chunk alignment for groupby-/resample-reduce operations w/o using flox | maawoo 56583917 | closed | 0 | 2 | 2024-01-24T15:12:38Z | 2024-01-24T16:23:20Z | 2024-01-24T15:58:22Z | CONTRIBUTOR | What happened?When performing groupby-/resample-reduce operations (e.g., ...whereas the alignment is preserved when flox is enabled: What did you expect to happen?The alignment of chunks is preserved whether using flox or not. Minimal Complete Verifiable Example```Python import pandas as pd import numpy as np import xarray as xr size_spatial = 1000 size_temporal = 200 time = pd.date_range("2000-01-01", periods=size_temporal, freq='h') lat = np.random.uniform(low=-90, high=90, size=size_spatial) lon = np.random.uniform(low=-180, high=180, size=size_spatial) data = np.random.rand(size_temporal, size_spatial, size_spatial) da = xr.DataArray(data=data, dims=['time', 'x', 'y'], coords={'time': time, 'x': lon, 'y': lat}).chunk({'time': -1, 'x': 'auto', 'y': 'auto'}) Chunk alignment not preservedwith xr.set_options(use_flox=False): da_1 = da.copy(deep=True) da_1 = da_1.resample(time="6h").mean() Chunk alignment preservedwith xr.set_options(use_flox=True): da_2 = da.copy(deep=True) da_2 = da_2.resample(time="6h").mean() ``` MVCE confirmation
Relevant log outputNo response Anything else we need to know?No response Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.11.7 | packaged by conda-forge | (main, Dec 23 2023, 14:38:07) [Clang 16.0.6 ]
python-bits: 64
OS: Darwin
OS-release: 22.4.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: None
LANG: None
LOCALE: (None, 'UTF-8')
libhdf5: None
libnetcdf: None
xarray: 2024.1.1
pandas: 2.2.0
numpy: 1.26.3
scipy: 1.12.0
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: 2024.1.0
distributed: 2024.1.0
matplotlib: None
cartopy: None
seaborn: None
numbagg: 0.7.1
fsspec: 2023.12.2
cupy: None
pint: None
sparse: None
flox: 0.9.0
numpy_groupies: 0.10.2
setuptools: 69.0.3
pip: 23.3.2
conda: None
pytest: None
mypy: None
IPython: 8.20.0
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8654/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
not_planned | xarray 13221727 | issue | ||||||
1963071630 | I_kwDOAMm_X851AhiO | 8378 | Extend DatetimeAccessor with `snap`-method | maawoo 56583917 | open | 0 | 2 | 2023-10-26T09:16:24Z | 2023-10-27T08:08:58Z | CONTRIBUTOR | Is your feature request related to a problem?With satellite remote sensing data, you sometimes end up with a blown up DataArray/Dataset because individual acquisitions have been saved in slices: One could then aggregate these slices with something like this:
However, this would miss cases where one slice has been acquired before and the other after a specific hour. The Describe the solution you'd likeIn addition to the Describe alternatives you've consideredNo response Additional contextNo response |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8378/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1497131525 | I_kwDOAMm_X85ZPGoF | 7378 | Improve docstrings for better discoverability | maawoo 56583917 | open | 0 | 9 | 2022-12-14T17:59:20Z | 2023-04-02T04:26:57Z | CONTRIBUTOR | What is your issue?I noticed that the docstrings of the aggregation methods are mostly written in the same style, e.g.: "Reduce this Dataset's data by applying xy along some dimension(s).". Let's say a user is interested in calculating the variance and searches for the appropriate method. Neither xarray.DataArray.var nor xarray.Dataset.var will be returned (see here), because "variance" is not mentioned at all in the docstrings. Same problem exists for other methods like https://github.com/pydata/xarray/issues/6793 is related, but I guess it already has enough tasks. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7378/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issues] ( [id] INTEGER PRIMARY KEY, [node_id] TEXT, [number] INTEGER, [title] TEXT, [user] INTEGER REFERENCES [users]([id]), [state] TEXT, [locked] INTEGER, [assignee] INTEGER REFERENCES [users]([id]), [milestone] INTEGER REFERENCES [milestones]([id]), [comments] INTEGER, [created_at] TEXT, [updated_at] TEXT, [closed_at] TEXT, [author_association] TEXT, [active_lock_reason] TEXT, [draft] INTEGER, [pull_request] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [state_reason] TEXT, [repo] INTEGER REFERENCES [repos]([id]), [type] TEXT ); CREATE INDEX [idx_issues_repo] ON [issues] ([repo]); CREATE INDEX [idx_issues_milestone] ON [issues] ([milestone]); CREATE INDEX [idx_issues_assignee] ON [issues] ([assignee]); CREATE INDEX [idx_issues_user] ON [issues] ([user]);