issues
6 rows where state = "closed", type = "issue" and user = 39069044 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: comments, created_at (date), updated_at (date), closed_at (date)
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at ▲ | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1648260939 | I_kwDOAMm_X85iPndL | 7702 | Allow passing coordinates in `to_zarr(region=...)` rather than passing indexes | slevang 39069044 | closed | 0 | 3 | 2023-03-30T20:23:00Z | 2023-11-14T18:34:51Z | 2023-11-14T18:34:51Z | CONTRIBUTOR | Is your feature request related to a problem?If I want to write to a region of data in a zarr, I usually have some boilerplate code like this:
Describe the solution you'd likeIt would be nice to automate this within There may be pitfalls I'm not thinking of, and I don't know exactly what the API would look like.
Describe alternatives you've consideredNo response Additional contextNo response |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7702/reactions", "total_count": 7, "+1": 7, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
1060265915 | I_kwDOAMm_X84_Ml-7 | 6013 | Memory leak with `open_zarr` default chunking option | slevang 39069044 | closed | 0 | 3 | 2021-11-22T15:06:33Z | 2023-11-10T03:08:35Z | 2023-11-10T02:32:49Z | CONTRIBUTOR | What happened:
I've been using xarray to open zarr datasets within a Flask app, and spent some time debugging a memory leak. What I found is that For whatever reason this function is generating dask items that are not easily cleared from memory within the context of a Flask route, and memory usage continues to grow within my app, at least towards some plateau. This memory growth isn't reproducible outside of a Flask route, so it's a bit of a niche problem. First proposal would be to simply align the default What you expected to happen: Memory usage should not grow when opening a zarr dataset within a Flask route. Minimal Complete Verifiable Example: ```python from flask import Flask import xarray as xr import gc import dask.array as da save a test dataset to zarr locallyds_test = xr.Dataset({"foo": (["x", "y", "z"], da.random.random(size=(300,300,300)))}) ds_test.to_zarr('test.zarr', mode='w') app = Flask(name) ping this route repeatedly to see memory increase@app.route('/open_zarr') def open_zarr(): # with default chunks='auto', memory grows, with chunks=None, memory is ok ds = xr.open_zarr('test.zarr', chunks='auto').compute() # Try to explicity clear memory but this doesn't help del ds gc.collect() return 'check memory' if name == 'main': app.run(host='0.0.0.0', port=8080, debug=True) ``` Anything else we need to know?: Environment: Output of <tt>xr.show_versions()</tt>INSTALLED VERSIONS ------------------ commit: None python: 3.9.7 | packaged by conda-forge | (default, Sep 29 2021, 19:20:46) [GCC 9.4.0] python-bits: 64 OS: Linux OS-release: 5.11.0-40-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.1 libnetcdf: 4.8.1 xarray: 0.20.1 pandas: 1.3.4 numpy: 1.19.5 scipy: 1.7.2 netCDF4: 1.5.8 pydap: None h5netcdf: 0.11.0 h5py: 3.1.0 Nio: None zarr: 2.10.1 cftime: 1.5.1.1 nc_time_axis: 1.4.0 PseudoNetCDF: None rasterio: 1.2.10 cfgrib: 0.9.9.1 iris: None bottleneck: 1.3.2 dask: 2021.11.1 distributed: 2021.11.1 matplotlib: 3.4.3 cartopy: 0.20.1 seaborn: None numbagg: None fsspec: 2021.11.0 cupy: None pint: 0.18 sparse: None setuptools: 58.5.3 pip: 21.3.1 conda: None pytest: None IPython: 7.29.0 sphinx: None |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6013/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
1689655334 | I_kwDOAMm_X85kthgm | 7797 | More `groupby` indexing problems | slevang 39069044 | closed | 0 | 1 | 2023-04-29T18:58:11Z | 2023-05-02T14:48:43Z | 2023-05-02T14:48:43Z | CONTRIBUTOR | What happened?There is still something wrong with the groupby indexing changes from ```python import numpy as np import xarray as xr monthly timeseries that should return "zero anomalies" everywheretime = xr.date_range("2023-01-01", "2023-12-31", freq="MS") data = np.linspace(-1, 1, 12) x = xr.DataArray(data, coords={"time": time}) clim = xr.DataArray(data, coords={"month": np.arange(1, 13, 1)}) seems to give the correct result if we use the full x, but not with a slicex_slice = x.sel(time=["2023-04-01"]) two typical ways of computing anomaliesanom_gb = x_slice.groupby("time.month") - clim anom_sel = x_slice - clim.sel(month=x_slice.time.dt.month) passes on 2023.3.0, fails on 2023.4.2the groupby version is aligning the indexes wrong, giving us something other than 0assert anom_sel.equals(anom_gb) ``` Related: #7759 #7766 cc @dcherian What did you expect to happen?No response Minimal Complete Verifiable ExampleNo response MVCE confirmation
Relevant log outputNo response Anything else we need to know?No response Environment |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7797/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
1423114234 | I_kwDOAMm_X85U0v_6 | 7220 | `xr.where(..., keep_attrs=True)` overwrites coordinate attributes | slevang 39069044 | closed | 0 | 3 | 2022-10-25T21:17:17Z | 2022-11-30T23:35:30Z | 2022-11-30T23:35:30Z | CONTRIBUTOR | What happened?6461 had some unintended consequences for
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7220/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
1381294181 | I_kwDOAMm_X85SVOBl | 7062 | Rolling mean on dask array does not preserve dtype | slevang 39069044 | closed | 0 | 2 | 2022-09-21T17:55:30Z | 2022-09-22T22:06:09Z | 2022-09-22T22:06:09Z | CONTRIBUTOR | What happened?Calling What did you expect to happen?This is a simple enough operation that if you start with Minimal Complete Verifiable Example```Python
MVCE confirmation
Relevant log outputNo response Anything else we need to know?5877 is somewhat related.Environment
INSTALLED VERSIONS
------------------
commit: e6791852aa7ec0b126048b0986e205e158ab9601
python: 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:10)
[GCC 10.3.0]
python-bits: 64
OS: Linux
OS-release: 5.15.0-46-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.1
libnetcdf: 4.8.1
xarray: 2022.6.1.dev63+ge6791852.d20220921
pandas: 1.4.2
numpy: 1.21.6
scipy: 1.8.1
netCDF4: 1.5.8
pydap: installed
h5netcdf: 1.0.2
h5py: 3.6.0
Nio: None
zarr: 2.12.0
cftime: 1.6.0
nc_time_axis: 1.4.1
PseudoNetCDF: 3.2.2
rasterio: 1.2.10
cfgrib: 0.9.10.1
iris: 3.2.1
bottleneck: 1.3.4
dask: 2022.04.1
distributed: 2022.4.1
matplotlib: 3.5.2
cartopy: 0.20.2
seaborn: 0.11.2
numbagg: 0.2.1
fsspec: 2022.8.2
cupy: None
pint: 0.19.2
sparse: 0.13.0
flox: 0.5.9
numpy_groupies: 0.9.19
setuptools: 62.0.0
pip: 22.2.2
conda: None
pytest: 7.1.3
IPython: None
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7062/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
859218255 | MDU6SXNzdWU4NTkyMTgyNTU= | 5165 | Poor memory management with dask=2021.4.0 | slevang 39069044 | closed | 0 | 4 | 2021-04-15T20:19:05Z | 2021-04-21T12:16:31Z | 2021-04-21T10:17:40Z | CONTRIBUTOR | What happened:
With the latest dask release What you expected to happen:
Dask would intelligently manage chunks and not fill up memory. This works fine in Minimal Complete Verifiable Example: Generate a synthetic dataset with time/lat/lon variable and associated climatology stored to disk, then calculate the anomaly: ```python import xarray as xr import pandas as pd import numpy as np import dask.array as da dates = pd.date_range('1980-01-01', '2019-12-31', freq='D') ds = xr.Dataset( data_vars = { 'x':( ('time', 'lat', 'lon'), da.random.random(size=(dates.size, 360, 720), chunks=(1, -1, -1))), 'clim':( ('dayofyear', 'lat', 'lon'), da.random.random(size=(366, 360, 720), chunks=(1, -1, -1))), }, coords = { 'time': dates, 'dayofyear': np.arange(1, 367, 1), 'lat': np.arange(-90, 90, .5), 'lon': np.arange(-180, 180, .5), } ) My original use case was pulling this data from disk, but it doesn't actually seem to matterds.to_zarr('test-data', mode='w') ds = xr.open_zarr('test-data') ds['anom'] = ds.x.groupby('time.dayofyear') - ds.clim ds[['anom']].to_zarr('test-anom', mode='w') ``` Anything else we need to know?: Distributed vs local scheduler and file backend e.g. zarr vs netcdf don't seem to affect this. Dask graphs look the same for both 2021.3.0:
Environment: Output of <tt>xr.show_versions()</tt>INSTALLED VERSIONS ------------------ commit: None python: 3.8.6 | packaged by conda-forge | (default, Dec 26 2020, 05:05:16) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 5.8.0-48-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.17.1.dev52+ge5690588 pandas: 1.2.1 numpy: 1.19.5 scipy: 1.6.0 netCDF4: 1.5.5.1 pydap: None h5netcdf: 0.8.1 h5py: 2.10.0 Nio: None zarr: 2.6.1 cftime: 1.3.1 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: 1.1.8 cfgrib: 0.9.8.5 iris: None bottleneck: 1.3.2 dask: 2021.04.0 distributed: 2021.04.0 matplotlib: 3.3.3 cartopy: 0.18.0 seaborn: None numbagg: None pint: 0.16.1 setuptools: 49.6.0.post20210108 pip: 20.3.3 conda: None pytest: None IPython: 7.20.0 sphinx: None |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5165/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issues] ( [id] INTEGER PRIMARY KEY, [node_id] TEXT, [number] INTEGER, [title] TEXT, [user] INTEGER REFERENCES [users]([id]), [state] TEXT, [locked] INTEGER, [assignee] INTEGER REFERENCES [users]([id]), [milestone] INTEGER REFERENCES [milestones]([id]), [comments] INTEGER, [created_at] TEXT, [updated_at] TEXT, [closed_at] TEXT, [author_association] TEXT, [active_lock_reason] TEXT, [draft] INTEGER, [pull_request] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [state_reason] TEXT, [repo] INTEGER REFERENCES [repos]([id]), [type] TEXT ); CREATE INDEX [idx_issues_repo] ON [issues] ([repo]); CREATE INDEX [idx_issues_milestone] ON [issues] ([milestone]); CREATE INDEX [idx_issues_assignee] ON [issues] ([assignee]); CREATE INDEX [idx_issues_user] ON [issues] ([user]);