issues
14 rows where state = "closed" and user = 39069044 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: comments, created_at (date), updated_at (date), closed_at (date)
| id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at ▲ | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2171912634 | PR_kwDOAMm_X85o3Ify | 8809 | Pass variable name to `encode_zarr_variable` | slevang 39069044 | closed | 0 | 6 | 2024-03-06T16:21:53Z | 2024-04-03T14:26:49Z | 2024-04-03T14:26:48Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/8809 |
The change from https://github.com/pydata/xarray/pull/8672 mostly fixed the issue of serializing a reset multiindex in the backends, but there was an additional niche issue that turned up in xeofs that was causing serialization to still fail on the zarr backend. The issue is that zarr is the only backend that uses a custom version of As a minimal fix, this PR just passes The exact workflow this turned up in involves DataTree and looks like this: ```python import numpy as np import xarray as xr from datatree import DataTree ND DataArray that gets stacked along a multiindexda = xr.DataArray(np.ones((3, 3)), coords={"dim1": [1, 2, 3], "dim2": [4, 5, 6]}) da = da.stack(feature=["dim1", "dim2"]) Extract just the stacked coordinates for saving in a datasetds = xr.Dataset(data_vars={"feature": da.feature}) Reset the multiindex, which should make things serializableds = ds.reset_index("feature") dt1 = DataTree() dt2 = DataTree(name="feature", data=ds) dt1["foo"] = dt2 Somehow in this step, dt1.foo.feature.dim1.variable becomes an IndexVariable againprint(type(dt1.foo.feature.dim1.variable)) Worksdt1.to_netcdf("test.nc", mode="w") Failsdt1.to_zarr("test.zarr", mode="w") ``` But we can reproduce in xarray with the test added here. |
{
"url": "https://api.github.com/repos/pydata/xarray/issues/8809/reactions",
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
xarray 13221727 | pull | |||||
| 1648260939 | I_kwDOAMm_X85iPndL | 7702 | Allow passing coordinates in `to_zarr(region=...)` rather than passing indexes | slevang 39069044 | closed | 0 | 3 | 2023-03-30T20:23:00Z | 2023-11-14T18:34:51Z | 2023-11-14T18:34:51Z | CONTRIBUTOR | Is your feature request related to a problem?If I want to write to a region of data in a zarr, I usually have some boilerplate code like this:
Describe the solution you'd likeIt would be nice to automate this within There may be pitfalls I'm not thinking of, and I don't know exactly what the API would look like.
Describe alternatives you've consideredNo response Additional contextNo response |
{
"url": "https://api.github.com/repos/pydata/xarray/issues/7702/reactions",
"total_count": 7,
"+1": 7,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
completed | xarray 13221727 | issue | ||||||
| 1985969769 | PR_kwDOAMm_X85fDaBX | 8434 | Automatic region detection and transpose for `to_zarr()` | slevang 39069044 | closed | 0 | 15 | 2023-11-09T16:15:08Z | 2023-11-14T18:34:50Z | 2023-11-14T18:34:50Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/8434 |
A quick pass at implementing these two improvements for zarr region writes:
|
{
"url": "https://api.github.com/repos/pydata/xarray/issues/8434/reactions",
"total_count": 3,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 3,
"rocket": 0,
"eyes": 0
} |
xarray 13221727 | pull | |||||
| 1060265915 | I_kwDOAMm_X84_Ml-7 | 6013 | Memory leak with `open_zarr` default chunking option | slevang 39069044 | closed | 0 | 3 | 2021-11-22T15:06:33Z | 2023-11-10T03:08:35Z | 2023-11-10T02:32:49Z | CONTRIBUTOR | What happened:
I've been using xarray to open zarr datasets within a Flask app, and spent some time debugging a memory leak. What I found is that For whatever reason this function is generating dask items that are not easily cleared from memory within the context of a Flask route, and memory usage continues to grow within my app, at least towards some plateau. This memory growth isn't reproducible outside of a Flask route, so it's a bit of a niche problem. First proposal would be to simply align the default What you expected to happen: Memory usage should not grow when opening a zarr dataset within a Flask route. Minimal Complete Verifiable Example: ```python from flask import Flask import xarray as xr import gc import dask.array as da save a test dataset to zarr locallyds_test = xr.Dataset({"foo": (["x", "y", "z"], da.random.random(size=(300,300,300)))}) ds_test.to_zarr('test.zarr', mode='w') app = Flask(name) ping this route repeatedly to see memory increase@app.route('/open_zarr') def open_zarr(): # with default chunks='auto', memory grows, with chunks=None, memory is ok ds = xr.open_zarr('test.zarr', chunks='auto').compute() # Try to explicity clear memory but this doesn't help del ds gc.collect() return 'check memory' if name == 'main': app.run(host='0.0.0.0', port=8080, debug=True) ``` Anything else we need to know?: Environment: Output of <tt>xr.show_versions()</tt>INSTALLED VERSIONS ------------------ commit: None python: 3.9.7 | packaged by conda-forge | (default, Sep 29 2021, 19:20:46) [GCC 9.4.0] python-bits: 64 OS: Linux OS-release: 5.11.0-40-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.1 libnetcdf: 4.8.1 xarray: 0.20.1 pandas: 1.3.4 numpy: 1.19.5 scipy: 1.7.2 netCDF4: 1.5.8 pydap: None h5netcdf: 0.11.0 h5py: 3.1.0 Nio: None zarr: 2.10.1 cftime: 1.5.1.1 nc_time_axis: 1.4.0 PseudoNetCDF: None rasterio: 1.2.10 cfgrib: 0.9.9.1 iris: None bottleneck: 1.3.2 dask: 2021.11.1 distributed: 2021.11.1 matplotlib: 3.4.3 cartopy: 0.20.1 seaborn: None numbagg: None fsspec: 2021.11.0 cupy: None pint: 0.18 sparse: None setuptools: 58.5.3 pip: 21.3.1 conda: None pytest: None IPython: 7.29.0 sphinx: None |
{
"url": "https://api.github.com/repos/pydata/xarray/issues/6013/reactions",
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
completed | xarray 13221727 | issue | ||||||
| 1689655334 | I_kwDOAMm_X85kthgm | 7797 | More `groupby` indexing problems | slevang 39069044 | closed | 0 | 1 | 2023-04-29T18:58:11Z | 2023-05-02T14:48:43Z | 2023-05-02T14:48:43Z | CONTRIBUTOR | What happened?There is still something wrong with the groupby indexing changes from ```python import numpy as np import xarray as xr monthly timeseries that should return "zero anomalies" everywheretime = xr.date_range("2023-01-01", "2023-12-31", freq="MS") data = np.linspace(-1, 1, 12) x = xr.DataArray(data, coords={"time": time}) clim = xr.DataArray(data, coords={"month": np.arange(1, 13, 1)}) seems to give the correct result if we use the full x, but not with a slicex_slice = x.sel(time=["2023-04-01"]) two typical ways of computing anomaliesanom_gb = x_slice.groupby("time.month") - clim anom_sel = x_slice - clim.sel(month=x_slice.time.dt.month) passes on 2023.3.0, fails on 2023.4.2the groupby version is aligning the indexes wrong, giving us something other than 0assert anom_sel.equals(anom_gb) ``` Related: #7759 #7766 cc @dcherian What did you expect to happen?No response Minimal Complete Verifiable ExampleNo response MVCE confirmation
Relevant log outputNo response Anything else we need to know?No response Environment |
{
"url": "https://api.github.com/repos/pydata/xarray/issues/7797/reactions",
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
completed | xarray 13221727 | issue | ||||||
| 1483235066 | PR_kwDOAMm_X85Eti0b | 7364 | Handle numpy-only attrs in `xr.where` | slevang 39069044 | closed | 0 | 1 | 2022-12-08T00:52:43Z | 2022-12-10T21:52:49Z | 2022-12-10T21:52:37Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/7364 |
|
{
"url": "https://api.github.com/repos/pydata/xarray/issues/7364/reactions",
"total_count": 1,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 1,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
xarray 13221727 | pull | |||||
| 1423114234 | I_kwDOAMm_X85U0v_6 | 7220 | `xr.where(..., keep_attrs=True)` overwrites coordinate attributes | slevang 39069044 | closed | 0 | 3 | 2022-10-25T21:17:17Z | 2022-11-30T23:35:30Z | 2022-11-30T23:35:30Z | CONTRIBUTOR | What happened?6461 had some unintended consequences for
|
{
"url": "https://api.github.com/repos/pydata/xarray/issues/7220/reactions",
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
completed | xarray 13221727 | issue | ||||||
| 1424732975 | PR_kwDOAMm_X85Bnoaj | 7229 | Fix coordinate attr handling in `xr.where(..., keep_attrs=True)` | slevang 39069044 | closed | 0 | 5 | 2022-10-26T21:45:01Z | 2022-11-30T23:35:29Z | 2022-11-30T23:35:29Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/7229 |
Reverts the |
{
"url": "https://api.github.com/repos/pydata/xarray/issues/7229/reactions",
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
xarray 13221727 | pull | |||||
| 1198058137 | PR_kwDOAMm_X8416DPB | 6461 | Fix `xr.where(..., keep_attrs=True)` bug | slevang 39069044 | closed | 0 | 4 | 2022-04-09T03:02:40Z | 2022-10-25T22:40:15Z | 2022-04-12T02:12:39Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/6461 |
Fixes a bug introduced by #4687 where passing a non-xarray object to |
{
"url": "https://api.github.com/repos/pydata/xarray/issues/6461/reactions",
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
xarray 13221727 | pull | |||||
| 1381294181 | I_kwDOAMm_X85SVOBl | 7062 | Rolling mean on dask array does not preserve dtype | slevang 39069044 | closed | 0 | 2 | 2022-09-21T17:55:30Z | 2022-09-22T22:06:09Z | 2022-09-22T22:06:09Z | CONTRIBUTOR | What happened?Calling What did you expect to happen?This is a simple enough operation that if you start with Minimal Complete Verifiable Example```Python
MVCE confirmation
Relevant log outputNo response Anything else we need to know?5877 is somewhat related.Environment
INSTALLED VERSIONS
------------------
commit: e6791852aa7ec0b126048b0986e205e158ab9601
python: 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:10)
[GCC 10.3.0]
python-bits: 64
OS: Linux
OS-release: 5.15.0-46-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.1
libnetcdf: 4.8.1
xarray: 2022.6.1.dev63+ge6791852.d20220921
pandas: 1.4.2
numpy: 1.21.6
scipy: 1.8.1
netCDF4: 1.5.8
pydap: installed
h5netcdf: 1.0.2
h5py: 3.6.0
Nio: None
zarr: 2.12.0
cftime: 1.6.0
nc_time_axis: 1.4.1
PseudoNetCDF: 3.2.2
rasterio: 1.2.10
cfgrib: 0.9.10.1
iris: 3.2.1
bottleneck: 1.3.4
dask: 2022.04.1
distributed: 2022.4.1
matplotlib: 3.5.2
cartopy: 0.20.2
seaborn: 0.11.2
numbagg: 0.2.1
fsspec: 2022.8.2
cupy: None
pint: 0.19.2
sparse: 0.13.0
flox: 0.5.9
numpy_groupies: 0.9.19
setuptools: 62.0.0
pip: 22.2.2
conda: None
pytest: 7.1.3
IPython: None
sphinx: None
|
{
"url": "https://api.github.com/repos/pydata/xarray/issues/7062/reactions",
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
completed | xarray 13221727 | issue | ||||||
| 1381297782 | PR_kwDOAMm_X84_XseG | 7063 | Better dtype preservation for rolling mean on dask array | slevang 39069044 | closed | 0 | 1 | 2022-09-21T17:59:07Z | 2022-09-22T22:06:08Z | 2022-09-22T22:06:08Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/7063 |
This just tests to make sure we at least get the same dtype whether we have a numpy or dask array. |
{
"url": "https://api.github.com/repos/pydata/xarray/issues/7063/reactions",
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
xarray 13221727 | pull | |||||
| 1380016376 | PR_kwDOAMm_X84_TlHf | 7060 | More informative error for non-existent zarr store | slevang 39069044 | closed | 0 | 2 | 2022-09-20T21:27:35Z | 2022-09-20T22:38:45Z | 2022-09-20T22:38:45Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/7060 |
I've often been tripped up by the stack trace noted in #6484. This PR changes two things:
|
{
"url": "https://api.github.com/repos/pydata/xarray/issues/7060/reactions",
"total_count": 1,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 1,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
xarray 13221727 | pull | |||||
| 859218255 | MDU6SXNzdWU4NTkyMTgyNTU= | 5165 | Poor memory management with dask=2021.4.0 | slevang 39069044 | closed | 0 | 4 | 2021-04-15T20:19:05Z | 2021-04-21T12:16:31Z | 2021-04-21T10:17:40Z | CONTRIBUTOR | What happened:
With the latest dask release What you expected to happen:
Dask would intelligently manage chunks and not fill up memory. This works fine in Minimal Complete Verifiable Example: Generate a synthetic dataset with time/lat/lon variable and associated climatology stored to disk, then calculate the anomaly: ```python import xarray as xr import pandas as pd import numpy as np import dask.array as da dates = pd.date_range('1980-01-01', '2019-12-31', freq='D') ds = xr.Dataset( data_vars = { 'x':( ('time', 'lat', 'lon'), da.random.random(size=(dates.size, 360, 720), chunks=(1, -1, -1))), 'clim':( ('dayofyear', 'lat', 'lon'), da.random.random(size=(366, 360, 720), chunks=(1, -1, -1))), }, coords = { 'time': dates, 'dayofyear': np.arange(1, 367, 1), 'lat': np.arange(-90, 90, .5), 'lon': np.arange(-180, 180, .5), } ) My original use case was pulling this data from disk, but it doesn't actually seem to matterds.to_zarr('test-data', mode='w') ds = xr.open_zarr('test-data') ds['anom'] = ds.x.groupby('time.dayofyear') - ds.clim ds[['anom']].to_zarr('test-anom', mode='w') ``` Anything else we need to know?: Distributed vs local scheduler and file backend e.g. zarr vs netcdf don't seem to affect this. Dask graphs look the same for both 2021.3.0:
Environment: Output of <tt>xr.show_versions()</tt>INSTALLED VERSIONS ------------------ commit: None python: 3.8.6 | packaged by conda-forge | (default, Dec 26 2020, 05:05:16) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 5.8.0-48-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.17.1.dev52+ge5690588 pandas: 1.2.1 numpy: 1.19.5 scipy: 1.6.0 netCDF4: 1.5.5.1 pydap: None h5netcdf: 0.8.1 h5py: 2.10.0 Nio: None zarr: 2.6.1 cftime: 1.3.1 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: 1.1.8 cfgrib: 0.9.8.5 iris: None bottleneck: 1.3.2 dask: 2021.04.0 distributed: 2021.04.0 matplotlib: 3.3.3 cartopy: 0.18.0 seaborn: None numbagg: None pint: 0.16.1 setuptools: 49.6.0.post20210108 pip: 20.3.3 conda: None pytest: None IPython: 7.20.0 sphinx: None |
{
"url": "https://api.github.com/repos/pydata/xarray/issues/5165/reactions",
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
completed | xarray 13221727 | issue | ||||||
| 797302408 | MDExOlB1bGxSZXF1ZXN0NTY0MzM0ODQ1 | 4849 | Basic curvefit implementation | slevang 39069044 | closed | 0 | 12 | 2021-01-30T01:28:16Z | 2021-03-31T16:55:53Z | 2021-03-31T16:55:53Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/4849 |
This is a simple implementation of a more general curve-fitting API as discussed in #4300, using the existing scipy |
{
"url": "https://api.github.com/repos/pydata/xarray/issues/4849/reactions",
"total_count": 5,
"+1": 4,
"-1": 0,
"laugh": 0,
"hooray": 1,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
xarray 13221727 | pull |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issues] (
[id] INTEGER PRIMARY KEY,
[node_id] TEXT,
[number] INTEGER,
[title] TEXT,
[user] INTEGER REFERENCES [users]([id]),
[state] TEXT,
[locked] INTEGER,
[assignee] INTEGER REFERENCES [users]([id]),
[milestone] INTEGER REFERENCES [milestones]([id]),
[comments] INTEGER,
[created_at] TEXT,
[updated_at] TEXT,
[closed_at] TEXT,
[author_association] TEXT,
[active_lock_reason] TEXT,
[draft] INTEGER,
[pull_request] TEXT,
[body] TEXT,
[reactions] TEXT,
[performed_via_github_app] TEXT,
[state_reason] TEXT,
[repo] INTEGER REFERENCES [repos]([id]),
[type] TEXT
);
CREATE INDEX [idx_issues_repo]
ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
ON [issues] ([user]);
and 2021.4.0:
