issues
3 rows where user = 34276374 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date), closed_at (date)
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at ▲ | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1197117301 | I_kwDOAMm_X85HWo91 | 6456 | Writing a a dataset to .zarr in a loop makes all the data NaNs | tbloch1 34276374 | closed | 0 | 11 | 2022-04-08T10:05:25Z | 2023-10-14T20:30:49Z | 2023-10-14T20:30:48Z | NONE | What happened?I have lots (61) pickled pandas dataframes that I'm trying to convert from pickle/pandas to zarr/xarray. Since the dataframes are large (10000x2048) I can't load them all into memory. To get around this I'm (MCVE below) looping through the pickle files, reading them into dataframes, constructing DataArrays and then Datasets from the data, concatinating the dataset with the previous dataset and updating the dataset to point to this new concatenated dataset. Since I didn't want to use up too much memory, I'm also periodically writing the Dataset to .zarr in the loop and reopening it (hoping to make use of dask storing data on disk?). When I do this however, the final dataset ends up being all NaNs. What did you expect to happen?I expected the final dataset to contain all the concatenated data. Minimal Complete Verifiable Example```Python import pandas as pd import numpy as np import glob import xarray as xr from tqdm import tqdm Creating pkl files[pd.DataFrame(np.random.randint(0,10, (1000,500))).astype(object).to_pickle('df{}.pkl'.format(i)) for i in range(4)] fnames = glob.glob('*.pkl') df = pd.read_pickle(fnames[0]) df.columns = np.arange(0,500).astype(object) # the real pkl files contain all objects df.index = np.arange(0,1000).astype(object) df = df.astype(np.float32) ds = xr.DataArray(df.values, dims=['fname', 'res_dim'], coords={'fname': df.index.values, 'res_dim': df.columns.values}) ds = ds.to_dataset(name='low_dim') for idx, fname in enumerate(tqdm(fnames[1:])): df = pd.read_pickle(fname) df.columns = np.arange(0,500).astype(object) df.index = np.arange(0,1000).astype(object) df = df.astype(np.float32)
ds.to_zarr('zarr_bug.zarr', mode='w') ds = xr.open_zarr('zarr_bug.zarr') print(ds.low_dim.values) ``` Relevant log output
Anything else we need to know?If I get rid of the loop saving, everything works normally. EnvironmentINSTALLED VERSIONScommit: None python: 3.9.11 (main, Mar 28 2022, 10:10:35) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 5.11.0-27-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.0 libnetcdf: 4.7.4 xarray: 2022.3.0 pandas: 1.4.1 numpy: 1.21.0 scipy: 1.8.0 netCDF4: 1.5.8 pydap: installed h5netcdf: 1.0.0 h5py: 3.6.0 Nio: None zarr: 2.11.1 cftime: 1.6.0 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.10 cfgrib: 0.9.10.1 iris: None bottleneck: None dask: 2022.03.0 distributed: 2022.3.0 matplotlib: 3.5.1 cartopy: None seaborn: 0.11.2 numbagg: None fsspec: 2022.02.0 cupy: None pint: None sparse: None setuptools: 58.0.4 pip: 21.2.4 conda: None pytest: None IPython: 8.1.1 sphinx: None |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6456/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
not_planned | xarray 13221727 | issue | ||||||
1674532233 | I_kwDOAMm_X85jz1WJ | 7767 | Inconsistency between xr.where() and da.where() | tbloch1 34276374 | closed | 0 | 6 | 2023-04-19T09:30:02Z | 2023-09-20T19:25:58Z | 2023-09-20T19:25:58Z | NONE | What is your issue?
Example:
It seems like these two methods with the same name should have the same functionality, but they give inverse results. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7767/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
1318369110 | I_kwDOAMm_X85OlLdW | 6828 | xarray.DataArray.str.cat() doesn't work on chunked data | tbloch1 34276374 | open | 0 | 3 | 2022-07-26T14:58:16Z | 2023-01-17T18:36:14Z | NONE | What happened?I was trying to concatenate some DataArrays of strings, and it kept just returning the first DataArray without any changes. What did you expect to happen?I was expecting it to just provide the strings, concatenated together with the spearator between them. Minimal Complete Verifiable Example```Python da = xr.DataArray( np.zeros((2, 2)).astype(str), coords={'x':np.arange(2), 'y': np.arange(2)}, dims=['x', 'y']) dac = da.chunk() print((da == dac).values.all()) print((da.str.cat(da, sep='--') == dac.str.cat(dac, sep='--')).values.all()) print((da.str.cat(da, sep='--') == dac.compute().str.cat(dac.compute(), sep='--')).values.all())
MVCE confirmation
Relevant log outputNo response Anything else we need to know?No response Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.8.10 (default, Mar 15 2022, 12:22:08)
[GCC 9.4.0]
python-bits: 64
OS: Linux
OS-release: 5.11.0-27-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.9.0
xarray: 2022.3.0
pandas: 1.4.2
numpy: 1.22.4
scipy: 1.8.1
netCDF4: 1.6.0
pydap: None
h5netcdf: 1.0.1
h5py: 3.6.0
Nio: None
zarr: 2.11.3
cftime: 1.6.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.2.10
cfgrib: None
iris: None
bottleneck: None
dask: 2022.7.0
distributed: None
matplotlib: 3.5.2
cartopy: None
seaborn: None
numbagg: None
fsspec: 2022.5.0
cupy: None
pint: None
sparse: None
setuptools: 62.3.2
pip: 22.1.2
conda: None
pytest: None
IPython: 8.3.0
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6828/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issues] ( [id] INTEGER PRIMARY KEY, [node_id] TEXT, [number] INTEGER, [title] TEXT, [user] INTEGER REFERENCES [users]([id]), [state] TEXT, [locked] INTEGER, [assignee] INTEGER REFERENCES [users]([id]), [milestone] INTEGER REFERENCES [milestones]([id]), [comments] INTEGER, [created_at] TEXT, [updated_at] TEXT, [closed_at] TEXT, [author_association] TEXT, [active_lock_reason] TEXT, [draft] INTEGER, [pull_request] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [state_reason] TEXT, [repo] INTEGER REFERENCES [repos]([id]), [type] TEXT ); CREATE INDEX [idx_issues_repo] ON [issues] ([repo]); CREATE INDEX [idx_issues_milestone] ON [issues] ([milestone]); CREATE INDEX [idx_issues_assignee] ON [issues] ([assignee]); CREATE INDEX [idx_issues_user] ON [issues] ([user]);