issues
2 rows where repo = 13221727, state = "open" and user = 40218891 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at ▲ | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1966264258 | I_kwDOAMm_X851Ms_C | 8385 | The method to_netcdf does not preserve chunks | yt87 40218891 | open | 0 | 3 | 2023-10-27T22:29:45Z | 2023-10-31T18:51:45Z | NONE | What happened?Methods What did you expect to happen?I expected the behaviour to be consistent for for all Minimal Complete Verifiable Example```Python import xarray as xr import dask.array as da rng = da.random.RandomState() shape = (20, 20) chunks = [10, 10] dims = ["x", "y"] z = rng.standard_normal(shape, chunks=chunks) ds = xr.DataArray(z, dims=dims, name="z").to_dataset() ds.chunks This one is rechunkedds.to_netcdf("/tmp/test1.nc", encoding={"z": {"chunksizes": (5, 5)}}) This one is not rechunked, also original chunks are lostds.chunk({"x": 5, "y": 5}).to_netcdf("/tmp/test2.nc") This one is rechunkedds.chunk({"x": 5, "y": 5}).to_zarr("/tmp/test2", mode="w") Frozen({'x': (10, 10), 'y': (10, 10)}) <xarray.backends.zarr.ZarrStore at 0x7f3669f1af80> xr.open_mfdataset("/tmp/test1.nc").chunks xr.open_mfdataset("/tmp/test2.nc").chunks xr.open_mfdataset("/tmp/test2", engine="zarr").chunks Frozen({'x': (5, 5, 5, 5), 'y': (5, 5, 5, 5)}) Frozen({'x': (20,), 'y': (20,)}) Frozen({'x': (5, 5, 5, 5), 'y': (5, 5, 5, 5)}) ``` MVCE confirmation
Relevant log outputNo response Anything else we need to know?I did get the same results for Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.11.6 | packaged by conda-forge | (main, Oct 3 2023, 10:40:35) [GCC 12.3.0]
python-bits: 64
OS: Linux
OS-release: 6.5.5-1-MANJARO
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.2
libnetcdf: 4.9.2
xarray: 2023.10.1
pandas: 2.1.1
numpy: 1.24.4
scipy: 1.11.3
netCDF4: 1.6.4
pydap: None
h5netcdf: 1.2.0
h5py: 3.10.0
Nio: None
zarr: 2.16.1
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
iris: None
bottleneck: 1.3.7
dask: 2023.10.0
distributed: 2023.10.0
matplotlib: 3.8.0
cartopy: 0.22.0
seaborn: None
numbagg: 0.5.1
fsspec: 2023.10.0
cupy: None
pint: None
sparse: 0.14.0
flox: 0.8.1
numpy_groupies: 0.10.2
setuptools: 68.2.2
pip: 23.3.1
conda: None
pytest: None
mypy: None
IPython: 8.16.1
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8385/reactions", "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
789653499 | MDU6SXNzdWU3ODk2NTM0OTk= | 4830 | GH2550 revisited | yt87 40218891 | open | 0 | 2 | 2021-01-20T05:40:16Z | 2021-01-25T23:06:01Z | NONE | Is your feature request related to a problem? Please describe. I am retrieving files from AWS: https://registry.opendata.aws/wrf-se-alaska-snap/. An example: ``` import s3fs import xarray as xr s3 = s3fs.S3FileSystem(anon=True) s3path = 's3://wrf-se-ak-ar5/gfdl/hist/daily/1980/WRFDS_1980-01-0[12].nc' remote_files = s3.glob(s3path) fileset = [s3.open(file) for file in remote_files] ds = xr.open_mfdataset(fileset, concat_dim='Time', decode_cf=False) ds ``` Data files for 1980 are missing time coordinate, so the above code fails. The time could be obtained by parsing file name, however in the current implementation the source attribute is available only when the fileset consists of strings or Paths. Describe the solution you'd like I would suggest to return to the original suggestion in #2550 - pass filename_or_object as an argument to preprocess function, but with necessary inspection. Here is my attempt (code in open_mfdataset): ``` open_kwargs = dict( engine=engine, chunks=chunks or {}, lock=lock, autoclose=autoclose, **kwargs )
ds = xr.open_mfdataset(fileset, preprocess=fix, concat_dim='Time', decode_cf=False)
def fix1(ds): print('fix1') return ds def fix2(ds, file): print('fix2:', file.as_uri()) return ds def fix3(ds, file, arg): print('fix3:', file.as_uri(), arg) return ds fileset = [Path('/home/george/Downloads/WRFDS_1988-04-23.nc'),
Path('/home/george/Downloads/WRFDS_1988-04-24.nc')
]
ds = xr.open_mfdataset(fileset, preprocess=fix1, concat_dim='Time', parallel=True)
ds = xr.open_mfdataset(fileset, preprocess=fix2, concat_dim='Time')
ds = xr.open_mfdataset(fileset, preprocess=partial(fix3, arg='additional argument'),
concat_dim='Time')
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/4830/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issues] ( [id] INTEGER PRIMARY KEY, [node_id] TEXT, [number] INTEGER, [title] TEXT, [user] INTEGER REFERENCES [users]([id]), [state] TEXT, [locked] INTEGER, [assignee] INTEGER REFERENCES [users]([id]), [milestone] INTEGER REFERENCES [milestones]([id]), [comments] INTEGER, [created_at] TEXT, [updated_at] TEXT, [closed_at] TEXT, [author_association] TEXT, [active_lock_reason] TEXT, [draft] INTEGER, [pull_request] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [state_reason] TEXT, [repo] INTEGER REFERENCES [repos]([id]), [type] TEXT ); CREATE INDEX [idx_issues_repo] ON [issues] ([repo]); CREATE INDEX [idx_issues_milestone] ON [issues] ([milestone]); CREATE INDEX [idx_issues_assignee] ON [issues] ([assignee]); CREATE INDEX [idx_issues_user] ON [issues] ([user]);