issue_comments
104 rows where user = 90008 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: reactions, created_at (date), updated_at (date)
user 1
- hmaarrfk · 104 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
1515539273 | https://github.com/pydata/xarray/issues/7770#issuecomment-1515539273 | https://api.github.com/repos/pydata/xarray/issues/7770 | IC_kwDOAMm_X85aVUtJ | hmaarrfk 90008 | 2023-04-20T00:15:23Z | 2023-04-20T00:15:23Z | CONTRIBUTOR | Understood. Thank you for your prompt replies. I'll read up on ask again if I have any questions. I guess I was trying to accommodate in the past users that were not using our wrappers to |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Provide a public API for adding new backends 1675299031 | |
1484222279 | https://github.com/pydata/xarray/pull/4400#issuecomment-1484222279 | https://api.github.com/repos/pydata/xarray/issues/4400 | IC_kwDOAMm_X85Yd29H | hmaarrfk 90008 | 2023-03-26T20:59:00Z | 2023-03-26T20:59:00Z | CONTRIBUTOR | nice! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
[WIP] Support nano second time encoding. 690546795 | |
1434780029 | https://github.com/pydata/xarray/issues/4079#issuecomment-1434780029 | https://api.github.com/repos/pydata/xarray/issues/4079 | IC_kwDOAMm_X85VhQF9 | hmaarrfk 90008 | 2023-02-17T15:08:50Z | 2023-02-17T15:08:50Z | CONTRIBUTOR | I know it is "stale" but aligning to these "surprise dimensions" creates "late stage" bugs that are hard to pinpoint. I'm not sure if it is possible to mark these dimensions as "unnamed" and as such, they should be "merged" into new "unnamed" dimensions that the user isn't tracking at this point in time. Our workaround have included calling these dimensions something related to the datarray ```python import xarray as xr d1 = xr.DataArray(data=[1, 2]) assert 'dim_0' in d1.dims d2 = xr.DataArray(data=[1, 2, 3]) assert 'dim_0' in d2.dims xr.Dataset({'d1': d1, 'd2': d2}) ``` Stack trace``` --------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[2], line 7 4 d2 = xr.DataArray(data=[1, 2, 3]) 5 assert 'dim_0' in d2.dims ----> 7 xr.Dataset({'d1': d1, 'd2': d2}) File ~/mambaforge/envs/dev/lib/python3.9/site-packages/xarray/core/dataset.py:612, in Dataset.__init__(self, data_vars, coords, attrs) 609 if isinstance(coords, Dataset): 610 coords = coords.variables --> 612 variables, coord_names, dims, indexes, _ = merge_data_and_coords( 613 data_vars, coords, compat="broadcast_equals" 614 ) 616 self._attrs = dict(attrs) if attrs is not None else None 617 self._close = None File ~/mambaforge/envs/dev/lib/python3.9/site-packages/xarray/core/merge.py:564, in merge_data_and_coords(data_vars, coords, compat, join) 562 objects = [data_vars, coords] 563 explicit_coords = coords.keys() --> 564 return merge_core( 565 objects, 566 compat, 567 join, 568 explicit_coords=explicit_coords, 569 indexes=Indexes(indexes, coords), 570 ) File ~/mambaforge/envs/dev/lib/python3.9/site-packages/xarray/core/merge.py:741, in merge_core(objects, compat, join, combine_attrs, priority_arg, explicit_coords, indexes, fill_value) 738 _assert_compat_valid(compat) 740 coerced = coerce_pandas_values(objects) --> 741 aligned = deep_align( 742 coerced, join=join, copy=False, indexes=indexes, fill_value=fill_value 743 ) 744 collected = collect_variables_and_indexes(aligned, indexes=indexes) 745 prioritized = _get_priority_vars_and_indexes(aligned, priority_arg, compat=compat) File ~/mambaforge/envs/dev/lib/python3.9/site-packages/xarray/core/alignment.py:848, in deep_align(objects, join, copy, indexes, exclude, raise_on_invalid, fill_value) 845 else: 846 out.append(variables) --> 848 aligned = align( 849 *targets, 850 join=join, 851 copy=copy, 852 indexes=indexes, 853 exclude=exclude, 854 fill_value=fill_value, 855 ) 857 for position, key, aligned_obj in zip(positions, keys, aligned): 858 if key is no_key: File ~/mambaforge/envs/dev/lib/python3.9/site-packages/xarray/core/alignment.py:785, in align(join, copy, indexes, exclude, fill_value, *objects) 589 """ 590 Given any number of Dataset and/or DataArray objects, returns new 591 objects with aligned indexes and dimension sizes. (...) 775 776 """ 777 aligner = Aligner( 778 objects, 779 join=join, (...) 783 fill_value=fill_value, 784 ) --> 785 aligner.align() 786 return aligner.results File ~/mambaforge/envs/dev/lib/python3.9/site-packages/xarray/core/alignment.py:573, in Aligner.align(self) 571 self.assert_no_index_conflict() 572 self.align_indexes() --> 573 self.assert_unindexed_dim_sizes_equal() 575 if self.join == "override": 576 self.override_indexes() File ~/mambaforge/envs/dev/lib/python3.9/site-packages/xarray/core/alignment.py:472, in Aligner.assert_unindexed_dim_sizes_equal(self) 470 add_err_msg = "" 471 if len(sizes) > 1: --> 472 raise ValueError( 473 f"cannot reindex or align along dimension {dim!r} " 474 f"because of conflicting dimension sizes: {sizes!r}" + add_err_msg 475 ) ValueError: cannot reindex or align along dimension 'dim_0' because of conflicting dimension sizes: {2, 3} ```cc: @claydugo |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Unnamed dimensions 621078539 | |
1421384646 | https://github.com/pydata/xarray/issues/7513#issuecomment-1421384646 | https://api.github.com/repos/pydata/xarray/issues/7513 | IC_kwDOAMm_X85UuJvG | hmaarrfk 90008 | 2023-02-07T20:15:42Z | 2023-02-07T20:15:42Z | CONTRIBUTOR | I kinda think this reminds me of |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
intermittent failures with h5netcdf, h5py on macos 1574694462 | |
1412388313 | https://github.com/pydata/xarray/issues/5081#issuecomment-1412388313 | https://api.github.com/repos/pydata/xarray/issues/5081 | IC_kwDOAMm_X85UL1XZ | hmaarrfk 90008 | 2023-02-01T16:54:53Z | 2023-02-01T16:54:53Z | CONTRIBUTOR | As a followup question, is the LazilyIndexedArray part of the 'public api'. That is when you do decide to refactor, https://docs.xarray.dev/en/stable/generated/xarray.core.indexing.LazilyIndexedArray.html Will you try to warn us users that choose to
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Lazy indexing arrays as a stand-alone package 842436143 | |
1412379773 | https://github.com/pydata/xarray/issues/5081#issuecomment-1412379773 | https://api.github.com/repos/pydata/xarray/issues/5081 | IC_kwDOAMm_X85ULzR9 | hmaarrfk 90008 | 2023-02-01T16:49:15Z | 2023-02-01T16:49:15Z | CONTRIBUTOR | I'm going to say, the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Lazy indexing arrays as a stand-alone package 842436143 | |
1411104404 | https://github.com/pydata/xarray/pull/4395#issuecomment-1411104404 | https://api.github.com/repos/pydata/xarray/issues/4395 | IC_kwDOAMm_X85UG76U | hmaarrfk 90008 | 2023-01-31T21:39:15Z | 2023-01-31T21:39:15Z | CONTRIBUTOR | Ultimately, I'm not sure how you want to manage resources. This zarr store could be considered a resource and thus, may have an owner. Or maybe zarr should close itself upon garbage cleanup. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: Ensure that zarr.ZipStores are closed 689502005 | |
1411102778 | https://github.com/pydata/xarray/pull/4395#issuecomment-1411102778 | https://api.github.com/repos/pydata/xarray/issues/4395 | IC_kwDOAMm_X85UG7g6 | hmaarrfk 90008 | 2023-01-31T21:38:23Z | 2023-01-31T21:38:23Z | CONTRIBUTOR | I'm not sure. I decided not to use zarr (not now) so i lost interest sorry. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: Ensure that zarr.ZipStores are closed 689502005 | |
1383192335 | https://github.com/pydata/xarray/issues/7245#issuecomment-1383192335 | https://api.github.com/repos/pydata/xarray/issues/7245 | IC_kwDOAMm_X85ScdcP | hmaarrfk 90008 | 2023-01-15T16:23:15Z | 2023-01-15T16:23:15Z | CONTRIBUTOR | Thank you for your explination. Do you think it is safe to "strip" encoding after "loading" the data? or is it still used after the initial call to |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
coordinates not removed for variable encoding during reset_coords 1432388736 | |
1369001951 | https://github.com/pydata/xarray/issues/7245#issuecomment-1369001951 | https://api.github.com/repos/pydata/xarray/issues/7245 | IC_kwDOAMm_X85RmU_f | hmaarrfk 90008 | 2023-01-02T14:41:45Z | 2023-01-02T14:41:45Z | CONTRIBUTOR | Kind bump |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
coordinates not removed for variable encoding during reset_coords 1432388736 | |
1362322800 | https://github.com/pydata/xarray/pull/7356#issuecomment-1362322800 | https://api.github.com/repos/pydata/xarray/issues/7356 | IC_kwDOAMm_X85RM2Vw | hmaarrfk 90008 | 2022-12-22T02:40:59Z | 2022-12-22T02:40:59Z | CONTRIBUTOR | Any chance of a release, this is quite breaking for large datasets that can only be out of memory. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Avoid loading entire dataset by getting the nbytes in an array 1475567394 | |
1346924547 | https://github.com/pydata/xarray/pull/7356#issuecomment-1346924547 | https://api.github.com/repos/pydata/xarray/issues/7356 | IC_kwDOAMm_X85QSHAD | hmaarrfk 90008 | 2022-12-12T17:27:47Z | 2022-12-12T17:27:47Z | CONTRIBUTOR | 👍🏾 |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Avoid loading entire dataset by getting the nbytes in an array 1475567394 | |
1339624818 | https://github.com/pydata/xarray/pull/7356#issuecomment-1339624818 | https://api.github.com/repos/pydata/xarray/issues/7356 | IC_kwDOAMm_X85P2Q1y | hmaarrfk 90008 | 2022-12-06T16:19:19Z | 2022-12-06T16:19:19Z | CONTRIBUTOR | Yes, without chunks of anything |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Avoid loading entire dataset by getting the nbytes in an array 1475567394 | |
1339624418 | https://github.com/pydata/xarray/pull/7356#issuecomment-1339624418 | https://api.github.com/repos/pydata/xarray/issues/7356 | IC_kwDOAMm_X85P2Qvi | hmaarrfk 90008 | 2022-12-06T16:18:59Z | 2022-12-06T16:18:59Z | CONTRIBUTOR | Very smart test! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Avoid loading entire dataset by getting the nbytes in an array 1475567394 | |
1339457617 | https://github.com/pydata/xarray/pull/7356#issuecomment-1339457617 | https://api.github.com/repos/pydata/xarray/issues/7356 | IC_kwDOAMm_X85P1oBR | hmaarrfk 90008 | 2022-12-06T14:18:11Z | 2022-12-06T14:18:11Z | CONTRIBUTOR | The data is loaded from an NetCDF store through open_dataset |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Avoid loading entire dataset by getting the nbytes in an array 1475567394 | |
1339452942 | https://github.com/pydata/xarray/pull/7356#issuecomment-1339452942 | https://api.github.com/repos/pydata/xarray/issues/7356 | IC_kwDOAMm_X85P1m4O | hmaarrfk 90008 | 2022-12-06T14:14:57Z | 2022-12-06T14:14:57Z | CONTRIBUTOR | No explicit test was added to ensure that the data wasn't loaded. I just experienced this bug enough (we would accidentally load 100GB files in our code base) that I knew exactly how to fix it. If you want i can add a test to ensure that future optimizations to nbytes do not trigger a data load. I was hoping the 1 line fix would be a shoe in. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Avoid loading entire dataset by getting the nbytes in an array 1475567394 | |
1336731702 | https://github.com/pydata/xarray/pull/7356#issuecomment-1336731702 | https://api.github.com/repos/pydata/xarray/issues/7356 | IC_kwDOAMm_X85PrOg2 | hmaarrfk 90008 | 2022-12-05T04:20:08Z | 2022-12-05T04:20:08Z | CONTRIBUTOR | It seems that checking hasattr on the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Avoid loading entire dataset by getting the nbytes in an array 1475567394 | |
1336711830 | https://github.com/pydata/xarray/pull/7356#issuecomment-1336711830 | https://api.github.com/repos/pydata/xarray/issues/7356 | IC_kwDOAMm_X85PrJqW | hmaarrfk 90008 | 2022-12-05T03:58:50Z | 2022-12-05T03:58:50Z | CONTRIBUTOR | I think that at the very lease, the current implementation works as well as the old one for arrays that are defined by the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Avoid loading entire dataset by getting the nbytes in an array 1475567394 | |
1336700669 | https://github.com/pydata/xarray/pull/7356#issuecomment-1336700669 | https://api.github.com/repos/pydata/xarray/issues/7356 | IC_kwDOAMm_X85PrG79 | hmaarrfk 90008 | 2022-12-05T03:36:31Z | 2022-12-05T03:36:31Z | CONTRIBUTOR | Looking into the history a little more. I seem to be proposing to revert: https://github.com/pydata/xarray/commit/60f8c3d3488d377b0b21009422c6121e1c8f1f70 I think this is important since many users have arrays that are larger than memory. For me, I found this bug when trying to access the number of bytes in a 16GB dataset that I'm trying to load on my wimpy laptop. Not fun to start swapping. I feel like others might be hitting this too. xref: https://github.com/pydata/xarray/pull/6797 https://github.com/pydata/xarray/issues/4842 |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Avoid loading entire dataset by getting the nbytes in an array 1475567394 | |
1336696899 | https://github.com/pydata/xarray/pull/7356#issuecomment-1336696899 | https://api.github.com/repos/pydata/xarray/issues/7356 | IC_kwDOAMm_X85PrGBD | hmaarrfk 90008 | 2022-12-05T03:30:31Z | 2022-12-05T03:30:31Z | CONTRIBUTOR | I personally do not even think the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Avoid loading entire dataset by getting the nbytes in an array 1475567394 | |
1320956883 | https://github.com/pydata/xarray/issues/7259#issuecomment-1320956883 | https://api.github.com/repos/pydata/xarray/issues/7259 | IC_kwDOAMm_X85OvDPT | hmaarrfk 90008 | 2022-11-19T19:51:27Z | 2022-11-19T19:51:27Z | CONTRIBUTOR | I'm really not sure. It seems to happen with a large swath of versions from my recent search. Also running from the python REPL, i don't see the warning. which makes me feel like numpy/cython/netcdf4 are trying to suppress the harmless warning. https://github.com/cython/cython/blob/0.29.x/Cython/Utility/ImportExport.c#L365 |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
🐛 NetCDF4 RuntimeWarning if xarray is imported before netCDF4 1437481995 | |
1320953994 | https://github.com/pydata/xarray/issues/7259#issuecomment-1320953994 | https://api.github.com/repos/pydata/xarray/issues/7259 | IC_kwDOAMm_X85OvCiK | hmaarrfk 90008 | 2022-11-19T19:33:37Z | 2022-11-19T19:33:37Z | CONTRIBUTOR | one or the other. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
🐛 NetCDF4 RuntimeWarning if xarray is imported before netCDF4 1437481995 | |
1320953155 | https://github.com/pydata/xarray/issues/7259#issuecomment-1320953155 | https://api.github.com/repos/pydata/xarray/issues/7259 | IC_kwDOAMm_X85OvCVD | hmaarrfk 90008 | 2022-11-19T19:28:20Z | 2022-11-19T19:28:53Z | CONTRIBUTOR | I think it is a numpy thing
```
$ mamba list
# packages in environment at /home/mark/mambaforge/envs/np:
#
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 2_gnu conda-forge
bzip2 1.0.8 h7f98852_4 conda-forge
c-ares 1.18.1 h7f98852_0 conda-forge
ca-certificates 2022.9.24 ha878542_0 conda-forge
cftime 1.6.2 py311h4c7f6c3_1 conda-forge
curl 7.86.0 h2283fc2_1 conda-forge
hdf4 4.2.15 h9772cbc_5 conda-forge
hdf5 1.12.2 nompi_h4df4325_100 conda-forge
icu 70.1 h27087fc_0 conda-forge
jpeg 9e h166bdaf_2 conda-forge
keyutils 1.6.1 h166bdaf_0 conda-forge
krb5 1.19.3 h08a2579_0 conda-forge
ld_impl_linux-64 2.39 hc81fddc_0 conda-forge
libblas 3.9.0 16_linux64_openblas conda-forge
libcblas 3.9.0 16_linux64_openblas conda-forge
libcurl 7.86.0 h2283fc2_1 conda-forge
libedit 3.1.20191231 he28a2e2_2 conda-forge
libev 4.33 h516909a_1 conda-forge
libffi 3.4.2 h7f98852_5 conda-forge
libgcc-ng 12.2.0 h65d4601_19 conda-forge
libgfortran-ng 12.2.0 h69a702a_19 conda-forge
libgfortran5 12.2.0 h337968e_19 conda-forge
libgomp 12.2.0 h65d4601_19 conda-forge
libiconv 1.17 h166bdaf_0 conda-forge
liblapack 3.9.0 16_linux64_openblas conda-forge
libnetcdf 4.8.1 nompi_h261ec11_106 conda-forge
libnghttp2 1.47.0 hff17c54_1 conda-forge
libnsl 2.0.0 h7f98852_0 conda-forge
libopenblas 0.3.21 pthreads_h78a6416_3 conda-forge
libsqlite 3.40.0 h753d276_0 conda-forge
libssh2 1.10.0 hf14f497_3 conda-forge
libstdcxx-ng 12.2.0 h46fd767_19 conda-forge
libuuid 2.32.1 h7f98852_1000 conda-forge
libxml2 2.10.3 h7463322_0 conda-forge
libzip 1.9.2 hc929e4a_1 conda-forge
libzlib 1.2.13 h166bdaf_4 conda-forge
ncurses 6.3 h27087fc_1 conda-forge
netcdf4 1.6.2 nompi_py311hc6fcf29_100 conda-forge
numpy 1.23.4 py311h7d28db0_1 conda-forge
openssl 3.0.7 h166bdaf_0 conda-forge
pip 22.3.1 pyhd8ed1ab_0 conda-forge
python 3.11.0 ha86cf86_0_cpython conda-forge
python_abi 3.11 2_cp311 conda-forge
readline 8.1.2 h0f457ee_0 conda-forge
setuptools 65.5.1 pyhd8ed1ab_0 conda-forge
tk 8.6.12 h27826a3_0 conda-forge
tzdata 2022f h191b570_0 conda-forge
wheel 0.38.4 pyhd8ed1ab_0 conda-forge
xz 5.2.6 h166bdaf_0 conda-forge
zlib 1.2.13 h166bdaf_4 conda-forge
```
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
🐛 NetCDF4 RuntimeWarning if xarray is imported before netCDF4 1437481995 | |
1320950377 | https://github.com/pydata/xarray/issues/7259#issuecomment-1320950377 | https://api.github.com/repos/pydata/xarray/issues/7259 | IC_kwDOAMm_X85OvBpp | hmaarrfk 90008 | 2022-11-19T19:14:04Z | 2022-11-19T19:14:04Z | CONTRIBUTOR |
`mamba list```` mamba list # packages in environment at /home/mark/mambaforge/envs/xr: # # Name Version Build Channel _libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 2_gnu conda-forge bzip2 1.0.8 h7f98852_4 conda-forge c-ares 1.18.1 h7f98852_0 conda-forge ca-certificates 2022.9.24 ha878542_0 conda-forge cftime 1.6.2 py311h4c7f6c3_1 conda-forge curl 7.86.0 h2283fc2_1 conda-forge hdf4 4.2.15 h9772cbc_5 conda-forge hdf5 1.12.2 nompi_h4df4325_100 conda-forge icu 70.1 h27087fc_0 conda-forge jpeg 9e h166bdaf_2 conda-forge keyutils 1.6.1 h166bdaf_0 conda-forge krb5 1.19.3 h08a2579_0 conda-forge ld_impl_linux-64 2.39 hc81fddc_0 conda-forge libblas 3.9.0 16_linux64_openblas conda-forge libcblas 3.9.0 16_linux64_openblas conda-forge libcurl 7.86.0 h2283fc2_1 conda-forge libedit 3.1.20191231 he28a2e2_2 conda-forge libev 4.33 h516909a_1 conda-forge libffi 3.4.2 h7f98852_5 conda-forge libgcc-ng 12.2.0 h65d4601_19 conda-forge libgfortran-ng 12.2.0 h69a702a_19 conda-forge libgfortran5 12.2.0 h337968e_19 conda-forge libgomp 12.2.0 h65d4601_19 conda-forge libiconv 1.17 h166bdaf_0 conda-forge liblapack 3.9.0 16_linux64_openblas conda-forge libnetcdf 4.8.1 nompi_h261ec11_106 conda-forge libnghttp2 1.47.0 hff17c54_1 conda-forge libnsl 2.0.0 h7f98852_0 conda-forge libopenblas 0.3.21 pthreads_h78a6416_3 conda-forge libsqlite 3.40.0 h753d276_0 conda-forge libssh2 1.10.0 hf14f497_3 conda-forge libstdcxx-ng 12.2.0 h46fd767_19 conda-forge libuuid 2.32.1 h7f98852_1000 conda-forge libxml2 2.10.3 h7463322_0 conda-forge libzip 1.9.2 hc929e4a_1 conda-forge libzlib 1.2.13 h166bdaf_4 conda-forge ncurses 6.3 h27087fc_1 conda-forge netcdf4 1.6.2 nompi_py311hc6fcf29_100 conda-forge numpy 1.23.4 py311h7d28db0_1 conda-forge openssl 3.0.7 h166bdaf_0 conda-forge packaging 21.3 pyhd8ed1ab_0 conda-forge pandas 1.5.1 py311h8b32b4d_1 conda-forge pip 22.3.1 pyhd8ed1ab_0 conda-forge pyparsing 3.0.9 pyhd8ed1ab_0 conda-forge python 3.11.0 ha86cf86_0_cpython conda-forge python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge python_abi 3.11 2_cp311 conda-forge pytz 2022.6 pyhd8ed1ab_0 conda-forge readline 8.1.2 h0f457ee_0 conda-forge setuptools 65.5.1 pyhd8ed1ab_0 conda-forge six 1.16.0 pyh6c4a22f_0 conda-forge tk 8.6.12 h27826a3_0 conda-forge tzdata 2022f h191b570_0 conda-forge wheel 0.38.4 pyhd8ed1ab_0 conda-forge xarray 2022.11.0 pyhd8ed1ab_0 conda-forge xz 5.2.6 h166bdaf_0 conda-forge zlib 1.2.13 h166bdaf_4 conda-forge ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
🐛 NetCDF4 RuntimeWarning if xarray is imported before netCDF4 1437481995 | |
1320945794 | https://github.com/pydata/xarray/issues/7259#issuecomment-1320945794 | https://api.github.com/repos/pydata/xarray/issues/7259 | IC_kwDOAMm_X85OvAiC | hmaarrfk 90008 | 2022-11-19T18:51:01Z | 2022-11-19T18:51:01Z | CONTRIBUTOR | It is also reproducible on binder:
It seems that the binder uses conda-forge, which is why i'm commenting here. It is really strange in the sense that xarray doesn't compile anything. https://github.com/conda-forge/xarray-feedstock/blob/main/recipe/meta.yaml#L16 So it must be something that gets lazy loaded that triggers things. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
🐛 NetCDF4 RuntimeWarning if xarray is imported before netCDF4 1437481995 | |
1306327743 | https://github.com/pydata/xarray/issues/2799#issuecomment-1306327743 | https://api.github.com/repos/pydata/xarray/issues/2799 | IC_kwDOAMm_X85N3Pq_ | hmaarrfk 90008 | 2022-11-07T22:45:07Z | 2022-11-07T22:45:07Z | CONTRIBUTOR | As I've been recently going down this performance rabbit hole, I think the discussion around https://github.com/pydata/xarray/issues/7045 is relevant and provides some additional historical context as to "why" this performance penalty might be happening. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Performance: numpy indexes small amounts of data 1000 faster than xarray 416962458 | |
1300527716 | https://github.com/pydata/xarray/issues/7245#issuecomment-1300527716 | https://api.github.com/repos/pydata/xarray/issues/7245 | IC_kwDOAMm_X85NhHpk | hmaarrfk 90008 | 2022-11-02T14:27:04Z | 2022-11-02T14:27:04Z | CONTRIBUTOR | While the above "fix" addresses the issues with renaming coordinates, I think there are plenty of usecases where we would still end up with strange, or unexpected results. For example.
We could apply the "fix" to the I think a more "generic", albeit breaking" fix would be to remove the " |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
coordinates not removed for variable encoding during reset_coords 1432388736 | |
1299492524 | https://github.com/pydata/xarray/issues/7245#issuecomment-1299492524 | https://api.github.com/repos/pydata/xarray/issues/7245 | IC_kwDOAMm_X85NdK6s | hmaarrfk 90008 | 2022-11-02T02:49:58Z | 2022-11-02T02:57:37Z | CONTRIBUTOR | And if you want to have a clean encoding dictionary, you may want to do the following:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
coordinates not removed for variable encoding during reset_coords 1432388736 | |
1299369449 | https://github.com/pydata/xarray/issues/7239#issuecomment-1299369449 | https://api.github.com/repos/pydata/xarray/issues/7239 | IC_kwDOAMm_X85Ncs3p | hmaarrfk 90008 | 2022-11-01T23:54:07Z | 2022-11-01T23:54:07Z | CONTRIBUTOR | I think these are good alternatives. From my experiments (and I'm still trying to create a minimum reproducible code that shows the real problem behind the slowdowns) reindexing can be quite an expensive. We used to have many coordinates (to ensure that critical metdata stays with data_variables) and those coordinates were causing slowdowns on reindexing operations. Thus the two calls However, for this particular issue, I think that documenting the strategies proposed in the docstring is good enough. I have a feeling if one can get to the bottom of 7224, the performance concerns here will be mitigated too. We can leave the performance discussion to: https://github.com/pydata/xarray/issues/7224 |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
include/exclude lists in Dataset.expand_dims 1429172192 | |
1296269381 | https://github.com/pydata/xarray/pull/7238#issuecomment-1296269381 | https://api.github.com/repos/pydata/xarray/issues/7238 | IC_kwDOAMm_X85NQ4BF | hmaarrfk 90008 | 2022-10-30T14:10:23Z | 2022-10-30T14:10:23Z | CONTRIBUTOR |
Right. thank you for finding that example. I was going to try to construct one. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Improve non-nanosecond warning 1428748922 | |
1296006560 | https://github.com/pydata/xarray/issues/7224#issuecomment-1296006560 | https://api.github.com/repos/pydata/xarray/issues/7224 | IC_kwDOAMm_X85NP32g | hmaarrfk 90008 | 2022-10-29T22:39:39Z | 2022-10-29T22:39:39Z | CONTRIBUTOR | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Insertion speed of new dataset elements 1423948375 | ||
1296006402 | https://github.com/pydata/xarray/issues/7224#issuecomment-1296006402 | https://api.github.com/repos/pydata/xarray/issues/7224 | IC_kwDOAMm_X85NP30C | hmaarrfk 90008 | 2022-10-29T22:39:01Z | 2022-10-29T22:39:01Z | CONTRIBUTOR | Ok, I don't think I have the right tools to really get to the bottom of this. The spyder profiler just seems to slowdown code too much. Any other tools to recommend? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Insertion speed of new dataset elements 1423948375 | |
1295999237 | https://github.com/pydata/xarray/pull/7236#issuecomment-1295999237 | https://api.github.com/repos/pydata/xarray/issues/7236 | IC_kwDOAMm_X85NP2EF | hmaarrfk 90008 | 2022-10-29T22:11:33Z | 2022-10-29T22:11:33Z | CONTRIBUTOR | Well now the benchmarks look like they make more sense:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Expand benchmarks for dataset insertion and creation 1428274982 | |
1295937569 | https://github.com/pydata/xarray/pull/7236#issuecomment-1295937569 | https://api.github.com/repos/pydata/xarray/issues/7236 | IC_kwDOAMm_X85NPnAh | hmaarrfk 90008 | 2022-10-29T18:58:35Z | 2022-10-29T18:58:35Z | CONTRIBUTOR |
as you though, the numbers improve quite a bit. I kinda want to understand why a no-op takes 1 ms! ^_^ |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Expand benchmarks for dataset insertion and creation 1428274982 | |
1295937364 | https://github.com/pydata/xarray/pull/7236#issuecomment-1295937364 | https://api.github.com/repos/pydata/xarray/issues/7236 | IC_kwDOAMm_X85NPm9U | hmaarrfk 90008 | 2022-10-29T18:57:54Z | 2022-10-29T18:57:54Z | CONTRIBUTOR | What about just specifying "dims"? |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Expand benchmarks for dataset insertion and creation 1428274982 | |
1295905591 | https://github.com/pydata/xarray/pull/7236#issuecomment-1295905591 | https://api.github.com/repos/pydata/xarray/issues/7236 | IC_kwDOAMm_X85NPfM3 | hmaarrfk 90008 | 2022-10-29T17:11:30Z | 2022-10-29T17:11:30Z | CONTRIBUTOR | With the right window size it looks like:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Expand benchmarks for dataset insertion and creation 1428274982 | |
1295852860 | https://github.com/pydata/xarray/pull/7236#issuecomment-1295852860 | https://api.github.com/repos/pydata/xarray/issues/7236 | IC_kwDOAMm_X85NPSU8 | hmaarrfk 90008 | 2022-10-29T14:28:25Z | 2022-10-29T14:28:25Z | CONTRIBUTOR | On the CI, it reports similar findings:
```
[ 67.73%] ··· ...dVariable.time_dict_of_dataarrays_to_dataset ok
[ 67.73%] ··· =================== =============
existing_elements [ 67.88%] ··· ...etAddVariable.time_dict_of_tuples_to_dataset ok
[ 67.88%] ··· =================== ===========
existing_elements [ 68.02%] ··· ...ddVariable.time_dict_of_variables_to_dataset ok
[ 68.02%] ··· =================== =============
existing_elements [ 68.17%] ··· ...e.DatasetAddVariable.time_merge_two_datasets ok
[ 68.17%] ··· =================== =============
existing_elements [ 68.31%] ··· ...e.DatasetAddVariable.time_variable_insertion ok
[ 68.31%] ··· =================== =============
existing_elements |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Expand benchmarks for dataset insertion and creation 1428274982 | |
1295843798 | https://github.com/pydata/xarray/pull/7236#issuecomment-1295843798 | https://api.github.com/repos/pydata/xarray/issues/7236 | IC_kwDOAMm_X85NPQHW | hmaarrfk 90008 | 2022-10-29T13:55:33Z | 2022-10-29T13:55:33Z | CONTRIBUTOR | ``` $ asv run -E existing --quick --bench merge · Discovering benchmarks · Running 5 total benchmarks (1 commits * 1 environments * 5 benchmarks) [ 0.00%] ·· Benchmarking existing-py_home_mark_mambaforge_envs_mcam_dev_bin_python [ 10.00%] ··· merge.DatasetAddVariable.time_dict_of_dataarrays_to_dataset ok [ 10.00%] ··· =================== ========== existing_elements ------------------- ---------- 0 762±0μs 10 7.18±0ms 100 12.6±0ms 1000 89.1±0ms =================== ========== [ 20.00%] ··· merge.DatasetAddVariable.time_dict_of_tuples_to_dataset ok [ 20.00%] ··· =================== ========== existing_elements ------------------- ---------- 0 889±0μs 10 2.01±0ms 100 1.34±0ms 1000 605±0μs =================== ========== [ 30.00%] ··· merge.DatasetAddVariable.time_dict_of_variables_to_dataset ok [ 30.00%] ··· =================== ========== existing_elements ------------------- ---------- 0 2.48±0ms 10 2.06±0ms 100 2.13±0ms 1000 2.38±0ms =================== ========== [ 40.00%] ··· merge.DatasetAddVariable.time_merge_two_datasets ok [ 40.00%] ··· =================== ========== existing_elements ------------------- ---------- 0 814±0μs 10 945±0μs 100 2.42±0ms 1000 5.23±0ms =================== ========== [ 50.00%] ··· merge.DatasetAddVariable.time_variable_insertion ok [ 50.00%] ··· =================== ========== existing_elements ------------------- ---------- 0 1.10±0ms 10 954±0μs 100 1.88±0ms 1000 5.29±0ms =================== ========== ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Expand benchmarks for dataset insertion and creation 1428274982 | |
1295257627 | https://github.com/pydata/xarray/pull/7179#issuecomment-1295257627 | https://api.github.com/repos/pydata/xarray/issues/7179 | IC_kwDOAMm_X85NNBAb | hmaarrfk 90008 | 2022-10-28T17:21:40Z | 2022-10-28T17:21:40Z | CONTRIBUTOR | Exciting improvements on usability for the next version! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Lazy Imports 1412019155 | |
1293689561 | https://github.com/pydata/xarray/pull/7222#issuecomment-1293689561 | https://api.github.com/repos/pydata/xarray/issues/7222 | IC_kwDOAMm_X85NHCLZ | hmaarrfk 90008 | 2022-10-27T15:15:45Z | 2022-10-27T15:15:45Z | CONTRIBUTOR |
Agreed. I'll take the small wins where I can :D. Great! I think this will be a good addition with: https://github.com/pydata/xarray/pull/7223#discussion_r1007023769 |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Actually make the fast code path return early for Aligner.align 1423321834 | |
1292299499 | https://github.com/pydata/xarray/pull/7223#issuecomment-1292299499 | https://api.github.com/repos/pydata/xarray/issues/7223 | IC_kwDOAMm_X85NBuzr | hmaarrfk 90008 | 2022-10-26T16:24:56Z | 2022-10-26T16:24:56Z | CONTRIBUTOR | ok naming is always hard. I tried to pick a good name. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Dataset insertion benchmark 1423916687 | |
1291948502 | https://github.com/pydata/xarray/pull/7221#issuecomment-1291948502 | https://api.github.com/repos/pydata/xarray/issues/7221 | IC_kwDOAMm_X85NAZHW | hmaarrfk 90008 | 2022-10-26T12:19:49Z | 2022-10-26T12:23:46Z | CONTRIBUTOR | I know it is not comparable, but I was really curious what "dictionary insertion" costs, in order to be able to understand if my comparisons were fair: code```python from tqdm import tqdm import xarray as xr from time import perf_counter import numpy as np N = 1000 # Everybody is lazy loading now, so lets force modules to get instantiated dummy_dataset = xr.Dataset() dummy_dataset['a'] = 1 dummy_dataset['b'] = 1 del dummy_dataset time_elapsed = np.zeros(N) # dataset = xr.Dataset() dataset = {} for i in tqdm(range(N)): # for i in range(N): time_start = perf_counter() dataset[f"var{i}"] = i time_end = perf_counter() time_elapsed[i] = time_end - time_start # %% from matplotlib import pyplot as plt plt.plot(np.arange(N), time_elapsed * 1E6, label='Time to add one variable') plt.xlabel("Number of existing variables") plt.ylabel("Time to add a variables (us)") plt.ylim([0, 10]) plt.title("Dictionary insertion") plt.grid(True) ```I think xarray gives me 3 order of magnitude of "thinking" benefit, so I'll take it!
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Remove debugging slow assert statement 1423312198 | |
1291894024 | https://github.com/pydata/xarray/pull/7221#issuecomment-1291894024 | https://api.github.com/repos/pydata/xarray/issues/7221 | IC_kwDOAMm_X85NAL0I | hmaarrfk 90008 | 2022-10-26T11:32:32Z | 2022-10-26T11:32:32Z | CONTRIBUTOR | Ok. I'll want to rethink them. I know it looks quadratic time, but i really would like to test n=1000 and i have an idea |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Remove debugging slow assert statement 1423312198 | |
1291450556 | https://github.com/pydata/xarray/pull/7221#issuecomment-1291450556 | https://api.github.com/repos/pydata/xarray/issues/7221 | IC_kwDOAMm_X85M-fi8 | hmaarrfk 90008 | 2022-10-26T03:32:53Z | 2022-10-26T03:32:53Z | CONTRIBUTOR | I'm somewhat ocnfused, I can run the benchmark locally ``` [ 1.80%] ··· dataset_creation.Creation.time_dataset_creation 4.37±0s ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Remove debugging slow assert statement 1423312198 | |
1291447746 | https://github.com/pydata/xarray/pull/7221#issuecomment-1291447746 | https://api.github.com/repos/pydata/xarray/issues/7221 | IC_kwDOAMm_X85M-e3C | hmaarrfk 90008 | 2022-10-26T03:27:36Z | 2022-10-26T03:27:36Z | CONTRIBUTOR | :/ not fun, the benchmark is failing. not sure why. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Remove debugging slow assert statement 1423312198 | |
1291405225 | https://github.com/pydata/xarray/pull/7222#issuecomment-1291405225 | https://api.github.com/repos/pydata/xarray/issues/7222 | IC_kwDOAMm_X85M-Uep | hmaarrfk 90008 | 2022-10-26T02:19:23Z | 2022-10-26T02:19:23Z | CONTRIBUTOR | I think the rapid return, helps by about 40% is still pretty good. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Actually make the fast code path return early for Aligner.align 1423321834 | |
1291402576 | https://github.com/pydata/xarray/pull/7222#issuecomment-1291402576 | https://api.github.com/repos/pydata/xarray/issues/7222 | IC_kwDOAMm_X85M-T1Q | hmaarrfk 90008 | 2022-10-26T02:17:45Z | 2022-10-26T02:17:45Z | CONTRIBUTOR | hmm ok. it seems i can't blatently avoid the copy like that. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Actually make the fast code path return early for Aligner.align 1423321834 | |
1291399714 | https://github.com/pydata/xarray/pull/7221#issuecomment-1291399714 | https://api.github.com/repos/pydata/xarray/issues/7221 | IC_kwDOAMm_X85M-TIi | hmaarrfk 90008 | 2022-10-26T02:14:40Z | 2022-10-26T02:14:40Z | CONTRIBUTOR |
I wasn't able to find something that really benchmarked "large" datasets.
Added one. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Remove debugging slow assert statement 1423312198 | |
1291391647 | https://github.com/pydata/xarray/pull/7222#issuecomment-1291391647 | https://api.github.com/repos/pydata/xarray/issues/7222 | IC_kwDOAMm_X85M-RKf | hmaarrfk 90008 | 2022-10-26T02:03:41Z | 2022-10-26T02:03:41Z | CONTRIBUTOR | The reason this is a separate merge request, is that I agree that this is more contentious as a change. However, I will argue that Using ripgrep you find that the only instances of Aligner exist internally:
xarray/core/alignment.py
107:class Aligner(Generic[DataAlignable]):
114: aligner = Aligner(objects, *kwargs) <------- Example
767: aligner = Aligner( <----------- Used and consumed for the method xarray/core/dataarray.py
1752: aligner: alignment.Aligner,
1760: """Callback called from |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Actually make the fast code path return early for Aligner.align 1423321834 | |
1291389702 | https://github.com/pydata/xarray/pull/7221#issuecomment-1291389702 | https://api.github.com/repos/pydata/xarray/issues/7221 | IC_kwDOAMm_X85M-QsG | hmaarrfk 90008 | 2022-10-26T01:59:57Z | 2022-10-26T01:59:57Z | CONTRIBUTOR |
Spyder profiler |
{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Remove debugging slow assert statement 1423312198 | |
1281117607 | https://github.com/pydata/xarray/pull/7172#issuecomment-1281117607 | https://api.github.com/repos/pydata/xarray/issues/7172 | IC_kwDOAMm_X85MXE2n | hmaarrfk 90008 | 2022-10-17T16:11:37Z | 2022-10-17T16:11:37Z | CONTRIBUTOR | Thank you all for taking the time to study, and worry about these improvements. Now i have to figure out how my software went from 2 sec loading time to 12 ;) Totally unrelated to this. But one day I'll have benchmarking in place to monitor it :D. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Lazy import dask.distributed to reduce import time of xarray 1410575877 | |
1280208522 | https://github.com/pydata/xarray/pull/7172#issuecomment-1280208522 | https://api.github.com/repos/pydata/xarray/issues/7172 | IC_kwDOAMm_X85MTm6K | hmaarrfk 90008 | 2022-10-17T02:59:41Z | 2022-10-17T02:59:41Z | CONTRIBUTOR |
At this point removing testing and tutorial would be strange and break things. Stefan in the discussion linked above speaks about the reasoning behind importing submodules in the top level namespace. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Lazy import dask.distributed to reduce import time of xarray 1410575877 | |
1280072309 | https://github.com/pydata/xarray/issues/6726#issuecomment-1280072309 | https://api.github.com/repos/pydata/xarray/issues/6726 | IC_kwDOAMm_X85MTFp1 | hmaarrfk 90008 | 2022-10-16T22:33:17Z | 2022-10-16T22:33:17Z | CONTRIBUTOR | In developing https://github.com/pydata/xarray/pull/7172, there are also some places where class types are used to check for features: https://github.com/pydata/xarray/blob/main/xarray/core/pycompat.py#L35 Dask and sparse and big contributors due to their need to resolve the class name in question. Ultimately. I think it is important to maybe constrain the problem. Are we ok with 100 ms over numpy + pandas? 20 ms? On my machines, the 0.5 s that xarray is close to seems long... but everytime I look at it, it seems to "just be a python problem". |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Long import time 1284475176 | |
1185946473 | https://github.com/pydata/xarray/issues/6791#issuecomment-1185946473 | https://api.github.com/repos/pydata/xarray/issues/6791 | IC_kwDOAMm_X85GsBtp | hmaarrfk 90008 | 2022-07-15T21:11:19Z | 2022-09-12T22:48:50Z | CONTRIBUTOR | I guess the code: ```python import xarray as xr dataset = xr.Dataset() my_variable = np.asarray(dataset.get('my_variable', np.asarray(1.0))) ``` coerces things as an array. Talking things out made me find this one. Though it doesn't read very well. Feel free to close. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
get_data or get_varibale method 1306457778 | |
1213089164 | https://github.com/pydata/xarray/pull/6910#issuecomment-1213089164 | https://api.github.com/repos/pydata/xarray/issues/6910 | IC_kwDOAMm_X85ITkWM | hmaarrfk 90008 | 2022-08-12T13:04:19Z | 2022-08-12T13:04:19Z | CONTRIBUTOR | Are the functions you are considering using this functions that never had keyword arguments before? When I wrote a similar decorator before i had an explicit list of arguments that were allowed to be converted. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
decorator to deprecate positional arguments 1337166287 | |
1213031961 | https://github.com/pydata/xarray/issues/5531#issuecomment-1213031961 | https://api.github.com/repos/pydata/xarray/issues/5531 | IC_kwDOAMm_X85ITWYZ | hmaarrfk 90008 | 2022-08-12T11:53:07Z | 2022-08-12T11:53:07Z | CONTRIBUTOR | These decorators are kinda fun to write and are quite taylored to a certain release philosophy. It might be warranted to just write your own ;) |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Keyword only args for arguments like "drop" 929840699 | |
1186019342 | https://github.com/pydata/xarray/issues/6791#issuecomment-1186019342 | https://api.github.com/repos/pydata/xarray/issues/6791 | IC_kwDOAMm_X85GsTgO | hmaarrfk 90008 | 2022-07-15T23:23:30Z | 2022-07-15T23:23:30Z | CONTRIBUTOR | Interesting. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
get_data or get_varibale method 1306457778 | |
1102962705 | https://github.com/pydata/xarray/issues/5531#issuecomment-1102962705 | https://api.github.com/repos/pydata/xarray/issues/5531 | IC_kwDOAMm_X85BveAR | hmaarrfk 90008 | 2022-04-19T18:34:07Z | 2022-04-19T18:34:07Z | CONTRIBUTOR | I think in my readme i suggest vedoring the code. Happy to give you a license for it so you don't need to credit me in addition to your own license. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Keyword only args for arguments like "drop" 929840699 | |
1094086518 | https://github.com/pydata/xarray/issues/6309#issuecomment-1094086518 | https://api.github.com/repos/pydata/xarray/issues/6309 | IC_kwDOAMm_X85BNm92 | hmaarrfk 90008 | 2022-04-09T17:06:13Z | 2022-04-09T17:06:13Z | CONTRIBUTOR | @max-sixty unfortunately, I think the way hdf5 is designed, it doesn't try to be too smart about what would be the best fine tuning for your particular system. In some ways, this is the correct approach. The current constructor pathway: https://github.com/pydata/xarray/blob/main/xarray/backends/h5netcdf_.py#L164 Doesn't provide a user with a catchall-kwargs. I think this would be an acceptable solution. I should say that the the performance of the direct driver is terrible without aligned data: https://github.com/Unidata/netcdf-c/pull/2206#issuecomment-1054855769 |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Read/Write performance optimizations for netcdf files 1152047670 | |
1052386013 | https://github.com/pydata/xarray/issues/6309#issuecomment-1052386013 | https://api.github.com/repos/pydata/xarray/issues/6309 | IC_kwDOAMm_X84-uiLd | hmaarrfk 90008 | 2022-02-26T17:57:33Z | 2022-02-26T17:57:33Z | CONTRIBUTOR | I have to elaborate that this may be even more important for users that READ the data back alot. Reading with the standard Xarray operands hits other limits, but one limit that it definitely hits is that of the HDF5 driver used. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Read/Write performance optimizations for netcdf files 1152047670 | |
1009823872 | https://github.com/pydata/xarray/pull/6154#issuecomment-1009823872 | https://api.github.com/repos/pydata/xarray/issues/6154 | IC_kwDOAMm_X848MLCA | hmaarrfk 90008 | 2022-01-11T10:28:51Z | 2022-01-11T10:28:51Z | CONTRIBUTOR | Thanks for merging so quickly |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Use base ImportError not MoudleNotFoundError when testing for plugins 1098924491 | |
1009820092 | https://github.com/pydata/xarray/issues/6153#issuecomment-1009820092 | https://api.github.com/repos/pydata/xarray/issues/6153 | IC_kwDOAMm_X848MKG8 | hmaarrfk 90008 | 2022-01-11T10:24:37Z | 2022-01-11T10:24:37Z | CONTRIBUTOR | Thank you @kmuehlbauer for the explicit PR link. I do plan on adding alignment features to h5py then to bring it toward h5netcdf. So I think something like this will be useful in the future. Feature request link: https://github.com/h5py/h5py/issues/2034 |
{ "total_count": 2, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 1 } |
[FEATURE]: to_netcdf and additional keyword arguments 1098915891 | |
1009802137 | https://github.com/pydata/xarray/pull/6154#issuecomment-1009802137 | https://api.github.com/repos/pydata/xarray/issues/6154 | IC_kwDOAMm_X848MFuZ | hmaarrfk 90008 | 2022-01-11T10:14:09Z | 2022-01-11T10:14:09Z | CONTRIBUTOR | ImportError is a superset of ModuleNotFoundError. https://github.com/python/cpython/blob/f4c03484da59049eb62a9bf7777b963e2267d187/Lib/test/exception_hierarchy.txt#L19 So it depends what question you care about asking:
I think question 2 is friendlier to xarray users. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Use base ImportError not MoudleNotFoundError when testing for plugins 1098924491 | |
1008227895 | https://github.com/pydata/xarray/issues/2347#issuecomment-1008227895 | https://api.github.com/repos/pydata/xarray/issues/2347 | IC_kwDOAMm_X848GFY3 | hmaarrfk 90008 | 2022-01-09T04:28:49Z | 2022-01-09T04:28:49Z | CONTRIBUTOR | This is likely true. Thanks for looking back into this. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Serialization of just coordinates 347962055 | |
786813358 | https://github.com/pydata/xarray/issues/2799#issuecomment-786813358 | https://api.github.com/repos/pydata/xarray/issues/2799 | MDEyOklzc3VlQ29tbWVudDc4NjgxMzM1OA== | hmaarrfk 90008 | 2021-02-26T18:19:28Z | 2021-02-26T18:19:28Z | CONTRIBUTOR | I hope the following can help users that struggle with the speed of xarray: I've found that when doing numerical computation, I often use the xarray to grab all the metadata relevant to my computation. Scale, chromaticity, experimental information. Eventually, i create a function that acts as a barrier: - Xarray input (high level experimental data) - Computation parameters output (low level implementation detail relevant information). The low level implementation can operate on the fast numpy arrays. I've found this to be the struggle with creating high level APIs that do things like sanitize inputs (xarray routines like For the example that @nbren12 brought up originally, it might be better to create xarray routines (if they don't exist already) that can create fast iterators for the underlying numpy arrays given a set of dimensions that the user cares about. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Performance: numpy indexes small amounts of data 1000 faster than xarray 416962458 | |
735759416 | https://github.com/pydata/xarray/pull/4400#issuecomment-735759416 | https://api.github.com/repos/pydata/xarray/issues/4400 | MDEyOklzc3VlQ29tbWVudDczNTc1OTQxNg== | hmaarrfk 90008 | 2020-11-30T12:33:33Z | 2020-11-30T12:33:33Z | CONTRIBUTOR | I think you should be able to define your own custom encoder if you want it to be a datetime. But inevitably, you will have to define your own save and load functions. Python, by definition of being such a loose language, allows you to do things that the original developers never really imagined. this can sometimes lead to silent corruption.like the one you've experienced. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
[WIP] Support nano second time encoding. 690546795 | |
735428830 | https://github.com/pydata/xarray/issues/1672#issuecomment-735428830 | https://api.github.com/repos/pydata/xarray/issues/1672 | MDEyOklzc3VlQ29tbWVudDczNTQyODgzMA== | hmaarrfk 90008 | 2020-11-29T17:34:44Z | 2020-11-29T17:35:04Z | CONTRIBUTOR | It isn't really part of any library. I don't really have plans of making it into a public library. I think the discussion is really around the xarray API, and what functions to implement at first. Then somebody can take the code and integrate it into the decided upon API. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Append along an unlimited dimension to an existing netCDF file 269700511 | |
735428578 | https://github.com/pydata/xarray/pull/4400#issuecomment-735428578 | https://api.github.com/repos/pydata/xarray/issues/4400 | MDEyOklzc3VlQ29tbWVudDczNTQyODU3OA== | hmaarrfk 90008 | 2020-11-29T17:32:37Z | 2020-11-29T17:32:37Z | CONTRIBUTOR | yeah, i'm not too sure. I think the idea is that this breaks compatibility with netcdf times, so the resulting file is thus not standard. For my application, us timing is enough. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
[WIP] Support nano second time encoding. 690546795 | |
685222909 | https://github.com/pydata/xarray/issues/1672#issuecomment-685222909 | https://api.github.com/repos/pydata/xarray/issues/1672 | MDEyOklzc3VlQ29tbWVudDY4NTIyMjkwOQ== | hmaarrfk 90008 | 2020-09-02T01:17:05Z | 2020-09-02T01:17:05Z | CONTRIBUTOR | Small prototype, but maybe it can help boost the development.
```python
import netCDF4
def _expand_variable(nc_variable, data, expanding_dim, nc_shape, added_size):
# For time deltas, we must ensure that we use the same encoding as
# what was previously stored.
# We likely need to do this as well for variables that had custom
# econdings too
if hasattr(nc_variable, 'calendar'):
data.encoding = {
'units': nc_variable.units,
'calendar': nc_variable.calendar,
}
data_encoded = xr.conventions.encode_cf_variable(data) # , name=name)
left_slices = data.dims.index(expanding_dim)
right_slices = data.ndim - left_slices - 1
nc_slice = (slice(None),) * left_slices + (slice(nc_shape, nc_shape + added_size),) + (slice(None),) * (right_slices)
nc_variable[nc_slice] = data_encoded.data
def append_to_netcdf(filename, ds_to_append, unlimited_dims):
if isinstance(unlimited_dims, str):
unlimited_dims = [unlimited_dims]
if len(unlimited_dims) != 1:
# TODO: change this so it can support multiple expanding dims
raise ValueError(
"We only support one unlimited dim for now, "
f"got {len(unlimited_dims)}.")
unlimited_dims = list(set(unlimited_dims))
expanding_dim = unlimited_dims[0]
with netCDF4.Dataset(filename, mode='a') as nc:
nc_dims = set(nc.dimensions.keys())
nc_coord = nc[expanding_dim]
nc_shape = len(nc_coord)
added_size = len(ds_to_append[expanding_dim])
variables, attrs = xr.conventions.encode_dataset_coordinates(ds_to_append)
for name, data in variables.items():
if expanding_dim not in data.dims:
# Nothing to do, data assumed to the identical
continue
nc_variable = nc[name]
_expand_variable(nc_variable, data, expanding_dim, nc_shape, added_size)
from xarray.tests.test_dataset import create_append_test_data
from xarray.testing import assert_equal
ds, ds_to_append, ds_with_new_var = create_append_test_data()
filename = 'test_dataset.nc'
ds.to_netcdf(filename, mode='w', unlimited_dims=['time'])
append_to_netcdf('test_dataset.nc', ds_to_append, unlimited_dims='time')
loaded = xr.load_dataset('test_dataset.nc')
assert_equal(xr.concat([ds, ds_to_append], dim="time"), loaded)
```
|
{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Append along an unlimited dimension to an existing netCDF file 269700511 | |
685200043 | https://github.com/pydata/xarray/issues/4183#issuecomment-685200043 | https://api.github.com/repos/pydata/xarray/issues/4183 | MDEyOklzc3VlQ29tbWVudDY4NTIwMDA0Mw== | hmaarrfk 90008 | 2020-09-02T00:13:30Z | 2020-09-02T00:13:30Z | CONTRIBUTOR | i ran into this problem trying to round trip time to the nanosecond (even though i don't need it, sub micro second would be nice) but unfrotunately, you run into the fact that cftime doesn't support nanoseconds https://github.com/Unidata/cftime/blob/master/cftime/_cftime.pyx Seems like they discussed a nanosecond issue a while back too https://github.com/Unidata/cftime/issues/77 Their ultimate point was that there was little point in having precision down to the nano second given that python datetime objects only have microseconds. I guess they are right. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Unable to decode a date in nanoseconds 646038170 | |
684833575 | https://github.com/pydata/xarray/issues/1672#issuecomment-684833575 | https://api.github.com/repos/pydata/xarray/issues/1672 | MDEyOklzc3VlQ29tbWVudDY4NDgzMzU3NQ== | hmaarrfk 90008 | 2020-09-01T12:58:52Z | 2020-09-01T12:58:52Z | CONTRIBUTOR | I think I got a basic prototype working. That said, I think a real challenge lies in supporting the numerous backends and lazy arrays. For example, I was only able to add data in peculiar fashions using the netcdf4 library which may trigger complex computations many times. Is this a use case that we must optimize for now? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Append along an unlimited dimension to an existing netCDF file 269700511 | |
684064522 | https://github.com/pydata/xarray/pull/4395#issuecomment-684064522 | https://api.github.com/repos/pydata/xarray/issues/4395 | MDEyOklzc3VlQ29tbWVudDY4NDA2NDUyMg== | hmaarrfk 90008 | 2020-08-31T21:59:28Z | 2020-08-31T21:59:28Z | CONTRIBUTOR | I'm not too sure about this anymore. with the way the test is written now, it is unclear to me if the store should be closed afterward. I'm also unsure of how to deal with the case where the user passed it a ZipStore instead of a string. Will have to keep thinking. |
{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: Ensure that zarr.ZipStores are closed 689502005 | |
680060278 | https://github.com/pydata/xarray/issues/2803#issuecomment-680060278 | https://api.github.com/repos/pydata/xarray/issues/2803 | MDEyOklzc3VlQ29tbWVudDY4MDA2MDI3OA== | hmaarrfk 90008 | 2020-08-25T14:29:18Z | 2020-08-25T14:29:18Z | CONTRIBUTOR | Sorry for noise. It seems that 1D arrays are still supported. I still had a 2D array lingering in my codebase. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Test failure with TestValidateAttrs.test_validating_attrs 417542619 | |
679399158 | https://github.com/pydata/xarray/issues/2803#issuecomment-679399158 | https://api.github.com/repos/pydata/xarray/issues/2803 | MDEyOklzc3VlQ29tbWVudDY3OTM5OTE1OA== | hmaarrfk 90008 | 2020-08-24T22:31:09Z | 2020-08-24T22:31:09Z | CONTRIBUTOR | With the netcdf4 back end, I'm not able to save a 1D attr dataset. I can save my dataset with the h5netcdf backend |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Test failure with TestValidateAttrs.test_validating_attrs 417542619 | |
679348131 | https://github.com/pydata/xarray/issues/2803#issuecomment-679348131 | https://api.github.com/repos/pydata/xarray/issues/2803 | MDEyOklzc3VlQ29tbWVudDY3OTM0ODEzMQ== | hmaarrfk 90008 | 2020-08-24T20:26:49Z | 2020-08-24T20:26:49Z | CONTRIBUTOR | Sorry for posting on such an old thread. Are |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Test failure with TestValidateAttrs.test_validating_attrs 417542619 | |
604220931 | https://github.com/pydata/xarray/pull/3888#issuecomment-604220931 | https://api.github.com/repos/pydata/xarray/issues/3888 | MDEyOklzc3VlQ29tbWVudDYwNDIyMDkzMQ== | hmaarrfk 90008 | 2020-03-26T04:23:05Z | 2020-03-26T04:23:05Z | CONTRIBUTOR | xfail just gets forgotten, so i'll leave it for now. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
[WIP] [DEMO] Add tests for ZipStore for zarr 587398134 | |
604181264 | https://github.com/pydata/xarray/issues/3815#issuecomment-604181264 | https://api.github.com/repos/pydata/xarray/issues/3815 | MDEyOklzc3VlQ29tbWVudDYwNDE4MTI2NA== | hmaarrfk 90008 | 2020-03-26T01:49:45Z | 2020-03-26T01:49:45Z | CONTRIBUTOR | And actually, zarr provides a I guess i can open upstream in zarr, but I think for catching the 0 sized array case, it is probably best to use the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Opening from zarr.ZipStore fails to read (store???) unicode characters 573577844 | |
604180511 | https://github.com/pydata/xarray/issues/3815#issuecomment-604180511 | https://api.github.com/repos/pydata/xarray/issues/3815 | MDEyOklzc3VlQ29tbWVudDYwNDE4MDUxMQ== | hmaarrfk 90008 | 2020-03-26T01:46:52Z | 2020-03-26T01:46:52Z | CONTRIBUTOR | I think the reason is that for zero sized arrays, you technically aren't allowed to write data to them. This means that when you create the 0 sized array, you can't actually change the value. Here is a reproducer without xarray
Though the code path follows what xarray does in the backend. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Opening from zarr.ZipStore fails to read (store???) unicode characters 573577844 | |
604169147 | https://github.com/pydata/xarray/pull/3888#issuecomment-604169147 | https://api.github.com/repos/pydata/xarray/issues/3888 | MDEyOklzc3VlQ29tbWVudDYwNDE2OTE0Nw== | hmaarrfk 90008 | 2020-03-26T01:05:05Z | 2020-03-26T01:05:05Z | CONTRIBUTOR | Alright, it probably makes more sense to reopen this when the issue gets fixed. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
[WIP] [DEMO] Add tests for ZipStore for zarr 587398134 | |
604160143 | https://github.com/pydata/xarray/pull/3888#issuecomment-604160143 | https://api.github.com/repos/pydata/xarray/issues/3888 | MDEyOklzc3VlQ29tbWVudDYwNDE2MDE0Mw== | hmaarrfk 90008 | 2020-03-26T00:31:15Z | 2020-03-26T00:31:15Z | CONTRIBUTOR | wouldn't this be a useful to test to have? I think the ability to save things in a zip store is quite useful. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
[WIP] [DEMO] Add tests for ZipStore for zarr 587398134 | |
604009141 | https://github.com/pydata/xarray/issues/3815#issuecomment-604009141 | https://api.github.com/repos/pydata/xarray/issues/3815 | MDEyOklzc3VlQ29tbWVudDYwNDAwOTE0MQ== | hmaarrfk 90008 | 2020-03-25T18:26:58Z | 2020-03-25T18:26:58Z | CONTRIBUTOR | @jakirkham not sure if you have any thoughts on why the code above is bugging out. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Opening from zarr.ZipStore fails to read (store???) unicode characters 573577844 | |
604008658 | https://github.com/pydata/xarray/issues/3815#issuecomment-604008658 | https://api.github.com/repos/pydata/xarray/issues/3815 | MDEyOklzc3VlQ29tbWVudDYwNDAwODY1OA== | hmaarrfk 90008 | 2020-03-25T18:26:10Z | 2020-03-25T18:26:10Z | CONTRIBUTOR | Honestly, i've found that Keeping things like: * The version of different software that was used. Seems more like an attribute, but really, it is all data, so :/. Regarding the originally issue, it seems that you are right in the sense that a 0 dimension string might be buggy in zarr itself. I guess we (when we have time) will have to dig down to find a MVC example that reproduces the issue without xarray to submit to Zarr. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Opening from zarr.ZipStore fails to read (store???) unicode characters 573577844 | |
603921577 | https://github.com/pydata/xarray/issues/3815#issuecomment-603921577 | https://api.github.com/repos/pydata/xarray/issues/3815 | MDEyOklzc3VlQ29tbWVudDYwMzkyMTU3Nw== | hmaarrfk 90008 | 2020-03-25T15:54:37Z | 2020-03-25T15:54:37Z | CONTRIBUTOR | Hmm, interesting! I've avoided attrs since they often get "lost" in computation, and don't get dragged along as rigorously as coordinates. I do have some real coordinates that are stored as strings. Thanks for the quickfeedback. Here is the reproducing code without using context managers (which auto clsoe things you know) ```python import xarray as xr import zarr x = xr.Dataset() x['hello'] = 'world' x with zarr.ZipStore('test_store.zip', mode='w') as store: x.to_zarr(store) read_store = zarr.ZipStore('test_store.zip', mode='r') The error will happen before this line is executedread_store.close()``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Opening from zarr.ZipStore fails to read (store???) unicode characters 573577844 | |
603608958 | https://github.com/pydata/xarray/issues/3815#issuecomment-603608958 | https://api.github.com/repos/pydata/xarray/issues/3815 | MDEyOklzc3VlQ29tbWVudDYwMzYwODk1OA== | hmaarrfk 90008 | 2020-03-25T02:45:12Z | 2020-03-25T02:45:12Z | CONTRIBUTOR | I will have to try the debugging things you mentionned some later time :/ |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Opening from zarr.ZipStore fails to read (store???) unicode characters 573577844 | |
603608822 | https://github.com/pydata/xarray/issues/3815#issuecomment-603608822 | https://api.github.com/repos/pydata/xarray/issues/3815 | MDEyOklzc3VlQ29tbWVudDYwMzYwODgyMg== | hmaarrfk 90008 | 2020-03-25T02:44:40Z | 2020-03-25T02:44:40Z | CONTRIBUTOR | Not sure if the builds in https://github.com/pydata/xarray/pull/3888 help reproduce things or not? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Opening from zarr.ZipStore fails to read (store???) unicode characters 573577844 | |
603601762 | https://github.com/pydata/xarray/issues/3815#issuecomment-603601762 | https://api.github.com/repos/pydata/xarray/issues/3815 | MDEyOklzc3VlQ29tbWVudDYwMzYwMTc2Mg== | hmaarrfk 90008 | 2020-03-25T02:16:29Z | 2020-03-25T02:16:29Z | CONTRIBUTOR | hmm i didn't realize this. I"m running from conda-forge + linux. Let me try on your CIs. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Opening from zarr.ZipStore fails to read (store???) unicode characters 573577844 | |
603556048 | https://github.com/pydata/xarray/issues/3815#issuecomment-603556048 | https://api.github.com/repos/pydata/xarray/issues/3815 | MDEyOklzc3VlQ29tbWVudDYwMzU1NjA0OA== | hmaarrfk 90008 | 2020-03-24T23:24:53Z | 2020-03-24T23:24:53Z | CONTRIBUTOR | See the zipstore example in my first comment |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Opening from zarr.ZipStore fails to read (store???) unicode characters 573577844 | |
603555953 | https://github.com/pydata/xarray/issues/3815#issuecomment-603555953 | https://api.github.com/repos/pydata/xarray/issues/3815 | MDEyOklzc3VlQ29tbWVudDYwMzU1NTk1Mw== | hmaarrfk 90008 | 2020-03-24T23:24:34Z | 2020-03-24T23:24:34Z | CONTRIBUTOR | I thought I provided it, but in either case, here is my traceback
```python
In [3]: import xarray as xr
...: import zarr
...: x = xr.Dataset()
...: x['hello'] = 'world'
...: x
...: with zarr.ZipStore('test_store.zip', mode='w') as store:
...: x.to_zarr(store)
...: with zarr.ZipStore('test_store.zip', mode='r') as store:
...: x_read = xr.open_zarr(store).compute()
...:
---------------------------------------------------------------------------
BadZipFile Traceback (most recent call last)
<ipython-input-3-5ad5a0456766> in <module>
7 x.to_zarr(store)
8 with zarr.ZipStore('test_store.zip', mode='r') as store:
----> 9 x_read = xr.open_zarr(store).compute()
10
~/miniconda3/envs/mcam_dev/lib/python3.7/site-packages/xarray/core/dataset.py in compute(self, **kwargs)
805 """
806 new = self.copy(deep=False)
--> 807 return new.load(**kwargs)
808
809 def _persist_inplace(self, **kwargs) -> "Dataset":
~/miniconda3/envs/mcam_dev/lib/python3.7/site-packages/xarray/core/dataset.py in load(self, **kwargs)
657 for k, v in self.variables.items():
658 if k not in lazy_data:
--> 659 v.load()
660
661 return self
~/miniconda3/envs/mcam_dev/lib/python3.7/site-packages/xarray/core/variable.py in load(self, **kwargs)
373 self._data = as_compatible_data(self._data.compute(**kwargs))
374 elif not hasattr(self._data, "__array_function__"):
--> 375 self._data = np.asarray(self._data)
376 return self
377
~/miniconda3/envs/mcam_dev/lib/python3.7/site-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
83
84 """
---> 85 return array(a, dtype, copy=False, order=order)
86
87
~/miniconda3/envs/mcam_dev/lib/python3.7/site-packages/xarray/core/indexing.py in __array__(self, dtype)
555 def __array__(self, dtype=None):
556 array = as_indexable(self.array)
--> 557 return np.asarray(array[self.key], dtype=None)
558
559 def transpose(self, order):
~/miniconda3/envs/mcam_dev/lib/python3.7/site-packages/xarray/backends/zarr.py in __getitem__(self, key)
47 array = self.get_array()
48 if isinstance(key, indexing.BasicIndexer):
---> 49 return array[key.tuple]
50 elif isinstance(key, indexing.VectorizedIndexer):
51 return array.vindex[
~/miniconda3/envs/mcam_dev/lib/python3.7/site-packages/zarr/core.py in __getitem__(self, selection)
570
571 fields, selection = pop_fields(selection)
--> 572 return self.get_basic_selection(selection, fields=fields)
573
574 def get_basic_selection(self, selection=Ellipsis, out=None, fields=None):
~/miniconda3/envs/mcam_dev/lib/python3.7/site-packages/zarr/core.py in get_basic_selection(self, selection, out, fields)
693 if self._shape == ():
694 return self._get_basic_selection_zd(selection=selection, out=out,
--> 695 fields=fields)
696 else:
697 return self._get_basic_selection_nd(selection=selection, out=out,
~/miniconda3/envs/mcam_dev/lib/python3.7/site-packages/zarr/core.py in _get_basic_selection_zd(self, selection, out, fields)
709 # obtain encoded data for chunk
710 ckey = self._chunk_key((0,))
--> 711 cdata = self.chunk_store[ckey]
712
713 except KeyError:
~/miniconda3/envs/mcam_dev/lib/python3.7/site-packages/zarr/storage.py in __getitem__(self, key)
1249 with self.mutex:
1250 with self.zf.open(key) as f: # will raise KeyError
-> 1251 return f.read()
1252
1253 def __setitem__(self, key, value):
~/miniconda3/envs/mcam_dev/lib/python3.7/zipfile.py in read(self, n)
914 self._offset = 0
915 while not self._eof:
--> 916 buf += self._read1(self.MAX_N)
917 return buf
918
~/miniconda3/envs/mcam_dev/lib/python3.7/zipfile.py in _read1(self, n)
1018 if self._left <= 0:
1019 self._eof = True
-> 1020 self._update_crc(data)
1021 return data
1022
~/miniconda3/envs/mcam_dev/lib/python3.7/zipfile.py in _update_crc(self, newdata)
946 # Check the CRC if we're at the end of the file
947 if self._eof and self._running_crc != self._expected_crc:
--> 948 raise BadZipFile("Bad CRC-32 for file %r" % self.name)
949
950 def read1(self, n):
BadZipFile: Bad CRC-32 for file 'hello/0'
```
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Opening from zarr.ZipStore fails to read (store???) unicode characters 573577844 | |
603190621 | https://github.com/pydata/xarray/issues/3815#issuecomment-603190621 | https://api.github.com/repos/pydata/xarray/issues/3815 | MDEyOklzc3VlQ29tbWVudDYwMzE5MDYyMQ== | hmaarrfk 90008 | 2020-03-24T11:41:37Z | 2020-03-24T11:41:37Z | CONTRIBUTOR | My guess is that that xarray might be trying to write to the store character by character??? Otherwise, not too sure. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Opening from zarr.ZipStore fails to read (store???) unicode characters 573577844 | |
552652019 | https://github.com/pydata/xarray/issues/2799#issuecomment-552652019 | https://api.github.com/repos/pydata/xarray/issues/2799 | MDEyOklzc3VlQ29tbWVudDU1MjY1MjAxOQ== | hmaarrfk 90008 | 2019-11-11T22:47:47Z | 2019-11-11T22:47:47Z | CONTRIBUTOR | Sure, I just wanted to make the note that this operation should be more or less constant time, as opposed to dependent on the size of the array. Somebody had mentionned it should increase with the size of the array. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Performance: numpy indexes small amounts of data 1000 faster than xarray 416962458 | |
552619589 | https://github.com/pydata/xarray/issues/2799#issuecomment-552619589 | https://api.github.com/repos/pydata/xarray/issues/2799 | MDEyOklzc3VlQ29tbWVudDU1MjYxOTU4OQ== | hmaarrfk 90008 | 2019-11-11T21:16:36Z | 2019-11-11T21:16:36Z | CONTRIBUTOR | Hmm, slicing should basically be a no-op. The fact that xarray makes it about 100x slower is a real killer. It seems from this conversation that it might be hard to workaround
```python
import xarray as xr
import numpy as np
n = np.zeros(shape=(1024, 1024))
x = xr.DataArray(n, dims=('y', 'x'))
the_slice = np.s_[256:512, 256:512]
%timeit n[the_slice]
%timeit x[the_slice]
186 ns ± 0.778 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
70.3 µs ± 593 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
```
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Performance: numpy indexes small amounts of data 1000 faster than xarray 416962458 | |
451767431 | https://github.com/pydata/xarray/issues/2347#issuecomment-451767431 | https://api.github.com/repos/pydata/xarray/issues/2347 | MDEyOklzc3VlQ29tbWVudDQ1MTc2NzQzMQ== | hmaarrfk 90008 | 2019-01-06T19:25:53Z | 2019-01-06T19:25:53Z | CONTRIBUTOR | mind blown!!!! thanks for that pointer I haven't touched my serialization code in a while, kinda scared to go back to it now, but I will keep that library in mind. I saw Zarr a while back, looks cool. I hope to see it grow. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Serialization of just coordinates 347962055 | |
451765999 | https://github.com/pydata/xarray/issues/2347#issuecomment-451765999 | https://api.github.com/repos/pydata/xarray/issues/2347 | MDEyOklzc3VlQ29tbWVudDQ1MTc2NTk5OQ== | hmaarrfk 90008 | 2019-01-06T19:06:53Z | 2019-01-06T19:06:53Z | CONTRIBUTOR | no need to be sorry. These two functions were easy enough for me to do myself in my own codebase. There are few issues that I've found doing this though.
Mainly, I can't find a good way to serialize numpy arrays in a round-trippable fashion.
It is difficult to get back |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Serialization of just coordinates 347962055 | |
416994400 | https://github.com/pydata/xarray/issues/2251#issuecomment-416994400 | https://api.github.com/repos/pydata/xarray/issues/2251 | MDEyOklzc3VlQ29tbWVudDQxNjk5NDQwMA== | hmaarrfk 90008 | 2018-08-29T15:24:07Z | 2018-08-29T15:24:07Z | CONTRIBUTOR | @shoyer, @fmaussion thank you for your answers. I'm OK with this issue being closed. I'm no expert on netcdf4, so I don't know if I could express the issue in a concise manner there. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
netcdf roundtrip fails to preserve the shape of numpy arrays in attributes 335608017 | |
410759337 | https://github.com/pydata/xarray/pull/2344#issuecomment-410759337 | https://api.github.com/repos/pydata/xarray/issues/2344 | MDEyOklzc3VlQ29tbWVudDQxMDc1OTMzNw== | hmaarrfk 90008 | 2018-08-06T16:02:09Z | 2018-08-06T16:02:09Z | CONTRIBUTOR | Thanks! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
FutureWarning: creation of DataArrays w/ coords Dataset 347712372 | |
410575268 | https://github.com/pydata/xarray/pull/2344#issuecomment-410575268 | https://api.github.com/repos/pydata/xarray/issues/2344 | MDEyOklzc3VlQ29tbWVudDQxMDU3NTI2OA== | hmaarrfk 90008 | 2018-08-06T02:55:12Z | 2018-08-06T02:55:12Z | CONTRIBUTOR | Maybe the issue that I am facing is that I want to deal with the storage of my metadata and data seperately. I used to have my own library that was replicating much of xarray's functionality, but your code is much nicer than anything I would be able to write in a finite time. :smile: Following the information here: http://xarray.pydata.org/en/stable/data-structures.html#coordinates-methods Currently, my serialization pipeline is: ```python import xarray as xr import numpy as np Setup an array with coordinatesn = np.zeros(3) coords={'x': np.arange(3)} m = xr.DataArray(n, dims=['x'], coords=coords) coords_dataset_dict = m.coords.to_dataset().to_dict() coords_dict = coords_dataset_dict['coords'] Read/Write dictionary to JSON fileThis works, but I'm essentially creating an emtpy dataset for itcoords_set = xr.Dataset.from_dict(coords_dataset_dict)
coords2 = coords_set.coords # so many I used to just pass the dataset to "coords"m3 = xr.DataArray(np.zeros(shape=m.shape), dims=m.dims, coords=coords_set) ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
FutureWarning: creation of DataArrays w/ coords Dataset 347712372 | |
410572206 | https://github.com/pydata/xarray/pull/2344#issuecomment-410572206 | https://api.github.com/repos/pydata/xarray/issues/2344 | MDEyOklzc3VlQ29tbWVudDQxMDU3MjIwNg== | hmaarrfk 90008 | 2018-08-06T02:31:02Z | 2018-08-06T02:31:02Z | CONTRIBUTOR | Is there a better way to serialize coordinates only? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
FutureWarning: creation of DataArrays w/ coords Dataset 347712372 | |
410572013 | https://github.com/pydata/xarray/pull/2344#issuecomment-410572013 | https://api.github.com/repos/pydata/xarray/issues/2344 | MDEyOklzc3VlQ29tbWVudDQxMDU3MjAxMw== | hmaarrfk 90008 | 2018-08-06T02:29:34Z | 2018-08-06T02:29:34Z | CONTRIBUTOR | It seems like this warning isn't benign though. I will take your suggestion though ( I feel like I'm not the only one who probably did this. Should you raise an other warning explicitly? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
FutureWarning: creation of DataArrays w/ coords Dataset 347712372 | |
410532428 | https://github.com/pydata/xarray/pull/2344#issuecomment-410532428 | https://api.github.com/repos/pydata/xarray/issues/2344 | MDEyOklzc3VlQ29tbWVudDQxMDUzMjQyOA== | hmaarrfk 90008 | 2018-08-05T16:45:27Z | 2018-08-05T16:45:27Z | CONTRIBUTOR | I came across this when serializing/deserializing my coordinates to a json file. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
FutureWarning: creation of DataArrays w/ coords Dataset 347712372 | |
410488222 | https://github.com/pydata/xarray/issues/2340#issuecomment-410488222 | https://api.github.com/repos/pydata/xarray/issues/2340 | MDEyOklzc3VlQ29tbWVudDQxMDQ4ODIyMg== | hmaarrfk 90008 | 2018-08-05T01:15:39Z | 2018-08-05T01:15:49Z | CONTRIBUTOR | Finishing up this line of though: without the assumption that the relative order of dimensions is maintained across arrays in a set, this feature is impossible to implement as a neat function call. You would have to specify exactly how to expand each of the coordinates which can get pretty long. I wrote some code, that I think should have worked if relative ordering was a valid assumption: Here it is for reference https://github.com/hmaarrfk/xarray/pull/1 To obtain the desired effect, you have to expand the dimensions of the coordinates individually: ```python import xarray as xr import numpy as np Setup an array with coordinatesn = np.arange(1, 13).reshape(3, 2, 2) coords={'y': np.arange(1, 4), 'x': np.arange(1, 3), 'xi': np.arange(2)} %%z = xr.DataArray(n[..., 0]2, dims=['y', 'x']) a = xr.DataArray(n, dims=['y', 'x', 'xi'], coords={*coords, 'z': z}) sliced = a[0] print("The original xarray") print(a.z) print("The sliced xarray") print(sliced.z) %%expanded = sliced.expand_dims('y', 0) expanded['z'] = expanded.z.expand_dims('y', 0) print(expanded) ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
expand_dims erases named dim in the array's coordinates 347558405 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
issue >30