html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/7770#issuecomment-1515539273,https://api.github.com/repos/pydata/xarray/issues/7770,1515539273,IC_kwDOAMm_X85aVUtJ,90008,2023-04-20T00:15:23Z,2023-04-20T00:15:23Z,CONTRIBUTOR,"Understood. Thank you for your prompt replies. I'll read up on ask again if I have any questions. I guess I was trying to accommodate in the past users that were not using our wrappers to `to_netcdf`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1675299031 https://github.com/pydata/xarray/pull/4400#issuecomment-1484222279,https://api.github.com/repos/pydata/xarray/issues/4400,1484222279,IC_kwDOAMm_X85Yd29H,90008,2023-03-26T20:59:00Z,2023-03-26T20:59:00Z,CONTRIBUTOR,nice!,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,690546795 https://github.com/pydata/xarray/issues/4079#issuecomment-1434780029,https://api.github.com/repos/pydata/xarray/issues/4079,1434780029,IC_kwDOAMm_X85VhQF9,90008,2023-02-17T15:08:50Z,2023-02-17T15:08:50Z,CONTRIBUTOR,"I know it is ""stale"" but aligning to these ""surprise dimensions"" creates ""late stage"" bugs that are hard to pinpoint. I'm not sure if it is possible to mark these dimensions as ""unnamed"" and as such, they should be ""merged"" into new ""unnamed"" dimensions that the user isn't tracking at this point in time. Our workaround have included calling these dimensions something related to the datarray `d1_i`, or simply making small small ""arrays"" a countable number of scalar variables (`d1_min`, `d1_max`) instead of a single array containing two values `d1_limits`. ```python import xarray as xr d1 = xr.DataArray(data=[1, 2]) assert 'dim_0' in d1.dims d2 = xr.DataArray(data=[1, 2, 3]) assert 'dim_0' in d2.dims xr.Dataset({'d1': d1, 'd2': d2}) ```
Stack trace ``` --------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[2], line 7 4 d2 = xr.DataArray(data=[1, 2, 3]) 5 assert 'dim_0' in d2.dims ----> 7 xr.Dataset({'d1': d1, 'd2': d2}) File ~/mambaforge/envs/dev/lib/python3.9/site-packages/xarray/core/dataset.py:612, in Dataset.__init__(self, data_vars, coords, attrs) 609 if isinstance(coords, Dataset): 610 coords = coords.variables --> 612 variables, coord_names, dims, indexes, _ = merge_data_and_coords( 613 data_vars, coords, compat=""broadcast_equals"" 614 ) 616 self._attrs = dict(attrs) if attrs is not None else None 617 self._close = None File ~/mambaforge/envs/dev/lib/python3.9/site-packages/xarray/core/merge.py:564, in merge_data_and_coords(data_vars, coords, compat, join) 562 objects = [data_vars, coords] 563 explicit_coords = coords.keys() --> 564 return merge_core( 565 objects, 566 compat, 567 join, 568 explicit_coords=explicit_coords, 569 indexes=Indexes(indexes, coords), 570 ) File ~/mambaforge/envs/dev/lib/python3.9/site-packages/xarray/core/merge.py:741, in merge_core(objects, compat, join, combine_attrs, priority_arg, explicit_coords, indexes, fill_value) 738 _assert_compat_valid(compat) 740 coerced = coerce_pandas_values(objects) --> 741 aligned = deep_align( 742 coerced, join=join, copy=False, indexes=indexes, fill_value=fill_value 743 ) 744 collected = collect_variables_and_indexes(aligned, indexes=indexes) 745 prioritized = _get_priority_vars_and_indexes(aligned, priority_arg, compat=compat) File ~/mambaforge/envs/dev/lib/python3.9/site-packages/xarray/core/alignment.py:848, in deep_align(objects, join, copy, indexes, exclude, raise_on_invalid, fill_value) 845 else: 846 out.append(variables) --> 848 aligned = align( 849 *targets, 850 join=join, 851 copy=copy, 852 indexes=indexes, 853 exclude=exclude, 854 fill_value=fill_value, 855 ) 857 for position, key, aligned_obj in zip(positions, keys, aligned): 858 if key is no_key: File ~/mambaforge/envs/dev/lib/python3.9/site-packages/xarray/core/alignment.py:785, in align(join, copy, indexes, exclude, fill_value, *objects) 589 """""" 590 Given any number of Dataset and/or DataArray objects, returns new 591 objects with aligned indexes and dimension sizes. (...) 775 776 """""" 777 aligner = Aligner( 778 objects, 779 join=join, (...) 783 fill_value=fill_value, 784 ) --> 785 aligner.align() 786 return aligner.results File ~/mambaforge/envs/dev/lib/python3.9/site-packages/xarray/core/alignment.py:573, in Aligner.align(self) 571 self.assert_no_index_conflict() 572 self.align_indexes() --> 573 self.assert_unindexed_dim_sizes_equal() 575 if self.join == ""override"": 576 self.override_indexes() File ~/mambaforge/envs/dev/lib/python3.9/site-packages/xarray/core/alignment.py:472, in Aligner.assert_unindexed_dim_sizes_equal(self) 470 add_err_msg = """" 471 if len(sizes) > 1: --> 472 raise ValueError( 473 f""cannot reindex or align along dimension {dim!r} "" 474 f""because of conflicting dimension sizes: {sizes!r}"" + add_err_msg 475 ) ValueError: cannot reindex or align along dimension 'dim_0' because of conflicting dimension sizes: {2, 3} ```
cc: @claydugo","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,621078539 https://github.com/pydata/xarray/issues/7513#issuecomment-1421384646,https://api.github.com/repos/pydata/xarray/issues/7513,1421384646,IC_kwDOAMm_X85UuJvG,90008,2023-02-07T20:15:42Z,2023-02-07T20:15:42Z,CONTRIBUTOR,"I kinda think this reminds me of https://github.com/pydata/xarray/discussions/7359","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1574694462 https://github.com/pydata/xarray/issues/5081#issuecomment-1412388313,https://api.github.com/repos/pydata/xarray/issues/5081,1412388313,IC_kwDOAMm_X85UL1XZ,90008,2023-02-01T16:54:53Z,2023-02-01T16:54:53Z,CONTRIBUTOR,"As a followup question, is the LazilyIndexedArray part of the 'public api'. That is when you do decide to refactor, https://docs.xarray.dev/en/stable/generated/xarray.core.indexing.LazilyIndexedArray.html Will you try to warn us users that choose to ``` from xarray.core.indexing import LazilyIndexedArray ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,842436143 https://github.com/pydata/xarray/issues/5081#issuecomment-1412379773,https://api.github.com/repos/pydata/xarray/issues/5081,1412379773,IC_kwDOAMm_X85ULzR9,90008,2023-02-01T16:49:15Z,2023-02-01T16:49:15Z,CONTRIBUTOR,"I'm going to say, the `LazilyIndexedArray` is pretty cool. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,842436143 https://github.com/pydata/xarray/pull/4395#issuecomment-1411104404,https://api.github.com/repos/pydata/xarray/issues/4395,1411104404,IC_kwDOAMm_X85UG76U,90008,2023-01-31T21:39:15Z,2023-01-31T21:39:15Z,CONTRIBUTOR,"Ultimately, I'm not sure how you want to manage resources. This zarr store could be considered a resource and thus, may have an owner. Or maybe zarr should close itself upon garbage cleanup.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,689502005 https://github.com/pydata/xarray/pull/4395#issuecomment-1411102778,https://api.github.com/repos/pydata/xarray/issues/4395,1411102778,IC_kwDOAMm_X85UG7g6,90008,2023-01-31T21:38:23Z,2023-01-31T21:38:23Z,CONTRIBUTOR,I'm not sure. I decided not to use zarr (not now) so i lost interest sorry.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,689502005 https://github.com/pydata/xarray/issues/7245#issuecomment-1383192335,https://api.github.com/repos/pydata/xarray/issues/7245,1383192335,IC_kwDOAMm_X85ScdcP,90008,2023-01-15T16:23:15Z,2023-01-15T16:23:15Z,CONTRIBUTOR,"Thank you for your explination. Do you think it is safe to ""strip"" encoding after ""loading"" the data? or is it still used after the initial call to `open_dataset`?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1432388736 https://github.com/pydata/xarray/issues/7245#issuecomment-1369001951,https://api.github.com/repos/pydata/xarray/issues/7245,1369001951,IC_kwDOAMm_X85RmU_f,90008,2023-01-02T14:41:45Z,2023-01-02T14:41:45Z,CONTRIBUTOR,Kind bump,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1432388736 https://github.com/pydata/xarray/pull/7356#issuecomment-1362322800,https://api.github.com/repos/pydata/xarray/issues/7356,1362322800,IC_kwDOAMm_X85RM2Vw,90008,2022-12-22T02:40:59Z,2022-12-22T02:40:59Z,CONTRIBUTOR,"Any chance of a release, this is quite breaking for large datasets that can only be out of memory.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1475567394 https://github.com/pydata/xarray/pull/7356#issuecomment-1346924547,https://api.github.com/repos/pydata/xarray/issues/7356,1346924547,IC_kwDOAMm_X85QSHAD,90008,2022-12-12T17:27:47Z,2022-12-12T17:27:47Z,CONTRIBUTOR,馃憤馃従 ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1475567394 https://github.com/pydata/xarray/pull/7356#issuecomment-1339624818,https://api.github.com/repos/pydata/xarray/issues/7356,1339624818,IC_kwDOAMm_X85P2Q1y,90008,2022-12-06T16:19:19Z,2022-12-06T16:19:19Z,CONTRIBUTOR,"Yes, without chunks of anything","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1475567394 https://github.com/pydata/xarray/pull/7356#issuecomment-1339624418,https://api.github.com/repos/pydata/xarray/issues/7356,1339624418,IC_kwDOAMm_X85P2Qvi,90008,2022-12-06T16:18:59Z,2022-12-06T16:18:59Z,CONTRIBUTOR,Very smart test!,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1475567394 https://github.com/pydata/xarray/pull/7356#issuecomment-1339457617,https://api.github.com/repos/pydata/xarray/issues/7356,1339457617,IC_kwDOAMm_X85P1oBR,90008,2022-12-06T14:18:11Z,2022-12-06T14:18:11Z,CONTRIBUTOR,The data is loaded from an NetCDF store through open_dataset,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1475567394 https://github.com/pydata/xarray/pull/7356#issuecomment-1339452942,https://api.github.com/repos/pydata/xarray/issues/7356,1339452942,IC_kwDOAMm_X85P1m4O,90008,2022-12-06T14:14:57Z,2022-12-06T14:14:57Z,CONTRIBUTOR,"No explicit test was added to ensure that the data wasn't loaded. I just experienced this bug enough (we would accidentally load 100GB files in our code base) that I knew exactly how to fix it. If you want i can add a test to ensure that future optimizations to nbytes do not trigger a data load. I was hoping the 1 line fix would be a shoe in.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1475567394 https://github.com/pydata/xarray/pull/7356#issuecomment-1336731702,https://api.github.com/repos/pydata/xarray/issues/7356,1336731702,IC_kwDOAMm_X85PrOg2,90008,2022-12-05T04:20:08Z,2022-12-05T04:20:08Z,CONTRIBUTOR,It seems that checking hasattr on the `_data` variable achieves both purposes.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1475567394 https://github.com/pydata/xarray/pull/7356#issuecomment-1336711830,https://api.github.com/repos/pydata/xarray/issues/7356,1336711830,IC_kwDOAMm_X85PrJqW,90008,2022-12-05T03:58:50Z,2022-12-05T03:58:50Z,CONTRIBUTOR,"I think that at the very lease, the current implementation works as well as the old one for arrays that are defined by the `sparse` package.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1475567394 https://github.com/pydata/xarray/pull/7356#issuecomment-1336700669,https://api.github.com/repos/pydata/xarray/issues/7356,1336700669,IC_kwDOAMm_X85PrG79,90008,2022-12-05T03:36:31Z,2022-12-05T03:36:31Z,CONTRIBUTOR,"Looking into the history a little more. I seem to be proposing to revert: https://github.com/pydata/xarray/commit/60f8c3d3488d377b0b21009422c6121e1c8f1f70 I think this is important since many users have arrays that are larger than memory. For me, I found this bug when trying to access the number of bytes in a 16GB dataset that I'm trying to load on my wimpy laptop. Not fun to start swapping. I feel like others might be hitting this too. xref: https://github.com/pydata/xarray/pull/6797 https://github.com/pydata/xarray/issues/4842","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1475567394 https://github.com/pydata/xarray/pull/7356#issuecomment-1336696899,https://api.github.com/repos/pydata/xarray/issues/7356,1336696899,IC_kwDOAMm_X85PrGBD,90008,2022-12-05T03:30:31Z,2022-12-05T03:30:31Z,CONTRIBUTOR,I personally do not even think the `hasattr` is really that useful. You might as well use size and itemsize,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1475567394 https://github.com/pydata/xarray/issues/7259#issuecomment-1320956883,https://api.github.com/repos/pydata/xarray/issues/7259,1320956883,IC_kwDOAMm_X85OvDPT,90008,2022-11-19T19:51:27Z,2022-11-19T19:51:27Z,CONTRIBUTOR,"I'm really not sure. It seems to happen with a large swath of versions from my recent search. Also running from the python REPL, i don't see the warning. which makes me feel like numpy/cython/netcdf4 are trying to suppress the harmless warning. https://github.com/cython/cython/blob/0.29.x/Cython/Utility/ImportExport.c#L365","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1437481995 https://github.com/pydata/xarray/issues/7259#issuecomment-1320953994,https://api.github.com/repos/pydata/xarray/issues/7259,1320953994,IC_kwDOAMm_X85OvCiK,90008,2022-11-19T19:33:37Z,2022-11-19T19:33:37Z,CONTRIBUTOR,one or the other.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1437481995 https://github.com/pydata/xarray/issues/7259#issuecomment-1320953155,https://api.github.com/repos/pydata/xarray/issues/7259,1320953155,IC_kwDOAMm_X85OvCVD,90008,2022-11-19T19:28:20Z,2022-11-19T19:28:53Z,CONTRIBUTOR,"I think it is a numpy thing ```bash mamba create --name np numpy netcdf4 --channel conda-forge --override-channels conda activate np python -c ""import numpy; import warnings; warnings.filterwarnings('error'); import netCDF4"" ``` ```python Traceback (most recent call last): File """", line 1, in File ""/home/mark/mambaforge/envs/np/lib/python3.11/site-packages/netCDF4/__init__.py"", line 3, in from ._netCDF4 import * File ""src/netCDF4/_netCDF4.pyx"", line 1, in init netCDF4._netCDF4 RuntimeWarning: numpy.ndarray size changed, may indicate binary incompatibility. Expected 16 from C header, got 96 from PyObject ```
``` $ mamba list # packages in environment at /home/mark/mambaforge/envs/np: # # Name Version Build Channel _libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 2_gnu conda-forge bzip2 1.0.8 h7f98852_4 conda-forge c-ares 1.18.1 h7f98852_0 conda-forge ca-certificates 2022.9.24 ha878542_0 conda-forge cftime 1.6.2 py311h4c7f6c3_1 conda-forge curl 7.86.0 h2283fc2_1 conda-forge hdf4 4.2.15 h9772cbc_5 conda-forge hdf5 1.12.2 nompi_h4df4325_100 conda-forge icu 70.1 h27087fc_0 conda-forge jpeg 9e h166bdaf_2 conda-forge keyutils 1.6.1 h166bdaf_0 conda-forge krb5 1.19.3 h08a2579_0 conda-forge ld_impl_linux-64 2.39 hc81fddc_0 conda-forge libblas 3.9.0 16_linux64_openblas conda-forge libcblas 3.9.0 16_linux64_openblas conda-forge libcurl 7.86.0 h2283fc2_1 conda-forge libedit 3.1.20191231 he28a2e2_2 conda-forge libev 4.33 h516909a_1 conda-forge libffi 3.4.2 h7f98852_5 conda-forge libgcc-ng 12.2.0 h65d4601_19 conda-forge libgfortran-ng 12.2.0 h69a702a_19 conda-forge libgfortran5 12.2.0 h337968e_19 conda-forge libgomp 12.2.0 h65d4601_19 conda-forge libiconv 1.17 h166bdaf_0 conda-forge liblapack 3.9.0 16_linux64_openblas conda-forge libnetcdf 4.8.1 nompi_h261ec11_106 conda-forge libnghttp2 1.47.0 hff17c54_1 conda-forge libnsl 2.0.0 h7f98852_0 conda-forge libopenblas 0.3.21 pthreads_h78a6416_3 conda-forge libsqlite 3.40.0 h753d276_0 conda-forge libssh2 1.10.0 hf14f497_3 conda-forge libstdcxx-ng 12.2.0 h46fd767_19 conda-forge libuuid 2.32.1 h7f98852_1000 conda-forge libxml2 2.10.3 h7463322_0 conda-forge libzip 1.9.2 hc929e4a_1 conda-forge libzlib 1.2.13 h166bdaf_4 conda-forge ncurses 6.3 h27087fc_1 conda-forge netcdf4 1.6.2 nompi_py311hc6fcf29_100 conda-forge numpy 1.23.4 py311h7d28db0_1 conda-forge openssl 3.0.7 h166bdaf_0 conda-forge pip 22.3.1 pyhd8ed1ab_0 conda-forge python 3.11.0 ha86cf86_0_cpython conda-forge python_abi 3.11 2_cp311 conda-forge readline 8.1.2 h0f457ee_0 conda-forge setuptools 65.5.1 pyhd8ed1ab_0 conda-forge tk 8.6.12 h27826a3_0 conda-forge tzdata 2022f h191b570_0 conda-forge wheel 0.38.4 pyhd8ed1ab_0 conda-forge xz 5.2.6 h166bdaf_0 conda-forge zlib 1.2.13 h166bdaf_4 conda-forge ```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1437481995 https://github.com/pydata/xarray/issues/7259#issuecomment-1320950377,https://api.github.com/repos/pydata/xarray/issues/7259,1320950377,IC_kwDOAMm_X85OvBpp,90008,2022-11-19T19:14:04Z,2022-11-19T19:14:04Z,CONTRIBUTOR,"``` mamba create --name xr numpy pandas xarray netcdf4 --channel conda-forge --override-channels conda activate xr python -c ""import xarray; import warnings; warnings.filterwarnings('error'); import netCDF4"" ``` ``` Traceback (most recent call last): File """", line 1, in File ""/home/mark/mambaforge/envs/xr/lib/python3.11/site-packages/netCDF4/__init__.py"", line 3, in from ._netCDF4 import * File ""src/netCDF4/_netCDF4.pyx"", line 1, in init netCDF4._netCDF4 RuntimeWarning: numpy.ndarray size changed, may indicate binary incompatibility. Expected 16 from C header, got 96 from PyObject ```
`mamba list` ``` mamba list # packages in environment at /home/mark/mambaforge/envs/xr: # # Name Version Build Channel _libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 2_gnu conda-forge bzip2 1.0.8 h7f98852_4 conda-forge c-ares 1.18.1 h7f98852_0 conda-forge ca-certificates 2022.9.24 ha878542_0 conda-forge cftime 1.6.2 py311h4c7f6c3_1 conda-forge curl 7.86.0 h2283fc2_1 conda-forge hdf4 4.2.15 h9772cbc_5 conda-forge hdf5 1.12.2 nompi_h4df4325_100 conda-forge icu 70.1 h27087fc_0 conda-forge jpeg 9e h166bdaf_2 conda-forge keyutils 1.6.1 h166bdaf_0 conda-forge krb5 1.19.3 h08a2579_0 conda-forge ld_impl_linux-64 2.39 hc81fddc_0 conda-forge libblas 3.9.0 16_linux64_openblas conda-forge libcblas 3.9.0 16_linux64_openblas conda-forge libcurl 7.86.0 h2283fc2_1 conda-forge libedit 3.1.20191231 he28a2e2_2 conda-forge libev 4.33 h516909a_1 conda-forge libffi 3.4.2 h7f98852_5 conda-forge libgcc-ng 12.2.0 h65d4601_19 conda-forge libgfortran-ng 12.2.0 h69a702a_19 conda-forge libgfortran5 12.2.0 h337968e_19 conda-forge libgomp 12.2.0 h65d4601_19 conda-forge libiconv 1.17 h166bdaf_0 conda-forge liblapack 3.9.0 16_linux64_openblas conda-forge libnetcdf 4.8.1 nompi_h261ec11_106 conda-forge libnghttp2 1.47.0 hff17c54_1 conda-forge libnsl 2.0.0 h7f98852_0 conda-forge libopenblas 0.3.21 pthreads_h78a6416_3 conda-forge libsqlite 3.40.0 h753d276_0 conda-forge libssh2 1.10.0 hf14f497_3 conda-forge libstdcxx-ng 12.2.0 h46fd767_19 conda-forge libuuid 2.32.1 h7f98852_1000 conda-forge libxml2 2.10.3 h7463322_0 conda-forge libzip 1.9.2 hc929e4a_1 conda-forge libzlib 1.2.13 h166bdaf_4 conda-forge ncurses 6.3 h27087fc_1 conda-forge netcdf4 1.6.2 nompi_py311hc6fcf29_100 conda-forge numpy 1.23.4 py311h7d28db0_1 conda-forge openssl 3.0.7 h166bdaf_0 conda-forge packaging 21.3 pyhd8ed1ab_0 conda-forge pandas 1.5.1 py311h8b32b4d_1 conda-forge pip 22.3.1 pyhd8ed1ab_0 conda-forge pyparsing 3.0.9 pyhd8ed1ab_0 conda-forge python 3.11.0 ha86cf86_0_cpython conda-forge python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge python_abi 3.11 2_cp311 conda-forge pytz 2022.6 pyhd8ed1ab_0 conda-forge readline 8.1.2 h0f457ee_0 conda-forge setuptools 65.5.1 pyhd8ed1ab_0 conda-forge six 1.16.0 pyh6c4a22f_0 conda-forge tk 8.6.12 h27826a3_0 conda-forge tzdata 2022f h191b570_0 conda-forge wheel 0.38.4 pyhd8ed1ab_0 conda-forge xarray 2022.11.0 pyhd8ed1ab_0 conda-forge xz 5.2.6 h166bdaf_0 conda-forge zlib 1.2.13 h166bdaf_4 conda-forge ```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1437481995 https://github.com/pydata/xarray/issues/7259#issuecomment-1320945794,https://api.github.com/repos/pydata/xarray/issues/7259,1320945794,IC_kwDOAMm_X85OvAiC,90008,2022-11-19T18:51:01Z,2022-11-19T18:51:01Z,CONTRIBUTOR,"It is also reproducible on binder: ![image](https://user-images.githubusercontent.com/90008/202866783-4c23e93c-2813-43bc-8dd3-2e047ac55dcc.png) It seems that the binder uses conda-forge, which is why i'm commenting here. It is really strange in the sense that xarray doesn't compile anything. https://github.com/conda-forge/xarray-feedstock/blob/main/recipe/meta.yaml#L16 So it must be something that gets lazy loaded that triggers things.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1437481995 https://github.com/pydata/xarray/issues/2799#issuecomment-1306327743,https://api.github.com/repos/pydata/xarray/issues/2799,1306327743,IC_kwDOAMm_X85N3Pq_,90008,2022-11-07T22:45:07Z,2022-11-07T22:45:07Z,CONTRIBUTOR,"As I've been recently going down this performance rabbit hole, I think the discussion around https://github.com/pydata/xarray/issues/7045 is relevant and provides some additional historical context as to ""why"" this performance penalty might be happening.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,416962458 https://github.com/pydata/xarray/issues/7245#issuecomment-1300527716,https://api.github.com/repos/pydata/xarray/issues/7245,1300527716,IC_kwDOAMm_X85NhHpk,90008,2022-11-02T14:27:04Z,2022-11-02T14:27:04Z,CONTRIBUTOR,"While the above ""fix"" addresses the issues with renaming coordinates, I think there are plenty of usecases where we would still end up with strange, or unexpected results. For example. 1. Load a dataset with many non-indexing coordinates. 2. Dropping variables (that happen to be coordinates). 3. Then adding back a variable with the same name. 4. Upon save, encoding would dictate that it is a coordinate of a particular variable and will promote it to a coordinate instead of data. We could apply the ""fix"" to the `drop_vars` method as well, but I think it may be hard (though not impossible) to hit all the cases. I think a more ""generic"", albeit breaking"" fix would be to remove the ""`coordinates`"" entirely from encoding after the dataset has been loaded. That said, this only ""works"" if `dataset['variable_name'].encoding['coordinates']` is considered a private variable. That is, users are not supposed to be adding to it at will.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1432388736 https://github.com/pydata/xarray/issues/7245#issuecomment-1299492524,https://api.github.com/repos/pydata/xarray/issues/7245,1299492524,IC_kwDOAMm_X85NdK6s,90008,2022-11-02T02:49:58Z,2022-11-02T02:57:37Z,CONTRIBUTOR,"And if you want to have a clean encoding dictionary, you may want to do the following: ```python names = set(names) for _, variable in obj._variables.items(): if 'coordinates' in variable.encoding: coords_in_encoding = set(variable.encoding.get('coordinates').split(' ')) remaining_coords = coords_in_encoding - names if len(remaining_coords) == 0: del variable.encoding['coordinates'] else: variable.encoding['coordinates'] = ' '.join(remaining_coords) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1432388736 https://github.com/pydata/xarray/issues/7239#issuecomment-1299369449,https://api.github.com/repos/pydata/xarray/issues/7239,1299369449,IC_kwDOAMm_X85Ncs3p,90008,2022-11-01T23:54:07Z,2022-11-01T23:54:07Z,CONTRIBUTOR,"I think these are good alternatives. From my experiments (and I'm still trying to create a minimum reproducible code that shows the real problem behind the slowdowns) reindexing can be quite an expensive. We used to have many coordinates (to ensure that critical metdata stays with data_variables) and those coordinates were causing slowdowns on reindexing operations. Thus the two calls `update` and `expand_dims` might cause two reindex merges to occur. However, for this particular issue, I think that documenting the strategies proposed in the docstring is good enough. I have a feeling if one can get to the bottom of 7224, the performance concerns here will be mitigated too. We can leave the performance discussion to: https://github.com/pydata/xarray/issues/7224","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1429172192 https://github.com/pydata/xarray/pull/7238#issuecomment-1296269381,https://api.github.com/repos/pydata/xarray/issues/7238,1296269381,IC_kwDOAMm_X85NQ4BF,90008,2022-10-30T14:10:23Z,2022-10-30T14:10:23Z,CONTRIBUTOR,"> Hmm...I was kind of hoping we could avoid something like adding a _stacklevel_increment argument. Right. thank you for finding that example. I was going to try to construct one.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1428748922 https://github.com/pydata/xarray/issues/7224#issuecomment-1296006560,https://api.github.com/repos/pydata/xarray/issues/7224,1296006560,IC_kwDOAMm_X85NP32g,90008,2022-10-29T22:39:39Z,2022-10-29T22:39:39Z,CONTRIBUTOR,xref: https://github.com/pandas-dev/pandas/pull/49393,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423948375 https://github.com/pydata/xarray/issues/7224#issuecomment-1296006402,https://api.github.com/repos/pydata/xarray/issues/7224,1296006402,IC_kwDOAMm_X85NP30C,90008,2022-10-29T22:39:01Z,2022-10-29T22:39:01Z,CONTRIBUTOR,"Ok, I don't think I have the right tools to really get to the bottom of this. The spyder profiler just seems to slowdown code too much. Any other tools to recommend?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423948375 https://github.com/pydata/xarray/pull/7236#issuecomment-1295999237,https://api.github.com/repos/pydata/xarray/issues/7236,1295999237,IC_kwDOAMm_X85NP2EF,90008,2022-10-29T22:11:33Z,2022-10-29T22:11:33Z,CONTRIBUTOR,"Well now the benchmarks look like they make more sense: ``` [ 50.00%] 路路路 ==================== ========== ========== ========== ========== ========== -- count -------------------- ------------------------------------------------------ strategy 0 1 10 100 1000 ==================== ========== ========== ========== ========== ========== dict_of_DataArrays 1.56卤0ms 3.60卤0ms 5.83卤0ms 16.6卤0ms 67.3卤0ms dict_of_Variables 1.65卤0ms 3.11卤0ms 4.03卤0ms 6.22卤0ms 18.9卤0ms dict_of_Tuples 2.42卤0ms 3.11卤0ms 982卤0渭s 5.17卤0ms 17.2卤0ms ==================== ========== ========== ========== ========== ========== ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1428274982 https://github.com/pydata/xarray/pull/7236#issuecomment-1295937569,https://api.github.com/repos/pydata/xarray/issues/7236,1295937569,IC_kwDOAMm_X85NPnAh,90008,2022-10-29T18:58:35Z,2022-10-29T18:58:35Z,CONTRIBUTOR,"``` [ 50.00%] 路路路 ==================== ========== ========== ========== ========== ========== -- count -------------------- ------------------------------------------------------ strategy 0 1 10 100 1000 ==================== ========== ========== ========== ========== ========== dict_of_DataArrays 1.65卤0ms 3.83卤0ms 4.03卤0ms 6.14卤0ms 16.6卤0ms dict_of_Variables 3.04卤0ms 3.24卤0ms 3.38卤0ms 4.04卤0ms 9.91卤0ms dict_of_Tuples 2.90卤0ms 3.03卤0ms 3.32卤0ms 3.22卤0ms 3.22卤0ms ==================== ========== ========== ========== ========== ========== ``` as you though, the numbers improve quite a bit. I kinda want to understand why a no-op takes 1 ms! ^_^","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1428274982 https://github.com/pydata/xarray/pull/7236#issuecomment-1295937364,https://api.github.com/repos/pydata/xarray/issues/7236,1295937364,IC_kwDOAMm_X85NPm9U,90008,2022-10-29T18:57:54Z,2022-10-29T18:57:54Z,CONTRIBUTOR,"What about just specifying ""dims""?","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1428274982 https://github.com/pydata/xarray/pull/7236#issuecomment-1295905591,https://api.github.com/repos/pydata/xarray/issues/7236,1295905591,IC_kwDOAMm_X85NPfM3,90008,2022-10-29T17:11:30Z,2022-10-29T17:11:30Z,CONTRIBUTOR,"With the right window size it looks like: ``` [ 50.00%] 路路路 ==================== ========== ========== ========== ========== ========== -- count -------------------- ------------------------------------------------------ strategy 0 1 10 100 1000 ==================== ========== ========== ========== ========== ========== dict_of_DataArrays 1.32卤0ms 5.87卤0ms 7.58卤0ms 18.7卤0ms 98.6卤0ms dict_of_Variables 2.70卤0ms 2.91卤0ms 3.01卤0ms 3.91卤0ms 7.04卤0ms dict_of_Tuples 2.84卤0ms 3.02卤0ms 3.22卤0ms 3.42卤0ms 3.02卤0ms ==================== ========== ========== ========== ========== ========== ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1428274982 https://github.com/pydata/xarray/pull/7236#issuecomment-1295852860,https://api.github.com/repos/pydata/xarray/issues/7236,1295852860,IC_kwDOAMm_X85NPSU8,90008,2022-10-29T14:28:25Z,2022-10-29T14:28:25Z,CONTRIBUTOR,"On the CI, it reports similar findings: ``` [ 67.73%] 路路路 ...dVariable.time_dict_of_dataarrays_to_dataset ok [ 67.73%] 路路路 =================== ============= existing_elements ------------------- ------------- 0 269卤0.9渭s 10 2.21卤0.01ms 100 16.5卤0.07ms 1000 153卤0.9ms =================== ============= [ 67.88%] 路路路 ...etAddVariable.time_dict_of_tuples_to_dataset ok [ 67.88%] 路路路 =================== =========== existing_elements ------------------- ----------- 0 269卤0.6渭s 10 289卤0.4渭s 100 293卤1渭s 1000 346卤0.4渭s =================== =========== [ 68.02%] 路路路 ...ddVariable.time_dict_of_variables_to_dataset ok [ 68.02%] 路路路 =================== ============= existing_elements ------------------- ------------- 0 270卤1渭s 10 329卤0.6渭s 100 636卤1渭s 1000 3.70卤0.01ms =================== ============= [ 68.17%] 路路路 ...e.DatasetAddVariable.time_merge_two_datasets ok [ 68.17%] 路路路 =================== ============= existing_elements ------------------- ------------- 0 104卤0.5渭s 10 235卤0.6渭s 100 1.05卤0ms 1000 9.02卤0.02ms =================== ============= [ 68.31%] 路路路 ...e.DatasetAddVariable.time_variable_insertion ok [ 68.31%] 路路路 =================== ============= existing_elements ------------------- ------------- 0 119卤1渭s 10 225卤0.7渭s 100 1.04卤0ms 1000 9.03卤0.03ms =================== ============= ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1428274982 https://github.com/pydata/xarray/pull/7236#issuecomment-1295843798,https://api.github.com/repos/pydata/xarray/issues/7236,1295843798,IC_kwDOAMm_X85NPQHW,90008,2022-10-29T13:55:33Z,2022-10-29T13:55:33Z,CONTRIBUTOR,"``` $ asv run -E existing --quick --bench merge 路 Discovering benchmarks 路 Running 5 total benchmarks (1 commits * 1 environments * 5 benchmarks) [ 0.00%] 路路 Benchmarking existing-py_home_mark_mambaforge_envs_mcam_dev_bin_python [ 10.00%] 路路路 merge.DatasetAddVariable.time_dict_of_dataarrays_to_dataset ok [ 10.00%] 路路路 =================== ========== existing_elements ------------------- ---------- 0 762卤0渭s 10 7.18卤0ms 100 12.6卤0ms 1000 89.1卤0ms =================== ========== [ 20.00%] 路路路 merge.DatasetAddVariable.time_dict_of_tuples_to_dataset ok [ 20.00%] 路路路 =================== ========== existing_elements ------------------- ---------- 0 889卤0渭s 10 2.01卤0ms 100 1.34卤0ms 1000 605卤0渭s =================== ========== [ 30.00%] 路路路 merge.DatasetAddVariable.time_dict_of_variables_to_dataset ok [ 30.00%] 路路路 =================== ========== existing_elements ------------------- ---------- 0 2.48卤0ms 10 2.06卤0ms 100 2.13卤0ms 1000 2.38卤0ms =================== ========== [ 40.00%] 路路路 merge.DatasetAddVariable.time_merge_two_datasets ok [ 40.00%] 路路路 =================== ========== existing_elements ------------------- ---------- 0 814卤0渭s 10 945卤0渭s 100 2.42卤0ms 1000 5.23卤0ms =================== ========== [ 50.00%] 路路路 merge.DatasetAddVariable.time_variable_insertion ok [ 50.00%] 路路路 =================== ========== existing_elements ------------------- ---------- 0 1.10卤0ms 10 954卤0渭s 100 1.88卤0ms 1000 5.29卤0ms =================== ========== ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1428274982 https://github.com/pydata/xarray/pull/7179#issuecomment-1295257627,https://api.github.com/repos/pydata/xarray/issues/7179,1295257627,IC_kwDOAMm_X85NNBAb,90008,2022-10-28T17:21:40Z,2022-10-28T17:21:40Z,CONTRIBUTOR,Exciting improvements on usability for the next version!,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1412019155 https://github.com/pydata/xarray/pull/7222#issuecomment-1293689561,https://api.github.com/repos/pydata/xarray/issues/7222,1293689561,IC_kwDOAMm_X85NHCLZ,90008,2022-10-27T15:15:45Z,2022-10-27T15:15:45Z,CONTRIBUTOR,"> but that would be a lot of work especially for such a critical piece of code in Xarray. Agreed. I'll take the small wins where I can :D. Great! I think this will be a good addition with: https://github.com/pydata/xarray/pull/7223#discussion_r1007023769","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423321834 https://github.com/pydata/xarray/pull/7223#issuecomment-1292299499,https://api.github.com/repos/pydata/xarray/issues/7223,1292299499,IC_kwDOAMm_X85NBuzr,90008,2022-10-26T16:24:56Z,2022-10-26T16:24:56Z,CONTRIBUTOR,ok naming is always hard. I tried to pick a good name.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423916687 https://github.com/pydata/xarray/pull/7221#issuecomment-1291948502,https://api.github.com/repos/pydata/xarray/issues/7221,1291948502,IC_kwDOAMm_X85NAZHW,90008,2022-10-26T12:19:49Z,2022-10-26T12:23:46Z,CONTRIBUTOR,"I know it is not comparable, but I was really curious what ""dictionary insertion"" costs, in order to be able to understand if my comparisons were fair:
code ```python from tqdm import tqdm import xarray as xr from time import perf_counter import numpy as np N = 1000 # Everybody is lazy loading now, so lets force modules to get instantiated dummy_dataset = xr.Dataset() dummy_dataset['a'] = 1 dummy_dataset['b'] = 1 del dummy_dataset time_elapsed = np.zeros(N) # dataset = xr.Dataset() dataset = {} for i in tqdm(range(N)): # for i in range(N): time_start = perf_counter() dataset[f""var{i}""] = i time_end = perf_counter() time_elapsed[i] = time_end - time_start # %% from matplotlib import pyplot as plt plt.plot(np.arange(N), time_elapsed * 1E6, label='Time to add one variable') plt.xlabel(""Number of existing variables"") plt.ylabel(""Time to add a variables (us)"") plt.ylim([0, 10]) plt.title(""Dictionary insertion"") plt.grid(True) ```
![image](https://user-images.githubusercontent.com/90008/198024147-0965787a-32be-409b-959c-1b87adbc633a.png) I think xarray gives me 3 order of magnitude of ""thinking"" benefit, so I'll take it! ``` python --version Python 3.9.13 ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423312198 https://github.com/pydata/xarray/pull/7221#issuecomment-1291894024,https://api.github.com/repos/pydata/xarray/issues/7221,1291894024,IC_kwDOAMm_X85NAL0I,90008,2022-10-26T11:32:32Z,2022-10-26T11:32:32Z,CONTRIBUTOR,"Ok. I'll want to rethink them. I know it looks quadratic time, but i really would like to test n=1000 and i have an idea","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423312198 https://github.com/pydata/xarray/pull/7221#issuecomment-1291450556,https://api.github.com/repos/pydata/xarray/issues/7221,1291450556,IC_kwDOAMm_X85M-fi8,90008,2022-10-26T03:32:53Z,2022-10-26T03:32:53Z,CONTRIBUTOR,"I'm somewhat ocnfused, I can run the benchmark locally ``` [ 1.80%] 路路路 dataset_creation.Creation.time_dataset_creation 4.37卤0s ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423312198 https://github.com/pydata/xarray/pull/7221#issuecomment-1291447746,https://api.github.com/repos/pydata/xarray/issues/7221,1291447746,IC_kwDOAMm_X85M-e3C,90008,2022-10-26T03:27:36Z,2022-10-26T03:27:36Z,CONTRIBUTOR,":/ not fun, the benchmark is failing. not sure why.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423312198 https://github.com/pydata/xarray/pull/7222#issuecomment-1291405225,https://api.github.com/repos/pydata/xarray/issues/7222,1291405225,IC_kwDOAMm_X85M-Uep,90008,2022-10-26T02:19:23Z,2022-10-26T02:19:23Z,CONTRIBUTOR,"I think the rapid return, helps by about 40% is still pretty good.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423321834 https://github.com/pydata/xarray/pull/7222#issuecomment-1291402576,https://api.github.com/repos/pydata/xarray/issues/7222,1291402576,IC_kwDOAMm_X85M-T1Q,90008,2022-10-26T02:17:45Z,2022-10-26T02:17:45Z,CONTRIBUTOR,hmm ok. it seems i can't blatently avoid the copy like that.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423321834 https://github.com/pydata/xarray/pull/7221#issuecomment-1291399714,https://api.github.com/repos/pydata/xarray/issues/7221,1291399714,IC_kwDOAMm_X85M-TIi,90008,2022-10-26T02:14:40Z,2022-10-26T02:14:40Z,CONTRIBUTOR,"> Would be interesting to see whether this was covered by our existing asv benchmarks. I wasn't able to find something that really benchmarked ""large"" datasets. > Would be a good benchmark to add if we don't have one already. Added one. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423312198 https://github.com/pydata/xarray/pull/7222#issuecomment-1291391647,https://api.github.com/repos/pydata/xarray/issues/7222,1291391647,IC_kwDOAMm_X85M-RKf,90008,2022-10-26T02:03:41Z,2022-10-26T02:03:41Z,CONTRIBUTOR,"The reason this is a separate merge request, is that I agree that this is more contentious as a change. However, I will argue that `Aligner` should really not be a class. Using ripgrep you find that the only instances of Aligner exist internally: ``` xarray/core/dataset.py 2775: aligner: alignment.Aligner, 2783: """"""Callback called from ``Aligner`` to create a new reindexed Dataset."""""" xarray/core/alignment.py 107:class Aligner(Generic[DataAlignable]): 114: aligner = Aligner(*objects, **kwargs) <------- Example 767: aligner = Aligner( <----------- Used and consumed for the method `align` 881: aligner = Aligner( <----------- Used and consumed for the method `reindex` 909: # This check is not performed in Aligner. xarray/core/dataarray.py 1752: aligner: alignment.Aligner, 1760: """"""Callback called from ``Aligner`` to create a new reindexed DataArray."""""" ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423321834 https://github.com/pydata/xarray/pull/7221#issuecomment-1291389702,https://api.github.com/repos/pydata/xarray/issues/7221,1291389702,IC_kwDOAMm_X85M-QsG,90008,2022-10-26T01:59:57Z,2022-10-26T01:59:57Z,CONTRIBUTOR,"> out of interest, how did you find this? Spyder profiler","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423312198 https://github.com/pydata/xarray/pull/7172#issuecomment-1281117607,https://api.github.com/repos/pydata/xarray/issues/7172,1281117607,IC_kwDOAMm_X85MXE2n,90008,2022-10-17T16:11:37Z,2022-10-17T16:11:37Z,CONTRIBUTOR,"Thank you all for taking the time to study, and worry about these improvements. Now i have to figure out how my software went from 2 sec loading time to 12 ;) Totally unrelated to this. But one day I'll have benchmarking in place to monitor it :D.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1410575877 https://github.com/pydata/xarray/pull/7172#issuecomment-1280208522,https://api.github.com/repos/pydata/xarray/issues/7172,1280208522,IC_kwDOAMm_X85MTm6K,90008,2022-10-17T02:59:41Z,2022-10-17T02:59:41Z,CONTRIBUTOR,"> Separate issue, but do these need to be imported into xarray/__init__.py At this point removing testing and tutorial would be strange and break things. Stefan in the discussion linked above speaks about the reasoning behind importing submodules in the top level namespace.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1410575877 https://github.com/pydata/xarray/issues/6726#issuecomment-1280072309,https://api.github.com/repos/pydata/xarray/issues/6726,1280072309,IC_kwDOAMm_X85MTFp1,90008,2022-10-16T22:33:17Z,2022-10-16T22:33:17Z,CONTRIBUTOR,"In developing https://github.com/pydata/xarray/pull/7172, there are also some places where class types are used to check for features: https://github.com/pydata/xarray/blob/main/xarray/core/pycompat.py#L35 Dask and sparse and big contributors due to their need to resolve the class name in question. Ultimately. I think it is important to maybe constrain the problem. Are we ok with 100 ms over numpy + pandas? 20 ms? On my machines, the 0.5 s that xarray is close to seems long... but everytime I look at it, it seems to ""just be a python problem"". ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1284475176 https://github.com/pydata/xarray/issues/6791#issuecomment-1185946473,https://api.github.com/repos/pydata/xarray/issues/6791,1185946473,IC_kwDOAMm_X85GsBtp,90008,2022-07-15T21:11:19Z,2022-09-12T22:48:50Z,CONTRIBUTOR,"I guess the code: ```python import xarray as xr dataset = xr.Dataset() my_variable = np.asarray(dataset.get('my_variable', np.asarray(1.0))) ``` coerces things as an array. Talking things out made me find this one. Though it doesn't read very well. Feel free to close.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1306457778 https://github.com/pydata/xarray/pull/6910#issuecomment-1213089164,https://api.github.com/repos/pydata/xarray/issues/6910,1213089164,IC_kwDOAMm_X85ITkWM,90008,2022-08-12T13:04:19Z,2022-08-12T13:04:19Z,CONTRIBUTOR,Are the functions you are considering using this functions that never had keyword arguments before? When I wrote a similar decorator before i had an explicit list of arguments that were allowed to be converted.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1337166287 https://github.com/pydata/xarray/issues/5531#issuecomment-1213031961,https://api.github.com/repos/pydata/xarray/issues/5531,1213031961,IC_kwDOAMm_X85ITWYZ,90008,2022-08-12T11:53:07Z,2022-08-12T11:53:07Z,CONTRIBUTOR,"These decorators are kinda fun to write and are quite taylored to a certain release philosophy. It might be warranted to just write your own ;)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,929840699 https://github.com/pydata/xarray/issues/6791#issuecomment-1186019342,https://api.github.com/repos/pydata/xarray/issues/6791,1186019342,IC_kwDOAMm_X85GsTgO,90008,2022-07-15T23:23:30Z,2022-07-15T23:23:30Z,CONTRIBUTOR,Interesting.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1306457778 https://github.com/pydata/xarray/issues/5531#issuecomment-1102962705,https://api.github.com/repos/pydata/xarray/issues/5531,1102962705,IC_kwDOAMm_X85BveAR,90008,2022-04-19T18:34:07Z,2022-04-19T18:34:07Z,CONTRIBUTOR,"I think in my readme i suggest vedoring the code. Happy to give you a license for it so you don't need to credit me in addition to your own license.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,929840699 https://github.com/pydata/xarray/issues/6309#issuecomment-1094086518,https://api.github.com/repos/pydata/xarray/issues/6309,1094086518,IC_kwDOAMm_X85BNm92,90008,2022-04-09T17:06:13Z,2022-04-09T17:06:13Z,CONTRIBUTOR,"@max-sixty unfortunately, I think the way hdf5 is designed, it doesn't try to be too smart about what would be the best fine tuning for your particular system. In some ways, this is the correct approach. The current constructor pathway: https://github.com/pydata/xarray/blob/main/xarray/backends/h5netcdf_.py#L164 Doesn't provide a user with a catchall-kwargs. I think this would be an acceptable solution. I should say that the the performance of the direct driver is terrible without aligned data: https://github.com/Unidata/netcdf-c/pull/2206#issuecomment-1054855769","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1152047670 https://github.com/pydata/xarray/issues/6309#issuecomment-1052386013,https://api.github.com/repos/pydata/xarray/issues/6309,1052386013,IC_kwDOAMm_X84-uiLd,90008,2022-02-26T17:57:33Z,2022-02-26T17:57:33Z,CONTRIBUTOR,"I have to elaborate that this may be even more important for users that READ the data back alot. Reading with the standard Xarray operands hits other limits, but one limit that it definitely hits is that of the HDF5 driver used. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1152047670 https://github.com/pydata/xarray/pull/6154#issuecomment-1009823872,https://api.github.com/repos/pydata/xarray/issues/6154,1009823872,IC_kwDOAMm_X848MLCA,90008,2022-01-11T10:28:51Z,2022-01-11T10:28:51Z,CONTRIBUTOR,Thanks for merging so quickly,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1098924491 https://github.com/pydata/xarray/issues/6153#issuecomment-1009820092,https://api.github.com/repos/pydata/xarray/issues/6153,1009820092,IC_kwDOAMm_X848MKG8,90008,2022-01-11T10:24:37Z,2022-01-11T10:24:37Z,CONTRIBUTOR,"Thank you @kmuehlbauer for the explicit PR link. I do plan on adding alignment features to h5py then to bring it toward h5netcdf. So I think something like this will be useful in the future. Feature request link: https://github.com/h5py/h5py/issues/2034","{""total_count"": 2, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 1}",,1098915891 https://github.com/pydata/xarray/pull/6154#issuecomment-1009802137,https://api.github.com/repos/pydata/xarray/issues/6154,1009802137,IC_kwDOAMm_X848MFuZ,90008,2022-01-11T10:14:09Z,2022-01-11T10:14:09Z,CONTRIBUTOR,"ImportError is a superset of ModuleNotFoundError. https://github.com/python/cpython/blob/f4c03484da59049eb62a9bf7777b963e2267d187/Lib/test/exception_hierarchy.txt#L19 So it depends what question you care about asking: 1. Does the python exist? You should test for ModuleNotFoundError 2. Is the package usable? You should probably test for ImportError I think question 2 is friendlier to xarray users.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1098924491 https://github.com/pydata/xarray/issues/2347#issuecomment-1008227895,https://api.github.com/repos/pydata/xarray/issues/2347,1008227895,IC_kwDOAMm_X848GFY3,90008,2022-01-09T04:28:49Z,2022-01-09T04:28:49Z,CONTRIBUTOR,This is likely true. Thanks for looking back into this.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,347962055 https://github.com/pydata/xarray/issues/2799#issuecomment-786813358,https://api.github.com/repos/pydata/xarray/issues/2799,786813358,MDEyOklzc3VlQ29tbWVudDc4NjgxMzM1OA==,90008,2021-02-26T18:19:28Z,2021-02-26T18:19:28Z,CONTRIBUTOR,"I hope the following can help users that struggle with the speed of xarray: I've found that when doing numerical computation, I often use the xarray to grab all the metadata relevant to my computation. Scale, chromaticity, experimental information. Eventually, i create a function that acts as a barrier: - Xarray input (high level experimental data) - Computation parameters output (low level implementation detail relevant information). The low level implementation can operate on the fast numpy arrays. I've found this to be the struggle with creating high level APIs that do things like sanitize inputs (xarray routines like `_validate_indexers` and `_broadcast_indexes`) and low level APIs that are simply interested in moving and computing data. For the example that @nbren12 brought up originally, it might be better to create xarray routines (if they don't exist already) that can create fast iterators for the underlying numpy arrays given a set of dimensions that the user cares about.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,416962458 https://github.com/pydata/xarray/pull/4400#issuecomment-735759416,https://api.github.com/repos/pydata/xarray/issues/4400,735759416,MDEyOklzc3VlQ29tbWVudDczNTc1OTQxNg==,90008,2020-11-30T12:33:33Z,2020-11-30T12:33:33Z,CONTRIBUTOR,"I think you should be able to define your own custom encoder if you want it to be a datetime. But inevitably, you will have to define your own save and load functions. Python, by definition of being such a loose language, allows you to do things that the original developers never really imagined. this can sometimes lead to silent corruption.like the one you've experienced.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,690546795 https://github.com/pydata/xarray/issues/1672#issuecomment-735428830,https://api.github.com/repos/pydata/xarray/issues/1672,735428830,MDEyOklzc3VlQ29tbWVudDczNTQyODgzMA==,90008,2020-11-29T17:34:44Z,2020-11-29T17:35:04Z,CONTRIBUTOR,"It isn't really part of any library. I don't really have plans of making it into a public library. I think the discussion is really around the xarray API, and what functions to implement at first. Then somebody can take the code and integrate it into the decided upon API.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,269700511 https://github.com/pydata/xarray/pull/4400#issuecomment-735428578,https://api.github.com/repos/pydata/xarray/issues/4400,735428578,MDEyOklzc3VlQ29tbWVudDczNTQyODU3OA==,90008,2020-11-29T17:32:37Z,2020-11-29T17:32:37Z,CONTRIBUTOR,"yeah, i'm not too sure. I think the idea is that this breaks compatibility with netcdf times, so the resulting file is thus not standard. For my application, us timing is enough.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,690546795 https://github.com/pydata/xarray/issues/1672#issuecomment-685222909,https://api.github.com/repos/pydata/xarray/issues/1672,685222909,MDEyOklzc3VlQ29tbWVudDY4NTIyMjkwOQ==,90008,2020-09-02T01:17:05Z,2020-09-02T01:17:05Z,CONTRIBUTOR,"Small prototype, but maybe it can help boost the development.
```python import netCDF4 def _expand_variable(nc_variable, data, expanding_dim, nc_shape, added_size): # For time deltas, we must ensure that we use the same encoding as # what was previously stored. # We likely need to do this as well for variables that had custom # econdings too if hasattr(nc_variable, 'calendar'): data.encoding = { 'units': nc_variable.units, 'calendar': nc_variable.calendar, } data_encoded = xr.conventions.encode_cf_variable(data) # , name=name) left_slices = data.dims.index(expanding_dim) right_slices = data.ndim - left_slices - 1 nc_slice = (slice(None),) * left_slices + (slice(nc_shape, nc_shape + added_size),) + (slice(None),) * (right_slices) nc_variable[nc_slice] = data_encoded.data def append_to_netcdf(filename, ds_to_append, unlimited_dims): if isinstance(unlimited_dims, str): unlimited_dims = [unlimited_dims] if len(unlimited_dims) != 1: # TODO: change this so it can support multiple expanding dims raise ValueError( ""We only support one unlimited dim for now, "" f""got {len(unlimited_dims)}."") unlimited_dims = list(set(unlimited_dims)) expanding_dim = unlimited_dims[0] with netCDF4.Dataset(filename, mode='a') as nc: nc_dims = set(nc.dimensions.keys()) nc_coord = nc[expanding_dim] nc_shape = len(nc_coord) added_size = len(ds_to_append[expanding_dim]) variables, attrs = xr.conventions.encode_dataset_coordinates(ds_to_append) for name, data in variables.items(): if expanding_dim not in data.dims: # Nothing to do, data assumed to the identical continue nc_variable = nc[name] _expand_variable(nc_variable, data, expanding_dim, nc_shape, added_size) from xarray.tests.test_dataset import create_append_test_data from xarray.testing import assert_equal ds, ds_to_append, ds_with_new_var = create_append_test_data() filename = 'test_dataset.nc' ds.to_netcdf(filename, mode='w', unlimited_dims=['time']) append_to_netcdf('test_dataset.nc', ds_to_append, unlimited_dims='time') loaded = xr.load_dataset('test_dataset.nc') assert_equal(xr.concat([ds, ds_to_append], dim=""time""), loaded) ```
","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,269700511 https://github.com/pydata/xarray/issues/4183#issuecomment-685200043,https://api.github.com/repos/pydata/xarray/issues/4183,685200043,MDEyOklzc3VlQ29tbWVudDY4NTIwMDA0Mw==,90008,2020-09-02T00:13:30Z,2020-09-02T00:13:30Z,CONTRIBUTOR,"i ran into this problem trying to round trip time to the nanosecond (even though i don't need it, sub micro second would be nice) but unfrotunately, you run into the fact that cftime doesn't support nanoseconds https://github.com/Unidata/cftime/blob/master/cftime/_cftime.pyx Seems like they discussed a nanosecond issue a while back too https://github.com/Unidata/cftime/issues/77 Their ultimate point was that there was little point in having precision down to the nano second given that python datetime objects only have microseconds. I guess they are right.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,646038170 https://github.com/pydata/xarray/issues/1672#issuecomment-684833575,https://api.github.com/repos/pydata/xarray/issues/1672,684833575,MDEyOklzc3VlQ29tbWVudDY4NDgzMzU3NQ==,90008,2020-09-01T12:58:52Z,2020-09-01T12:58:52Z,CONTRIBUTOR,"I think I got a basic prototype working. That said, I think a real challenge lies in supporting the numerous backends and lazy arrays. For example, I was only able to add data in peculiar fashions using the netcdf4 library which may trigger complex computations many times. Is this a use case that we must optimize for now?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,269700511 https://github.com/pydata/xarray/pull/4395#issuecomment-684064522,https://api.github.com/repos/pydata/xarray/issues/4395,684064522,MDEyOklzc3VlQ29tbWVudDY4NDA2NDUyMg==,90008,2020-08-31T21:59:28Z,2020-08-31T21:59:28Z,CONTRIBUTOR,"I'm not too sure about this anymore. with the way the test is written now, it is unclear to me if the store should be closed afterward. I'm also unsure of how to deal with the case where the user passed it a ZipStore instead of a string. Will have to keep thinking.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,689502005 https://github.com/pydata/xarray/issues/2803#issuecomment-680060278,https://api.github.com/repos/pydata/xarray/issues/2803,680060278,MDEyOklzc3VlQ29tbWVudDY4MDA2MDI3OA==,90008,2020-08-25T14:29:18Z,2020-08-25T14:29:18Z,CONTRIBUTOR,"Sorry for noise. It seems that 1D arrays are still supported. I still had a 2D array lingering in my codebase.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,417542619 https://github.com/pydata/xarray/issues/2803#issuecomment-679399158,https://api.github.com/repos/pydata/xarray/issues/2803,679399158,MDEyOklzc3VlQ29tbWVudDY3OTM5OTE1OA==,90008,2020-08-24T22:31:09Z,2020-08-24T22:31:09Z,CONTRIBUTOR,"With the netcdf4 back end, I'm not able to save a 1D attr dataset. I can save my dataset with the h5netcdf backend","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,417542619 https://github.com/pydata/xarray/issues/2803#issuecomment-679348131,https://api.github.com/repos/pydata/xarray/issues/2803,679348131,MDEyOklzc3VlQ29tbWVudDY3OTM0ODEzMQ==,90008,2020-08-24T20:26:49Z,2020-08-24T20:26:49Z,CONTRIBUTOR,"Sorry for posting on such an old thread. Are `attrs` supposed to support 1D arrays?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,417542619 https://github.com/pydata/xarray/pull/3888#issuecomment-604220931,https://api.github.com/repos/pydata/xarray/issues/3888,604220931,MDEyOklzc3VlQ29tbWVudDYwNDIyMDkzMQ==,90008,2020-03-26T04:23:05Z,2020-03-26T04:23:05Z,CONTRIBUTOR,"xfail just gets forgotten, so i'll leave it for now.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,587398134 https://github.com/pydata/xarray/issues/3815#issuecomment-604181264,https://api.github.com/repos/pydata/xarray/issues/3815,604181264,MDEyOklzc3VlQ29tbWVudDYwNDE4MTI2NA==,90008,2020-03-26T01:49:45Z,2020-03-26T01:49:45Z,CONTRIBUTOR,"And actually, zarr provides a `data` argument in `create_dataset` that actually encounters the same bug ```python import zarr import numpy as np name = 'hello' data = np.array('world', dtype=' ```python In [3]: import xarray as xr ...: import zarr ...: x = xr.Dataset() ...: x['hello'] = 'world' ...: x ...: with zarr.ZipStore('test_store.zip', mode='w') as store: ...: x.to_zarr(store) ...: with zarr.ZipStore('test_store.zip', mode='r') as store: ...: x_read = xr.open_zarr(store).compute() ...: --------------------------------------------------------------------------- BadZipFile Traceback (most recent call last) in 7 x.to_zarr(store) 8 with zarr.ZipStore('test_store.zip', mode='r') as store: ----> 9 x_read = xr.open_zarr(store).compute() 10 ~/miniconda3/envs/mcam_dev/lib/python3.7/site-packages/xarray/core/dataset.py in compute(self, **kwargs) 805 """""" 806 new = self.copy(deep=False) --> 807 return new.load(**kwargs) 808 809 def _persist_inplace(self, **kwargs) -> ""Dataset"": ~/miniconda3/envs/mcam_dev/lib/python3.7/site-packages/xarray/core/dataset.py in load(self, **kwargs) 657 for k, v in self.variables.items(): 658 if k not in lazy_data: --> 659 v.load() 660 661 return self ~/miniconda3/envs/mcam_dev/lib/python3.7/site-packages/xarray/core/variable.py in load(self, **kwargs) 373 self._data = as_compatible_data(self._data.compute(**kwargs)) 374 elif not hasattr(self._data, ""__array_function__""): --> 375 self._data = np.asarray(self._data) 376 return self 377 ~/miniconda3/envs/mcam_dev/lib/python3.7/site-packages/numpy/core/_asarray.py in asarray(a, dtype, order) 83 84 """""" ---> 85 return array(a, dtype, copy=False, order=order) 86 87 ~/miniconda3/envs/mcam_dev/lib/python3.7/site-packages/xarray/core/indexing.py in __array__(self, dtype) 555 def __array__(self, dtype=None): 556 array = as_indexable(self.array) --> 557 return np.asarray(array[self.key], dtype=None) 558 559 def transpose(self, order): ~/miniconda3/envs/mcam_dev/lib/python3.7/site-packages/xarray/backends/zarr.py in __getitem__(self, key) 47 array = self.get_array() 48 if isinstance(key, indexing.BasicIndexer): ---> 49 return array[key.tuple] 50 elif isinstance(key, indexing.VectorizedIndexer): 51 return array.vindex[ ~/miniconda3/envs/mcam_dev/lib/python3.7/site-packages/zarr/core.py in __getitem__(self, selection) 570 571 fields, selection = pop_fields(selection) --> 572 return self.get_basic_selection(selection, fields=fields) 573 574 def get_basic_selection(self, selection=Ellipsis, out=None, fields=None): ~/miniconda3/envs/mcam_dev/lib/python3.7/site-packages/zarr/core.py in get_basic_selection(self, selection, out, fields) 693 if self._shape == (): 694 return self._get_basic_selection_zd(selection=selection, out=out, --> 695 fields=fields) 696 else: 697 return self._get_basic_selection_nd(selection=selection, out=out, ~/miniconda3/envs/mcam_dev/lib/python3.7/site-packages/zarr/core.py in _get_basic_selection_zd(self, selection, out, fields) 709 # obtain encoded data for chunk 710 ckey = self._chunk_key((0,)) --> 711 cdata = self.chunk_store[ckey] 712 713 except KeyError: ~/miniconda3/envs/mcam_dev/lib/python3.7/site-packages/zarr/storage.py in __getitem__(self, key) 1249 with self.mutex: 1250 with self.zf.open(key) as f: # will raise KeyError -> 1251 return f.read() 1252 1253 def __setitem__(self, key, value): ~/miniconda3/envs/mcam_dev/lib/python3.7/zipfile.py in read(self, n) 914 self._offset = 0 915 while not self._eof: --> 916 buf += self._read1(self.MAX_N) 917 return buf 918 ~/miniconda3/envs/mcam_dev/lib/python3.7/zipfile.py in _read1(self, n) 1018 if self._left <= 0: 1019 self._eof = True -> 1020 self._update_crc(data) 1021 return data 1022 ~/miniconda3/envs/mcam_dev/lib/python3.7/zipfile.py in _update_crc(self, newdata) 946 # Check the CRC if we're at the end of the file 947 if self._eof and self._running_crc != self._expected_crc: --> 948 raise BadZipFile(""Bad CRC-32 for file %r"" % self.name) 949 950 def read1(self, n): BadZipFile: Bad CRC-32 for file 'hello/0' ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,573577844 https://github.com/pydata/xarray/issues/3815#issuecomment-603190621,https://api.github.com/repos/pydata/xarray/issues/3815,603190621,MDEyOklzc3VlQ29tbWVudDYwMzE5MDYyMQ==,90008,2020-03-24T11:41:37Z,2020-03-24T11:41:37Z,CONTRIBUTOR,"My guess is that that xarray might be trying to write to the store character by character??? Otherwise, not too sure. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,573577844 https://github.com/pydata/xarray/issues/2799#issuecomment-552652019,https://api.github.com/repos/pydata/xarray/issues/2799,552652019,MDEyOklzc3VlQ29tbWVudDU1MjY1MjAxOQ==,90008,2019-11-11T22:47:47Z,2019-11-11T22:47:47Z,CONTRIBUTOR,"Sure, I just wanted to make the note that this operation **should** be more or less constant time, as opposed to dependent on the size of the array. Somebody had mentionned it should increase with the size of the array. ","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,416962458 https://github.com/pydata/xarray/issues/2799#issuecomment-552619589,https://api.github.com/repos/pydata/xarray/issues/2799,552619589,MDEyOklzc3VlQ29tbWVudDU1MjYxOTU4OQ==,90008,2019-11-11T21:16:36Z,2019-11-11T21:16:36Z,CONTRIBUTOR,"Hmm, slicing should basically be a no-op. The fact that xarray makes it about 100x slower is a real killer. It seems from this conversation that it might be hard to workaround
```python import xarray as xr import numpy as np n = np.zeros(shape=(1024, 1024)) x = xr.DataArray(n, dims=('y', 'x')) the_slice = np.s_[256:512, 256:512] %timeit n[the_slice] %timeit x[the_slice] 186 ns 卤 0.778 ns per loop (mean 卤 std. dev. of 7 runs, 10000000 loops each) 70.3 碌s 卤 593 ns per loop (mean 卤 std. dev. of 7 runs, 10000 loops each) ```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,416962458 https://github.com/pydata/xarray/issues/2347#issuecomment-451767431,https://api.github.com/repos/pydata/xarray/issues/2347,451767431,MDEyOklzc3VlQ29tbWVudDQ1MTc2NzQzMQ==,90008,2019-01-06T19:25:53Z,2019-01-06T19:25:53Z,CONTRIBUTOR,"mind blown!!!! thanks for that pointer I haven't touched my serialization code in a while, kinda scared to go back to it now, but I will keep that library in mind. I saw Zarr a while back, looks cool. I hope to see it grow.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,347962055 https://github.com/pydata/xarray/issues/2347#issuecomment-451765999,https://api.github.com/repos/pydata/xarray/issues/2347,451765999,MDEyOklzc3VlQ29tbWVudDQ1MTc2NTk5OQ==,90008,2019-01-06T19:06:53Z,2019-01-06T19:06:53Z,CONTRIBUTOR,"no need to be sorry. These two functions were easy enough for me to do myself in my own codebase. There are few issues that I've found doing this though. Mainly, I can't find a good way to serialize numpy arrays in a round-trippable fashion. It is difficult to get back `lists of arrays`, or `arrays of unit8`. I don't know if you have a good way to solvle this problem. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,347962055 https://github.com/pydata/xarray/issues/2251#issuecomment-416994400,https://api.github.com/repos/pydata/xarray/issues/2251,416994400,MDEyOklzc3VlQ29tbWVudDQxNjk5NDQwMA==,90008,2018-08-29T15:24:07Z,2018-08-29T15:24:07Z,CONTRIBUTOR,"@shoyer, @fmaussion thank you for your answers. I'm OK with this issue being closed. I'm no expert on netcdf4, so I don't know if I could express the issue in a concise manner there. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,335608017 https://github.com/pydata/xarray/pull/2344#issuecomment-410759337,https://api.github.com/repos/pydata/xarray/issues/2344,410759337,MDEyOklzc3VlQ29tbWVudDQxMDc1OTMzNw==,90008,2018-08-06T16:02:09Z,2018-08-06T16:02:09Z,CONTRIBUTOR,Thanks!,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,347712372 https://github.com/pydata/xarray/pull/2344#issuecomment-410575268,https://api.github.com/repos/pydata/xarray/issues/2344,410575268,MDEyOklzc3VlQ29tbWVudDQxMDU3NTI2OA==,90008,2018-08-06T02:55:12Z,2018-08-06T02:55:12Z,CONTRIBUTOR,"Maybe the issue that I am facing is that I want to deal with the storage of my metadata and data seperately. I used to have my own library that was replicating much of xarray's functionality, but your code is much nicer than anything I would be able to write in a finite time. :smile: Following the information here: http://xarray.pydata.org/en/stable/data-structures.html#coordinates-methods Currently, my serialization pipeline is: ```python import xarray as xr import numpy as np # Setup an array with coordinates n = np.zeros(3) coords={'x': np.arange(3)} m = xr.DataArray(n, dims=['x'], coords=coords) coords_dataset_dict = m.coords.to_dataset().to_dict() coords_dict = coords_dataset_dict['coords'] # Read/Write dictionary to JSON file # This works, but I'm essentially creating an emtpy dataset for it coords_set = xr.Dataset.from_dict(coords_dataset_dict) coords2 = coords_set.coords # so many `coords` :D m2 = xr.DataArray(np.zeros(shape=m.shape), dims=m.dims, coords=coords2) # I used to just pass the dataset to ""coords"" m3 = xr.DataArray(np.zeros(shape=m.shape), dims=m.dims, coords=coords_set) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,347712372 https://github.com/pydata/xarray/pull/2344#issuecomment-410572206,https://api.github.com/repos/pydata/xarray/issues/2344,410572206,MDEyOklzc3VlQ29tbWVudDQxMDU3MjIwNg==,90008,2018-08-06T02:31:02Z,2018-08-06T02:31:02Z,CONTRIBUTOR,Is there a better way to serialize coordinates only?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,347712372 https://github.com/pydata/xarray/pull/2344#issuecomment-410572013,https://api.github.com/repos/pydata/xarray/issues/2344,410572013,MDEyOklzc3VlQ29tbWVudDQxMDU3MjAxMw==,90008,2018-08-06T02:29:34Z,2018-08-06T02:29:34Z,CONTRIBUTOR,"It seems like this warning isn't benign though. I will take your suggestion though (`coords=dataset.coords`) I feel like I'm not the only one who probably did this. Should you raise an other warning explicitly?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,347712372 https://github.com/pydata/xarray/pull/2344#issuecomment-410532428,https://api.github.com/repos/pydata/xarray/issues/2344,410532428,MDEyOklzc3VlQ29tbWVudDQxMDUzMjQyOA==,90008,2018-08-05T16:45:27Z,2018-08-05T16:45:27Z,CONTRIBUTOR,I came across this when serializing/deserializing my coordinates to a json file.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,347712372 https://github.com/pydata/xarray/issues/2340#issuecomment-410488222,https://api.github.com/repos/pydata/xarray/issues/2340,410488222,MDEyOklzc3VlQ29tbWVudDQxMDQ4ODIyMg==,90008,2018-08-05T01:15:39Z,2018-08-05T01:15:49Z,CONTRIBUTOR,"Finishing up this line of though: without the assumption that the relative order of dimensions is maintained across arrays in a set, this feature is impossible to implement as a neat function call. You would have to specify exactly how to expand each of the coordinates which can get pretty long. I wrote some code, that I think should have worked if relative ordering was a valid assumption: Here it is for reference https://github.com/hmaarrfk/xarray/pull/1 To obtain the desired effect, you have to expand the dimensions of the coordinates individually: ```python import xarray as xr import numpy as np # Setup an array with coordinates n = np.arange(1, 13).reshape(3, 2, 2) coords={'y': np.arange(1, 4), 'x': np.arange(1, 3), 'xi': np.arange(2)} # %% z = xr.DataArray(n[..., 0]*2, dims=['y', 'x']) a = xr.DataArray(n, dims=['y', 'x', 'xi'], coords={**coords, 'z': z}) sliced = a[0] print(""The original xarray"") print(a.z) print(""The sliced xarray"") print(sliced.z) # %% expanded = sliced.expand_dims('y', 0) expanded['z'] = expanded.z.expand_dims('y', 0) print(expanded) ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,347558405