issue_comments
5,143 rows where author_association = "MEMBER" and user = 1217238 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
user 1
- shoyer · 5,143 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
1572412059 | https://github.com/pydata/xarray/pull/7880#issuecomment-1572412059 | https://api.github.com/repos/pydata/xarray/issues/7880 | IC_kwDOAMm_X85duRqb | shoyer 1217238 | 2023-06-01T16:51:07Z | 2023-06-01T17:10:49Z | MEMBER | Given that this error only is caused when Python is shutting down, which is exactly a case in which we do not need to clean up open file objects, maybe we can remove the Something like: ```python import atexit @atexit.register def _remove_del_method(): # We don't need to close unclosed files at program exit, # and may not be able to do, because Python is cleaning up # imports. del CachingFileManager.del ``` (I have not tested this!) |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
don't use `CacheFileManager.__del__` on interpreter shutdown 1730664352 | |
1572350143 | https://github.com/pydata/xarray/pull/7880#issuecomment-1572350143 | https://api.github.com/repos/pydata/xarray/issues/7880 | IC_kwDOAMm_X85duCi_ | shoyer 1217238 | 2023-06-01T16:16:40Z | 2023-06-01T16:16:40Z | MEMBER | I agree that this seems very hard to test! Have you verfied that this fixes things at least on your machine? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
don't use `CacheFileManager.__del__` on interpreter shutdown 1730664352 | |
1546951468 | https://github.com/pydata/xarray/issues/5511#issuecomment-1546951468 | https://api.github.com/repos/pydata/xarray/issues/5511 | IC_kwDOAMm_X85cNJss | shoyer 1217238 | 2023-05-14T17:17:56Z | 2023-05-14T17:17:56Z | MEMBER | If we can find cases where we know concurrent writes are unsafe, we can definitely start raising errors. Better to be safe than to suffer from silent data corruption! |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Appending data to a dataset stored in Zarr format produce PermissonError or NaN values in the final result 927617256 | |
1543042186 | https://github.com/pydata/xarray/issues/7325#issuecomment-1543042186 | https://api.github.com/repos/pydata/xarray/issues/7325 | IC_kwDOAMm_X85b-PSK | shoyer 1217238 | 2023-05-11T01:24:27Z | 2023-05-11T01:24:27Z | MEMBER | For anyone following along, I released a small package for reading TensorStore data into Xarray: https://github.com/google/xarray-tensorstore |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support reading Zarr data via TensorStore 1465287257 | |
1530685353 | https://github.com/pydata/xarray/issues/4001#issuecomment-1530685353 | https://api.github.com/repos/pydata/xarray/issues/4001 | IC_kwDOAMm_X85bPGep | shoyer 1217238 | 2023-05-02T00:35:52Z | 2023-05-02T00:35:52Z | MEMBER | Can we delete the "Flexible indexes" meeting? It doesn't happen anymore. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
[community] Bi-weekly community developers meeting 606530049 | |
1526489103 | https://github.com/pydata/xarray/issues/7764#issuecomment-1526489103 | https://api.github.com/repos/pydata/xarray/issues/7764 | IC_kwDOAMm_X85a_GAP | shoyer 1217238 | 2023-04-27T21:15:23Z | 2023-04-27T21:15:23Z | MEMBER | Allowing for explicitly passing a function matching the The overhead from optimizing contraction paths is probably very small relative to the overhead of Xarray in general, so I would support setting |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support opt_einsum in xr.dot 1672288892 | |
1496912849 | https://github.com/pydata/xarray/issues/6323#issuecomment-1496912849 | https://api.github.com/repos/pydata/xarray/issues/6323 | IC_kwDOAMm_X85ZORPR | shoyer 1217238 | 2023-04-05T04:49:34Z | 2023-04-05T04:49:34Z | MEMBER |
My expectation was that this would be a separate object, e.g., "disable all encoding propagation by discarding encoding attributes once a Dataset has been modified" would be an intermediate step, on the route to removing (As a side note, I would probably spell this as |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
propagation of `encoding` 1158378382 | |
1464180874 | https://github.com/pydata/xarray/issues/2227#issuecomment-1464180874 | https://api.github.com/repos/pydata/xarray/issues/2227 | IC_kwDOAMm_X85XRaCK | shoyer 1217238 | 2023-03-10T18:04:23Z | 2023-03-10T18:04:23Z | MEMBER | @dschwoerer are you sure that you are actually calculating the same thing in both cases? What exactly do the values of |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
1434932769 | https://github.com/pydata/xarray/issues/4079#issuecomment-1434932769 | https://api.github.com/repos/pydata/xarray/issues/4079 | IC_kwDOAMm_X85Vh1Yh | shoyer 1217238 | 2023-02-17T17:03:52Z | 2023-02-17T17:03:52Z | MEMBER | I agree, automatic dimension only ever really made sense for interactive usecases, where a user could see and fix the default names. It's a little late to change the default now to raising an error instead, but maybe we could add a warning? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Unnamed dimensions 621078539 | |
1414068565 | https://github.com/pydata/xarray/issues/5081#issuecomment-1414068565 | https://api.github.com/repos/pydata/xarray/issues/5081 | IC_kwDOAMm_X85USPlV | shoyer 1217238 | 2023-02-02T17:00:39Z | 2023-02-02T17:00:39Z | MEMBER | Is Personally I would not want to guarantee external stability/availability for this API in its current state. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Lazy indexing arrays as a stand-alone package 842436143 | |
1412396693 | https://github.com/pydata/xarray/pull/7496#issuecomment-1412396693 | https://api.github.com/repos/pydata/xarray/issues/7496 | IC_kwDOAMm_X85UL3aV | shoyer 1217238 | 2023-02-01T17:00:21Z | 2023-02-01T17:00:21Z | MEMBER | I like So personally I would rather go the other direction and add The inconsistency in the |
{ "total_count": 5, "+1": 5, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
deprecate open_zarr 1564661430 | |
1378074559 | https://github.com/pydata/xarray/pull/7418#issuecomment-1378074559 | https://api.github.com/repos/pydata/xarray/issues/7418 | IC_kwDOAMm_X85SI7-_ | shoyer 1217238 | 2023-01-11T00:27:47Z | 2023-01-11T00:27:47Z | MEMBER | I agree, datatree is an important data structure for Xarray. My preferred way to do this would be follow @rabernat's suggestion and to fork the code the existing repo into the Xarray main codebase. My main concern is that we should carefully evaluate the datatree API to make sure we won't want to change it soon. Once we bring it into Xarray, there will be a higher expectation that the interface will remain stable. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Import datatree in xarray? 1519552711 | |
1366880017 | https://github.com/pydata/xarray/issues/7404#issuecomment-1366880017 | https://api.github.com/repos/pydata/xarray/issues/7404 | IC_kwDOAMm_X85ReO8R | shoyer 1217238 | 2022-12-28T19:46:07Z | 2022-12-28T19:46:07Z | MEMBER | If you care about memory usage, you should explicitly close files after you use them, e.g., by calling |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Memory leak - xr.open_dataset() not releasing memory. 1512460818 | |
1351908915 | https://github.com/pydata/xarray/issues/7344#issuecomment-1351908915 | https://api.github.com/repos/pydata/xarray/issues/7344 | IC_kwDOAMm_X85QlH4z | shoyer 1217238 | 2022-12-14T18:24:04Z | 2022-12-14T18:24:04Z | MEMBER | I think it's OK to still require bottleneck for
|
{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Disable bottleneck by default? 1471685307 | |
1345646743 | https://github.com/pydata/xarray/pull/7368#issuecomment-1345646743 | https://api.github.com/repos/pydata/xarray/issues/7368 | IC_kwDOAMm_X85QNPCX | shoyer 1217238 | 2022-12-11T20:17:15Z | 2022-12-11T20:17:15Z | MEMBER |
In the long term, I think we should refactor For now, it's worth noting that the current |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Expose "Coordinates" as part of Xarray's public API 1485037066 | |
1344968954 | https://github.com/pydata/xarray/pull/7368#issuecomment-1344968954 | https://api.github.com/repos/pydata/xarray/issues/7368 | IC_kwDOAMm_X85QKpj6 | shoyer 1217238 | 2022-12-10T01:37:35Z | 2022-12-10T01:37:35Z | MEMBER | Long term, do you think it would make sense to merge together Indexes, Coordinates and IndexedCoordinates? They are sort of all containers for the same thing. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Expose "Coordinates" as part of Xarray's public API 1485037066 | |
1344944917 | https://github.com/pydata/xarray/pull/7368#issuecomment-1344944917 | https://api.github.com/repos/pydata/xarray/issues/7368 | IC_kwDOAMm_X85QKjsV | shoyer 1217238 | 2022-12-10T00:31:46Z | 2022-12-10T00:31:46Z | MEMBER |
Generally this looks great to me!
My suggestion would be:
Yes, this makes more sense to me!
Yes, I also agree! This makes more sense. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Expose "Coordinates" as part of Xarray's public API 1485037066 | |
1341296800 | https://github.com/pydata/xarray/issues/6610#issuecomment-1341296800 | https://api.github.com/repos/pydata/xarray/issues/6610 | IC_kwDOAMm_X85P8pCg | shoyer 1217238 | 2022-12-07T17:12:05Z | 2022-12-07T17:12:05Z | MEMBER | I also like the idea of creating specific Grouper objects for different types of selection, e.g., |
{ "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Update GroupBy constructor for grouping by multiple variables, dask arrays 1236174701 | |
1338121102 | https://github.com/pydata/xarray/issues/7350#issuecomment-1338121102 | https://api.github.com/repos/pydata/xarray/issues/7350 | IC_kwDOAMm_X85PwhuO | shoyer 1217238 | 2022-12-05T20:23:46Z | 2022-12-05T20:23:46Z | MEMBER |
Another way of describing the current behavior would be that xarray keeps around "every coordinate which could possibly still be valid," which is determined based upon dimension names. The main challenge is that "Coordinate variables should not have their coordinates changed" doesn't really make sense in Xarray's data model. Only Let me give an example of why we might want to keep scalar coordinates around. Consider a Dataset where |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Coordinate variable gains coordinate on subset 1473329967 | |
1336304695 | https://github.com/pydata/xarray/issues/7342#issuecomment-1336304695 | https://api.github.com/repos/pydata/xarray/issues/7342 | IC_kwDOAMm_X85PpmQ3 | shoyer 1217238 | 2022-12-04T02:28:45Z | 2022-12-04T02:28:45Z | MEMBER | The "robust" part is really just a modification to how the limits for color scales are chosen, i.e., ignoring the bottom and top 2% of the dtaa from the color scale. So it sounds like what you're hoping for is separate per-column or per-row color scaling? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
`xr.DataArray.plot.pcolormesh(robust="col/row")` 1471561942 | |
1336302962 | https://github.com/pydata/xarray/issues/7350#issuecomment-1336302962 | https://api.github.com/repos/pydata/xarray/issues/7350 | IC_kwDOAMm_X85Ppl1y | shoyer 1217238 | 2022-12-04T02:16:25Z | 2022-12-04T02:16:25Z | MEMBER | This was an intentional design choice, back in the early days of Xarray. The rule Xarray uses for choosing which coordinates to associate with a DataArray created from a Dataset or DataArray is "every coordinate whose dimensions are still present on the new DataArray." This includes scalar coordinates, which are always kept around (because their dimensions are always included). What rule would you suggest instead? I agree that the behavior in this case "feels" wrong, but keep in mind that once |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Coordinate variable gains coordinate on subset 1473329967 | |
1336299057 | https://github.com/pydata/xarray/issues/7344#issuecomment-1336299057 | https://api.github.com/repos/pydata/xarray/issues/7344 | IC_kwDOAMm_X85Ppk4x | shoyer 1217238 | 2022-12-04T01:55:34Z | 2022-12-04T01:55:34Z | MEMBER | The case where Bottleneck really makes a difference was for moving window statistics, where it uses a smarter algorithm than our current NumPy implementation, which creating a moving window view. Otherwise, I agree, it probably isn't worth the trouble. That said -- we could also switch to smarter NumPy based algorithms to implement most moving window calculations, e.g,. using |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Disable bottleneck by default? 1471685307 | |
1330164500 | https://github.com/pydata/xarray/issues/7299#issuecomment-1330164500 | https://api.github.com/repos/pydata/xarray/issues/7299 | IC_kwDOAMm_X85PSLMU | shoyer 1217238 | 2022-11-29T06:53:48Z | 2022-11-29T06:53:48Z | MEMBER |
Thanks for the excellent report! I agree, this sounds like a good fix to me. I think something like the following would work: Replace the return line of |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
DataArray.reindex of empty array changes dtype from float32 to float64 1455771929 | |
1328156723 | https://github.com/pydata/xarray/pull/7323#issuecomment-1328156723 | https://api.github.com/repos/pydata/xarray/issues/7323 | IC_kwDOAMm_X85PKhAz | shoyer 1217238 | 2022-11-27T02:31:51Z | 2022-11-27T02:31:51Z | MEMBER |
For what it's worth, I think your users will have a poor experience with encoded JSON data for very large arrays. It will be slow to compress and transfer this data. In the long term, you would probably do better to transmit the data in some binary form (e.g., by calling |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
(Issue #7324) added functions that return data values in memory efficient manner 1465047346 | |
1328156304 | https://github.com/pydata/xarray/pull/7323#issuecomment-1328156304 | https://api.github.com/repos/pydata/xarray/issues/7323 | IC_kwDOAMm_X85PKg6Q | shoyer 1217238 | 2022-11-27T02:27:07Z | 2022-11-27T02:27:07Z | MEMBER | Thanks for report and the PR! This really needs a "minimal complete verifiable" example (e.g., by creating and loading a Zarr array with random data) so others can verify your reported the performance gains: https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports https://stackoverflow.com/help/minimal-reproducible-example To be honest, this fix looks a little funny to me, because NumPy's own implementation of If you can reproduce the issue only using NumPy, it could also make more sense to file this as a upstream bug report to NumPy. The NumPy maintainers are in a better position to debug tricky memory allocation issues involving NumPy. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
(Issue #7324) added functions that return data values in memory efficient manner 1465047346 | |
1295283938 | https://github.com/pydata/xarray/pull/7214#issuecomment-1295283938 | https://api.github.com/repos/pydata/xarray/issues/7214 | IC_kwDOAMm_X85NNHbi | shoyer 1217238 | 2022-10-28T17:49:10Z | 2022-10-28T17:49:10Z | MEMBER |
I agree -- we should support this for backwards compatibility (even if we deprecate it).
OK, this totally makes sense. I don't love that it is possible to express invalid states in Xarray's data model. This motivated the creation of I wonder if we should consider the broader refactor of merging the This would have a number of benefits:
|
{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Pass indexes directly to the DataArray and Dataset constructors 1422543378 | |
1294262457 | https://github.com/pydata/xarray/pull/7221#issuecomment-1294262457 | https://api.github.com/repos/pydata/xarray/issues/7221 | IC_kwDOAMm_X85NJOC5 | shoyer 1217238 | 2022-10-28T00:27:22Z | 2022-10-28T00:27:22Z | MEMBER | I no longer remember why I added these checks, but I certainly did not expect to see this sort of performance penalty! |
{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Remove debugging slow assert statement 1423312198 | |
1293909730 | https://github.com/pydata/xarray/pull/7214#issuecomment-1293909730 | https://api.github.com/repos/pydata/xarray/issues/7214 | IC_kwDOAMm_X85NH37i | shoyer 1217238 | 2022-10-27T18:28:40Z | 2022-10-27T18:28:40Z | MEMBER |
I would lean against this, only because it's easier to explicitly manipulate indexes in the form of a Explicitly providing indexes is an advanced user feature. I think it's OK to require users to do a bit more work in this case and to not necessarily do consistency checks (beyond verifying that the coordinate variables exist). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Pass indexes directly to the DataArray and Dataset constructors 1422543378 | |
1288188522 | https://github.com/pydata/xarray/issues/7132#issuecomment-1288188522 | https://api.github.com/repos/pydata/xarray/issues/7132 | IC_kwDOAMm_X85MyDJq | shoyer 1217238 | 2022-10-23T19:59:28Z | 2022-10-23T19:59:28Z | MEMBER | This is correct -- We would welcome contributions to fix this. This would entail making the We would also need a fall-back method for determining appropriate time units without looking at the array values. Something like |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Saving a DataArray of datetime objects as zarr is not a lazy operation despite compute=False 1397532790 | |
1286421985 | https://github.com/pydata/xarray/issues/6807#issuecomment-1286421985 | https://api.github.com/repos/pydata/xarray/issues/6807 | IC_kwDOAMm_X85MrT3h | shoyer 1217238 | 2022-10-21T03:49:18Z | 2022-10-21T03:49:18Z | MEMBER | Cubed should define a concatenate function, so that should be OK |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Alternative parallel execution frameworks in xarray 1308715638 | |
1278202565 | https://github.com/pydata/xarray/pull/4879#issuecomment-1278202565 | https://api.github.com/repos/pydata/xarray/issues/4879 | IC_kwDOAMm_X85ML9LF | shoyer 1217238 | 2022-10-13T21:34:05Z | 2022-10-13T21:34:05Z | MEMBER | I think we could fix this by marking CachingFileManager with |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Cache files for different CachingFileManager objects separately 803068773 | |
1269050790 | https://github.com/pydata/xarray/pull/4879#issuecomment-1269050790 | https://api.github.com/repos/pydata/xarray/issues/4879 | IC_kwDOAMm_X85LpC2m | shoyer 1217238 | 2022-10-05T22:27:28Z | 2022-10-05T22:27:28Z | MEMBER | Anyone want to review here? I think this should be ready to go in. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Cache files for different CachingFileManager objects separately 803068773 | |
1268700309 | https://github.com/pydata/xarray/pull/4879#issuecomment-1268700309 | https://api.github.com/repos/pydata/xarray/issues/4879 | IC_kwDOAMm_X85LntSV | shoyer 1217238 | 2022-10-05T17:06:02Z | 2022-10-05T17:57:19Z | MEMBER | ~~Actually maybe we don't need to keep files open after pickling... let me give this one more try.~~ Nevermind, this didn't work -- it still results in failing tests with dask-distributed on Windows. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Cache files for different CachingFileManager objects separately 803068773 | |
1268684962 | https://github.com/pydata/xarray/pull/4879#issuecomment-1268684962 | https://api.github.com/repos/pydata/xarray/issues/4879 | IC_kwDOAMm_X85Lnpii | shoyer 1217238 | 2022-10-05T16:51:14Z | 2022-10-05T16:51:14Z | MEMBER | OK, after a bit more futzing tests are passing and I think this is actually ready to go in. I ended up leaving in the reference counting after all -- I couldn't figure out another way to keep files open after a pickle round-trip. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Cache files for different CachingFileManager objects separately 803068773 | |
1260250383 | https://github.com/pydata/xarray/issues/6293#issuecomment-1260250383 | https://api.github.com/repos/pydata/xarray/issues/6293 | IC_kwDOAMm_X85LHeUP | shoyer 1217238 | 2022-09-28T00:49:26Z | 2022-09-28T00:49:26Z | MEMBER | Yes yes -- the sooner we can get rid of MultiIndex special cases the better! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Explicit indexes: next steps 1148021907 | |
1259823913 | https://github.com/pydata/xarray/pull/4879#issuecomment-1259823913 | https://api.github.com/repos/pydata/xarray/issues/4879 | IC_kwDOAMm_X85LF2Mp | shoyer 1217238 | 2022-09-27T17:26:06Z | 2022-09-27T17:26:06Z | MEMBER | I added @cjauvin's integration test, and verified that the fix works for the scipy and h5netcdf backends. Unfortunately, it doesn't work yet for the netCDF4 backend. I don't think we can solve this in Xarray without fixes netCDF4-Python or the netCDF-C library: https://github.com/Unidata/netcdf4-python/issues/1195 |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Cache files for different CachingFileManager objects separately 803068773 | |
1249910951 | https://github.com/pydata/xarray/issues/7045#issuecomment-1249910951 | https://api.github.com/repos/pydata/xarray/issues/7045 | IC_kwDOAMm_X85KgCCn | shoyer 1217238 | 2022-09-16T22:26:36Z | 2022-09-16T22:26:36Z | MEMBER | As a concrete example, suppose we have two datasets: 1. Hourly predictions for 10 days 2. Daily observations for a month. ```python import numpy as np import pandas as pd import xarray predictions = xarray.DataArray( np.random.RandomState(0).randn(24*10), {'time': pd.date_range('2022-01-01', '2022-01-11', freq='1h', closed='left')}, ) observations = xarray.DataArray( np.random.RandomState(1).randn(31), {'time': pd.date_range('2022-01-01', '2022-01-31', freq='24h')}, ) ``` Today, if you compare these datasets, they automatically align: ```
With this proposed change, you would get an error, e.g., something like: ```
Instead, you would need to manually align these objects, e.g., with
To (partially) simulate the effect of this change on a codebase today, you could write |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Should Xarray stop doing automatic index-based alignment? 1376109308 | |
1249601076 | https://github.com/pydata/xarray/issues/7045#issuecomment-1249601076 | https://api.github.com/repos/pydata/xarray/issues/7045 | IC_kwDOAMm_X85Ke2Y0 | shoyer 1217238 | 2022-09-16T17:16:52Z | 2022-09-16T17:18:38Z | MEMBER |
The problem is that user expectations are actually rather different for different options:
This would definitely be a step forward! However, it's a tricky nut to crack. We would both need a heuristic for defining Even then, automatic alignment is often problematic, e.g., imagine cases where a coordinate is defined in separate units. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Should Xarray stop doing automatic index-based alignment? 1376109308 | |
1244918028 | https://github.com/pydata/xarray/issues/7002#issuecomment-1244918028 | https://api.github.com/repos/pydata/xarray/issues/7002 | IC_kwDOAMm_X85KM_EM | shoyer 1217238 | 2022-09-13T05:30:12Z | 2022-09-13T05:30:12Z | MEMBER | I like option (4). If a multi-coordinate index needs to care about order, it can implement that logic itself. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Custom indexes and coordinate (re)ordering 1364388790 | |
1210976795 | https://github.com/pydata/xarray/issues/6904#issuecomment-1210976795 | https://api.github.com/repos/pydata/xarray/issues/6904 | IC_kwDOAMm_X85ILgob | shoyer 1217238 | 2022-08-10T16:43:36Z | 2022-08-10T16:43:36Z | MEMBER | You might look into different multiprocessing modes: https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods It may also be that the NetCDF or HDF5 libraries were simply not written in a way that can support multi-processing. This would not surprise me.
I agree, maybe this isn't worth the trouble. I have not seen it done successfully before. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
`sel` behaving randomly when applying to a dataset with multiprocessing 1333650265 | |
1210255676 | https://github.com/pydata/xarray/issues/6904#issuecomment-1210255676 | https://api.github.com/repos/pydata/xarray/issues/6904 | IC_kwDOAMm_X85IIwk8 | shoyer 1217238 | 2022-08-10T07:10:41Z | 2022-08-10T07:10:41Z | MEMBER |
Yes it should, as long as you're using multi-processing under the covers. If you do multi-threading, then you would want to use |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
`sel` behaving randomly when applying to a dataset with multiprocessing 1333650265 | |
1210233503 | https://github.com/pydata/xarray/issues/6904#issuecomment-1210233503 | https://api.github.com/repos/pydata/xarray/issues/6904 | IC_kwDOAMm_X85IIrKf | shoyer 1217238 | 2022-08-10T06:45:06Z | 2022-08-10T06:45:06Z | MEMBER | Can you try explicitly passing in a multiprocessing lock into the (We automatically select appropriate locks if using Dask, but I'm not sure how we would do that more generally...) |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
`sel` behaving randomly when applying to a dataset with multiprocessing 1333650265 | |
1210190649 | https://github.com/pydata/xarray/issues/4285#issuecomment-1210190649 | https://api.github.com/repos/pydata/xarray/issues/4285 | IC_kwDOAMm_X85IIgs5 | shoyer 1217238 | 2022-08-10T05:48:47Z | 2022-08-10T05:48:47Z | MEMBER | I am tempted to suggest that the right way to handle Awkward array is to treat "var" dimensions similar to NumPy's structured dtypes, with Either way, I would definitely encourage figuring out some actual use-cases before building this out :) |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Awkward array backend? 667864088 | |
1204280307 | https://github.com/pydata/xarray/pull/6874#issuecomment-1204280307 | https://api.github.com/repos/pydata/xarray/issues/6874 | IC_kwDOAMm_X85Hx9vz | shoyer 1217238 | 2022-08-03T17:44:20Z | 2022-08-03T17:44:20Z | MEMBER | As I understand it, the main purpose here is to remove Xarray lazy indexing class. Maybe call this |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Avoid calling np.asarray on lazy indexing classes 1327380960 | |
1200314984 | https://github.com/pydata/xarray/issues/2304#issuecomment-1200314984 | https://api.github.com/repos/pydata/xarray/issues/2304 | IC_kwDOAMm_X85Hi1po | shoyer 1217238 | 2022-07-30T23:55:04Z | 2022-07-30T23:55:04Z | MEMBER |
Yes, I'm pretty sure "float" means single precision (np.float32), given that "double" certainly means double precision (no.float64).
Yes, I believe so.
I think we can treat this a bug fix and just go forward with it. Yes, some people are going to be surprised, but I don't think it's distruptive enough that we need to go to a major effort to preserve backwards compatibility. It should already be straightforward to work around by setting |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
float32 instead of float64 when decoding int16 with scale_factor netcdf var using xarray 343659822 | |
1199939328 | https://github.com/pydata/xarray/issues/6849#issuecomment-1199939328 | https://api.github.com/repos/pydata/xarray/issues/6849 | IC_kwDOAMm_X85HhZ8A | shoyer 1217238 | 2022-07-29T20:56:05Z | 2022-07-29T20:56:05Z | MEMBER | I agree, I think only setting a few indexes at a time would be normal. If we eventually need convenience methods for setting multiple indexes we can add those later. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Public API for setting new indexes: add a set_xindex method? 1322198907 | |
1199753281 | https://github.com/pydata/xarray/issues/6849#issuecomment-1199753281 | https://api.github.com/repos/pydata/xarray/issues/6849 | IC_kwDOAMm_X85HgshB | shoyer 1217238 | 2022-07-29T17:00:06Z | 2022-07-29T17:00:06Z | MEMBER | This sounds great to me! I don't think we need support for setting multiple indexes at once in a single method call. You can call |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Public API for setting new indexes: add a set_xindex method? 1322198907 | |
1198375377 | https://github.com/pydata/xarray/issues/6833#issuecomment-1198375377 | https://api.github.com/repos/pydata/xarray/issues/6833 | IC_kwDOAMm_X85HbcHR | shoyer 1217238 | 2022-07-28T16:29:30Z | 2022-07-28T16:29:30Z | MEMBER | I just toggled the "Require a pull request before merging" option |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Require a pull request before merging to main 1318800553 | |
1188520871 | https://github.com/pydata/xarray/issues/6807#issuecomment-1188520871 | https://api.github.com/repos/pydata/xarray/issues/6807 | IC_kwDOAMm_X85G12On | shoyer 1217238 | 2022-07-19T02:18:03Z | 2022-07-19T02:18:03Z | MEMBER | Sounds good to me. The challenge will be defining a parallel computing API that works across all these projects, with their slightly different models. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Alternative parallel execution frameworks in xarray 1308715638 | |
1183458691 | https://github.com/pydata/xarray/issues/6505#issuecomment-1183458691 | https://api.github.com/repos/pydata/xarray/issues/6505 | IC_kwDOAMm_X85GiiWD | shoyer 1217238 | 2022-07-13T16:51:09Z | 2022-07-13T16:51:31Z | MEMBER | Reopening because my second example |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Dropping a MultiIndex variable raises an error after explicit indexes refactor 1210267320 | |
1176808719 | https://github.com/pydata/xarray/issues/2697#issuecomment-1176808719 | https://api.github.com/repos/pydata/xarray/issues/2697 | IC_kwDOAMm_X85GJK0P | shoyer 1217238 | 2022-07-06T22:21:48Z | 2022-07-06T22:21:48Z | MEMBER | Maybe a separate project in xarray-contrib would make sense? I would be reluctant to add this into Xarray proper if we need a new external dependency for reading XML files. On Wed, Jul 6, 2022 at 2:37 PM David Huard @.***> wrote:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
read ncml files to create multifile datasets 401874795 | |
1165848854 | https://github.com/pydata/xarray/pull/6721#issuecomment-1165848854 | https://api.github.com/repos/pydata/xarray/issues/6721 | IC_kwDOAMm_X85FfXEW | shoyer 1217238 | 2022-06-24T18:57:42Z | 2022-06-24T18:57:42Z | MEMBER | The simplest option would probably be a custom Zarr store that raises an error if you try to look at array data. This could be implemented as a subclass of an existing Zarr store (e.g., the in memory store) that raises an error in |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Fix .chunks loading lazy backed array data 1284071791 | |
1165847538 | https://github.com/pydata/xarray/pull/6721#issuecomment-1165847538 | https://api.github.com/repos/pydata/xarray/issues/6721 | IC_kwDOAMm_X85FfWvy | shoyer 1217238 | 2022-06-24T18:55:51Z | 2022-06-24T18:55:51Z | MEMBER | We have some tests with |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Fix .chunks loading lazy backed array data 1284071791 | |
1163345547 | https://github.com/pydata/xarray/issues/6704#issuecomment-1163345547 | https://api.github.com/repos/pydata/xarray/issues/6704 | IC_kwDOAMm_X85FVz6L | shoyer 1217238 | 2022-06-22T16:31:33Z | 2022-06-22T16:31:33Z | MEMBER |
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Future of `DataArray.rename` 1275752720 | |
1163299397 | https://github.com/pydata/xarray/issues/6646#issuecomment-1163299397 | https://api.github.com/repos/pydata/xarray/issues/6646 | IC_kwDOAMm_X85FVopF | shoyer 1217238 | 2022-06-22T15:57:14Z | 2022-06-22T15:57:14Z | MEMBER | NumPy mostly uses |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
`dim` vs `dims` 1250939008 | |
1163296444 | https://github.com/pydata/xarray/issues/6646#issuecomment-1163296444 | https://api.github.com/repos/pydata/xarray/issues/6646 | IC_kwDOAMm_X85FVn68 | shoyer 1217238 | 2022-06-22T15:55:13Z | 2022-06-22T15:56:35Z | MEMBER | It would be helpful to understand if there are also other uses of |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
`dim` vs `dims` 1250939008 | |
1163292851 | https://github.com/pydata/xarray/issues/6704#issuecomment-1163292851 | https://api.github.com/repos/pydata/xarray/issues/6704 | IC_kwDOAMm_X85FVnCz | shoyer 1217238 | 2022-06-22T15:52:12Z | 2022-06-22T15:52:12Z | MEMBER | Should we call it The later might make more sense, but then it wouldn't mirror |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Future of `DataArray.rename` 1275752720 | |
1150280375 | https://github.com/pydata/xarray/issues/644#issuecomment-1150280375 | https://api.github.com/repos/pydata/xarray/issues/644 | IC_kwDOAMm_X85Ej-K3 | shoyer 1217238 | 2022-06-08T18:56:17Z | 2022-06-08T18:56:17Z | MEMBER | This might fit more naturally into interp() as a new method like "nearest-valid" rather than in sel(). The difference is that sel() only looks at indexes (and not the data) to select out a single value, whereas interp() can combine adjacent values in arbitrary ways. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature request: only allow nearest-neighbor .sel for valid data (not NaN positions) 114773593 | |
1146873595 | https://github.com/pydata/xarray/issues/6524#issuecomment-1146873595 | https://api.github.com/repos/pydata/xarray/issues/6524 | IC_kwDOAMm_X85EW-b7 | shoyer 1217238 | 2022-06-05T19:54:47Z | 2022-06-05T19:54:47Z | MEMBER |
Yes, it's worth discussing. I don't know if there will be a satisfying resolution, though. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
NumPy `__array_ufunc__` does not work with typing 1217425815 | |
1137839614 | https://github.com/pydata/xarray/issues/6633#issuecomment-1137839614 | https://api.github.com/repos/pydata/xarray/issues/6633 | IC_kwDOAMm_X85D0g3- | shoyer 1217238 | 2022-05-25T20:55:14Z | 2022-05-25T20:55:14Z | MEMBER | Looking at this mur-sst dataset in particular, it stores time in chunks of size 5. That means fetching the 6443 time values requires 1288 separate HTTP requests -- no wonder it's so slow! If the time axis were instead stored in a single chunk of 51 KB, Xarray would only need 3 small size HTTP requests to load the lat, lon and time indexes, which would probably complete in a fraction of a second. That said, I agree that this would be nice to have in general. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Opening dataset without loading any indexes? 1247010680 | |
1137754031 | https://github.com/pydata/xarray/issues/6633#issuecomment-1137754031 | https://api.github.com/repos/pydata/xarray/issues/6633 | IC_kwDOAMm_X85D0L-v | shoyer 1217238 | 2022-05-25T19:12:40Z | 2022-05-25T19:12:40Z | MEMBER |
+1 this syntax makes sense to me! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Opening dataset without loading any indexes? 1247010680 | |
1137661171 | https://github.com/pydata/xarray/pull/6475#issuecomment-1137661171 | https://api.github.com/repos/pydata/xarray/issues/6475 | IC_kwDOAMm_X85Dz1Tz | shoyer 1217238 | 2022-05-25T18:10:21Z | 2022-05-25T18:10:21Z | MEMBER |
I opened up https://github.com/zarr-developers/zarr-python/issues/1039 |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
implement Zarr v3 spec support 1200581329 | |
1137572812 | https://github.com/pydata/xarray/issues/6633#issuecomment-1137572812 | https://api.github.com/repos/pydata/xarray/issues/6633 | IC_kwDOAMm_X85DzfvM | shoyer 1217238 | 2022-05-25T17:10:04Z | 2022-05-25T17:10:04Z | MEMBER | Early versions of Xarray used to have lazy loading of data for indexes, but we removed this for the sake of simplicity. In principle we could restore lazy indexes, but another option (post explicit index refactor) might be an option for opening a dataset without creating indexes for 1D coordinates along dimensions. Another way to solve this sort of challenges might be to load index data in parallel when using Dask. Right now I believe the data corresponding to indexes is always loaded eagerly, without using Dask. All that said -- Do you have a specific example where this has been problematic? In my experience it has been pretty reasonable to use xarray.Dataset objects for schema-like templates, even with index data needing to be loaded eagerly. Possibly another Zarr chunking scheme for your index data could be more efficient? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Opening dataset without loading any indexes? 1247010680 | |
1126587818 | https://github.com/pydata/xarray/issues/6607#issuecomment-1126587818 | https://api.github.com/repos/pydata/xarray/issues/6607 | IC_kwDOAMm_X85DJl2q | shoyer 1217238 | 2022-05-14T00:10:13Z | 2022-05-14T00:10:13Z | MEMBER |
This seems like a good idea In the long term, we like to decouple indexes from coordinate, and make something like the following work:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Coordinate promotion workaround broken 1235725650 | |
1126255398 | https://github.com/pydata/xarray/pull/5734#issuecomment-1126255398 | https://api.github.com/repos/pydata/xarray/issues/5734 | IC_kwDOAMm_X85DIUsm | shoyer 1217238 | 2022-05-13T16:51:24Z | 2022-05-13T16:51:24Z | MEMBER | 👍 this looks great to me! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Enable `flox` in `GroupBy` and `resample` 978356586 | |
1124302215 | https://github.com/pydata/xarray/pull/6566#issuecomment-1124302215 | https://api.github.com/repos/pydata/xarray/issues/6566 | IC_kwDOAMm_X85DA32H | shoyer 1217238 | 2022-05-11T21:15:36Z | 2022-05-11T21:15:36Z | MEMBER | For whatever reason, Windows seems to be much stricter about requiring file handles to be explicitly closed. So my guess is that this could be solved by using |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
New inline_array kwarg for open_dataset 1223270563 | |
1116397246 | https://github.com/pydata/xarray/issues/6517#issuecomment-1116397246 | https://api.github.com/repos/pydata/xarray/issues/6517 | IC_kwDOAMm_X85Cit6- | shoyer 1217238 | 2022-05-03T18:09:42Z | 2022-05-03T18:09:42Z | MEMBER | I'm a little skeptical that it makes sense to add special case logic into Xarray in an attempt to keep NumPy's "OWNDATA" flag up to date. There are lots of places where we create views of data from existing arrays inside Xarray operations. There are definitely cases where Xarray's internal operations do memory copies followed by views, which would also result in datasets with misleading "OWNDATA" flags if you look only at resulting datasets, e.g.,
Overall, I just don't think this is a reliable way to trace memory allocation with NumPy. Maybe you could do better by also tracing back to source arrays with |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Loading from NetCDF creates unnecessary numpy.ndarray-views that clears the OWNDATA-flag 1216517115 | |
1114173984 | https://github.com/pydata/xarray/issues/1621#issuecomment-1114173984 | https://api.github.com/repos/pydata/xarray/issues/1621 | IC_kwDOAMm_X85CaPIg | shoyer 1217238 | 2022-05-01T08:49:40Z | 2022-05-01T08:49:40Z | MEMBER | Still relevant! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Undesired decoding to timedelta64 (was: units of "seconds" translated to time coordinate) 264321376 | |
1111813044 | https://github.com/pydata/xarray/issues/6524#issuecomment-1111813044 | https://api.github.com/repos/pydata/xarray/issues/6524 | IC_kwDOAMm_X85CROu0 | shoyer 1217238 | 2022-04-28T06:52:04Z | 2022-04-28T06:52:04Z | MEMBER | I think this would need to get updated on the NumPy side. Ideally NumPy ufuncs would be typed to check for class HasArrayUFunc(Protocol): def array_ufunc(ufunc, method, inputs, *kwargs): pass ArrayOrHasArrayUFunc = TypeVar("ArrayOrHasArrayUFunc", ndarray, HasArrayUFunc) def exp(x: ArrayOrHasArrayUFunc) -> ArrayOrHasArrayUFunc: ... ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
NumPy `__array_ufunc__` does not work with typing 1217425815 | |
863427710 | https://github.com/pydata/xarray/issues/2171#issuecomment-863427710 | https://api.github.com/repos/pydata/xarray/issues/2171 | MDEyOklzc3VlQ29tbWVudDg2MzQyNzcxMA== | shoyer 1217238 | 2021-06-17T17:30:17Z | 2022-04-19T03:15:24Z | MEMBER | @gagebeni please open a new discussion for your issue: https://github.com/pydata/xarray/discussions |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support alignment/broadcasting with unlabeled dimensions of size 1 325439138 | |
1100953736 | https://github.com/pydata/xarray/issues/4267#issuecomment-1100953736 | https://api.github.com/repos/pydata/xarray/issues/4267 | IC_kwDOAMm_X85BnziI | shoyer 1217238 | 2022-04-17T21:42:36Z | 2022-04-17T21:42:36Z | MEMBER | This is still relevant |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
CachingFileManager should not use __del__ 665488672 | |
1099788049 | https://github.com/pydata/xarray/pull/6476#issuecomment-1099788049 | https://api.github.com/repos/pydata/xarray/issues/6476 | IC_kwDOAMm_X85BjW8R | shoyer 1217238 | 2022-04-15T02:14:56Z | 2022-04-15T02:14:56Z | MEMBER | I will take a look soon! On Thu, Apr 14, 2022 at 6:23 PM Maximilian Roos @.***> wrote:
|
{ "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 0 } |
Fix zarr append dtype checks 1200716594 | |
1099309755 | https://github.com/pydata/xarray/pull/6420#issuecomment-1099309755 | https://api.github.com/repos/pydata/xarray/issues/6420 | IC_kwDOAMm_X85BhiK7 | shoyer 1217238 | 2022-04-14T15:36:14Z | 2022-04-14T15:36:14Z | MEMBER | Thanks @malmans2 ! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Add support in the "zarr" backend for reading NCZarr data 1183534905 | |
1099307673 | https://github.com/pydata/xarray/pull/6475#issuecomment-1099307673 | https://api.github.com/repos/pydata/xarray/issues/6475 | IC_kwDOAMm_X85BhhqZ | shoyer 1217238 | 2022-04-14T15:33:54Z | 2022-04-14T15:33:54Z | MEMBER |
is there an issue on the Zarr side where this is currently being discussed? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
implement Zarr v3 spec support 1200581329 | |
1098229361 | https://github.com/pydata/xarray/pull/6475#issuecomment-1098229361 | https://api.github.com/repos/pydata/xarray/issues/6475 | IC_kwDOAMm_X85BdaZx | shoyer 1217238 | 2022-04-13T16:04:23Z | 2022-04-13T16:04:23Z | MEMBER |
Does Zarr v3 have a notion of a "root" group? That feels like a more sensible default to me, both for Xarray and Zarr-Python
This sounds fine for now, but I am concerned that it will slow the adoption of Zarr v3. Eventually, we would presumably want to change the default to version 3, but this is difficult to do if it entirely breaks backwards compatibility. My preference would be for the default behavior to try opening Zarr v2, and fall back to opening in v3 mode, even if this requires attempting to open a file from the store. This is similar to how Xarray handles other Zarr versioning issues (e.g., for consolidated metadata). Perhaps Zarr-Python could raise an informative error that we could catch if the Zarr version is incorrect, or even handle this behavior itself? |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
implement Zarr v3 spec support 1200581329 | |
1094123521 | https://github.com/pydata/xarray/pull/6420#issuecomment-1094123521 | https://api.github.com/repos/pydata/xarray/issues/6420 | IC_kwDOAMm_X85BNwAB | shoyer 1217238 | 2022-04-09T21:00:04Z | 2022-04-09T21:00:04Z | MEMBER | Could you also add brief updates to mention NCZarr support in the docstring for
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Add support in the "zarr" backend for reading NCZarr data 1183534905 | |
1090499559 | https://github.com/pydata/xarray/issues/6374#issuecomment-1090499559 | https://api.github.com/repos/pydata/xarray/issues/6374 | IC_kwDOAMm_X85A_7Pn | shoyer 1217238 | 2022-04-06T17:04:26Z | 2022-04-06T17:04:26Z | MEMBER |
This error message comes from Xarray and can be triggered by calling I don't think netCDF-C needs to be involved at all, which is why I suggested opening a separate issue. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Should the zarr backend support NCZarr conventions? 1172229856 | |
1090464275 | https://github.com/pydata/xarray/issues/6374#issuecomment-1090464275 | https://api.github.com/repos/pydata/xarray/issues/6374 | IC_kwDOAMm_X85A_yoT | shoyer 1217238 | 2022-04-06T16:25:40Z | 2022-04-06T16:25:40Z | MEMBER | @wankoelias could you kindly open a new issue for writing GDAL ZARR? |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Should the zarr backend support NCZarr conventions? 1172229856 | |
1078695337 | https://github.com/pydata/xarray/issues/2233#issuecomment-1078695337 | https://api.github.com/repos/pydata/xarray/issues/2233 | IC_kwDOAMm_X85AS5Wp | shoyer 1217238 | 2022-03-25T06:20:10Z | 2022-03-25T06:20:10Z | MEMBER | This is the second follow-up item in https://github.com/pydata/xarray/issues/6293 I think we could definitely experiment with relaxing this constraint now, although ideally we would continue to check off auditing all of the methods in that long list first. |
{ "total_count": 4, "+1": 4, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Problem opening unstructured grid ocean forecasts with 4D vertical coordinates 332471780 | |
1077253534 | https://github.com/pydata/xarray/issues/6408#issuecomment-1077253534 | https://api.github.com/repos/pydata/xarray/issues/6408 | IC_kwDOAMm_X85ANZWe | shoyer 1217238 | 2022-03-24T05:53:56Z | 2022-03-24T05:53:56Z | MEMBER | I think this is probably fine without a deprecation cycle. This is a very easy fix for users. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
backwards incompatible changes in reductions 1178949620 | |
1076796582 | https://github.com/pydata/xarray/issues/6374#issuecomment-1076796582 | https://api.github.com/repos/pydata/xarray/issues/6374 | IC_kwDOAMm_X85ALpym | shoyer 1217238 | 2022-03-23T20:38:12Z | 2022-03-23T20:38:12Z | MEMBER | @DennisHeimbigner I think it would be great to standardize NCZarr as a super-set of the "Xarray-Zarr" standard! I think Xarray should indeed be able to read such files. If you want to read a sub-group, you can read the sub-group in a separate call to @rabernat I would not be opposed to adding support inside Xarray for reading NCZarr data, specifically to understand NCZarr's encoding of dimension names when using Zarr-Python. This wouldn't give 100% compatibility with NCZarr, but it would be very close (maybe just with incorrect dtypes for attributes) with a minimal amount of work. I don't think it would be a big deal to look for |
{ "total_count": 3, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 2, "rocket": 0, "eyes": 0 } |
Should the zarr backend support NCZarr conventions? 1172229856 | |
1071104882 | https://github.com/pydata/xarray/pull/5692#issuecomment-1071104882 | https://api.github.com/repos/pydata/xarray/issues/5692 | IC_kwDOAMm_X84_18Ny | shoyer 1217238 | 2022-03-17T17:12:07Z | 2022-03-17T17:12:07Z | MEMBER | OK, in it goes! Big thanks to @benbovy for seeing this through :) |
{ "total_count": 24, "+1": 0, "-1": 0, "laugh": 0, "hooray": 13, "confused": 0, "heart": 1, "rocket": 10, "eyes": 0 } |
Explicit indexes 966983801 | |
1069344000 | https://github.com/pydata/xarray/pull/5692#issuecomment-1069344000 | https://api.github.com/repos/pydata/xarray/issues/5692 | IC_kwDOAMm_X84_vOUA | shoyer 1217238 | 2022-03-16T16:47:45Z | 2022-03-16T16:47:45Z | MEMBER | OK, I think we’re good to go here? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Explicit indexes 966983801 | |
1065381346 | https://github.com/pydata/xarray/issues/6345#issuecomment-1065381346 | https://api.github.com/repos/pydata/xarray/issues/6345 | IC_kwDOAMm_X84_gG3i | shoyer 1217238 | 2022-03-11T18:38:42Z | 2022-03-11T18:38:42Z | MEMBER | The data type restriction here seems to date back to the original PR adding support for appending. I turned up this comment that seems to summarize the motivation for this check: https://github.com/pydata/xarray/pull/2706#issuecomment-502481584 I think the original issue was that appending a fixed-width string could be a problem if the fixed-width does not match the width of the existing string dtype stored in Zarr. This obviously doesn't apply in this case, because you are adding an entirely new variable. So I guess the check could be removed in that case. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
`to_zarr` raises `ValueError: Invalid dtype` with `mode='a'` (but not with `mode='w'`) 1164454058 | |
1062211273 | https://github.com/pydata/xarray/issues/1613#issuecomment-1062211273 | https://api.github.com/repos/pydata/xarray/issues/1613 | IC_kwDOAMm_X84_UA7J | shoyer 1217238 | 2022-03-08T21:09:05Z | 2022-03-08T21:09:05Z | MEMBER | Another challenge with changing the meaning of I think the separate new API (e.g., |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Should sel with slice objects care about underlying coordinate order? 263403430 | |
1059662347 | https://github.com/pydata/xarray/pull/5692#issuecomment-1059662347 | https://api.github.com/repos/pydata/xarray/issues/5692 | IC_kwDOAMm_X84_KSoL | shoyer 1217238 | 2022-03-05T03:05:36Z | 2022-03-05T03:05:36Z | MEMBER | I would like to merge this PR very soon so it can get testing before the next release. If anyone has any remaining concerns, please speak up! |
{ "total_count": 5, "+1": 5, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Explicit indexes 966983801 | |
1059546596 | https://github.com/pydata/xarray/issues/1460#issuecomment-1059546596 | https://api.github.com/repos/pydata/xarray/issues/1460 | IC_kwDOAMm_X84_J2Xk | shoyer 1217238 | 2022-03-04T21:31:41Z | 2022-03-04T21:31:41Z | MEMBER | Well, even if we keep |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
groupby should still squeeze for non-monotonic inputs 237008177 | |
1058366320 | https://github.com/pydata/xarray/issues/1613#issuecomment-1058366320 | https://api.github.com/repos/pydata/xarray/issues/1613 | IC_kwDOAMm_X84_FWNw | shoyer 1217238 | 2022-03-03T18:39:59Z | 2022-03-03T18:39:59Z | MEMBER | One complication with using
If we change the semantics of Alternatively, we could either do the dedicated indexing object like |
{ "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 1 } |
Should sel with slice objects care about underlying coordinate order? 263403430 | |
1058293194 | https://github.com/pydata/xarray/issues/1613#issuecomment-1058293194 | https://api.github.com/repos/pydata/xarray/issues/1613 | IC_kwDOAMm_X84_FEXK | shoyer 1217238 | 2022-03-03T17:23:09Z | 2022-03-03T17:23:09Z | MEMBER | This is probably worth fixing if possible in a straightforward way. I don't think anyone is well served by matching the behavior of Python list indexing here -- it's a strange edge that case that indexing a list like |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Should sel with slice objects care about underlying coordinate order? 263403430 | |
1057657161 | https://github.com/pydata/xarray/issues/6176#issuecomment-1057657161 | https://api.github.com/repos/pydata/xarray/issues/6176 | IC_kwDOAMm_X84_CpFJ | shoyer 1217238 | 2022-03-03T04:32:10Z | 2022-03-03T04:32:10Z | MEMBER | Breaking changes will continue to be very rare, and whenever possible will be preceeded by deprecation or future warnings for multiple months. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Xarray versioning to switch to CalVer 1108564253 | |
1051297372 | https://github.com/pydata/xarray/issues/6304#issuecomment-1051297372 | https://api.github.com/repos/pydata/xarray/issues/6304 | IC_kwDOAMm_X84-qYZc | shoyer 1217238 | 2022-02-25T21:50:15Z | 2022-02-25T21:50:15Z | MEMBER | Adding a |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
add join argument to xr.broadcast? 1150251120 | |
1042660100 | https://github.com/pydata/xarray/issues/4118#issuecomment-1042660100 | https://api.github.com/repos/pydata/xarray/issues/4118 | IC_kwDOAMm_X84-JbsE | shoyer 1217238 | 2022-02-17T07:45:24Z | 2022-02-17T07:45:24Z | MEMBER | One thing that came up in our discussion about this in the developer meeting today is that we could also pretty easily expose a "low level" API for IO using dictionaries of xarray.Variable objects. This intermediate representation could be useful for cleaning up data into a form suitable for conversion into Dataset objects. On Wed, Feb 16, 2022 at 11:39 PM Alessandro Amici @.***> wrote:
|
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Feature Request: Hierarchical storage and processing in xarray 628719058 | |
1035611864 | https://github.com/pydata/xarray/issues/2186#issuecomment-1035611864 | https://api.github.com/repos/pydata/xarray/issues/2186 | IC_kwDOAMm_X849ui7Y | shoyer 1217238 | 2022-02-10T22:49:40Z | 2022-02-10T22:50:01Z | MEMBER | For what it's wroth, the recommended way to do this is to explicitly close the Dataset with Or with a context manager, e.g.,
|
{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Memory leak while looping through a Dataset 326533369 | |
1034196986 | https://github.com/pydata/xarray/issues/6069#issuecomment-1034196986 | https://api.github.com/repos/pydata/xarray/issues/6069 | IC_kwDOAMm_X849pJf6 | shoyer 1217238 | 2022-02-09T21:12:31Z | 2022-02-09T21:12:31Z | MEMBER | The reason why this isn't allowed is because it's ambiguous what to do with the other variables that are not restricted to the region (['cell', 'face', 'layer', 'max_cell_node', 'max_face_nodes', 'node', 'siglay'] in this case). I can imagine quite a few different ways this behavior could be implemented:
I believe your proposal here (removing these checks from (4) seems like perhaps the most user-friendly option, but checking existing variables can add significant overhead. When experimenting adding The current solution is not to do any of these, and to force the user to make an explicit choice by dropping new variables, or write them in a separate call to |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
to_zarr: region not recognised as dataset dimensions 1077079208 | |
1032051447 | https://github.com/pydata/xarray/issues/6230#issuecomment-1032051447 | https://api.github.com/repos/pydata/xarray/issues/6230 | IC_kwDOAMm_X849g9r3 | shoyer 1217238 | 2022-02-07T23:40:48Z | 2022-02-07T23:40:48Z | MEMBER | In the long term (cc @benbovy) I think we would ideally split
|
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
[PERFORMANCE]: `isin` on `CFTimeIndex`-backed `Coordinate` slow 1120583442 | |
1031811347 | https://github.com/pydata/xarray/issues/6230#issuecomment-1031811347 | https://api.github.com/repos/pydata/xarray/issues/6230 | IC_kwDOAMm_X849gDET | shoyer 1217238 | 2022-02-07T19:01:54Z | 2022-02-07T19:01:54Z | MEMBER | Oh, I guess the challenge is that |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
[PERFORMANCE]: `isin` on `CFTimeIndex`-backed `Coordinate` slow 1120583442 | |
1031810590 | https://github.com/pydata/xarray/issues/6230#issuecomment-1031810590 | https://api.github.com/repos/pydata/xarray/issues/6230 | IC_kwDOAMm_X849gC4e | shoyer 1217238 | 2022-02-07T19:01:08Z | 2022-02-07T19:01:08Z | MEMBER | Yes, I think replacing this with something like |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
[PERFORMANCE]: `isin` on `CFTimeIndex`-backed `Coordinate` slow 1120583442 | |
1028136906 | https://github.com/pydata/xarray/issues/6174#issuecomment-1028136906 | https://api.github.com/repos/pydata/xarray/issues/6174 | IC_kwDOAMm_X849SB_K | shoyer 1217238 | 2022-02-02T16:46:24Z | 2022-02-02T17:20:50Z | MEMBER | Have you seen In principle, it was designed for exactly this sort of thing. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
[FEATURE]: Read from/write to several NetCDF4 groups with a single file open/close operation 1108138101 | |
1020635094 | https://github.com/pydata/xarray/pull/6187#issuecomment-1020635094 | https://api.github.com/repos/pydata/xarray/issues/6187 | IC_kwDOAMm_X8481afW | shoyer 1217238 | 2022-01-24T23:01:14Z | 2022-01-24T23:01:14Z | MEMBER | Let me ponder the linked issue. This was not an intentional feature for |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
to_netcdf: docstrings for compute parameter 1112365912 | |
1011450955 | https://github.com/pydata/xarray/issues/6084#issuecomment-1011450955 | https://api.github.com/repos/pydata/xarray/issues/6084 | IC_kwDOAMm_X848SYRL | shoyer 1217238 | 2022-01-12T21:05:59Z | 2022-01-12T21:05:59Z | MEMBER |
I don't think that line adds any measurable overhead. It's just telling dask to delay computation of a single function. For sure this would be worth elaborating on in the Xarray docs! I wrote a little bit about this in the docs for Xarray-Beam: see "One recommended pattern" in https://xarray-beam.readthedocs.io/en/latest/read-write.html#writing-data-to-zarr |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Initialise zarr metadata without computing dask graph 1083621690 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
issue >30