github: issues: 3 rows where repo = 13221727, state = "closed" and user = 488992 sorted by updated

3 rows where repo = 13221727, state = "closed" and user = 488992 sorted by updated_at descending

Search:

descending

id	node_id	number	title	user	state	comments	created_at	updated_at ▲	closed_at	author_association	draft	pull_request	body	reactions	state_reason	repo	type
801672790	MDU6SXNzdWU4MDE2NzI3OTA=	4862	Obtaining fresh data from the disk when reopening a NetCDF file a second time	cjauvin 488992	closed	2	2021-02-04T22:09:09Z	2023-03-30T20:01:06Z	2023-03-30T20:01:06Z	CONTRIBUTOR			I have a program where I open a `.nc` file, do something with it, and want to reopen it later, after an external program has been modifying it, and my issue is that the caching mechanism will give me the already opened version of the file, and not the refreshed version on the disk. To demonstrate this behavior, let's say you have two files: `bla.nc` and `bla_mod.nc`, with different content: ```python import shutil import xarray as xr a = xr.open_dataset("bla.nc") Simulate external process modifying bla.nc while this script is running shutil.copy("bla_mod.nc", "bla.nc") a.close() # this is the only thing that WOULD make it work! b = xr.open_dataset("bla.nc") Here I would expect b to be different than a, but it is not ``` I understand that the file SHOULD be `close`d (or that I should use a context manager) in an ideal world, and that if so it would work but let's say it is not (perhaps we forgot, or we're simply being lazy). At first I thought that I could use the `cache` parameter to `open_dataset` for that purpose, but after studying the code, I discovered that it is connected to a different caching mechanism than the one that is at play here. After some experiments to better understand the code, I came to the conclusion that the only way my particular use case could be supported (that is, without using an explicit `close` or a context manager, which is, in itself, debatable, I admit) is that if the underlying `netCDF4._netCDF4.Dataset` file object is explicitly closed, like it is when flushed out of the cache: https://github.com/pydata/xarray/blob/5735e163bea43ec9bc3c2e640fbf25a1d4a9d0c0/xarray/backends/file_manager.py#L222 Given that I cannot really see how, in the particular case where the user calls `open_dataset` for a second time, she wouldn't want the fresh version on disk, it made me think that a fix for that behavior would be to simply explicitly flush the cache immediately after the `CachingFileManager` for a particular dataset has been created, as I do here: https://github.com/pydata/xarray/compare/master...cjauvin:netcdf-caching-bug Because I admit that this looks weird at first sight (why close an object immediately after having created it?), I imagine that a better option would probably be to add a boolean option to the `CachingFileManager`, in order to make it optional (something like `flush_and_close_file_if_already_present`). I think this subtle change would result in a more coherent experience with the exact use case that I present, but admittedly, I didn't study the overall code deeply enough to be certain that it couldn't result in unwanted side effects for some other backends.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4862/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1076265104	PR_kwDOAMm_X84vpj53	6059	Weighted quantile	cjauvin 488992	closed	19	2021-12-10T01:11:36Z	2022-03-27T20:36:22Z	2022-03-27T20:36:22Z	CONTRIBUTOR	0	pydata/xarray/pulls/6059	[x] Tests added [x] Passes `pre-commit run --all-files` [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` [x] New functions/methods are listed in `api.rst` This is a follow-up to https://github.com/pydata/xarray/pull/5870/, which adds a weighted `quantile` function. The question of how to precisely define the weighted quantile function is surprisingly complex, and this implementation offers a compromise in terms of simplicity and compatibility: The only interpolation method supported is the so-called "Type 7", as explained in https://aakinshin.net/posts/weighted-quantiles/, which proposes an R implementation, that I have adapted It turns out that Type 7 is apparently the most "popular" one, at least in the Python world: it corresponds to the default `linear` interpolation option of `numpy.quantile` (https://numpy.org/doc/stable/reference/generated/numpy.quantile.html) which is also the basis of xarray's already existing non-weighted quantile function I have taken care in making sure that the results of this new function, with equal weights, are equivalent to the ones of the already existing, non-weighted function (when used with its default interporlation option) The interpolation question is so complex and confusing that entire articles have been written about it, as mentioned in the blog post above, in particular this one, which establishes the "nine types" taxoxomy, used, implicitly or not, by many software packages: https://doi.org/10.2307/2684934. The situation seems even more complex in the NumPy world, where many discussions and suggestions are aimed toward trying to improve the consistency of the API. The current non-weighted situation has the 9 options, as well as 4 extra legacy ones: https://github.com/numpy/numpy/blob/376ad691fe4df77e502108d279872f56b30376dc/numpy/lib/function_base.py#L4177-L4203 This PR cuts the Gordian knot by offering only one interpolation option, but.. given that its implementation is based on `apply_ufunc` (in a very similar way to xarray's already existing non-weighted `quantile` function, which is also using `apply_ufunc` with `np.quantile`), in the event that `np.quantile` ever gains a `weights` keyword argument, it would be very easy to swap it. That way, xarray's weighted `quantile` could lose a little bit of code, and gain a plethora of interpolation options.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6059/reactions", "total_count": 2, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 1 }		xarray 13221727	pull
1027640127	PR_kwDOAMm_X84tQwrV	5870	Add var and std to weighted computations	cjauvin 488992	closed	8	2021-10-15T17:13:31Z	2022-01-04T21:20:58Z	2021-10-28T11:02:54Z	CONTRIBUTOR	0	pydata/xarray/pulls/5870	[x] Tests added [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` [x] New functions/methods are listed in `api.rst` This follows https://github.com/pydata/xarray/pull/2922 to add `var`, `std` and `sum_of_squares` to `DataArray.weighted` and `Dataset.weighted`. I would also like to add weighted quantile, eventually.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5870/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);

issues

3 rows where repo = 13221727, state = "closed" and user = 488992 sorted by updated_at descending

Simulate external process modifying bla.nc while this script is running

a.close() # this is the only thing that WOULD make it work!

Here I would expect b to be different than a, but it is not

Advanced export