issues
4 rows where state = "open", type = "issue" and user = 8382834 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at ▲ | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1951543761 | I_kwDOAMm_X850UjHR | 8335 | ```DataArray.sel``` can silently pick up the nearest point, even if it is far away and the query is out of bounds | jerabaul29 8382834 | open | 0 | 13 | 2023-10-19T08:02:44Z | 2024-04-29T23:02:31Z | CONTRIBUTOR | What is your issue?@paulina-t (who found a bug caused by the behavior we report here in a codebase, where it was badly messing things up). See the example notebook at https://github.com/jerabaul29/public_bug_reports/blob/main/xarray/2023_10_18/interp.ipynb . ProblemIt is always a bit risky to interpolate / find the nearest neighbor to a query or similar, as bad things can happen if querying a value for a point that is outside of the area that is represented. Fortunately, xarray returns NaN if performing ```python import xarray as xr import numpy as np xr.version '2023.9.0' data = np.array([[1, 2, 3], [4, 5, 6]]) lat = [10, 20] lon = [120, 130, 140] data_xr = xr.DataArray(data, coords={'lat':lat, 'lon':lon}, dims=['lat', 'lon']) data_xr <xarray.DataArray (lat: 2, lon: 3)> array([[1, 2, 3], [4, 5, 6]]) Coordinates: * lat (lat) int64 10 20 * lon (lon) int64 120 130 140 interp is civilized: rather than wildly extrapolating, it returns NaNdata_xr.interp(lat=15, lon=125) <xarray.DataArray ()> array(3.) Coordinates: lat int64 15 lon int64 125 data_xr.interp(lat=5, lon=125) <xarray.DataArray ()> array(nan) Coordinates: lat int64 5 lon int64 125 ``` Unfortunately, ```python sel is not as civilized: it happily finds the neares neighbor, even if it is "on the one side" of the example datadata_xr.sel(lat=5, lon=125, method='nearest') <xarray.DataArray ()> array(2) Coordinates: lat int64 10 lon int64 130 ``` This can easily cause tricky bugs. DiscussionWould it be possible for
when performing a I understand that finding the nearest neighbor may still be useful / wanted in some cases even when being outside of the bounds of the dataset, but the fact that this happens silently by default has been causing bugs for us. Could either this default behavior be changed, or maybe enabled with a flag ( |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8335/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
2098252325 | I_kwDOAMm_X859EMol | 8653 | xarray v 2023.9.0: ```ValueError: unable to infer dtype on variable 'time'; xarray cannot serialize arbitrary Python objects``` | jerabaul29 8382834 | open | 0 | 1 | 2024-01-24T13:18:55Z | 2024-02-05T12:50:34Z | CONTRIBUTOR | What happened?I tried to save an xarray dataset with datetimes as data for its time dimension to a nc file with What did you expect to happen?I expected xarray to automatically detect these were datetimes, and convert them to whatever format xarray likes to work with internally to dump it into a CF compatible file, following what is described at https://github.com/pydata/xarray/issues/2512 . Minimal Complete Verifiable Example```Python import xarray as xr import datetime times = [datetime.datetime(2024, 1, 1, 1, 1, 1, tzinfo=datetime.timezone.utc), datetime.datetime(2024, 1, 1, 1, 1, 2, tzinfo=datetime.timezone.utc)] data = [1, 2] xr_result = xr.Dataset( { 'time': xr.DataArray(dims=["time"], data=times, attrs={ "standard_name": "time", }), # 'data': xr.DataArray(dims=["time"], data=data, attrs={ "_FillValue": "NaN", "standard_name": "some_data", }), } ) xr_result.to_netcdf("test.nc") ``` MVCE confirmation
Relevant log outputNo response Anything else we need to know?The example is available as a notebook viewable at: Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.11.5 (main, Sep 11 2023, 13:54:46) [GCC 11.2.0]
python-bits: 64
OS: Linux
OS-release: 6.5.0-14-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.1
libnetcdf: 4.8.1
xarray: 2023.9.0
pandas: 2.0.3
numpy: 1.25.2
scipy: 1.11.3
netCDF4: 1.6.2
pydap: None
h5netcdf: None
h5py: 3.10.0
Nio: None
zarr: None
cftime: 1.6.3
nc_time_axis: None
PseudoNetCDF: None
iris: None
bottleneck: 1.3.5
dask: 2023.9.2
distributed: 2023.9.2
matplotlib: 3.7.2
cartopy: 0.21.1
seaborn: 0.13.0
numbagg: None
fsspec: 2023.9.2
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 68.0.0
pip: 23.2.1
conda: None
pytest: None
mypy: None
IPython: 8.15.0
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8653/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1853356670 | I_kwDOAMm_X85ud_p- | 8074 | Add an ```only_variables``` or similar option to ```xarray.open_dataset``` and ```xarray.open_mfdataset``` | jerabaul29 8382834 | open | 0 | 7 | 2023-08-16T14:23:43Z | 2023-08-21T06:55:17Z | CONTRIBUTOR | Is your feature request related to a problem?Sometimes, a variable in a nc file is corrupted or not "xarray friendly" and crashes opening a file (see for example https://github.com/pydata/xarray/issues/8072 ; I solved this on my machine by just Describe the solution you'd likeWe already can exclude variables with the Describe alternatives you've considered
Additional contextNo response |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8074/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1552701403 | I_kwDOAMm_X85cjFfb | 7468 | Provide default APIs and functions for getting variable at a given location, based on some criteria / extrema conditions on other variables | jerabaul29 8382834 | open | 0 | 0 | 2023-01-23T08:35:43Z | 2023-01-23T08:35:43Z | CONTRIBUTOR | Is your feature request related to a problem?No, this is related to a need that comes regularly when working with netCDF files in geosciences. Describe the solution you'd likewhat is neededThere are many cases with netcdf files when one wants to find some location, or get variable(s) at some location, where the location is determined by a condition on some variables. A classical example, around which there are many stack overflow questions, online discussions, suggested "hacky" solution, snippets etc, available, is something like the following. Given a file that looks like this:
answer a question like:
I do not think there is a recommended, standard, simple / one liner to do this with xarray in general (in particular if the (latval, lonval) falls out of the discrete set of mesh nodes). This means that a there are plenty of ad hoc hacked solutions getting shared around to solve this. Having a default recommended way would likely help users quite a bit and save quite some work. the existing ways to solve the needAs soon as the TLAT and TLON are not "aligned" with the ni and nj coordinates (if they exactly match a mesh point, then likely some
There are many more examples of questions that revolve around this kind of "query", and the answers are usually ad-hoc, though a lot of the logics repeat themselves, which make me believe a general high quality / standard solution would be useful:
Also note that most of these answers use simple / relatively naive / inefficient algorithms, but I wonder if there are some examples of code that could be used to build this in an efficient way, see the discussions in:
It looks like there are some snippets available that can be use to do this more or less exactly, when the netcdf file follows some conventions: It looks like there is no dedicated / recommended / default xarray solution to do this though. It would be great if xarray could offer a (set of) well tested, well implemented, efficient way(s) to solve this kind of needs. I guess this is such a common need that providing a default solution with a default API, even if it is not optimal for all use cases, would be better than providing nothing at all and have users hack their own helper functions. what xarray could implementIt would be great if xarray could offer support for this built in. A few thoughts of how this could be done:
I wonder if thinking about a few APIs and agreeing on these would be helpful before implementing anything. Just for the sake of brainstorming, maybe some functions with this kind of "API pseudocode" on datasets could make sense / would be a nice standardization to offer to users? Any thoughts / ideas of better solution?
(note: for this last function, consider also providing a variant that performs interpolation outside of mesh points?) Maybe providing a few specializations for working with finding specific points in space would be useful? Like:
Describe alternatives you've consideredWriting my own small function, or re-using some snippet circulating on internet. Additional contextNo response |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7468/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issues] ( [id] INTEGER PRIMARY KEY, [node_id] TEXT, [number] INTEGER, [title] TEXT, [user] INTEGER REFERENCES [users]([id]), [state] TEXT, [locked] INTEGER, [assignee] INTEGER REFERENCES [users]([id]), [milestone] INTEGER REFERENCES [milestones]([id]), [comments] INTEGER, [created_at] TEXT, [updated_at] TEXT, [closed_at] TEXT, [author_association] TEXT, [active_lock_reason] TEXT, [draft] INTEGER, [pull_request] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [state_reason] TEXT, [repo] INTEGER REFERENCES [repos]([id]), [type] TEXT ); CREATE INDEX [idx_issues_repo] ON [issues] ([repo]); CREATE INDEX [idx_issues_milestone] ON [issues] ([milestone]); CREATE INDEX [idx_issues_assignee] ON [issues] ([assignee]); CREATE INDEX [idx_issues_user] ON [issues] ([user]);