home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

13 rows where issue = 993563624 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 2

  • max-sixty 7
  • nickdoty 6

author_association 2

  • MEMBER 7
  • NONE 6

issue 1

  • Unexpected behavior when using slice in a sel() statement · 13 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
921988569 https://github.com/pydata/xarray/issues/5786#issuecomment-921988569 https://api.github.com/repos/pydata/xarray/issues/5786 IC_kwDOAMm_X8429G3Z max-sixty 5635139 2021-09-17T18:13:03Z 2021-09-17T18:13:03Z MEMBER

Hi @nickdoty — I'm less experienced in these than others, but the float conversion issue makes sense.

Generally I would use something like a tolerance for floats; relying on exact values is liable to hit these sorts of things at the moment.

I agree the current state isn't ideal. I think there are issues for float conversions on the tracker, and feel free to create a new one if there isn't.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Unexpected behavior when using slice in a sel() statement 993563624
921062321 https://github.com/pydata/xarray/issues/5786#issuecomment-921062321 https://api.github.com/repos/pydata/xarray/issues/5786 IC_kwDOAMm_X8425kux nickdoty 89942016 2021-09-16T16:43:52Z 2021-09-16T16:43:52Z NONE

@max-sixty after digging through the data sets using ncdump and h5dump we confirmed that the data does look good. In doing so, I created a small sample data set and program that I believe narrows down where the problem lies. I've attached the file used for testing (it's zipped, Github doesn't support .nc4 files), but here is the code snippet: ```python import xarray as xr

data = xr.open_dataset('test_slice.nc4') print(data) print(data.indexes['lat']) ```

And here is the output: <xarray.Dataset> Dimensions: (lat: 11, lon: 1, time: 1) Coordinates: * time (time) datetime64[ns] 2018-01-30T09:00:00 * lat (lat) float32 10.43 10.44 10.45 10.46 ... 10.51 10.52 10.53 * lon (lon) float32 -68.0 Data variables: analysed_sst (time, lat, lon) float32 ... Attributes: (12/47) Conventions: CF-1.5 title: Daily MUR SST, Final product summary: A merged, multi-sensor L4 Foundation SST anal... references: http://podaac.jpl.nasa.gov/Multi-scale_Ultra-... institution: Jet Propulsion Laboratory history: created at nominal 4-day latency; replaced nr... ... ... project: NASA Making Earth Science Data Records for Us... publisher_name: GHRSST Project Office publisher_url: http://www.ghrsst.org publisher_email: ghrsst-po@nceo.ac.uk processing_level: L4 cdm_data_type: grid Float64Index([10.430000305175781, 10.4399995803833, 10.449999809265137, 10.460000038146973, 10.470000267028809, 10.479999542236328, 10.489999771118164, 10.5, 10.510000228881836, 10.520000457763672, 10.529999732971191], dtype='float64', name='lat')

From the data source, the latitudes are of type float32, but when they are accessed via data.indexes['lat'] they are now float64. We believe this conversion may be why the values are changing. If possible, could you point me the right direction for addressing this? I don't know if there is someone knowledgeable about this, if there's a way I can handle it in my program, or if another Issue ticket is the best route.

Thank you!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Unexpected behavior when using slice in a sel() statement 993563624
920332562 https://github.com/pydata/xarray/issues/5786#issuecomment-920332562 https://api.github.com/repos/pydata/xarray/issues/5786 IC_kwDOAMm_X8422ykS nickdoty 89942016 2021-09-15T19:56:41Z 2021-09-15T19:56:41Z NONE

@max-sixty it does - it doesn't solve my problem, but it appears that the problem isn't with .slice but instead with the latitude values themselves. I'll go ahead and close this because I don't know if the issue is still with xarray or the creation of the netCDF4 file. But it'll be outside of the scope of this issue so I'll close it regardless. Thank you for your help!

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Unexpected behavior when using slice in a sel() statement 993563624
920287535 https://github.com/pydata/xarray/issues/5786#issuecomment-920287535 https://api.github.com/repos/pydata/xarray/issues/5786 IC_kwDOAMm_X8422nkv max-sixty 5635139 2021-09-15T18:45:06Z 2021-09-15T18:45:06Z MEMBER

Thanks for trying @nickdoty

Here's that slightly modified — does this help explain the 44.99 point?

```python

In [3]: import xarray as xr ...: ...: data = xr.open_dataset('https://podaac-opendap.jpl.nasa.gov/opendap/allData/ghrsst/data/GDS2/L4/GLOB/JPL/MUR/' ...: 'v4.1/2018/030/20180130090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc') ...: print(data['analysed_sst'].sel(lat=slice(None, 10.45), lon=-68.00).indexes['lat']) ...: print(data['analysed_sst'].sel(lat=slice(10.45, None), lon=-68.00).indexes['lat']) ...: Float64Index([-89.98999786376953, -89.9800033569336, -89.97000122070312, -89.95999908447266, -89.94999694824219, -89.94000244140625, -89.93000030517578, -89.91999816894531, -89.91000366210938, -89.9000015258789, ... 10.359999656677246, 10.369999885559082, 10.380000114440918, 10.390000343322754, 10.399999618530273, 10.40999984741211, 10.420000076293945, 10.430000305175781, 10.4399995803833, 10.449999809265137], dtype='float64', name='lat', length=10045) Float64Index([10.460000038146973, 10.470000267028809, 10.479999542236328, 10.489999771118164, 10.5, 10.510000228881836, 10.520000457763672, 10.529999732971191, 10.539999961853027, 10.550000190734863, ... 89.9000015258789, 89.91000366210938, 89.91999816894531, 89.93000030517578, 89.94000244140625, 89.94999694824219, 89.95999908447266, 89.97000122070312, 89.9800033569336, 89.98999786376953], dtype='float64', name='lat', length=7954) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Unexpected behavior when using slice in a sel() statement 993563624
920276263 https://github.com/pydata/xarray/issues/5786#issuecomment-920276263 https://api.github.com/repos/pydata/xarray/issues/5786 IC_kwDOAMm_X8422k0n nickdoty 89942016 2021-09-15T18:33:58Z 2021-09-15T18:33:58Z NONE

This is as clear an example as I can make of the problem. ```python import xarray as xr

data = xr.open_dataset('https://podaac-opendap.jpl.nasa.gov/opendap/allData/ghrsst/data/GDS2/L4/GLOB/JPL/MUR/' 'v4.1/2018/030/20180130090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc') ex = data['analysed_sst'].sel(lat=slice(10.45, 10.52), lon=-68.00).values[0]

print("Items in the array:") print(ex) print("Number of items in the array: " + str(len(ex))) ```

Output: Items in the array: [ nan nan nan nan 299.485 299.46 ] Number of items in the array: 6

Where that says Number of items in the array: 6 there should be 7 items in the array.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Unexpected behavior when using slice in a sel() statement 993563624
920264934 https://github.com/pydata/xarray/issues/5786#issuecomment-920264934 https://api.github.com/repos/pydata/xarray/issues/5786 IC_kwDOAMm_X8422iDm max-sixty 5635139 2021-09-15T18:16:42Z 2021-09-15T18:16:42Z MEMBER

@nickdoty please show the output where this is happening. The output showing in my response above shows a value (10.449999809265137) less than 10.45.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Unexpected behavior when using slice in a sel() statement 993563624
920206755 https://github.com/pydata/xarray/issues/5786#issuecomment-920206755 https://api.github.com/repos/pydata/xarray/issues/5786 IC_kwDOAMm_X8422T2j nickdoty 89942016 2021-09-15T17:06:51Z 2021-09-15T17:06:51Z NONE

The code in my ex_2 encapsulates the problem - I should be receiving data for the point at (10.45, 68.00) but I am not. python ex_2 = data['analysed_sst'].sel(lat=slice(10.45, 10.52), lon=longitude).indexes['lat']

I will look into the tolerance keyword - hopefully it will provide a temporary solution.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Unexpected behavior when using slice in a sel() statement 993563624
918589726 https://github.com/pydata/xarray/issues/5786#issuecomment-918589726 https://api.github.com/repos/pydata/xarray/issues/5786 IC_kwDOAMm_X842wJEe max-sixty 5635139 2021-09-13T21:24:40Z 2021-09-13T21:24:40Z MEMBER

Oh and check out the tolerance keyword to sel

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Unexpected behavior when using slice in a sel() statement 993563624
918578097 https://github.com/pydata/xarray/issues/5786#issuecomment-918578097 https://api.github.com/repos/pydata/xarray/issues/5786 IC_kwDOAMm_X842wGOx max-sixty 5635139 2021-09-13T21:08:32Z 2021-09-13T21:08:32Z MEMBER

Thanks for explaining @nickdoty . This does seem to support the idea that this is happening because the value is 10.449999809265137, and not 10.45.

If that's not the case, could you show a working example of selecting something with 10.45 failing to select a value that's equal or above 10.45. Probably the easiest way is to select the values along lat and show the result.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Unexpected behavior when using slice in a sel() statement 993563624
918490945 https://github.com/pydata/xarray/issues/5786#issuecomment-918490945 https://api.github.com/repos/pydata/xarray/issues/5786 IC_kwDOAMm_X842vw9B nickdoty 89942016 2021-09-13T19:05:29Z 2021-09-13T19:05:29Z NONE

No need to apologize. Some understanding of the data set may help. The MUR-SST data set is a large data set that displays the sea surface temperature (in K) at a given longitude and latitude. The precision is 2 decimal places, so there is a value every .01 deg. The values I displayed above that are nan are expected, as they are over land, and thus can't reflect the sea-surface temperature (this was discovered as part of another search. Ideally I would have liked to find an example where it is completely over water, but I was unable to). Given that, my code opens the dataset generated by PO.DAAC and then each example selects specifically the analysed_sst portion, and then selects the data over a slice of the latitude, where the longitude stays consistent.

From my understanding, the slice mechanism is inclusive on the first value, and exclusive on the second value. Given that, I should from ex_1 get values for (where these are lat/long pairs) (10.47, 68.00), (10.48, 68.00), (10.49, 68.00), (10.50, 68.00), (10.51, 68.00) - and I do receive those values - they correspond to nan, nan, nan, 299.485',299.46`.

However, in ex_2 - I'd additionally expect values for (10.45, 68.00) and (10.46, 68.00) but I only get the additional value for (10.46, 68.00). Additionally, from my understanding, the slice mechanism takes the given input and finds the next nearest sequential value to start at, if the given value does not exist. So in ex_3, I changed the value from 10.45 to 10.449, and I then get a value for (10.45, 68.00). I think at some point there's some math taking place on the given inputs and it's causing that 10.45 to become something like 10.450000000000213 (I have no idea what the value is - this is just an example) and thus, it finds the next sequential value at 10.46. However, I don't know where this is taking place - and whether this is an xarray issue or potentially somewhere else.

Ultimately, for our less technical users, we'd expect them to do slice(10.45, 10.52) and get a value corresponding to 10.45. But unfortunately they do not.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Unexpected behavior when using slice in a sel() statement 993563624
918453939 https://github.com/pydata/xarray/issues/5786#issuecomment-918453939 https://api.github.com/repos/pydata/xarray/issues/5786 IC_kwDOAMm_X842vn6z max-sixty 5635139 2021-09-13T18:16:43Z 2021-09-13T18:16:43Z MEMBER

Sorry if I'm being slow in understanding. My question was whether this was caused by the 10.449999809265137 value being slightly below 10.45.

If not, could you show in the example both the values being selected and the values that it's selecting? Probably the easiest way is to select the values along lat and show the result.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Unexpected behavior when using slice in a sel() statement 993563624
918249382 https://github.com/pydata/xarray/issues/5786#issuecomment-918249382 https://api.github.com/repos/pydata/xarray/issues/5786 IC_kwDOAMm_X842u1-m nickdoty 89942016 2021-09-13T14:26:57Z 2021-09-13T14:26:57Z NONE

The issue is, the number of analysed_sst values from the file are not expected. In example 1, we should receive values for 10.47, 10.48, 10.49, 10.50, 10.51 - a total of 5. In example 2, we should receive values for 10.45, 10.46, 10.47, 10.48, 10.49, 10.50, 10.51 - a total of 7. In example 3, we should receive values for 10.45, 10.46, 10.47, 10.48, 10.49, 10.50, 10.51 - a total of 7.

These are the actual results we get: python [ nan nan nan 299.485 299.46 ] [ nan nan nan nan 299.485 299.46 ] [ nan nan nan nan nan 299.485 299.46 ]

The number of results from example 2 is only 6 - because there isn't a value for 10.45. It appears, somehow the number 10.45 is being altered/recalculated so it is slightly larger than 10.45, so the slice value starts at the next valid value of 10.46.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Unexpected behavior when using slice in a sel() statement 993563624
917694806 https://github.com/pydata/xarray/issues/5786#issuecomment-917694806 https://api.github.com/repos/pydata/xarray/issues/5786 IC_kwDOAMm_X842sulW max-sixty 5635139 2021-09-12T19:20:57Z 2021-09-12T19:20:57Z MEMBER

Thanks for the issue @nickdoty . Is it a rounding issue?

```python

In [5]: import xarray as xr ...: longitude = -68.0 ...: ...: data = xr.open_dataset('https://podaac-opendap.jpl.nasa.gov/opendap/allData/ghrsst/data/GDS2/L4/GLOB/JPL/MUR/' ...: 'v4.1/2018/030/20180130090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc') ...: ex_1 = data['analysed_sst'].sel(lat=slice(10.47, 10.52), lon=longitude).indexes['lat'] ...: ex_2 = data['analysed_sst'].sel(lat=slice(10.45, 10.52), lon=longitude).indexes['lat'] ...: ex_3 = data['analysed_sst'].sel(lat=slice(10.449, 10.52), lon=longitude).indexes['lat'] ...: ...: print(ex_1) ...: print(ex_2) ...: print(ex_3) Float64Index([10.470000267028809, 10.479999542236328, 10.489999771118164, 10.5, 10.510000228881836], dtype='float64', name='lat') Float64Index([10.460000038146973, 10.470000267028809, 10.479999542236328, 10.489999771118164, 10.5, 10.510000228881836], dtype='float64', name='lat') Float64Index([10.449999809265137, 10.460000038146973, 10.470000267028809, 10.479999542236328, 10.489999771118164, 10.5, 10.510000228881836], dtype='float64', name='lat') ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Unexpected behavior when using slice in a sel() statement 993563624

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 15.292ms · About: xarray-datasette