home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

23 rows where issue = 1385031286 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 7

  • kthyng 6
  • ocefpaf 5
  • jhamman 5
  • dcherian 4
  • pnorton-usgs 1
  • keewis 1
  • trexfeathers 1

author_association 3

  • MEMBER 10
  • NONE 8
  • CONTRIBUTOR 5

issue 1

  • open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 · 23 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1494957700 https://github.com/pydata/xarray/issues/7079#issuecomment-1494957700 https://api.github.com/repos/pydata/xarray/issues/7079 IC_kwDOAMm_X85ZGz6E kthyng 3487237 2023-04-03T20:45:42Z 2023-04-03T20:45:42Z NONE

I'm not really sure what to think any more — we have had a real, consistent issue that seemed to fit the description of this issue which went away with one of the fixes above (using single threading), but using local files at the moment seems to remove the error even with the current version of xarray and either parallel option.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 1385031286
1494950846 https://github.com/pydata/xarray/issues/7079#issuecomment-1494950846 https://api.github.com/repos/pydata/xarray/issues/7079 IC_kwDOAMm_X85ZGyO- kthyng 3487237 2023-04-03T20:39:02Z 2023-04-03T20:39:02Z NONE

Ok I downloaded the two files and indeed there is no error with parallel=True nor parallel=False.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 1385031286
1494560788 https://github.com/pydata/xarray/issues/7079#issuecomment-1494560788 https://api.github.com/repos/pydata/xarray/issues/7079 IC_kwDOAMm_X85ZFTAU ocefpaf 950575 2023-04-03T15:44:18Z 2023-04-03T15:44:18Z CONTRIBUTOR

@kthyng those files are on a remote server and that may not be the segfault from the original issue here. It may be a server that is not happy with parallel access. Can you try that with local files?

PS: you can also try with netcdf4<1.6.1 and, if that also fails, it is most likely the server than the issue here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 1385031286
1494542372 https://github.com/pydata/xarray/issues/7079#issuecomment-1494542372 https://api.github.com/repos/pydata/xarray/issues/7079 IC_kwDOAMm_X85ZFOgk kthyng 3487237 2023-04-03T15:31:54Z 2023-04-03T15:31:54Z NONE

@jhamman Yes, using the PR version of xarray, with parallel=True I met the error but with parallel=False I did not.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 1385031286
1492739561 https://github.com/pydata/xarray/issues/7079#issuecomment-1492739561 https://api.github.com/repos/pydata/xarray/issues/7079 IC_kwDOAMm_X85Y-WXp jhamman 2443309 2023-04-01T00:00:24Z 2023-04-01T00:00:24Z MEMBER

@kthyng - any difference when running with parallel=True vs parallel=False?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 1385031286
1492670307 https://github.com/pydata/xarray/issues/7079#issuecomment-1492670307 https://api.github.com/repos/pydata/xarray/issues/7079 IC_kwDOAMm_X85Y-Fdj kthyng 3487237 2023-03-31T22:14:06Z 2023-03-31T22:14:06Z NONE

I was able to reproduce the error with the current version of xarray and then have it work with the new version. Here is what I did:

Make new environment conda create -n test_xarray xarray netcdf4 dask

Check version ``` (test_xarray) kthyng@adams ~ % conda list xarray

packages in environment at /Users/kthyng/miniconda3/envs/test_xarray:

Name Version Build Channel

xarray 2023.3.0 pyhd8ed1ab_0 conda-forge ```

In python: import xarray as xr urls = ["https://opendap.co-ops.nos.noaa.gov/thredds/dodsC/NOAA/WCOFS/MODELS/2023/03/31/nos.wcofs.2ds.n001.20230331.t03z.nc", "https://opendap.co-ops.nos.noaa.gov/thredds/dodsC/NOAA/WCOFS/MODELS/2023/03/31/nos.wcofs.2ds.n002.20230331.t03z.nc"] xr.open_mfdataset(urls) returns the following the first time xr.open_mfdataset(urls) is run but the second time it runs fine. OSError: [Errno -70] NetCDF: DAP server error: 'https://opendap.co-ops.nos.noaa.gov/thredds/dodsC/NOAA/WCOFS/MODELS/2023/03/31/nos.wcofs.2ds.n002.20230331.t03z.nc'

Next I used the PR version of xarray and reran the code above and then it was able to read in ok on the first try.

Note: after a week or so those files won't work and will have to be updated with something more current but the pattern to use is clear from the file names.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 1385031286
1490420229 https://github.com/pydata/xarray/issues/7079#issuecomment-1490420229 https://api.github.com/repos/pydata/xarray/issues/7079 IC_kwDOAMm_X85Y1gIF kthyng 3487237 2023-03-30T14:36:39Z 2023-03-30T14:36:39Z NONE

@jhamman Sorry for my delay — I started this the other day and got waylaid. I'll try to get back to it today or tomorrow.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 1385031286
1485339487 https://github.com/pydata/xarray/issues/7079#issuecomment-1485339487 https://api.github.com/repos/pydata/xarray/issues/7079 IC_kwDOAMm_X85YiHtf jhamman 2443309 2023-03-27T15:28:39Z 2023-03-27T15:28:39Z MEMBER

@cefect, @pnorton-usgs, @kthyng - Is this still an issue for you? If so, could you try to run the xarray test suite in #7079 and report back? We haven't been able to trigger the error reported here so we could use some help running the test suite in an "offending" environment.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 1385031286
1409716721 https://github.com/pydata/xarray/issues/7079#issuecomment-1409716721 https://api.github.com/repos/pydata/xarray/issues/7079 IC_kwDOAMm_X85UBpHx jhamman 2443309 2023-01-31T03:57:43Z 2023-01-31T03:57:43Z MEMBER

Update: I pushed two new tests to #7488. They are not failing in our test env. If someone that has reported this issue could try running the test suite, that would be super helpful in terms of confirming where the problem lies.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 1385031286
1409358970 https://github.com/pydata/xarray/issues/7079#issuecomment-1409358970 https://api.github.com/repos/pydata/xarray/issues/7079 IC_kwDOAMm_X85UARx6 jhamman 2443309 2023-01-30T21:22:22Z 2023-01-30T23:33:01Z MEMBER

I've opened #7488 which I think has actually exposed a few other failures. I doubt I'll have much time to put into this issue in the near time so anyone should feel free to jump in here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 1385031286
1408427455 https://github.com/pydata/xarray/issues/7079#issuecomment-1408427455 https://api.github.com/repos/pydata/xarray/issues/7079 IC_kwDOAMm_X85T8uW_ trexfeathers 40734014 2023-01-30T11:09:15Z 2023-01-30T11:09:15Z NONE

iris has the pin in their package metadata

Note that this will hopefully be removed soon - SciTools/iris#5095 - but the reviewer has been assigned to other urgent work so it's paused right now.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 1385031286
1404127967 https://github.com/pydata/xarray/issues/7079#issuecomment-1404127967 https://api.github.com/repos/pydata/xarray/issues/7079 IC_kwDOAMm_X85TsUrf keewis 14808389 2023-01-25T19:32:12Z 2023-01-25T19:37:14Z MEMBER

iris has the pin in their package metadata

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 1385031286
1404118074 https://github.com/pydata/xarray/issues/7079#issuecomment-1404118074 https://api.github.com/repos/pydata/xarray/issues/7079 IC_kwDOAMm_X85TsSQ6 dcherian 2448579 2023-01-25T19:22:52Z 2023-01-25T19:22:52Z MEMBER

o I'm surprised we're not catching this.

Turns out we're running tests on an older working version (logs) even though we don't have a pin. netcdf4 1.6.0 nompi_py310h0a86a1f_103 conda-forge

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 1385031286
1404113750 https://github.com/pydata/xarray/issues/7079#issuecomment-1404113750 https://api.github.com/repos/pydata/xarray/issues/7079 IC_kwDOAMm_X85TsRNW jhamman 2443309 2023-01-25T19:18:37Z 2023-01-25T19:18:37Z MEMBER

It would be great if someone could put together a MCVE that reproduces the issue here. We have multiple tests in our test suite that use open_mfdataset with parallel=True, including one that runs against a distributed scheduler and one that runs against the threaded scheduler, so I'm surprised we're not catching this. In any event, the next step would be to develop a test that that triggers the error so we can sort out a fix.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 1385031286
1404041288 https://github.com/pydata/xarray/issues/7079#issuecomment-1404041288 https://api.github.com/repos/pydata/xarray/issues/7079 IC_kwDOAMm_X85Tr_hI dcherian 2448579 2023-01-25T18:21:26Z 2023-01-25T19:03:07Z MEMBER

From https://github.com/conda-forge/netcdf4-feedstock/issues/141:

It's on users to manage locking for non-threadsafe resources like netCDF.

@pydata/xarray ~Should we be handling this by default in the netCDF4 backend now?~

EDIT: We already have locks: https://github.com/pydata/xarray/blob/6e77f5e8942206b3e0ab08c3621ade1499d8235b/xarray/backends/netCDF4_.py#L363-L383

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 1385031286
1276693638 https://github.com/pydata/xarray/issues/7079#issuecomment-1276693638 https://api.github.com/repos/pydata/xarray/issues/7079 IC_kwDOAMm_X85MGMyG dcherian 2448579 2022-10-12T20:23:11Z 2022-10-12T20:23:11Z MEMBER

My workflow is my own laptop only

Use LocalCluster! ;)

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 1,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 1385031286
1276668410 https://github.com/pydata/xarray/issues/7079#issuecomment-1276668410 https://api.github.com/repos/pydata/xarray/issues/7079 IC_kwDOAMm_X85MGGn6 ocefpaf 950575 2022-10-12T19:57:35Z 2022-10-12T20:17:08Z CONTRIBUTOR

Note that this is not a bug per se, netcdf-c was never thread safe and, when the work around were removed in netcdf4-python, this issue surfaced. The right fix is to disable threads, like in my example above, or to wait for a netcdf-c release that is thread safe. I don't think the work around will be re-added in netcdf4-python.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 1385031286
1276685512 https://github.com/pydata/xarray/issues/7079#issuecomment-1276685512 https://api.github.com/repos/pydata/xarray/issues/7079 IC_kwDOAMm_X85MGKzI ocefpaf 950575 2022-10-12T20:16:41Z 2022-10-12T20:16:41Z CONTRIBUTOR

This fix will restrict you to serial compute.

I was waiting for someone who do stuff on clusters to comment on that. Thanks! (My workflow is my own laptop only, so I'm quite limited on that front :smile:)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 1385031286
1276681057 https://github.com/pydata/xarray/issues/7079#issuecomment-1276681057 https://api.github.com/repos/pydata/xarray/issues/7079 IC_kwDOAMm_X85MGJth dcherian 2448579 2022-10-12T20:11:54Z 2022-10-12T20:11:54Z MEMBER

The right fix is to disable threads, like in my example above

This fix will restrict you to serial compute.

You can also parallelize across processes using something like

python PBSCluster( ..., cores=1, processes=2, )

or LocalCluster(threads_per_worker=1, ...)

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 1385031286
1276656637 https://github.com/pydata/xarray/issues/7079#issuecomment-1276656637 https://api.github.com/repos/pydata/xarray/issues/7079 IC_kwDOAMm_X85MGDv9 kthyng 3487237 2022-10-12T19:45:08Z 2022-10-12T19:45:08Z NONE

@ocefpaf and all: thank you! What a mysterious error this has been. Using the workaround

import dask dask.config.set(scheduler="single-threaded")

did indeed avoid the issue for me.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 1385031286
1267477522 https://github.com/pydata/xarray/issues/7079#issuecomment-1267477522 https://api.github.com/repos/pydata/xarray/issues/7079 IC_kwDOAMm_X85LjCwS ocefpaf 950575 2022-10-04T19:24:01Z 2022-10-04T19:34:42Z CONTRIBUTOR

Also, you can try:

python import dask dask.config.set(scheduler="single-threaded")

That would ensure you don't use threads when reading with netcdf-c (netcdf4).


Edit: this is not an xarray problem and I recommend to close this issue and follow up with the one already opened upstream.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 1385031286
1267159210 https://github.com/pydata/xarray/issues/7079#issuecomment-1267159210 https://api.github.com/repos/pydata/xarray/issues/7079 IC_kwDOAMm_X85Lh1Cq ocefpaf 950575 2022-10-04T15:11:17Z 2022-10-04T15:11:17Z CONTRIBUTOR

I believe you are hitting https://github.com/Unidata/netcdf4-python/issues/1192

The verdict is not out on that one yet. Your parallelization may not be thread safe, which makes 1.6.1 failures that expected. For now, if you can, downgrade to 1.6.0 or use an engine that is thread safe. Maybe h5netcdf (not sure!)?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 1385031286
1262169014 https://github.com/pydata/xarray/issues/7079#issuecomment-1262169014 https://api.github.com/repos/pydata/xarray/issues/7079 IC_kwDOAMm_X85LOyu2 pnorton-usgs 8998112 2022-09-29T11:52:42Z 2022-09-29T11:52:42Z NONE

I ran into this problem yesterday reading netcdf files on our HPC with a known good script and netcdf files. Unfortunately just trying to open the files again in a try..except block did not work for me. Looking back through my environment update history I found that the netcdf4 library had been updated since I'd last successfully run the script. The current version installed was conda-forge/linux-64::netcdf4-1.6.1-nompi_py39hfaa66c4_100; I rolled it back to conda-forge/linux-64::netcdf4-1.6.0-nompi_py39h6ced12a_102. After the rollback the script started working again without error.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset parallel=True failing with netcdf4 >= 1.6.1 1385031286

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.342ms · About: xarray-datasette