home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

14 rows where author_association = "NONE" and issue = 631085856 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 4

  • peterdudfield 5
  • zoj613 5
  • rsignell-usgs 3
  • alaws-USGS 1

issue 1

  • Document writing netcdf from xarray directly to S3 · 14 ✖

author_association 1

  • NONE · 14 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1453906696 https://github.com/pydata/xarray/issues/4122#issuecomment-1453906696 https://api.github.com/repos/pydata/xarray/issues/4122 IC_kwDOAMm_X85WqNsI zoj613 44142765 2023-03-03T18:08:07Z 2023-03-03T18:08:07Z NONE

Based on the docs

The default format is NETCDF4 if you are saving a file to disk and have the netCDF4-python library available. Otherwise, xarray falls back to using scipy to write netCDF files and defaults to the NETCDF3_64BIT format (scipy does not support netCDF4).

It appears scipy engine is safe is one does not need to be bothered with specifying engines.By the way, what are the limitations of the netcdf3 standard vs netcdf4?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Document writing netcdf from xarray directly to S3 631085856
1453901550 https://github.com/pydata/xarray/issues/4122#issuecomment-1453901550 https://api.github.com/repos/pydata/xarray/issues/4122 IC_kwDOAMm_X85WqMbu peterdudfield 34686298 2023-03-03T18:03:49Z 2023-03-03T18:03:49Z NONE

I never needed to specify an engine when writing, you only need it when reading the file. I use the engine="scipy" one for reading.

using engine="scipy" worked - thank you

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Document writing netcdf from xarray directly to S3 631085856
1453899322 https://github.com/pydata/xarray/issues/4122#issuecomment-1453899322 https://api.github.com/repos/pydata/xarray/issues/4122 IC_kwDOAMm_X85WqL46 peterdudfield 34686298 2023-03-03T18:02:04Z 2023-03-03T18:02:04Z NONE

I use the engine="scipy" one for reading.

This is netCDF3, in that case. If that's fine for you, no problem.

What do you mean this is netcdf3?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Document writing netcdf from xarray directly to S3 631085856
1453897364 https://github.com/pydata/xarray/issues/4122#issuecomment-1453897364 https://api.github.com/repos/pydata/xarray/issues/4122 IC_kwDOAMm_X85WqLaU zoj613 44142765 2023-03-03T18:00:33Z 2023-03-03T18:00:33Z NONE

I never needed to specify an engine when writing, you only need it when reading the file. I use the engine="scipy" one for reading.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Document writing netcdf from xarray directly to S3 631085856
1453562756 https://github.com/pydata/xarray/issues/4122#issuecomment-1453562756 https://api.github.com/repos/pydata/xarray/issues/4122 IC_kwDOAMm_X85Wo5uE peterdudfield 34686298 2023-03-03T13:52:08Z 2023-03-03T17:48:15Z NONE

Maybe it is netCDF3? xarray is supposed to be able to determine the file type

with fsspec.open("s3://some_bucket/some_remote_destination.nc", mode="rb") as ff: ds = xr.open_dataset(ff)

but maybe play with the engine= argument.

Thanks, i tried to make sure it was engine=h5netcdf when saving, but not sure this worked

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Document writing netcdf from xarray directly to S3 631085856
1453883439 https://github.com/pydata/xarray/issues/4122#issuecomment-1453883439 https://api.github.com/repos/pydata/xarray/issues/4122 IC_kwDOAMm_X85WqIAv peterdudfield 34686298 2023-03-03T17:47:43Z 2023-03-03T17:47:43Z NONE

It could be here's some running notes of mine - https://github.com/openclimatefix/MetOfficeDataHub/issues/65

The same method is python with fsspec.open("simplecache::s3://nowcasting-nwp-development/data/test.netcdf", mode="wb") as f: dataset.to_netcdf(f,engine='h5netcdf')

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Document writing netcdf from xarray directly to S3 631085856
1453873364 https://github.com/pydata/xarray/issues/4122#issuecomment-1453873364 https://api.github.com/repos/pydata/xarray/issues/4122 IC_kwDOAMm_X85WqFjU alaws-USGS 108412194 2023-03-03T17:38:11Z 2023-03-03T17:38:33Z NONE

@peterdudfield Could this be an issue with how you wrote out your NetCDF file? When I write to requester pays buckets, my approach using your variables would look like this and incorporates instructions from fsspec for remote write caching:

url = "simplecache::s3://file_path/to/file.nc" with fsspec.open(url, mode="wb", s3={"profile": "default"}) as ff: # the important part is using s3={"profile":"your_aws_profile"} daymet_sel.to_netcdf(ff)

I haven't had any read issues with files saved this way.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Document writing netcdf from xarray directly to S3 631085856
1453553088 https://github.com/pydata/xarray/issues/4122#issuecomment-1453553088 https://api.github.com/repos/pydata/xarray/issues/4122 IC_kwDOAMm_X85Wo3XA peterdudfield 34686298 2023-03-03T13:43:58Z 2023-03-03T13:45:59Z NONE

What didn't work:

python f = fsspec.filesystem("s3", anon=False) with f.open("some_bucket/some_remote_destination.nc", mode="wb") as ff: xr.open_dataset("some_local_file.nc").to_netcdf(ff)

this results in a OSError: [Errno 29] Seek only available in read mode exception

Changing the above to

python with fsspec.open("simplecache::s3://some_bucket/some_remote_destination.nc", mode="wb") as ff: xr.open_dataset("some_local_file.nc").to_netcdf(ff)

fixed it.

How would you go about reading this file? Once it is saved in s3

Im currently getting an error ValueError: b'CDF\x02\x00\x00\x00\x00' is not the signature of a valid netCDF4 file

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Document writing netcdf from xarray directly to S3 631085856
1401040677 https://github.com/pydata/xarray/issues/4122#issuecomment-1401040677 https://api.github.com/repos/pydata/xarray/issues/4122 IC_kwDOAMm_X85Tgi8l zoj613 44142765 2023-01-23T21:49:46Z 2023-01-23T21:52:29Z NONE

What didn't work: python f = fsspec.filesystem("s3", anon=False) with f.open("some_bucket/some_remote_destination.nc", mode="wb") as ff: xr.open_dataset("some_local_file.nc").to_netcdf(ff) this results in a OSError: [Errno 29] Seek only available in read mode exception

Changing the above to python with fsspec.open("simplecache::s3://some_bucket/some_remote_destination.nc", mode="wb") as ff: xr.open_dataset("some_local_file.nc").to_netcdf(ff) fixed it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Document writing netcdf from xarray directly to S3 631085856
1400564474 https://github.com/pydata/xarray/issues/4122#issuecomment-1400564474 https://api.github.com/repos/pydata/xarray/issues/4122 IC_kwDOAMm_X85Teur6 zoj613 44142765 2023-01-23T15:44:20Z 2023-01-23T15:44:20Z NONE

'/silt/usgs/Projects/stellwagen/CF-1.6/BUZZ_BAY/2651-A.cdf') outfile = fsspec.open('simpl

Thanks, this actually worked for me. It seems as though initializing an s3 store using fs = fsspec.S3FileSystem(...) beforehand and using it as a context manager via with fs.open(...) as out: data.to_netcdf(out) caused the failure.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Document writing netcdf from xarray directly to S3 631085856
1400519887 https://github.com/pydata/xarray/issues/4122#issuecomment-1400519887 https://api.github.com/repos/pydata/xarray/issues/4122 IC_kwDOAMm_X85TejzP zoj613 44142765 2023-01-23T15:16:21Z 2023-01-23T15:16:21Z NONE

Is there any reliable to use to write a xr.Dataset object as a netcdf file in 2023? I tried using the above approach with fsspec but I keep getting a OSError: [Errno 29] Seek only available in read mode exception.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Document writing netcdf from xarray directly to S3 631085856
745520766 https://github.com/pydata/xarray/issues/4122#issuecomment-745520766 https://api.github.com/repos/pydata/xarray/issues/4122 MDEyOklzc3VlQ29tbWVudDc0NTUyMDc2Ng== rsignell-usgs 1872600 2020-12-15T19:39:16Z 2020-12-15T19:39:16Z NONE

I'm closing this the recommended approach for writing NetCDF to object stroage is to write locally, then push.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Document writing netcdf from xarray directly to S3 631085856
640548620 https://github.com/pydata/xarray/issues/4122#issuecomment-640548620 https://api.github.com/repos/pydata/xarray/issues/4122 MDEyOklzc3VlQ29tbWVudDY0MDU0ODYyMA== rsignell-usgs 1872600 2020-06-08T11:36:14Z 2020-06-08T11:37:21Z NONE

@martindurant, I asked @ajelenak offline and he reminded me that:

File metadata are dispersed throughout an HDF5 [and NetCDF4] file in order to support writing and modifying array sizes at any time of execution

Looking forward to simplecache:: for writing in fsspec=0.7.5!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Document writing netcdf from xarray directly to S3 631085856
639771646 https://github.com/pydata/xarray/issues/4122#issuecomment-639771646 https://api.github.com/repos/pydata/xarray/issues/4122 MDEyOklzc3VlQ29tbWVudDYzOTc3MTY0Ng== rsignell-usgs 1872600 2020-06-05T20:08:37Z 2020-06-05T20:54:36Z NONE

Okay @scottyhq, I tried setting engine='h5netcdf', but still got: OSError: Seek only available in read mode Thinking about this a little more, it's pretty clear why writing NetCDF to S3 would require seek mode.

I asked @martindurant about supporting seek for writing in fsspec and he said that would be pretty hard. And in fact, the performance probably would be pretty terrible as lots of little writes would be required.

So maybe it's best just to write netcdf files locally and then push them to S3.

And to facilitate that, @martindurant merged a PR yesterday to enable simplecache for writing in fsspec, so after doing: pip install git+https://github.com/intake/filesystem_spec.git in my environment, this now works: ```python import xarray as xr import fsspec

ds = xr.open_dataset('http://geoport.usgs.esipfed.org/thredds/dodsC' '/silt/usgs/Projects/stellwagen/CF-1.6/BUZZ_BAY/2651-A.cdf')

outfile = fsspec.open('simplecache::s3://chs-pangeo-data-bucket/rsignell/foo2.nc', mode='wb', s3=dict(profile='default')) with outfile as f: ds.to_netcdf(f) `` (Here I'm tellingfsspec` to use the AWS credentials in my "default" profile)

Thanks Martin!!!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Document writing netcdf from xarray directly to S3 631085856

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 646.467ms · About: xarray-datasette