home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

16 rows where issue = 186895655 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 7

  • shoyer 5
  • delgadom 4
  • JackKelly 2
  • nickwg03 2
  • rabernat 1
  • jhamman 1
  • niallrobinson 1

author_association 3

  • MEMBER 7
  • NONE 5
  • CONTRIBUTOR 4

issue 1

  • Support creating DataSet from streaming object · 16 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
636641496 https://github.com/pydata/xarray/issues/1075#issuecomment-636641496 https://api.github.com/repos/pydata/xarray/issues/1075 MDEyOklzc3VlQ29tbWVudDYzNjY0MTQ5Ng== JackKelly 460756 2020-06-01T06:37:08Z 2020-06-01T06:37:08Z NONE

FWIW, I've also tested @delgadom's technique, using netCDF4 and it also works well (and is useful in situations where we don't want to install h5netcdf). Thanks!

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support creating DataSet from streaming object 186895655
635415386 https://github.com/pydata/xarray/issues/1075#issuecomment-635415386 https://api.github.com/repos/pydata/xarray/issues/1075 MDEyOklzc3VlQ29tbWVudDYzNTQxNTM4Ng== JackKelly 460756 2020-05-28T15:18:34Z 2020-05-28T15:19:06Z NONE

Is this now implemented (and hence can this issue be closed?) It appears that this works well:

python boto_s3 = boto3.client('s3') s3_object = boto_s3.get_object(Bucket=bucket, Key=key) netcdf_bytes = s3_object['Body'].read() netcdf_bytes_io = io.BytesIO(netcdf_bytes) ds = xr.open_dataset(netcdf_bytes_io)

Is that the right approach to opening a NetCDF file on S3, using the latest xarray code?

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support creating DataSet from streaming object 186895655
373749850 https://github.com/pydata/xarray/issues/1075#issuecomment-373749850 https://api.github.com/repos/pydata/xarray/issues/1075 MDEyOklzc3VlQ29tbWVudDM3Mzc0OTg1MA== nickwg03 4528512 2018-03-16T15:30:18Z 2018-03-16T15:30:18Z NONE

@delgadom Ah, I see. I needed libnetcdf=4.5.0, I had been using an earlier version. Sounds like prior to 4.5.0 there were still some issues with the name of the file being passed into netCDF4.Dataset, as is mentioned here: https://github.com/Unidata/netcdf4-python/issues/295

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support creating DataSet from streaming object 186895655
373541528 https://github.com/pydata/xarray/issues/1075#issuecomment-373541528 https://api.github.com/repos/pydata/xarray/issues/1075 MDEyOklzc3VlQ29tbWVudDM3MzU0MTUyOA== delgadom 3698640 2018-03-15T22:21:51Z 2018-03-16T01:33:24Z CONTRIBUTOR

xarray==0.10.2 netCDF4==1.3.1

Just tried it again and didn't have any issues:

```python patt = ( 'http://nasanex.s3.amazonaws.com/NEX-GDDP/BCSD/{scen}/day/atmos/{var}/' + 'r1i1p1/v1.0/{var}day_BCSD{scen}r1i1p1{model}_{year}.nc')

def open_url_dataset(url):

fname = os.path.splitext(os.path.basename(url))[0]
res = requests.get(url)
content = io.BytesIO(res.content)
nc4_ds = netCDF4.Dataset(fname, memory=res.content)

store = xr.backends.NetCDF4DataStore(nc4_ds)
ds = xr.open_dataset(store)

return ds

ds = open_url_dataset(url=patt.format( model='GFDL-ESM2G', scen='historical', var='tasmax', year=1988)) ds ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support creating DataSet from streaming object 186895655
373517197 https://github.com/pydata/xarray/issues/1075#issuecomment-373517197 https://api.github.com/repos/pydata/xarray/issues/1075 MDEyOklzc3VlQ29tbWVudDM3MzUxNzE5Nw== nickwg03 4528512 2018-03-15T20:45:50Z 2018-03-15T20:45:50Z NONE

@delgadom which version of netCDF4 are you using? I'm following your same steps but am still receiving an [Errno 2] No such file or directory

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support creating DataSet from streaming object 186895655
357130204 https://github.com/pydata/xarray/issues/1075#issuecomment-357130204 https://api.github.com/repos/pydata/xarray/issues/1075 MDEyOklzc3VlQ29tbWVudDM1NzEzMDIwNA== shoyer 1217238 2018-01-12T03:03:22Z 2018-01-12T03:03:22Z MEMBER

We could potentially add a from_memory() constructor to NetCDF4DataStore to simplify this process. On Thu, Jan 11, 2018 at 6:27 PM Michael Delgado notifications@github.com wrote:

yes! Thanks @jhamman https://github.com/jhamman and @shoyer https://github.com/shoyer. I hadn't tried it yet, but just did. worked great!

In [1]: import xarray as xr ...: import requests ...: import netCDF4 ...: ...: %matplotlib inline

In [2]: res = requests.get( ...: 'http://nasanex.s3.amazonaws.com/NEX-GDDP/BCSD/rcp45/day/atmos/tasmin/' + ...: 'r1i1p1/v1.0/tasmin_day_BCSD_rcp45_r1i1p1_CESM1-BGC_2073.nc')

In [3]: res.status_code Out [3]: 200

In [4]: res.headers['content-type'] Out [4]: 'application/x-netcdf'

In [5]: nc4_ds = netCDF4.Dataset('tasmin_day_BCSD_rcp45_r1i1p1_CESM1-BGC_2073', memory=res.content)

In [6]: store = xr.backends.NetCDF4DataStore(nc4_ds)

In [7]: ds = xr.open_dataset(store)

In [8]: ds.tasmin.isel(time=0).plot() /global/home/users/mdelgado/git/public/xarray/xarray/plot/utils.py:51: FutureWarning: 'pandas.tseries.converter.register' has been moved and renamed to 'pandas.plotting.register_matplotlib_converters'. converter.register() Out [8]: <matplotlib.collections.QuadMesh at 0x2aede3c922b0>

[image: output_7_2] https://user-images.githubusercontent.com/3698640/34856943-f82619f4-f6fc-11e7-831d-f5d4032a338a.png

In [9]: ds Out [9]: <xarray.Dataset> Dimensions: (lat: 720, lon: 1440, time: 365) Coordinates: * time (time) datetime64[ns] 2073-01-01T12:00:00 2073-01-02T12:00:00 ... * lat (lat) float32 -89.875 -89.625 -89.375 -89.125 -88.875 -88.625 ... * lon (lon) float32 0.125 0.375 0.625 0.875 1.125 1.375 1.625 1.875 ... Data variables: tasmin (time, lat, lon) float64 ... Attributes: parent_experiment: historical parent_experiment_id: historical parent_experiment_rip: r1i1p1 Conventions: CF-1.4 institution: NASA Earth Exchange, NASA Ames Research C... institute_id: NASA-Ames realm: atmos modeling_realm: atmos version: 1.0 downscalingModel: BCSD experiment_id: rcp45 frequency: day realization: 1 initialization_method: 1 physics_version: 1 tracking_id: 1865ff49-b20c-4268-852a-a9503efec72c driving_data_tracking_ids: N/A driving_model_ensemble_member: r1i1p1 driving_experiment_name: historical driving_experiment: historical model_id: BCSD references: BCSD method: Thrasher et al., 2012, Hydro... DOI: http://dx.doi.org/10.7292/W0MW2F2G experiment: RCP4.5 title: CESM1-BGC global downscaled NEX CMIP5 Cli... contact: Dr. Rama Nemani: rama.nemani@nasa.gov, Dr... disclaimer: This data is considered provisional and s... resolution_id: 0.25 degree project_id: NEXGDDP table_id: Table day (12 November 2010) source: BCSD 2014 creation_date: 2015-01-07T19:18:31Z forcing: N/A product: output

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/1075#issuecomment-357125148, or mute the thread https://github.com/notifications/unsubscribe-auth/ABKS1hiQ2cgre7e234H1PSZ33v3i1CWzks5tJsMQgaJpZM4KnpJe .

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support creating DataSet from streaming object 186895655
357125148 https://github.com/pydata/xarray/issues/1075#issuecomment-357125148 https://api.github.com/repos/pydata/xarray/issues/1075 MDEyOklzc3VlQ29tbWVudDM1NzEyNTE0OA== delgadom 3698640 2018-01-12T02:27:27Z 2018-01-12T02:27:27Z CONTRIBUTOR

yes! Thanks @jhamman and @shoyer. I hadn't tried it yet, but just did. worked great!

```python In [1]: import xarray as xr ...: import requests ...: import netCDF4 ...: ...: %matplotlib inline

In [2]: res = requests.get( ...: 'http://nasanex.s3.amazonaws.com/NEX-GDDP/BCSD/rcp45/day/atmos/tasmin/' + ...: 'r1i1p1/v1.0/tasmin_day_BCSD_rcp45_r1i1p1_CESM1-BGC_2073.nc')

In [3]: res.status_code Out [3]: 200

In [4]: res.headers['content-type'] Out [4]: 'application/x-netcdf'

In [5]: nc4_ds = netCDF4.Dataset('tasmin_day_BCSD_rcp45_r1i1p1_CESM1-BGC_2073', memory=res.content)

In [6]: store = xr.backends.NetCDF4DataStore(nc4_ds)

In [7]: ds = xr.open_dataset(store)

In [8]: ds.tasmin.isel(time=0).plot() /global/home/users/mdelgado/git/public/xarray/xarray/plot/utils.py:51: FutureWarning: 'pandas.tseries.converter.register' has been moved and renamed to 'pandas.plotting.register_matplotlib_converters'. converter.register() Out [8]: <matplotlib.collections.QuadMesh at 0x2aede3c922b0> ![output_7_2](https://user-images.githubusercontent.com/3698640/34856943-f82619f4-f6fc-11e7-831d-f5d4032a338a.png)python In [9]: ds Out [9]: <xarray.Dataset> Dimensions: (lat: 720, lon: 1440, time: 365) Coordinates: * time (time) datetime64[ns] 2073-01-01T12:00:00 2073-01-02T12:00:00 ... * lat (lat) float32 -89.875 -89.625 -89.375 -89.125 -88.875 -88.625 ... * lon (lon) float32 0.125 0.375 0.625 0.875 1.125 1.375 1.625 1.875 ... Data variables: tasmin (time, lat, lon) float64 ... Attributes: parent_experiment: historical parent_experiment_id: historical parent_experiment_rip: r1i1p1 Conventions: CF-1.4 institution: NASA Earth Exchange, NASA Ames Research C... institute_id: NASA-Ames realm: atmos modeling_realm: atmos version: 1.0 downscalingModel: BCSD experiment_id: rcp45 frequency: day realization: 1 initialization_method: 1 physics_version: 1 tracking_id: 1865ff49-b20c-4268-852a-a9503efec72c driving_data_tracking_ids: N/A driving_model_ensemble_member: r1i1p1 driving_experiment_name: historical driving_experiment: historical model_id: BCSD references: BCSD method: Thrasher et al., 2012, Hydro... DOI: http://dx.doi.org/10.7292/W0MW2F2G experiment: RCP4.5 title: CESM1-BGC global downscaled NEX CMIP5 Cli... contact: Dr. Rama Nemani: rama.nemani@nasa.gov, Dr... disclaimer: This data is considered provisional and s... resolution_id: 0.25 degree project_id: NEXGDDP table_id: Table day (12 November 2010) source: BCSD 2014 creation_date: 2015-01-07T19:18:31Z forcing: N/A product: output ```

{
    "total_count": 4,
    "+1": 4,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support creating DataSet from streaming object 186895655
357115879 https://github.com/pydata/xarray/issues/1075#issuecomment-357115879 https://api.github.com/repos/pydata/xarray/issues/1075 MDEyOklzc3VlQ29tbWVudDM1NzExNTg3OQ== jhamman 2443309 2018-01-12T01:26:09Z 2018-01-12T01:26:09Z MEMBER

@delgadom - did you find a solution here?

A few more references, we're exploring ways to do this in the Pangeo project using Fuse (https://github.com/pangeo-data/pangeo/issues/52). There is a s3 equivalent of the gcsfs library used in that issue: https://github.com/dask/s3fs

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support creating DataSet from streaming object 186895655
347707575 https://github.com/pydata/xarray/issues/1075#issuecomment-347707575 https://api.github.com/repos/pydata/xarray/issues/1075 MDEyOklzc3VlQ29tbWVudDM0NzcwNzU3NQ== shoyer 1217238 2017-11-29T00:09:09Z 2017-11-29T00:09:31Z MEMBER

@delgadom Yes, that should work (I haven't tested it, but yes in principle it should all work now).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support creating DataSet from streaming object 186895655
347705483 https://github.com/pydata/xarray/issues/1075#issuecomment-347705483 https://api.github.com/repos/pydata/xarray/issues/1075 MDEyOklzc3VlQ29tbWVudDM0NzcwNTQ4Mw== delgadom 3698640 2017-11-28T23:58:41Z 2017-11-28T23:58:41Z CONTRIBUTOR

Thanks @shoyer. So you can download the entire object into memory and then create a file image and read that? While not a full fix, it's definitely an improvement over download-to-disk-then-read workflow!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support creating DataSet from streaming object 186895655
346214120 https://github.com/pydata/xarray/issues/1075#issuecomment-346214120 https://api.github.com/repos/pydata/xarray/issues/1075 MDEyOklzc3VlQ29tbWVudDM0NjIxNDEyMA== shoyer 1217238 2017-11-22T01:23:42Z 2017-11-22T01:23:42Z MEMBER

Just to clarify: I wrote about that we use could support initializing a Dataset from a netCDF4 file image. But this wouldn't help yet for streaming access.

Initializing a Dataset from a netCDF4 file image should actually work with the latest versions of xarray and netCDF4-python: python nc4_ds = netCDF4.Dataset('arbitrary-name', memory=netcdf_bytes) store = xarray.backends.NetCDF4DataStore(nc4_ds) ds = xarray.open_dataset(store)

{
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support creating DataSet from streaming object 186895655
345989495 https://github.com/pydata/xarray/issues/1075#issuecomment-345989495 https://api.github.com/repos/pydata/xarray/issues/1075 MDEyOklzc3VlQ29tbWVudDM0NTk4OTQ5NQ== niallrobinson 2979205 2017-11-21T10:50:40Z 2017-11-21T10:51:43Z NONE

FWIW this would be really useful 👍 from me, specifically for the use case above of reading from s3

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support creating DataSet from streaming object 186895655
307297489 https://github.com/pydata/xarray/issues/1075#issuecomment-307297489 https://api.github.com/repos/pydata/xarray/issues/1075 MDEyOklzc3VlQ29tbWVudDMwNzI5NzQ4OQ== shoyer 1217238 2017-06-09T05:16:40Z 2017-06-09T05:16:40Z MEMBER

Yes, we could support initializing a Dataset from netCDF4 file image in a bytes object.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support creating DataSet from streaming object 186895655
307277552 https://github.com/pydata/xarray/issues/1075#issuecomment-307277552 https://api.github.com/repos/pydata/xarray/issues/1075 MDEyOklzc3VlQ29tbWVudDMwNzI3NzU1Mg== rabernat 1197350 2017-06-09T02:21:55Z 2017-06-09T02:21:55Z MEMBER

Is this issue resolvable now that unidata/netcdf4-python#652 has been merged?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support creating DataSet from streaming object 186895655
258025809 https://github.com/pydata/xarray/issues/1075#issuecomment-258025809 https://api.github.com/repos/pydata/xarray/issues/1075 MDEyOklzc3VlQ29tbWVudDI1ODAyNTgwOQ== delgadom 3698640 2016-11-02T23:03:34Z 2016-11-02T23:03:34Z CONTRIBUTOR

Got it. :( Thanks!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support creating DataSet from streaming object 186895655
258018067 https://github.com/pydata/xarray/issues/1075#issuecomment-258018067 https://api.github.com/repos/pydata/xarray/issues/1075 MDEyOklzc3VlQ29tbWVudDI1ODAxODA2Nw== shoyer 1217238 2016-11-02T22:24:43Z 2016-11-02T22:25:08Z MEMBER

This does work for netCDF3 files, if you provide a file-like object (e.g., wrapped in BytesIO) or set engine='scipy'.

Unfortunately, this is a netCDF4/HDF5 file:

```

data.raw.read(8) '\x89HDF\r\n\x1a\n' ```

And as yet, there is no support for reading from file-like objects in either h5py (https://github.com/h5py/h5py/issues/552) or python-netCDF4 (https://github.com/Unidata/netcdf4-python/issues/295). So we're currently stuck :(.

One possibility is to use the new HDF5 library pyfive with h5netcdf (https://github.com/shoyer/h5netcdf/issues/25). But pyfive doesn't have enough features yet to read netCDF files.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support creating DataSet from streaming object 186895655

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.608ms · About: xarray-datasette