home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

26 rows where author_association = "MEMBER" and user = 3924836 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, reactions, created_at (date), updated_at (date)

issue 20

  • added some logic to deal with rasterio objects in addition to filepaths 3
  • changed url for rasterio network test 3
  • enable loading remote hdf5 files 2
  • test_rasterio_vrt_network is failing in continuous integration tests 2
  • Chunked processing across multiple raster (geoTIF) files 1
  • enable reading of file-like HDF5 objects 1
  • BUG: Fix #2864 by adding the missing vrt parameters 1
  • Accessing COG overviews with read_rasterio 1
  • xr.DataSet.from_dataframe / xr.DataArray.from_series does not preserve DateTimeIndex with timezone 1
  • environment file for binderized examples 1
  • Document writing netcdf from xarray directly to S3 1
  • Comprehensive benchmarking suite 1
  • OpenDAP Documentation Example failing with RunTimeError 1
  • Example documentation in open_rasterio 1
  • ds = xr.tutorial.load_dataset("air_temperature") with 0.18 needs engine argument 1
  • failing flaky test: rasterio vrt 1
  • Writing GDAL ZARR _CRS attribute not possible 1
  • Avoid loading any data for reprs 1
  • copy of custom index does not align with original 1
  • `to_netCDF` provides a `netcdf` that does not import correctly in QGIS 1

user 1

  • scottyhq · 26 ✖

author_association 1

  • MEMBER · 26 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1483302077 https://github.com/pydata/xarray/issues/7162#issuecomment-1483302077 https://api.github.com/repos/pydata/xarray/issues/7162 IC_kwDOAMm_X85YaWS9 scottyhq 3924836 2023-03-24T19:22:56Z 2023-03-24T19:23:46Z MEMBER

Reviving this exploration with xarray : 2023.3.0 and I'm no longer seeing the traceback linked above, so I think this can be closed

Original traceback from xr.align(copy, newds), which now succeeds: ``pytbFile ~/mambaforge/envs/xarray-release/lib/python3.10/site-packages/xarray/core/indexes.py:490, in PandasIndex.equals(self, other) 488 if not isinstance(other, PandasIndex): 489 return False --> 490 return self.index.equals(other.index) and self.dim == other.dim

AttributeError: 'PandasIndex' object has no attribute 'index' ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  copy of custom index does not align with original 1409811164
1289469597 https://github.com/pydata/xarray/issues/7195#issuecomment-1289469597 https://api.github.com/repos/pydata/xarray/issues/7195 IC_kwDOAMm_X85M276d scottyhq 3924836 2022-10-24T19:04:23Z 2022-10-24T19:04:23Z MEMBER

@leomiquelutti I wasn't able to reproduce this (ran your colab notebook and opened the netcdf in qgis 3.26). It's possible for versions to change between your colab executions so I recommend using rioxarray.show_versions() Since you are using rioxarray to write the CRS, I'd also recommend opening this issue or discussion over in https://github.com/corteva/rioxarray

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `to_netCDF` provides a `netcdf` that does not import correctly in QGIS 1419483454
1166154340 https://github.com/pydata/xarray/issues/6722#issuecomment-1166154340 https://api.github.com/repos/pydata/xarray/issues/6722 IC_kwDOAMm_X85Fghpk scottyhq 3924836 2022-06-25T00:37:46Z 2022-06-25T00:37:46Z MEMBER

This would be a pretty small change and only applies for loading data into numpy arrays, for example current repr for a variable followed by modified for the example dataset above (which already happens for large arrays):


Seeing a few values at the edges can be nice, so this makes me realize how data summaries in the metadata (Zarr or STAC) is great for large datasets on cloud storage.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Avoid loading any data for reprs 1284094480
1098327424 https://github.com/pydata/xarray/issues/6448#issuecomment-1098327424 https://api.github.com/repos/pydata/xarray/issues/6448 IC_kwDOAMm_X85BdyWA scottyhq 3924836 2022-04-13T17:53:21Z 2022-04-13T17:53:21Z MEMBER

One of the main motivations behind the the rioxarray extension is GDAL compatibility. It looks like @snowman2 and @TomAugspurger have discussed saving many geotiffs loaded into xarray as GDAL-compatible Zarr for example https://github.com/corteva/rioxarray/issues/433#issuecomment-967685356.

While it seems that the ultimate solution is agreeing on a format standard, here is another small example using the rioxarray extension where format conversion doesn't currently work as you might expect:

```python

https://github.com/pydata/xarray-data

ds = xr.open_dataset('xarray-data/air_temperature.nc', engine='rasterio')

TooManyDimensions: Only 2D and 3D data arrays supported.

ds.rio.to_raster('test.zarr', driver='ZARR')

Does not error, but output not equivalent to gdal_translate -of ZARR xarray-data/air_temperature.nc gdal_air_temp.zarr

for example, gdalinfo xarray-tutorial-airtemp.zarr gives

Warning 1: Too many samples along the > 2D dimensions of /air.

ds.to_zarr('xarray-tutorial-airtemp.zarr')

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Writing GDAL ZARR _CRS attribute not possible 1194993450
1068741201 https://github.com/pydata/xarray/issues/6363#issuecomment-1068741201 https://api.github.com/repos/pydata/xarray/issues/6363 IC_kwDOAMm_X84_s7JR scottyhq 3924836 2022-03-16T05:14:50Z 2022-03-16T05:20:46Z MEMBER

Well, I changed that URL back in https://github.com/pydata/xarray/pull/3162 . But maybe it's best to just remove that test now that VRT and other GDAL/rasterio functionality is integrated with rioxarray?

Unfortunately the AWS "landsat-pds" (public data set) was deprecated and not supported any longer, so maybe those images are finally being deleted https://lists.osgeo.org/pipermail/landsat-pds/2021-July/000185.html

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  failing flaky test: rasterio vrt 1170533154
839334583 https://github.com/pydata/xarray/issues/5291#issuecomment-839334583 https://api.github.com/repos/pydata/xarray/issues/5291 MDEyOklzc3VlQ29tbWVudDgzOTMzNDU4Mw== scottyhq 3924836 2021-05-12T00:31:28Z 2021-05-12T00:31:28Z MEMBER

Thanks @keewis I should have been more clear about the environment. I was recently going over the tutorial with someone and started with:

  1. conda create -n xarray-tutorial xarray running ds = xr.tutorial.load_dataset("air_temperature") --> ImportError: using the tutorial data requires pooch
  2. we install pooch and then hit: ValueError: cannot guess the engine, try passing one explicitly
  3. after consulting the docstring we then try ds = xr.tutorial.load_dataset("air_temperature", engine="netcdf4") and hit ValueError: unrecognized engine netcdf4 must be one of: ['store']
  4. being familiar with xarray we then installed netcdf4 into our environment and all is well.

I do think these error messages are not obvious to fix for new xarray users trying out the tutorial (especially # 3 above)

we could definitely improve the error message, though. Something like "unknown engine {engine}, please choose one of the installed engines: {engines}", maybe?

Yes. Perhaps with a link to docs with a list of engines? for the tutorial case specifically could also update the ImportError message to read ImportError: please install 'pooch' and 'netcdf4' to use xarray tutorial data?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ds = xr.tutorial.load_dataset("air_temperature") with 0.18 needs engine argument 889162918
823833533 https://github.com/pydata/xarray/issues/3291#issuecomment-823833533 https://api.github.com/repos/pydata/xarray/issues/3291 MDEyOklzc3VlQ29tbWVudDgyMzgzMzUzMw== scottyhq 3924836 2021-04-21T07:12:11Z 2021-04-21T07:13:45Z MEMBER

Just wanted to rekindle discussion here and ping @dcherian and @benbovy , the current workaround for pandas DatetimeIndex with timezone info (dtype='datetime64[ns, EST]') is to drop the timezone piece or use to_index() and operate in pandas, then reassign the time coordinate: See https://github.com/pydata/xarray/issues/1036 and https://github.com/pydata/xarray/issues/3163.

If I'm following https://github.com/pydata/xarray/blob/master/design_notes/flexible_indexes_notes.md this is another potential example of improved user-friendliness where we could have timezone-aware indexes and therefore call pandas methods like pandas.core.indexes.datetimes.DatetimeIndex.tz_convert() directly as a DataArray method?

This would definitely be great for remote sensing data that is usually stored with UTC timestamps, but often analysis requires converting to local time.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xr.DataSet.from_dataframe / xr.DataArray.from_series does not preserve DateTimeIndex with timezone 490618213
796044411 https://github.com/pydata/xarray/issues/5005#issuecomment-796044411 https://api.github.com/repos/pydata/xarray/issues/5005 MDEyOklzc3VlQ29tbWVudDc5NjA0NDQxMQ== scottyhq 3924836 2021-03-10T20:25:41Z 2021-03-10T20:28:16Z MEMBER

Thanks @gabriel-abrahao good catch! the docs are definitely outdated here and need to be changed. This has been brought up before here https://github.com/pydata/xarray/issues/3185#issuecomment-574215734

You can either do what you suggested or simply transform = Affine(*da.transform) https://github.com/pydata/xarray/blob/12b4480ff2bde696142ca850275cdcc85ca0fbc9/xarray/tests/test_backends.py#L4468

A PR to update the documentation would be welcome. It'd be just a one-line fix here: https://github.com/pydata/xarray/blob/d2582c2f8811a3bd527d47c945b1cccd4983a1d3/xarray/backends/rasterio_.py#L179

cc @snowman2 the docstring for open_rasterio has the same issue currently in rioxarray https://github.com/corteva/rioxarray/blob/63ce87f92835897d46aaa270f5ec07aca4e72cda/rioxarray/_io.py#L653

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Example documentation in open_rasterio 823418135
782258574 https://github.com/pydata/xarray/issues/4925#issuecomment-782258574 https://api.github.com/repos/pydata/xarray/issues/4925 MDEyOklzc3VlQ29tbWVudDc4MjI1ODU3NA== scottyhq 3924836 2021-02-19T18:29:34Z 2021-02-19T19:10:58Z MEMBER

@Tinkaa I looked into this a bit more and I suspect the way you are installing packages (conda, pip?) is important to bring in compatible libnetcdf behind the scenes (You're xr.show_versions() shows new libnetcdf: 4.7.4 but older netCDF4: 1.5.0.1 making me think you installed and older version netcdf into an existing environment without any dependency resolution).

This works for me on mac0S: conda create -c conda-forge -n opendap_working python xarray netcdf4==1.5.1 matplotlib Note that I found the explicit pin important. Perhaps because it brings in libnetcdf=4.6.2 behind the scenes? Here is my full list of working versions:

``` INSTALLED VERSIONS


commit: None python: 3.7.10 | packaged by conda-forge | (default, Feb 19 2021, 15:59:12) [Clang 11.0.1 ] python-bits: 64 OS: Darwin OS-release: 20.3.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2

xarray: 0.16.2 pandas: 1.2.2 numpy: 1.20.1 scipy: None netCDF4: 1.5.0.1 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.4.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.3.4 cartopy: None seaborn: None numbagg: None pint: None setuptools: 49.6.0.post20210108 pip: 21.0.1 conda: None pytest: None IPython: None sphinx: None ```

Hopefully some folks that know more about opendap could chime in (@dopplershift, @rabernat ?) to help get to the root of this issue. For what it's worth, I noticed this additional log output that wasn't surfaced in the jupyter notebook in addition to the original traceback i posted: oc_open: server error retrieving url: code=3 message="The identifier `tmax.tmax%5b0%5d%5b0:3:620%5d%5b0:3:1404%5d' is not in the dataset."oc_open: server error retrieving url: code=3 message="The identifier `tmax.tmax%5b0%5d%5b0:3:620%5d%5b0:3:1404%5d' is not in the dataset."oc_open: server error retrieving url: code=3 message="The identifier `tmax.tmax%5b0%5d%5b0:3:620%5d%5b0:3:1404%5d' is not in the dataset."oc_open: server error retrieving url: code=3 message="The identifier `tmax.tmax%5b0%5d%5b0:3:620%5d%5b0:3:1404%5d' is not in the dataset." oc_open: server error retrieving url: code=3 message="The identifier `tmax.tmax%5b0%5d%5b0:3:620%5d%5b0:3:1404%5d' is not in the dataset."^CTraceback (most recent call last): File "/Users/scott/miniconda3/envs/opendap_test/lib/python3.9/site-packages/xarray/backends/common.py", line 52, in robust_getitem return array[key] File "src/netCDF4/_netCDF4.pyx", line 4420, in netCDF4._netCDF4.Variable.__getitem__ File "src/netCDF4/_netCDF4.pyx", line 5363, in netCDF4._netCDF4.Variable._get File "src/netCDF4/_netCDF4.pyx", line 1950, in netCDF4._netCDF4._ensure_nc_success RuntimeError: NetCDF: file not found

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  OpenDAP Documentation Example failing with RunTimeError 811409317
738190759 https://github.com/pydata/xarray/issues/4648#issuecomment-738190759 https://api.github.com/repos/pydata/xarray/issues/4648 MDEyOklzc3VlQ29tbWVudDczODE5MDc1OQ== scottyhq 3924836 2020-12-03T18:17:13Z 2020-12-03T18:17:13Z MEMBER

thanks for the ping @dcherian, i really like the idea! One other thing that often gets neglected in test suites is operating on remote data. I understand the need to avoid long-running tests and tests prone to network failures for PRs, but running these sorts of examples as a cron job could be very helpful for benchmarking and detecting issues.

In intake-xarray we recently added tests against a local HTTP server and "S3" server: https://github.com/intake/intake-xarray/blob/master/intake_xarray/tests/test_remote.py

Also added several simple tests requiring a network connection to public data (no auth required) that we run locally but not in CI currently: https://github.com/intake/intake-xarray/blob/master/intake_xarray/tests/test_network.py

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Comprehensive benchmarking suite 756425955
714684059 https://github.com/pydata/xarray/issues/3269#issuecomment-714684059 https://api.github.com/repos/pydata/xarray/issues/3269 MDEyOklzc3VlQ29tbWVudDcxNDY4NDA1OQ== scottyhq 3924836 2020-10-22T18:38:15Z 2020-10-22T18:38:15Z MEMBER

@dcherian I was just revisiting a use-case for this and realized xr.open_rasterio() does not take **kwargs. http://xarray.pydata.org/en/stable/generated/xarray.open_rasterio.html#xarray.open_rasterio

I think this would be easy to implement because a while back rasterio implemented a keyword argument so that you can do rasterio.open(path, overview_level=3) https://github.com/mapbox/rasterio/issues/1504

So is this just a matter of accepting kwargs in xr.open_rasterio() and passing them through? https://github.com/pydata/xarray/blob/4aa7622b6ff16647df64fe69f39438b7cbe9576c/xarray/backends/rasterio_.py#L241

As seems to be done here for example: https://github.com/pydata/xarray/blob/cc271e61077c543e0f3b1a06ad5e905ea2c91617/xarray/backends/h5netcdf_.py#L164

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Accessing COG overviews with read_rasterio 485988536
639651072 https://github.com/pydata/xarray/issues/4122#issuecomment-639651072 https://api.github.com/repos/pydata/xarray/issues/4122 MDEyOklzc3VlQ29tbWVudDYzOTY1MTA3Mg== scottyhq 3924836 2020-06-05T17:27:58Z 2020-06-05T17:27:58Z MEMBER

Not sure, but I think the h5netcdf engine is the only one that allows for file-like objects (so anything going through fsspec)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Document writing netcdf from xarray directly to S3 631085856
557610023 https://github.com/pydata/xarray/issues/3563#issuecomment-557610023 https://api.github.com/repos/pydata/xarray/issues/3563 MDEyOklzc3VlQ29tbWVudDU1NzYxMDAyMw== scottyhq 3924836 2019-11-22T16:56:42Z 2019-11-22T16:56:42Z MEMBER

Awesome @dcherian ! I like the approach of defining a binder environment then pulling the examples directory into the binder session with nbgitpuller. There are options for where to store the binder environment config (1) master branch .binder, 2) a new branch called 'binder', or 3) even a separate repo... maybe pydata/xarray-binder-env - see https://github.com/scottyhq/repo2docker-githubci ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  environment file for binderized examples 527296094
516633175 https://github.com/pydata/xarray/pull/3162#issuecomment-516633175 https://api.github.com/repos/pydata/xarray/issues/3162 MDEyOklzc3VlQ29tbWVudDUxNjYzMzE3NQ== scottyhq 3924836 2019-07-30T23:26:07Z 2019-07-30T23:26:23Z MEMBER

ok. as noted https://github.com/mapbox/rasterio/pull/1709 the aws_unsigned=True should bypass importing boto3 in upcoming rasterio 1.0.25. But I've gone ahead and added boto3 to ci requirements.yml files in case people want to add other tests reading from s3, gcs, etc. in the future

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  changed url for rasterio network test 473142248
515588027 https://github.com/pydata/xarray/pull/3162#issuecomment-515588027 https://api.github.com/repos/pydata/xarray/issues/3162 MDEyOklzc3VlQ29tbWVudDUxNTU4ODAyNw== scottyhq 3924836 2019-07-26T20:24:32Z 2019-07-26T20:24:32Z MEMBER

I'm in no rush to merge this (just wanted to help get rid of the red on Azure :). I think it's important to have network tests since it's increasingly common to be streaming data from s3, gcs, etc rather than reading locally. But I'm not sure of best practices for setting such tests up, so any suggestions for modifications are welcome.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  changed url for rasterio network test 473142248
515452425 https://github.com/pydata/xarray/pull/3162#issuecomment-515452425 https://api.github.com/repos/pydata/xarray/issues/3162 MDEyOklzc3VlQ29tbWVudDUxNTQ1MjQyNQ== scottyhq 3924836 2019-07-26T13:30:04Z 2019-07-26T14:00:48Z MEMBER

but seems like the tests are now run with each build on Azure? https://dev.azure.com/xarray/xarray/_build/results?buildId=389&view=ms.vss-test-web.build-test-results-tab

the new test currently still fails - I think due to https://github.com/mapbox/rasterio/pull/1709

cc @shoyer who created the issue linked in the first post.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  changed url for rasterio network test 473142248
509354012 https://github.com/pydata/xarray/issues/3083#issuecomment-509354012 https://api.github.com/repos/pydata/xarray/issues/3083 MDEyOklzc3VlQ29tbWVudDUwOTM1NDAxMg== scottyhq 3924836 2019-07-08T19:10:53Z 2019-07-08T19:10:53Z MEMBER

sounds good. i'll try to get to it this week.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  test_rasterio_vrt_network is failing in continuous integration tests 464793626
509090872 https://github.com/pydata/xarray/issues/3083#issuecomment-509090872 https://api.github.com/repos/pydata/xarray/issues/3083 MDEyOklzc3VlQ29tbWVudDUwOTA5MDg3Mg== scottyhq 3924836 2019-07-08T06:04:32Z 2019-07-08T06:04:32Z MEMBER

Strange, the same test works on my laptop current packages. Some ideas below:

  • in the rasterio.Envenvironment we can add CPL_CURL_VERBOSE=True to get more information about the error, but would need to capture output that typically goes to the terminal (I've done this before with https://github.com/minrk/wurlitzer).

  • I'm inclined to remove CPL_VSIL_CURL_USE_HEAD=False which is not documented anywhere obvious (https://lists.osgeo.org/pipermail/gdal-dev/2014-August/039924.html) and could be behind the error message. All rasterio/gdal environment config options are detailed here https://trac.osgeo.org/gdal/wiki/ConfigOptions

  • we might also consider a different URL. maybe add a tif file to https://github.com/pydata/xarray-data? or use one of same URLs that the rasterio library is using for network tif tests? https://github.com/mapbox/rasterio/blob/c2a495d536db6328e018cb65ce7cc0b8c559a937/tests/test_env.py#L34-L36

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  test_rasterio_vrt_network is failing in continuous integration tests 464793626
479927296 https://github.com/pydata/xarray/pull/2865#issuecomment-479927296 https://api.github.com/repos/pydata/xarray/issues/2865 MDEyOklzc3VlQ29tbWVudDQ3OTkyNzI5Ng== scottyhq 3924836 2019-04-04T14:45:01Z 2019-04-04T14:45:01Z MEMBER

@jmichel-otb completely agree on your comments and think that the remote sensing community gains a lot by making these packages work well together (gdal, rasterio, xarray, dask, etc). As more people start using the rasterio backend with various use cases hopefully there will be more contributions such as this. Thanks!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  BUG: Fix #2864 by adding the missing vrt parameters 428374352
470765319 https://github.com/pydata/xarray/pull/2782#issuecomment-470765319 https://api.github.com/repos/pydata/xarray/issues/2782 MDEyOklzc3VlQ29tbWVudDQ3MDc2NTMxOQ== scottyhq 3924836 2019-03-08T01:13:16Z 2019-03-08T01:13:58Z MEMBER

thanks for the input @shoyer, I attempted to tidy up a bit and in the process re-ordered some things such as adding an 'engine' check at the top of open_dataset(). backend tests are passing locally on my machine. hopefully i didn't add too much here or overstep!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  enable loading remote hdf5 files 412645481
469556377 https://github.com/pydata/xarray/pull/2782#issuecomment-469556377 https://api.github.com/repos/pydata/xarray/issues/2782 MDEyOklzc3VlQ29tbWVudDQ2OTU1NjM3Nw== scottyhq 3924836 2019-03-05T06:27:51Z 2019-03-05T06:27:51Z MEMBER

@shoyer , it would be great to have your feedback on these recent changes now that h5netcdf 0.7 is out. There's a bit more logic required in api.py now that scipy isn't the only backend that is able to read file-like objects (and people may not specify engine= when opening datasets)

test_backends.py passes locally for me except for TestValidateAttrs.test_validating_attrs... not sure why.

Also, per your comment here: https://github.com/shoyer/h5netcdf/pull/51#issuecomment-467591446, I think it would be great to get a few small netcdf4/hdf test files in https://github.com/pydata/xarray-data.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  enable loading remote hdf5 files 412645481
467689325 https://github.com/pydata/xarray/issues/2781#issuecomment-467689325 https://api.github.com/repos/pydata/xarray/issues/2781 MDEyOklzc3VlQ29tbWVudDQ2NzY4OTMyNQ== scottyhq 3924836 2019-02-27T01:44:45Z 2019-02-27T01:44:45Z MEMBER

Just noting here that I've gotten this to work reading a netcd4/fhdf5 file via gcsfs, but not for the same file accessed via s3fs: https://github.com/dask/s3fs/issues/168

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  enable reading of file-like HDF5 objects 412623833
448685064 https://github.com/pydata/xarray/pull/2589#issuecomment-448685064 https://api.github.com/repos/pydata/xarray/issues/2589 MDEyOklzc3VlQ29tbWVudDQ0ODY4NTA2NA== scottyhq 3924836 2018-12-19T17:49:19Z 2018-12-19T17:49:19Z MEMBER

I think the minimum rasterio version should be increased to 1.0. For whatever reason the conda defaults channel hasn't been updated since 0.36 (Jun 14, 2016!). There are many important changes in 1.0 and beyond, and those releases are available via both pip and conda-forge.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  added some logic to deal with rasterio objects in addition to filepaths 387123860
448416895 https://github.com/pydata/xarray/pull/2589#issuecomment-448416895 https://api.github.com/repos/pydata/xarray/issues/2589 MDEyOklzc3VlQ29tbWVudDQ0ODQxNjg5NQ== scottyhq 3924836 2018-12-18T23:57:08Z 2018-12-18T23:57:08Z MEMBER

thanks for the feedback @fmaussion, I think I've addressed your suggestions, let me know if anything else needs adjusting

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  added some logic to deal with rasterio objects in addition to filepaths 387123860
443974465 https://github.com/pydata/xarray/pull/2589#issuecomment-443974465 https://api.github.com/repos/pydata/xarray/issues/2589 MDEyOklzc3VlQ29tbWVudDQ0Mzk3NDQ2NQ== scottyhq 3924836 2018-12-04T05:14:26Z 2018-12-04T05:14:26Z MEMBER

Following up on https://github.com/dask/dask/issues/3255 @mrocklin, @shoyer, @jhamman

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  added some logic to deal with rasterio objects in addition to filepaths 387123860
417412405 https://github.com/pydata/xarray/issues/2314#issuecomment-417412405 https://api.github.com/repos/pydata/xarray/issues/2314 MDEyOklzc3VlQ29tbWVudDQxNzQxMjQwNQ== scottyhq 3924836 2018-08-30T18:01:02Z 2018-08-30T18:01:02Z MEMBER

As @darothen mentioned, first thing is to check that the geotiffs themselves are tiled (otherwise I'm guessing that open_rasterio() will open the entire thing. You can do this with:

python import rasterio with rasterio.open('image_001.tif') as src: print(src.profile)

Here is the mentioned example notebook which works for tiled geotiffs stored on google cloud: https://github.com/scottyhq/pangeo-example-notebooks/tree/binderfy

You can use the 'launch binder' button to run it with a pangeo dask-kubernetes cluster, or just read through the landsat8-cog-ndvi.ipynb notebook.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Chunked processing across multiple raster (geoTIF) files 344621749

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 88.135ms · About: xarray-datasette