home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

16 rows where repo = 13221727, state = "open" and user = 1197350 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, created_at (date), updated_at (date)

type 2

  • issue 13
  • pull 3

state 1

  • open · 16 ✖

repo 1

  • xarray · 16 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
421064313 MDExOlB1bGxSZXF1ZXN0MjYxMjAyMDU2 2813 [WIP] added protect_dataset_variables_inplace to open_zarr rabernat 1197350 open 0     3 2019-03-14T14:50:15Z 2024-03-25T14:05:24Z   MEMBER   0 pydata/xarray/pulls/2813

This adds the same call to _protect_dataset_variables_inplace to open_zarr which we find in open_dataset. It wraps the arrays with indexing.MemoryCachedArray.

As far as I can tell, it does not work, in the sense that nothing is cached.

  • [ ] One possible way to close #2812
  • [ ] Tests added
  • [ ] Fully documented, including whats-new.rst for all changes and api.rst for new API
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2813/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
421070999 MDExOlB1bGxSZXF1ZXN0MjYxMjA3MTYz 2814 [WIP] Use zarr internal LRU caching rabernat 1197350 open 0     2 2019-03-14T15:01:06Z 2024-03-25T14:00:50Z   MEMBER   0 pydata/xarray/pulls/2814

Alternative way to close #2812. This uses zarr's own caching.

In contrast to #2813, this does work.

  • [ ] Closes #2812
  • [ ] Tests added
  • [ ] Fully documented, including whats-new.rst for all changes and api.rst for new API
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2814/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
663148659 MDU6SXNzdWU2NjMxNDg2NTk= 4242 Expose xarray's h5py serialization capabilites as public API? rabernat 1197350 open 0     5 2020-07-21T16:27:45Z 2024-03-20T13:33:15Z   MEMBER      

Xarray has a magic ability to serialize h5py datasets. We should expose this somehow and allow it to be used outside of xarray.

Consider the following example:

```python import s3fs import h5py import dask.array as dsa import xarray as xr import cloudpickle

url = 'noaa-goes16/ABI-L2-RRQPEF/2020/001/00/OR_ABI-L2-RRQPEF-M6_G16_s20200010000216_e20200010009524_c20200010010034.nc' fs = s3fs.S3FileSystem(anon=True) f = fs.open(url) ds = h5py.File(f, mode='r') data = dsa.from_array(ds['RRQPE']) _ = cloudpickle.dumps(data) ```

This raises TypeError: h5py objects cannot be pickled.

However, if I read the file with xarray... python ds = xr.open_dataset(f, chunks={}) data = ds['RRQPE'].data _ = cloudpickle.dumps(data)

It works just fine. This has come up in several places (e.g. https://github.com/dask/s3fs/issues/337, https://github.com/dask/distributed/issues/2787).

It seems like the ability to pickle these arrays is broadly useful, beyond xarray.

  1. How does our magic work?
  2. What would it look like to break this magic out and expose it as public API (or inside another package)
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4242/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
224553135 MDU6SXNzdWUyMjQ1NTMxMzU= 1385 slow performance with open_mfdataset rabernat 1197350 open 0     52 2017-04-26T18:06:32Z 2024-03-14T01:31:21Z   MEMBER      

We have a dataset stored across multiple netCDF files. We are getting very slow performance with open_mfdataset, and I would like to improve this.

Each individual netCDF file looks like this: python %time ds_single = xr.open_dataset('float_trajectories.0000000000.nc') ds_single ``` CPU times: user 14.9 ms, sys: 48.4 ms, total: 63.4 ms Wall time: 60.8 ms

<xarray.Dataset> Dimensions: (npart: 8192000, time: 1) Coordinates: * time (time) datetime64[ns] 1993-01-01 * npart (npart) int32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ... Data variables: z (time, npart) float32 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 ... vort (time, npart) float32 -9.71733e-10 -9.72858e-10 -9.73001e-10 ... u (time, npart) float32 0.000545563 0.000544884 0.000544204 ... v (time, npart) float32 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... x (time, npart) float32 180.016 180.047 180.078 180.109 180.141 ... y (time, npart) float32 -79.9844 -79.9844 -79.9844 -79.9844 ... ```

As shown above, a single data file opens in ~60 ms.

When I call open_mdsdataset on 49 files (each with a different time dimension but the same npart), here is what happens:

python %time ds = xr.open_mfdataset('*.nc', ) ds ``` CPU times: user 1min 31s, sys: 25.4 s, total: 1min 57s Wall time: 2min 4s

<xarray.Dataset> Dimensions: (npart: 8192000, time: 49) Coordinates: * npart (npart) int64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ... * time (time) datetime64[ns] 1993-01-01 1993-01-02 1993-01-03 ... Data variables: z (time, npart) float64 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 ... vort (time, npart) float64 -9.717e-10 -9.729e-10 -9.73e-10 -9.73e-10 ... u (time, npart) float64 0.0005456 0.0005449 0.0005442 0.0005437 ... v (time, npart) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... x (time, npart) float64 180.0 180.0 180.1 180.1 180.1 180.2 180.2 ... y (time, npart) float64 -79.98 -79.98 -79.98 -79.98 -79.98 -79.98 ... ```

It takes over 2 minutes to open the dataset. Specifying concat_dim='time' does not improve performance.

Here is %prun of the open_mfdataset command.

``` 748994 function calls (724222 primitive calls) in 142.160 seconds

Ordered by: internal time

ncalls tottime percall cumtime percall filename:lineno(function) 49 62.455 1.275 62.458 1.275 {method 'get_indexer' of 'pandas.index.IndexEngine' objects} 49 47.207 0.963 47.209 0.963 base.py:1067(is_unique) 196 7.198 0.037 7.267 0.037 {operator.getitem} 49 4.632 0.095 4.687 0.096 netCDF4_.py:182(_open_netcdf4_group) 240 3.189 0.013 3.426 0.014 numeric.py:2476(array_equal) 98 1.937 0.020 1.937 0.020 {numpy.core.multiarray.arange} 4175/3146 1.867 0.000 9.296 0.003 {numpy.core.multiarray.array} 49 1.525 0.031 119.144 2.432 alignment.py:251(reindex_variables) 24 1.065 0.044 1.065 0.044 {method 'cumsum' of 'numpy.ndarray' objects} 12 1.010 0.084 1.010 0.084 {method 'sort' of 'numpy.ndarray' objects} 5227/4035 0.660 0.000 1.688 0.000 collections.py:50(init) 12 0.600 0.050 3.238 0.270 core.py:2761(insert) 12691/7497 0.473 0.000 0.875 0.000 indexing.py:363(shape) 110728 0.425 0.000 0.663 0.000 {isinstance} 12 0.413 0.034 0.413 0.034 {method 'flatten' of 'numpy.ndarray' objects} 12 0.341 0.028 0.341 0.028 {numpy.core.multiarray.where} 2 0.333 0.166 0.333 0.166 {pandas._join.outer_join_indexer_int64} 1 0.331 0.331 142.164 142.164 <string>:1(<module>) ```

It looks like most of the time is being spent on reindex_variables. I understand why this happens...xarray needs to make sure the dimensions are the same in order to concatenate them together.

Is there any obvious way I could improve the load time? For example, can I give a hint to xarray that this reindex_variables step is not necessary, since I know that all the npart dimensions are the same in each file?

Possibly related to #1301 and #1340.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1385/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
396806015 MDU6SXNzdWUzOTY4MDYwMTU= 2660 DataArrays to/from Zarr Arrays rabernat 1197350 open 0     7 2019-01-08T08:56:05Z 2023-10-27T14:00:20Z   MEMBER      

Right now, open_zarr and Dataset.to_zarr only work with Zarr groups. Zarr Groups can contain multiple Array objects.

It would be nice if we could open Zarr Arrays directly as xarray DataArrays and write xarray DataArrays directly to Zarr Arrays.

However, this might not make sense, because, unlike xarray DataArrays, zarr Arrays can't hold any coordinates.

Just raising this idea for discussion.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2660/reactions",
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  reopened xarray 13221727 issue
527323165 MDU6SXNzdWU1MjczMjMxNjU= 3564 DOC: from examples to tutorials rabernat 1197350 open 0     14 2019-11-22T17:30:14Z 2023-02-21T20:01:05Z   MEMBER      

It's awesome to see the work we did at Scipy2019 finally hit the live docs! Thanks @keewis and @dcherian for pushing it through.

Now that we have these more detailed, realistic examples, let's think about how we can take our documentation to the next level. I think we need TUTORIALS. The examples are a good start. I think we can build on these to create tutorials which walk through most of xarray's core features with a domain-specific datasets. We could have different tutorials for different fields. For example.

  • Xarray tutorial for meteorology / atmospheric science
  • Xarray tutorial for oceanography
  • Xarray tutorial for physics (whatever @fujiisoup and @TomNicholas do! 😉 )
  • Xarray tutorial for finance (whatever @max-sixty and @crusaderky do! :wink:)
  • Xarray tutorial for neuroscience (see nice example from @choldgraf: https://predictablynoisy.com/xarray-explore-ieeg)

Each tutorial would cover the same core elements (loading data, indexing, aligning, grouping, computations, plotting, etc.), but using a familiar, real dataset, rather than the generic, made-up ones in our current docs.

Yes, this would be a lot of work, but I think it would have a huge impact. Just raising here for discussion.

xref #2980 #2378 #3131

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3564/reactions",
    "total_count": 6,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 6,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
750985364 MDU6SXNzdWU3NTA5ODUzNjQ= 4610 Add histogram method rabernat 1197350 open 0     21 2020-11-25T17:05:02Z 2023-02-16T21:17:57Z   MEMBER      

On today's dev call, we discussed the possible role that numpy_groupies could play in xarray (#4540). I noted that many of the use cases for advanced grouping overlap significantly with histogram-type operations. A major need that we have is to take [weighted] histograms over some, but not all, axes of DataArrays. Since groupby doesn't allow this (see #1013), we started the standalone xhistogram package.

Given the broad usefulness of this feature, I suggested that we might want to deprecate xhistogram and move the histogram function to xarray. We may want to also reimplement it using numpy_groupies, which I think is smarter than our implementation in xhistogram.

I've opened this issue to keep track of the idea.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4610/reactions",
    "total_count": 9,
    "+1": 9,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
421029352 MDU6SXNzdWU0MjEwMjkzNTI= 2812 expose zarr caching from xarray rabernat 1197350 open 0     12 2019-03-14T13:50:16Z 2022-09-14T01:33:03Z   MEMBER      

Zarr has its own internal mechanism for caching, described here: - https://zarr.readthedocs.io/en/stable/tutorial.html#distributed-cloud-storage - https://zarr.readthedocs.io/en/stable/api/storage.html#zarr.storage.LRUStoreCache

However, this capability is currently inaccessible from xarray.

I propose to add a new keyword cache=True/False to open_zarr which wraps the store in an LRUStoreCache.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2812/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
710357592 MDU6SXNzdWU3MTAzNTc1OTI= 4470 xarray / vtk integration rabernat 1197350 open 0     21 2020-09-28T15:14:32Z 2022-06-22T18:20:39Z   MEMBER      

I just had a great chat with @aashish24 and @banesullivan of Kitware about how we could improve interoperability between xarray and the VTK stack They also made me aware of pyvista, which looks very cool. As a user of both tools, I can see it would be great if I could quickly drop into VTK from xarray for advanced 3D visualization.

A low-hanging fruit would be to simply be able to round-trip data between vtk and xarray in memory, much like we do with pandas. This should be doable because vtk already has a netCDF file reader. Rather than reading the data from a file, vtk could initialize its objects from an xarray dataset which, in principle, should contain all the same data / metadata

Beyond this, there are many possibilities for deeper integration around the treatment of finite-volume cells, structured / unstructured meshes, etc. Possibly related to https://github.com/pydata/xarray/issues/4222.

I just thought I would open this issue to track the general topic of xarray / vtk integration.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4470/reactions",
    "total_count": 23,
    "+1": 13,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 5,
    "rocket": 5,
    "eyes": 0
}
    xarray 13221727 issue
467908830 MDExOlB1bGxSZXF1ZXN0Mjk3NDQ1NDc3 3131 WIP: tutorial on merging datasets rabernat 1197350 open 0 TomNicholas 35968931   10 2019-07-15T01:28:25Z 2022-06-09T14:50:17Z   MEMBER   0 pydata/xarray/pulls/3131
  • [x] Closes #1391
  • [ ] Fully documented, including whats-new.rst for all changes and api.rst for new API

This is a start on a tutorial about merging / combining datasets.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3131/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
208312826 MDU6SXNzdWUyMDgzMTI4MjY= 1273 replace a dim with a coordinate from another dataset rabernat 1197350 open 0     4 2017-02-17T02:15:36Z 2022-04-09T15:26:20Z   MEMBER      

I often want a function that takes a dataarray / dataset and replaces a dimension with a coordinate from a different dataset.

@shoyer proposed the following simple solution. ```python def replace_dim(da, olddim, newdim): renamed = da.rename({olddim: newdim.name})

# note that alignment along a dimension is skipped when you are overriding
# the relevant coordinate values
renamed.coords[newdim.name] = newdim
return renamed

```

Is this of broad enough interest to add a build in method for?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1273/reactions",
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
439875798 MDU6SXNzdWU0Mzk4NzU3OTg= 2937 encoding of boolean dtype in zarr rabernat 1197350 open 0     3 2019-05-03T03:53:27Z 2022-04-09T01:22:42Z   MEMBER      

I want to store an array with 1364688000 boolean values in zarr. I will have to read this array many times, so I am trying to do it as efficiently as possible.

I have noticed that, if we try to write boolean data to zarr from xarray, zarr stores it as i8. ~This means we are using 8x more memory than we actually need.~ In researching this, I actually learned that numpy bools use a full byte of memory 😲! However, we could still improve performance (albeit very marginally) by skipping the unnecessary dtype encoding that happens here.

Example python import xarray as xr import zarr for dtype in ['f8', 'i4', 'bool']: ds = xr.DataArray([1, 0]).astype(dtype).to_dataset('foo') store = {} ds.to_zarr(store) za = zarr.open(store)['foo'] print(dtype, za.dtype, za.attrs.get('dtype')) gives f8 float64 None i4 int32 None bool int8 bool

So it seems like, during serialization of bool data, xarray is converting the data to int8 and then adding a {'dtype': 'bool'} to the attributes as encoding. When the data is read back, this gets decoded and the data is coerced back to bool.

Problem description

Since zarr is fully capable of storing bool data directly, we should not need to encode the data as i8.

I think this happens in encode_cf_variable: https://github.com/pydata/xarray/blob/612d390f925e5490314c363e5e368b2a8bd5daf0/xarray/conventions.py#L236

which calls maybe_encode_bools: https://github.com/pydata/xarray/blob/612d390f925e5490314c363e5e368b2a8bd5daf0/xarray/conventions.py#L105-L112

So maybe we make the boolean encoding optional?

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.7 | packaged by conda-forge | (default, Feb 28 2019, 09:07:38) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-693.17.1.el7.centos.plus.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.8.18 libnetcdf: 4.4.1.1 xarray: 0.12.1 pandas: 0.20.3 numpy: 1.13.3 scipy: 1.1.0 netCDF4: 1.3.0 pydap: None h5netcdf: 0.5.0 h5py: 2.7.1 Nio: None zarr: 2.3.1 cftime: None nc_time_axis: None PseudonetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 0.19.0+3.g064ebb1 distributed: 1.21.8 matplotlib: 3.0.3 cartopy: 0.16.0 seaborn: 0.8.1 setuptools: 36.6.0 pip: 9.0.1 conda: None pytest: 3.2.1 IPython: 6.2.1 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2937/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
517338735 MDU6SXNzdWU1MTczMzg3MzU= 3484 Need documentation on sparse / cupy integration rabernat 1197350 open 0     6 2019-11-04T18:57:05Z 2022-02-24T17:12:21Z   MEMBER      

In https://github.com/pydata/xarray/issues/1375#issuecomment-526432439, @fjanoos asked:

Is there documentation for using sparse arrays ? Could you point me to some example code ?

@dcherian:

there isn't any formal documentation yet but you can look at test_sparse.py for examples. That file will also tell you what works and doesn't work currently.

If we want people to take advantage of this cool new capability, we need to document it! I'm at pydata NYC and want to share something about this, but it's hard to know where to start without docs.

xref https://github.com/pydata/xarray/issues/3245

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3484/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1047608434 I_kwDOAMm_X84-cTxy 5954 Writeable backends via entrypoints rabernat 1197350 open 0     7 2021-11-08T15:47:12Z 2021-11-09T16:28:59Z   MEMBER      

The backend refactor has gone a long way towards making it easier to implement custom backend readers via entry points. However, it is still not clear how to implement a writeable backend from a third party package as an entry point. Some of the reasons for this are:

  • While our reading function (open_dataset) has a generic name, our writing functions (Dataset.to_netcdf / Dataset.to_zarr) are still format specific. (Related to https://github.com/pydata/xarray/issues/3638). I propose we introduce a generic Dataset.to method and deprecate the others.
  • The BackendEntrypoint base class does not have a writing method, just open_dataset: https://github.com/pydata/xarray/blob/e0deb9cf0a5cd5c9e3db033fd13f075added9c1e/xarray/backends/common.py#L356-L370 (Related to https://github.com/pydata/xarray/issues/1970)
  • As a result, writing is implemented ad-hoc for each backend.
  • This makes it impossible for a third-party package to to implement writing.

We should fix this situation! Here are the steps I would take.

  • [ ] Decide on the desired API for writeable backends.
  • [ ] Formalize this in the BackendEntrypoint base class.
  • [ ] Refactor the existing writeable backends (netcdf4-python, h5netcdf, scipy, Zarr) to use this API
  • [ ] Maybe deprecate to_zarr and to_netcdf (or at least refactor to make a shallow call to a generic method)
  • [ ] Encourage third party implementors to try it (e.g. TileDB)
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5954/reactions",
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 1
}
    xarray 13221727 issue
1006588071 I_kwDOAMm_X847_1Cn 5816 Link API docs to user guide and other examples rabernat 1197350 open 0     3 2021-09-24T15:34:31Z 2021-10-10T16:39:18Z   MEMBER      

Noting down a comment by @danjonesocean on Twitter: https://twitter.com/DanJonesOcean/status/1441392596362874882

In general, having more examples on each xarray page (like the one below) would be good. Then they would come up quickly in function searches:

http://xarray.pydata.org/en/stable/generated/xarray.Dataset.merge.html#xarray.Dataset.merge

Our API docs are generated by the function docstrings, and these are usually the first thing users hit when they search for functions. However, these docstring uniformly lack examples, often leaving users stuck.

I see two ways to mitigate this: - Add examples directly to the docstings (suggested by @jklymak) - Cross reference other examples from the user guide or other tutorials

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5816/reactions",
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 1
}
    xarray 13221727 issue
403359614 MDU6SXNzdWU0MDMzNTk2MTQ= 2712 improve docs on zarr + cloud storage rabernat 1197350 open 0     1 2019-01-25T22:35:08Z 2020-12-26T14:34:37Z   MEMBER      

In the Pangeo gitter chat, @birdsarah helped identify some shortcomings in the documentation about zarr cloud storage (https://github.com/pydata/xarray/blob/master/doc/io.rst#cloud-storage-buckets). We don't mention s3fs or how to use authentication. A more detailed set of examples would probably help people struggling to make the pieces fit together.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2712/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 64.366ms · About: xarray-datasette