home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 1034136761

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/pull/6237#issuecomment-1034136761 https://api.github.com/repos/pydata/xarray/issues/6237 1034136761 IC_kwDOAMm_X849o6y5 38358698 2022-02-09T19:53:45Z 2022-02-09T19:53:45Z CONTRIBUTOR

Thanks for your PR - we are definitely interested to make this work on windows.

👍

Instead of adding :okexcept: could we explicitly close the offending files?

It seems that we can, although closing all of the references was non-trivial in "user-guide/io.rst". Please see the diffs below for the two files with the most extensive changes. Within each document, I've tried to delete the files as early as possible to keep that code close to the last use of the file.

"user-guide/dask.rst" ```diff @@ -55,6 +55,8 @@ argument to :py:func:`~xarray.open_dataset` or using the .. ipython:: python :suppress: + import os + import numpy as np import pandas as pd import xarray as xr @@ -129,6 +131,11 @@ will return a ``dask.delayed`` object that can be computed later. with ProgressBar(): results = delayed_obj.compute() +.. ipython:: python + :suppress: + + os.remove("manipulated-example-data.nc") # Was not opened. + .. note:: When using Dask's distributed scheduler to write NETCDF4 files, @@ -147,14 +154,6 @@ A dataset can also be converted to a Dask DataFrame using :py:meth:`~xarray.Data Dask DataFrames do not support multi-indexes so the coordinate variables from the dataset are included as columns in the Dask DataFrame. -.. ipython:: python - :okexcept: - :suppress: - - import os - - os.remove("example-data.nc") - os.remove("manipulated-example-data.nc") Using Dask with xarray ---------------------- @@ -211,7 +210,7 @@ Dask arrays using the :py:meth:`~xarray.Dataset.persist` method: .. ipython:: python - ds = ds.persist() + persisted = ds.persist() :py:meth:`~xarray.Dataset.persist` is particularly useful when using a distributed cluster because the data will be loaded into distributed memory @@ -233,11 +232,6 @@ chunk size depends both on your data and on the operations you want to perform. With xarray, both converting data to a Dask arrays and converting the chunk sizes of Dask arrays is done with the :py:meth:`~xarray.Dataset.chunk` method: -.. ipython:: python - :suppress: - - ds = ds.chunk({"time": 10}) - .. ipython:: python rechunked = ds.chunk({"latitude": 100, "longitude": 100}) @@ -509,6 +503,11 @@ Notice that the 0-shaped sizes were not printed to screen. Since ``template`` ha expected = ds + 10 + 10 mapped.identical(expected) +.. ipython:: python + :suppress: + + ds.close() # Closes "example-data.nc". + os.remove("example-data.nc") .. tip:: ``` Above, I've removed the line `ds = ds.chunk({"time": 10})`, because the call to open the dataset already specified that chunking.
"user-guide/io.rst" ```diff @@ -11,6 +11,8 @@ format (recommended). .. ipython:: python :suppress: + import os + import numpy as np import pandas as pd import xarray as xr @@ -84,6 +86,13 @@ We can load netCDF files to create a new Dataset using ds_disk = xr.open_dataset("saved_on_disk.nc") ds_disk +.. ipython:: python + :suppress: + + # Close "saved_on_disk.nc", but retain the file until after closing or deleting other + # datasets that will refer to it. + ds_disk.close() + Similarly, a DataArray can be saved to disk using the :py:meth:`DataArray.to_netcdf` method, and loaded from disk using the :py:func:`open_dataarray` function. As netCDF files @@ -204,11 +213,6 @@ You can view this encoding information (among others) in the Note that all operations that manipulate variables other than indexing will remove encoding information. -.. ipython:: python - :suppress: - - ds_disk.close() - .. _combining multiple files: @@ -484,14 +488,13 @@ and currently raises a warning unless ``invalid_netcdf=True`` is set: da.to_netcdf("complex.nc", engine="h5netcdf", invalid_netcdf=True) # Reading it back - xr.open_dataarray("complex.nc", engine="h5netcdf") + reopened = xr.open_dataarray("complex.nc", engine="h5netcdf") + reopened .. ipython:: python - :okexcept: :suppress: - import os - + reopened.close() os.remove("complex.nc") .. warning:: @@ -724,17 +727,19 @@ To export just the dataset schema without the data itself, use the ds.to_dict(data=False) -This can be useful for generating indices of dataset contents to expose to -search indices or other automated data discovery tools. - .. ipython:: python - :okexcept: :suppress: - import os - + # We're now done with the dataset named `ds`. Although the `with` statement closed + # the dataset, displaying the unpickled pickle of `ds` re-opened "saved_on_disk.nc". + # However, `ds` (rather than the unpickled dataset) refers to the open file. Delete + # `ds` to close the file. + del ds os.remove("saved_on_disk.nc") +This can be useful for generating indices of dataset contents to expose to +search indices or other automated data discovery tools. + .. _io.rasterio: Rasterio ```

We can also straightforwardly remove the "rasm.zarr" directory:

"internals/zarr-encoding-spec.rst" ```diff @@ -63,3 +63,9 @@ re-open it directly with Zarr: print(os.listdir("rasm.zarr")) print(zgroup.tree()) dict(zgroup["Tair"].attrs) + +.. ipython:: python + :suppress: + + import shutil + shutil.rmtree("rasm.zarr") ```

Is that general approach agreeable? If so, I'll commit the changes for further comment and review.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  1124431593
Powered by Datasette · Queries took 0.461ms · About: xarray-datasette