issue_comments: 1034136761

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/pull/6237#issuecomment-1034136761	https://api.github.com/repos/pydata/xarray/issues/6237	1034136761	IC_kwDOAMm_X849o6y5	38358698	2022-02-09T19:53:45Z	2022-02-09T19:53:45Z	CONTRIBUTOR	Thanks for your PR - we are definitely interested to make this work on windows. 👍 Instead of adding `:okexcept:` could we explicitly close the offending files? It seems that we can, although closing all of the references was non-trivial in "user-guide/io.rst". Please see the diffs below for the two files with the most extensive changes. Within each document, I've tried to delete the files as early as possible to keep that code close to the last use of the file. "user-guide/dask.rst" ```diff @@ -55,6 +55,8 @@ argument to :py:func:`~xarray.open_dataset` or using the .. ipython:: python :suppress: + import os + import numpy as np import pandas as pd import xarray as xr @@ -129,6 +131,11 @@ will return a ``dask.delayed`` object that can be computed later. with ProgressBar(): results = delayed_obj.compute() +.. ipython:: python + :suppress: + + os.remove("manipulated-example-data.nc") # Was not opened. + .. note:: When using Dask's distributed scheduler to write NETCDF4 files, @@ -147,14 +154,6 @@ A dataset can also be converted to a Dask DataFrame using :py:meth:`~xarray.Data Dask DataFrames do not support multi-indexes so the coordinate variables from the dataset are included as columns in the Dask DataFrame. -.. ipython:: python - :okexcept: - :suppress: - - import os - - os.remove("example-data.nc") - os.remove("manipulated-example-data.nc") Using Dask with xarray ---------------------- @@ -211,7 +210,7 @@ Dask arrays using the :py:meth:`~xarray.Dataset.persist` method: .. ipython:: python - ds = ds.persist() + persisted = ds.persist() :py:meth:`~xarray.Dataset.persist` is particularly useful when using a distributed cluster because the data will be loaded into distributed memory @@ -233,11 +232,6 @@ chunk size depends both on your data and on the operations you want to perform. With xarray, both converting data to a Dask arrays and converting the chunk sizes of Dask arrays is done with the :py:meth:`~xarray.Dataset.chunk` method: -.. ipython:: python - :suppress: - - ds = ds.chunk({"time": 10}) - .. ipython:: python rechunked = ds.chunk({"latitude": 100, "longitude": 100}) @@ -509,6 +503,11 @@ Notice that the 0-shaped sizes were not printed to screen. Since ``template`` ha expected = ds + 10 + 10 mapped.identical(expected) +.. ipython:: python + :suppress: + + ds.close() # Closes "example-data.nc". + os.remove("example-data.nc") .. tip:: ``` Above, I've removed the line `ds = ds.chunk({"time": 10})`, because the call to open the dataset already specified that chunking. "user-guide/io.rst" ```diff @@ -11,6 +11,8 @@ format (recommended). .. ipython:: python :suppress: + import os + import numpy as np import pandas as pd import xarray as xr @@ -84,6 +86,13 @@ We can load netCDF files to create a new Dataset using ds_disk = xr.open_dataset("saved_on_disk.nc") ds_disk +.. ipython:: python + :suppress: + + # Close "saved_on_disk.nc", but retain the file until after closing or deleting other + # datasets that will refer to it. + ds_disk.close() + Similarly, a DataArray can be saved to disk using the :py:meth:`DataArray.to_netcdf` method, and loaded from disk using the :py:func:`open_dataarray` function. As netCDF files @@ -204,11 +213,6 @@ You can view this encoding information (among others) in the Note that all operations that manipulate variables other than indexing will remove encoding information. -.. ipython:: python - :suppress: - - ds_disk.close() - .. _combining multiple files: @@ -484,14 +488,13 @@ and currently raises a warning unless ``invalid_netcdf=True`` is set: da.to_netcdf("complex.nc", engine="h5netcdf", invalid_netcdf=True) # Reading it back - xr.open_dataarray("complex.nc", engine="h5netcdf") + reopened = xr.open_dataarray("complex.nc", engine="h5netcdf") + reopened .. ipython:: python - :okexcept: :suppress: - import os - + reopened.close() os.remove("complex.nc") .. warning:: @@ -724,17 +727,19 @@ To export just the dataset schema without the data itself, use the ds.to_dict(data=False) -This can be useful for generating indices of dataset contents to expose to -search indices or other automated data discovery tools. - .. ipython:: python - :okexcept: :suppress: - import os - + # We're now done with the dataset named `ds`. Although the `with` statement closed + # the dataset, displaying the unpickled pickle of `ds` re-opened "saved_on_disk.nc". + # However, `ds` (rather than the unpickled dataset) refers to the open file. Delete + # `ds` to close the file. + del ds os.remove("saved_on_disk.nc") +This can be useful for generating indices of dataset contents to expose to +search indices or other automated data discovery tools. + .. _io.rasterio: Rasterio ``` We can also straightforwardly remove the "rasm.zarr" directory: "internals/zarr-encoding-spec.rst" ```diff @@ -63,3 +63,9 @@ re-open it directly with Zarr: print(os.listdir("rasm.zarr")) print(zgroup.tree()) dict(zgroup["Tair"].attrs) + +.. ipython:: python + :suppress: + + import shutil + shutil.rmtree("rasm.zarr") ``` Is that general approach agreeable? If so, I'll commit the changes for further comment and review.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		1124431593