home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 435535284

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
435535284 MDU6SXNzdWU0MzU1MzUyODQ= 2912 Writing a netCDF file is unexpectedly slow 2014301 closed 0     12 2019-04-21T18:31:36Z 2023-09-12T15:58:18Z 2023-09-12T15:58:18Z NONE      

```python ncdat=xr.open_mfdataset(nclist, concat_dim='time')

ncdat['lat']=ncdat['lat'].isel(time=0).drop('time') ncdat['lon']=ncdat['lon'].isel(time=0).drop('time') ncdat=ncdat.rename({'north_south':'lat', 'east_west':'lon'})

lat_coords=ncdat.lat[:,0] #Extract latitudes lon_coords=ncdat.lon[0,:] #Extract longitudes

ncdat=ncdat.drop(['lat','lon'])

reformatted_ncdat=ncdat.assign_coords(lat=lat_coords,lon=lon_coords, time=ncdat.coords['time'])

ncdat = reformatted_ncdat.sortby('time') ncdat.to_netcdf('testing.nc')

```

Problem description

After some processing, I am left with this xarray dataset ncdat which I want to export to a netCDF file.

<xarray.Dataset> Dimensions: (lat: 59, lon: 75, time: 500) Coordinates: * time (time) datetime64[ns] 2007-01-22 ... 2008-06-04 * lat (lat) float32 -4.25 -4.15 ... 1.4500003 1.5500002 * lon (lon) float32 29.049988 29.149994 ... 36.450012 Data variables: Streamflow_tavg (time, lat, lon) float32 dask.array<shape=(500, 59, 75), chunksize=(1, 59, 75)> RiverDepth_tavg (time, lat, lon) float32 dask.array<shape=(500, 59, 75), chunksize=(1, 59, 75)> RiverFlowVelocity_tavg (time, lat, lon) float32 dask.array<shape=(500, 59, 75), chunksize=(1, 59, 75)> FloodedFrac_tavg (time, lat, lon) float32 dask.array<shape=(500, 59, 75), chunksize=(1, 59, 75)> SurfElev_tavg (time, lat, lon) float32 dask.array<shape=(500, 59, 75), chunksize=(1, 59, 75)> SWS_tavg (time, lat, lon) float32 dask.array<shape=(500, 59, 75), chunksize=(1, 59, 75)> Attributes: missing_value: -9999.0 NUM_SOIL_LAYERS: 1 SOIL_LAYER_THICKNESSES: 1.0 title: LIS land surface model output institution: NASA GSFC source: model_not_specified history: created on date: 2019-04-19T09:11:12.992 references: Kumar_etal_EMS_2006, Peters-Lidard_etal_ISSE_2007 conventions: CF-1.6 comment: website: http://lis.gsfc.nasa.gov/ MAP_PROJECTION: EQUIDISTANT CYLINDRICAL SOUTH_WEST_CORNER_LAT: -4.25 SOUTH_WEST_CORNER_LON: 29.05 DX: 0.1 DY: 0.1 But the problem is it takes an inordinately long time to export. Almost 10 mins for this particular file which is only 35M.

How can I expedite this process? Is there anything wrong with the structure of ncdat?

Expected Output

A netCDF file

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 | packaged by conda-forge | (default, Mar 27 2019, 23:01:00) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.0.101-0.47.105-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.12.1 pandas: 0.24.2 numpy: 1.16.2 scipy: 1.2.1 netCDF4: 1.5.0.1 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 1.2.0 distributed: 1.27.0 matplotlib: 3.0.3 cartopy: 0.17.0 seaborn: 0.9.0 setuptools: 41.0.0 pip: 19.0.3 conda: None pytest: None IPython: 7.4.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2912/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 2 rows from issues_id in issues_labels
  • 11 rows from issue in issue_comments
Powered by Datasette · Queries took 0.684ms · About: xarray-datasette