home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

9 rows where issue = 243927150 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 5

  • hadfieldnz 4
  • rabernat 2
  • shoyer 1
  • jhamman 1
  • fmaussion 1

author_association 2

  • MEMBER 5
  • NONE 4

issue 1

  • Excessive memory usage when printing multi-file Dataset · 9 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
331539708 https://github.com/pydata/xarray/issues/1481#issuecomment-331539708 https://api.github.com/repos/pydata/xarray/issues/1481 MDEyOklzc3VlQ29tbWVudDMzMTUzOTcwOA== jhamman 2443309 2017-09-22T19:30:00Z 2017-09-22T19:30:00Z MEMBER

@hadfieldnz - I think this was just fixed in #1532. Keep an eye out for the 0.10 release. Feel free to reopen if you feel there's more to do here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Excessive memory usage when printing multi-file Dataset 243927150
317301035 https://github.com/pydata/xarray/issues/1481#issuecomment-317301035 https://api.github.com/repos/pydata/xarray/issues/1481 MDEyOklzc3VlQ29tbWVudDMxNzMwMTAzNQ== hadfieldnz 29717790 2017-07-24T01:58:32Z 2017-07-24T01:58:32Z NONE

In response to your comment, Stephan

More broadly: maybe we should disable automatically printing a preview of the contents of xarray.Dataset objects when they have lazily loaded data in the form of dask arrays. This is convenient for interactive use in many cases (when it can be done cheaply!) but fails in many edge cases.

Speaking rather selfishly--as someone who is quite good at finding bugs in scientific software, but not much use in fixing them--my worry is that the bugs that are no longer uncovered by printing the dataset preview would come back to bite me some other way.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Excessive memory usage when printing multi-file Dataset 243927150
317288301 https://github.com/pydata/xarray/issues/1481#issuecomment-317288301 https://api.github.com/repos/pydata/xarray/issues/1481 MDEyOklzc3VlQ29tbWVudDMxNzI4ODMwMQ== hadfieldnz 29717790 2017-07-23T22:57:06Z 2017-07-23T22:57:06Z NONE

Back at work and able to check things out more thoroughly on a machine with more RAM...

A good number of files to trigger the problem is 10.

As reported before, upgrading dask from 0.14.3 to 0.15.0 did not fix the problem. It seemed to speed up the handling of muli-file datasets generally, therefore causing my PC to crash faster when it crashes.

Ryan, calling open_mfdataset with decode_cf=False does allow me to open and print the 10-file dataset, though this still seems to use an uncomfortably large amount of RAM: about 7 GiB in the Python kernel process, vs only a few hundred for the 25-file dataset.

Stephan, although I discovered this problem when dealing with a 25-file sequence, I boiled it down to a test case involving one file opened multiple times before reporting it here. There is a copy of the file (2.27 Gib) in a publicly accessible location here:

ftp://ftp.niwa.co.nz/incoming/hadfield/roms_avg_0001.nc

and here is the output of ncdump -h:

netcdf roms_avg_0001 { dimensions: xi_rho = 482 ; xi_u = 481 ; xi_v = 482 ; xi_psi = 481 ; eta_rho = 242 ; eta_u = 242 ; eta_v = 241 ; eta_psi = 241 ; s_rho = 20 ; s_w = 21 ; tracer = 2 ; boundary = 4 ; ocean_time = UNLIMITED ; // (100 currently) variables: int ntimes ; ntimes:long_name = "number of long time-steps" ; int ndtfast ; ndtfast:long_name = "number of short time-steps" ; double dt ; dt:long_name = "size of long time-steps" ; dt:units = "second" ; double dtfast ; dtfast:long_name = "size of short time-steps" ; dtfast:units = "second" ; double dstart ; dstart:long_name = "time stamp assigned to model initilization" ; dstart:units = "days since 1990-01-01 00:00:00" ; int nHIS ; nHIS:long_name = "number of time-steps between history records" ; int ndefHIS ; ndefHIS:long_name = "number of time-steps between the creation of history files" ; int nRST ; nRST:long_name = "number of time-steps between restart records" ; int ntsAVG ; ntsAVG:long_name = "starting time-step for accumulation of time-averaged fields" ; int nAVG ; nAVG:long_name = "number of time-steps between time-averaged records" ; int ndefAVG ; ndefAVG:long_name = "number of time-steps between the creation of average files" ; int nSTA ; nSTA:long_name = "number of time-steps between stations records" ; double Falpha ; Falpha:long_name = "Power-law shape barotropic filter parameter" ; double Fbeta ; Fbeta:long_name = "Power-law shape barotropic filter parameter" ; double Fgamma ; Fgamma:long_name = "Power-law shape barotropic filter parameter" ; double Akt_bak(tracer) ; Akt_bak:long_name = "background vertical mixing coefficient for tracers" ; Akt_bak:units = "meter2 second-1" ; double Akv_bak ; Akv_bak:long_name = "background vertical mixing coefficient for momentum" ; Akv_bak:units = "meter2 second-1" ; double Akk_bak ; Akk_bak:long_name = "background vertical mixing coefficient for turbulent energy" ; Akk_bak:units = "meter2 second-1" ; double Akp_bak ; Akp_bak:long_name = "background vertical mixing coefficient for length scale" ; Akp_bak:units = "meter2 second-1" ; double rdrg ; rdrg:long_name = "linear drag coefficient" ; rdrg:units = "meter second-1" ; double rdrg2 ; rdrg2:long_name = "quadratic drag coefficient" ; double Zob ; Zob:long_name = "bottom roughness" ; Zob:units = "meter" ; double Zos ; Zos:long_name = "surface roughness" ; Zos:units = "meter" ; double gls_p ; gls_p:long_name = "stability exponent" ; double gls_m ; gls_m:long_name = "turbulent kinetic energy exponent" ; double gls_n ; gls_n:long_name = "turbulent length scale exponent" ; double gls_cmu0 ; gls_cmu0:long_name = "stability coefficient" ; double gls_c1 ; gls_c1:long_name = "shear production coefficient" ; double gls_c2 ; gls_c2:long_name = "dissipation coefficient" ; double gls_c3m ; gls_c3m:long_name = "buoyancy production coefficient (minus)" ; double gls_c3p ; gls_c3p:long_name = "buoyancy production coefficient (plus)" ; double gls_sigk ; gls_sigk:long_name = "constant Schmidt number for TKE" ; double gls_sigp ; gls_sigp:long_name = "constant Schmidt number for PSI" ; double gls_Kmin ; gls_Kmin:long_name = "minimum value of specific turbulent kinetic energy" ; double gls_Pmin ; gls_Pmin:long_name = "minimum Value of dissipation" ; double Charnok_alpha ; Charnok_alpha:long_name = "Charnok factor for surface roughness" ; double Zos_hsig_alpha ; Zos_hsig_alpha:long_name = "wave amplitude factor for surface roughness" ; double sz_alpha ; sz_alpha:long_name = "surface flux from wave dissipation" ; double CrgBan_cw ; CrgBan_cw:long_name = "surface flux due to Craig and Banner wave breaking" ; double Znudg ; Znudg:long_name = "free-surface nudging/relaxation inverse time scale" ; Znudg:units = "day-1" ; double M2nudg ; M2nudg:long_name = "2D momentum nudging/relaxation inverse time scale" ; M2nudg:units = "day-1" ; double M3nudg ; M3nudg:long_name = "3D momentum nudging/relaxation inverse time scale" ; M3nudg:units = "day-1" ; double Tnudg(tracer) ; Tnudg:long_name = "Tracers nudging/relaxation inverse time scale" ; Tnudg:units = "day-1" ; double FSobc_in(boundary) ; FSobc_in:long_name = "free-surface inflow, nudging inverse time scale" ; FSobc_in:units = "second-1" ; double FSobc_out(boundary) ; FSobc_out:long_name = "free-surface outflow, nudging inverse time scale" ; FSobc_out:units = "second-1" ; double M2obc_in(boundary) ; M2obc_in:long_name = "2D momentum inflow, nudging inverse time scale" ; M2obc_in:units = "second-1" ; double M2obc_out(boundary) ; M2obc_out:long_name = "2D momentum outflow, nudging inverse time scale" ; M2obc_out:units = "second-1" ; double Tobc_in(boundary, tracer) ; Tobc_in:long_name = "tracers inflow, nudging inverse time scale" ; Tobc_in:units = "second-1" ; double Tobc_out(boundary, tracer) ; Tobc_out:long_name = "tracers outflow, nudging inverse time scale" ; Tobc_out:units = "second-1" ; double M3obc_in(boundary) ; M3obc_in:long_name = "3D momentum inflow, nudging inverse time scale" ; M3obc_in:units = "second-1" ; double M3obc_out(boundary) ; M3obc_out:long_name = "3D momentum outflow, nudging inverse time scale" ; M3obc_out:units = "second-1" ; double rho0 ; rho0:long_name = "mean density used in Boussinesq approximation" ; rho0:units = "kilogram meter-3" ; double gamma2 ; gamma2:long_name = "slipperiness parameter" ; int LuvSrc ; LuvSrc:long_name = "momentum point sources and sink activation switch" ; LuvSrc:flag_values = 0, 1 ; LuvSrc:flag_meanings = ".FALSE. .TRUE." ; int LwSrc ; LwSrc:long_name = "mass point sources and sink activation switch" ; LwSrc:flag_values = 0, 1 ; LwSrc:flag_meanings = ".FALSE. .TRUE." ; int LtracerSrc(tracer) ; LtracerSrc:long_name = "tracer point sources and sink activation switch" ; LtracerSrc:flag_values = 0, 1 ; LtracerSrc:flag_meanings = ".FALSE. .TRUE." ; int LsshCLM ; LsshCLM:long_name = "sea surface height climatology processing switch" ; LsshCLM:flag_values = 0, 1 ; LsshCLM:flag_meanings = ".FALSE. .TRUE." ; int Lm2CLM ; Lm2CLM:long_name = "2D momentum climatology processing switch" ; Lm2CLM:flag_values = 0, 1 ; Lm2CLM:flag_meanings = ".FALSE. .TRUE." ; int Lm3CLM ; Lm3CLM:long_name = "3D momentum climatology processing switch" ; Lm3CLM:flag_values = 0, 1 ; Lm3CLM:flag_meanings = ".FALSE. .TRUE." ; int LtracerCLM(tracer) ; LtracerCLM:long_name = "tracer climatology processing switch" ; LtracerCLM:flag_values = 0, 1 ; LtracerCLM:flag_meanings = ".FALSE. .TRUE." ; int LnudgeM2CLM ; LnudgeM2CLM:long_name = "2D momentum climatology nudging activation switch" ; LnudgeM2CLM:flag_values = 0, 1 ; LnudgeM2CLM:flag_meanings = ".FALSE. .TRUE." ; int LnudgeM3CLM ; LnudgeM3CLM:long_name = "3D momentum climatology nudging activation switch" ; LnudgeM3CLM:flag_values = 0, 1 ; LnudgeM3CLM:flag_meanings = ".FALSE. .TRUE." ; int LnudgeTCLM(tracer) ; LnudgeTCLM:long_name = "tracer climatology nudging activation switch" ; LnudgeTCLM:flag_values = 0, 1 ; LnudgeTCLM:flag_meanings = ".FALSE. .TRUE." ; int spherical ; spherical:long_name = "grid type logical switch" ; spherical:flag_values = 0, 1 ; spherical:flag_meanings = "Cartesian spherical" ; double xl ; xl:long_name = "domain length in the XI-direction" ; xl:units = "meter" ; double el ; el:long_name = "domain length in the ETA-direction" ; el:units = "meter" ; int Vtransform ; Vtransform:long_name = "vertical terrain-following transformation equation" ; int Vstretching ; Vstretching:long_name = "vertical terrain-following stretching function" ; double theta_s ; theta_s:long_name = "S-coordinate surface control parameter" ; double theta_b ; theta_b:long_name = "S-coordinate bottom control parameter" ; double Tcline ; Tcline:long_name = "S-coordinate surface/bottom layer width" ; Tcline:units = "meter" ; double hc ; hc:long_name = "S-coordinate parameter, critical depth" ; hc:units = "meter" ; int grid ; grid:cf_role = "grid_topology" ; grid:topology_dimension = 2 ; grid:node_dimensions = "xi_psi eta_psi" ; grid:face_dimensions = "xi_rho: xi_psi (padding: both) eta_rho: eta_psi (padding: both)" ; grid:edge1_dimensions = "xi_u: xi_psi eta_u: eta_psi (padding: both)" ; grid:edge2_dimensions = "xi_v: xi_psi (padding: both) eta_v: eta_psi" ; grid:node_coordinates = "lon_psi lat_psi" ; grid:face_coordinates = "lon_rho lat_rho" ; grid:edge1_coordinates = "lon_u lat_u" ; grid:edge2_coordinates = "lon_v lat_v" ; grid:vertical_dimensions = "s_rho: s_w (padding: none)" ; double s_rho(s_rho) ; s_rho:long_name = "S-coordinate at RHO-points" ; s_rho:valid_min = -1. ; s_rho:valid_max = 0. ; s_rho:positive = "up" ; s_rho:standard_name = "ocean_s_coordinate_g2" ; s_rho:formula_terms = "s: s_rho C: Cs_r eta: zeta depth: h depth_c: hc" ; s_rho:field = "s_rho, scalar" ; double s_w(s_w) ; s_w:long_name = "S-coordinate at W-points" ; s_w:valid_min = -1. ; s_w:valid_max = 0. ; s_w:positive = "up" ; s_w:standard_name = "ocean_s_coordinate_g2" ; s_w:formula_terms = "s: s_w C: Cs_w eta: zeta depth: h depth_c: hc" ; s_w:field = "s_w, scalar" ; double Cs_r(s_rho) ; Cs_r:long_name = "S-coordinate stretching curves at RHO-points" ; Cs_r:valid_min = -1. ; Cs_r:valid_max = 0. ; Cs_r:field = "Cs_r, scalar" ; double Cs_w(s_w) ; Cs_w:long_name = "S-coordinate stretching curves at W-points" ; Cs_w:valid_min = -1. ; Cs_w:valid_max = 0. ; Cs_w:field = "Cs_w, scalar" ; double h(eta_rho, xi_rho) ; h:long_name = "bathymetry at RHO-points" ; h:units = "meter" ; h:grid = "grid" ; h:location = "face" ; h:coordinates = "lon_rho lat_rho" ; h:field = "bath, scalar" ; double f(eta_rho, xi_rho) ; f:long_name = "Coriolis parameter at RHO-points" ; f:units = "second-1" ; f:grid = "grid" ; f:location = "face" ; f:coordinates = "lon_rho lat_rho" ; f:field = "coriolis, scalar" ; double pm(eta_rho, xi_rho) ; pm:long_name = "curvilinear coordinate metric in XI" ; pm:units = "meter-1" ; pm:grid = "grid" ; pm:location = "face" ; pm:coordinates = "lon_rho lat_rho" ; pm:field = "pm, scalar" ; double pn(eta_rho, xi_rho) ; pn:long_name = "curvilinear coordinate metric in ETA" ; pn:units = "meter-1" ; pn:grid = "grid" ; pn:location = "face" ; pn:coordinates = "lon_rho lat_rho" ; pn:field = "pn, scalar" ; double lon_rho(eta_rho, xi_rho) ; lon_rho:long_name = "longitude of RHO-points" ; lon_rho:units = "degree_east" ; lon_rho:standard_name = "longitude" ; lon_rho:field = "lon_rho, scalar" ; double lat_rho(eta_rho, xi_rho) ; lat_rho:long_name = "latitude of RHO-points" ; lat_rho:units = "degree_north" ; lat_rho:standard_name = "latitude" ; lat_rho:field = "lat_rho, scalar" ; double lon_u(eta_u, xi_u) ; lon_u:long_name = "longitude of U-points" ; lon_u:units = "degree_east" ; lon_u:standard_name = "longitude" ; lon_u:field = "lon_u, scalar" ; double lat_u(eta_u, xi_u) ; lat_u:long_name = "latitude of U-points" ; lat_u:units = "degree_north" ; lat_u:standard_name = "latitude" ; lat_u:field = "lat_u, scalar" ; double lon_v(eta_v, xi_v) ; lon_v:long_name = "longitude of V-points" ; lon_v:units = "degree_east" ; lon_v:standard_name = "longitude" ; lon_v:field = "lon_v, scalar" ; double lat_v(eta_v, xi_v) ; lat_v:long_name = "latitude of V-points" ; lat_v:units = "degree_north" ; lat_v:standard_name = "latitude" ; lat_v:field = "lat_v, scalar" ; double lon_psi(eta_psi, xi_psi) ; lon_psi:long_name = "longitude of PSI-points" ; lon_psi:units = "degree_east" ; lon_psi:standard_name = "longitude" ; lon_psi:field = "lon_psi, scalar" ; double lat_psi(eta_psi, xi_psi) ; lat_psi:long_name = "latitude of PSI-points" ; lat_psi:units = "degree_north" ; lat_psi:standard_name = "latitude" ; lat_psi:field = "lat_psi, scalar" ; double angle(eta_rho, xi_rho) ; angle:long_name = "angle between XI-axis and EAST" ; angle:units = "radians" ; angle:grid = "grid" ; angle:location = "face" ; angle:coordinates = "lon_rho lat_rho" ; angle:field = "angle, scalar" ; double mask_rho(eta_rho, xi_rho) ; mask_rho:long_name = "mask on RHO-points" ; mask_rho:flag_values = 0., 1. ; mask_rho:flag_meanings = "land water" ; mask_rho:grid = "grid" ; mask_rho:location = "face" ; mask_rho:coordinates = "lon_rho lat_rho" ; double mask_u(eta_u, xi_u) ; mask_u:long_name = "mask on U-points" ; mask_u:flag_values = 0., 1. ; mask_u:flag_meanings = "land water" ; mask_u:grid = "grid" ; mask_u:location = "edge1" ; mask_u:coordinates = "lon_u lat_u" ; double mask_v(eta_v, xi_v) ; mask_v:long_name = "mask on V-points" ; mask_v:flag_values = 0., 1. ; mask_v:flag_meanings = "land water" ; mask_v:grid = "grid" ; mask_v:location = "edge2" ; mask_v:coordinates = "lon_v lat_v" ; double mask_psi(eta_psi, xi_psi) ; mask_psi:long_name = "mask on psi-points" ; mask_psi:flag_values = 0., 1. ; mask_psi:flag_meanings = "land water" ; mask_psi:grid = "grid" ; mask_psi:location = "node" ; mask_psi:coordinates = "lon_psi lat_psi" ; double ocean_time(ocean_time) ; ocean_time:long_name = "averaged time since initialization" ; ocean_time:units = "seconds since 1990-01-01 00:00:00" ; ocean_time:calendar = "gregorian" ; ocean_time:field = "time, scalar, series" ; short zeta(ocean_time, eta_rho, xi_rho) ; zeta:long_name = "time-averaged free-surface" ; zeta:units = "meter" ; zeta:time = "ocean_time" ; zeta:grid = "grid" ; zeta:location = "face" ; zeta:coordinates = "lon_rho lat_rho ocean_time" ; zeta:field = "free-surface, scalar, series" ; zeta:add_offset = -0.0001525949f ; zeta:scale_factor = 0.0003051898f ; zeta:valid_range = -32766s, 32767s ; short ubar(ocean_time, eta_u, xi_u) ; ubar:long_name = "time-averaged vertically integrated u-momentum component" ; ubar:units = "meter second-1" ; ubar:time = "ocean_time" ; ubar:grid = "grid" ; ubar:location = "edge1" ; ubar:coordinates = "lon_u lat_u ocean_time" ; ubar:field = "ubar-velocity, scalar, series" ; ubar:add_offset = -0.0001525949f ; ubar:scale_factor = 0.0003051898f ; ubar:valid_range = -32766s, 32767s ; short vbar(ocean_time, eta_v, xi_v) ; vbar:long_name = "time-averaged vertically integrated v-momentum component" ; vbar:units = "meter second-1" ; vbar:time = "ocean_time" ; vbar:grid = "grid" ; vbar:location = "edge2" ; vbar:coordinates = "lon_v lat_v ocean_time" ; vbar:field = "vbar-velocity, scalar, series" ; vbar:add_offset = -0.0001525949f ; vbar:scale_factor = 0.0003051898f ; vbar:valid_range = -32766s, 32767s ; short u(ocean_time, s_rho, eta_u, xi_u) ; u:long_name = "time-averaged u-momentum component" ; u:units = "meter second-1" ; u:time = "ocean_time" ; u:grid = "grid" ; u:location = "edge1" ; u:coordinates = "lon_u lat_u s_rho ocean_time" ; u:field = "u-velocity, scalar, series" ; u:add_offset = -0.0001525949f ; u:scale_factor = 0.0003051898f ; u:valid_range = -32766s, 32767s ; short v(ocean_time, s_rho, eta_v, xi_v) ; v:long_name = "time-averaged v-momentum component" ; v:units = "meter second-1" ; v:time = "ocean_time" ; v:grid = "grid" ; v:location = "edge2" ; v:coordinates = "lon_v lat_v s_rho ocean_time" ; v:field = "v-velocity, scalar, series" ; v:add_offset = -0.0001525949f ; v:scale_factor = 0.0003051898f ; v:valid_range = -32766s, 32767s ; short w(ocean_time, s_w, eta_rho, xi_rho) ; w:long_name = "time-averaged vertical momentum component" ; w:units = "meter second-1" ; w:time = "ocean_time" ; w:standard_name = "upward_sea_water_velocity" ; w:grid = "grid" ; w:location = "face" ; w:coordinates = "lon_rho lat_rho s_w ocean_time" ; w:field = "w-velocity, scalar, series" ; w:add_offset = -1.525949e-05f ; w:scale_factor = 3.051898e-05f ; w:valid_range = -32766s, 32767s ; short temp(ocean_time, s_rho, eta_rho, xi_rho) ; temp:long_name = "time-averaged potential temperature" ; temp:units = "Celsius" ; temp:time = "ocean_time" ; temp:grid = "grid" ; temp:location = "face" ; temp:coordinates = "lon_rho lat_rho s_rho ocean_time" ; temp:field = "temperature, scalar, series" ; temp:add_offset = 19.99962f ; temp:scale_factor = 0.0007629744f ; temp:valid_range = -32766s, 32767s ; short salt(ocean_time, s_rho, eta_rho, xi_rho) ; salt:long_name = "time-averaged salinity" ; salt:time = "ocean_time" ; salt:grid = "grid" ; salt:location = "face" ; salt:coordinates = "lon_rho lat_rho s_rho ocean_time" ; salt:field = "salinity, scalar, series" ; salt:add_offset = 19.99969f ; salt:scale_factor = 0.0006103795f ; salt:valid_range = -32766s, 32767s ;

// global attributes: :file = "roms_avg_0001.nc" ; :format = "netCDF-3 64bit offset file" ; :Conventions = "CF-1.4, SGRID-0.3" ; :type = "ROMS/TOMS nonlinear model averages file" ; :title = "ROMS - Cook Strait" ; :var_info = "varinfo.dat" ; :rst_file = "roms_rst_0001.nc" ; :avg_file = "roms_avg_0001.nc" ; :sta_file = "roms_sta_0001.nc" ; :grd_file = "../../grd/roms_grd.nc" ; :ini_file = "roms_rst_0000.nc" ; :frc_file_01 = "../../frc/yearly/roms_frc_stress_wrfnz_1.20_2009.nc, ../../frc/yearly/roms_frc_stress_wrfnz_1.20_2010.nc, ../../frc/yearly/roms_frc_stress_wrfnz_1.20_2011.nc, ../../frc/yearly/roms_frc_stress_wrfnz_1.20_2012.nc, ../../frc/yearly/roms_frc_stress_wrfnz_1.20_2013.nc" ; :frc_file_02 = "../../frc/yearly/roms_frc_sst_oisst_2009.nc, ../../frc/yearly/roms_frc_sst_oisst_2010.nc, ../../frc/yearly/roms_frc_sst_oisst_2011.nc, ../../frc/yearly/roms_frc_sst_oisst_2012.nc, ../../frc/yearly/roms_frc_sst_oisst_2013.nc" ; :frc_file_03 = "../../frc/yearly/roms_frc_shflux_ncep_2009.nc, ../../frc/yearly/roms_frc_shflux_ncep_2010.nc, ../../frc/yearly/roms_frc_shflux_ncep_2011.nc, ../../frc/yearly/roms_frc_shflux_ncep_2012.nc, ../../frc/yearly/roms_frc_shflux_ncep_2013.nc" ; :frc_file_04 = "../../frc/yearly/roms_frc_swflux_ncep_2009.nc, ../../frc/yearly/roms_frc_swflux_ncep_2010.nc, ../../frc/yearly/roms_frc_swflux_ncep_2011.nc, ../../frc/yearly/roms_frc_swflux_ncep_2012.nc, ../../frc/yearly/roms_frc_swflux_ncep_2013.nc" ; :frc_file_05 = "../../frc/yearly/roms_frc_swrad_ncep_2009.nc, ../../frc/yearly/roms_frc_swrad_ncep_2010.nc, ../../frc/yearly/roms_frc_swrad_ncep_2011.nc, ../../frc/yearly/roms_frc_swrad_ncep_2012.nc, ../../frc/yearly/roms_frc_swrad_ncep_2013.nc" ; :frc_file_06 = "../../frc/fixed/roms_frc_tide.nc" ; :bry_file = "../../clm/bran-yearly/roms_bry_2009.nc, ../../clm/bran-yearly/roms_bry_2010.nc, ../../clm/bran-yearly/roms_bry_2011.nc, ../../clm/bran-yearly/roms_bry_2012.nc" ; :clm_file = "../../clm/bran-yearly/roms_clm_2009.nc, ../../clm/bran-yearly/roms_clm_2010.nc, ../../clm/bran-yearly/roms_clm_2011.nc, ../../clm/bran-yearly/roms_clm_2012.nc" ; :nud_file = "../../nud/a/roms_nud.nc" ; :script_file = "" ; :spos_file = "roms_sta.in" ; :NLM_LBC = "\n", "EDGE: WEST SOUTH EAST NORTH \n", "zeta: Che Che Che Che \n", "ubar: Shc Shc Shc Shc \n", "vbar: Shc Shc Shc Shc \n", "u: RadNud RadNud RadNud RadNud \n", "v: RadNud RadNud RadNud RadNud \n", "temp: RadNud RadNud RadNud RadNud \n", "salt: RadNud RadNud RadNud RadNud \n", "tke: Gra Gra Gra Gra" ; :svn_url = "https:://myroms.org/svn/src" ; :svn_rev = "Unversioned directory" ; :code_dir = "/gpfs_hpcf/filesets/hpcf/scratch/hadfield/roms_bld_AIX-00CD7D244C00_b52f5e7e37f4965dbbc3c64d675da121" ; :header_dir = "/gpfs_hpcf/filesets/hpcf/scratch/hadfield/roms_bld_AIX-00CD7D244C00_b52f5e7e37f4965dbbc3c64d675da121/ROMS/Include" ; :header_file = "greater_cook.h" ; :os = "AIX" ; :cpu = "powerpc" ; :compiler_system = "xlf" ; :compiler_command = "/usr/bin/mpxlf95_r" ; :compiler_flags = "-qsuffix=f=f90 -qmaxmem=-1 -qarch=pwr6 -qnoextname -q64 -O3 -qstrict -qfree=f90 -qfree=f90" ; :tiling = "008x008" ; :history = "2017-03-28 02:02:44: packed with rncpack6\n", "ROMS/TOMS, Version 3.7, Monday - March 27, 2017 - 9:18:28 PM" ; :ana_file = "ROMS/Functionals/ana_btflux.h, ROMS/Functionals/ana_srflux.h, ROMS/Functionals/ana_dqdsst.h" ; :CPP_options = "GREATER_COOK, ADD_FSOBC, ADD_M2OBC, ANA_BSFLUX, ANA_BTFLUX, ANA_DQDSST, ASSUMED_SHAPE, AVERAGES, CURVGRID, DIURNAL_SRFLUX, DJ_GRADPS, DOUBLE_PRECISION, GLS_MIXING, KANTHA_CLAYSON, MASKING, MPI, NONLINEAR, NONLIN_EOS, NO_LBC_ATT, N2S2_HORAVG, POWER_LAW, PROFILE, QCORRECTION, K_GSCHEME, RADIATION_2D, RAMP_TIDES, !RST_SINGLE, SALINITY, SOLAR_SOURCE, SOLVE3D, SSH_TIDES, STATIONS, TS_U3HADVECTION, TS_SVADVECTION, UV_ADV, UV_COR, UV_U3HADVECTION, UV_C4VADVECTION, UV_LOGDRAG, UV_TIDES, VAR_RHO_2D" ; }

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Excessive memory usage when printing multi-file Dataset 243927150
317050127 https://github.com/pydata/xarray/issues/1481#issuecomment-317050127 https://api.github.com/repos/pydata/xarray/issues/1481 MDEyOklzc3VlQ29tbWVudDMxNzA1MDEyNw== shoyer 1217238 2017-07-21T16:40:19Z 2017-07-21T16:40:19Z MEMBER

Our formatting logic pulls out the first few values of arrays to print them in the repr. It appears that this is failing spectacularly in this case, though I'm not sure why.

Can you share a quick preview of what a single one of your constituent netCDF files looks like?

More broadly: maybe we should disable automatically printing a preview of the contents of xarray.Dataset objects when they have lazily loaded data in the form of dask arrays. This is convenient for interactive use in many cases (when it can be done cheaply!) but fails in many edge cases.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Excessive memory usage when printing multi-file Dataset 243927150
316989732 https://github.com/pydata/xarray/issues/1481#issuecomment-316989732 https://api.github.com/repos/pydata/xarray/issues/1481 MDEyOklzc3VlQ29tbWVudDMxNjk4OTczMg== rabernat 1197350 2017-07-21T12:37:11Z 2017-07-21T12:37:21Z MEMBER

Can you try calling open_mfdataset with the decode_cf=False option?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Excessive memory usage when printing multi-file Dataset 243927150
316949986 https://github.com/pydata/xarray/issues/1481#issuecomment-316949986 https://api.github.com/repos/pydata/xarray/issues/1481 MDEyOklzc3VlQ29tbWVudDMxNjk0OTk4Ng== hadfieldnz 29717790 2017-07-21T09:14:34Z 2017-07-21T09:14:34Z NONE

I ran "conda update dask", which upgraded me from 0.14.3 to 0.15.0.

Short report: No this has not eliminated the problem.

Long report: Today (Friday) I am on my home machine, which has only 6 GiB RAM. I confirmed earlier today with dask 0.14.3 that I can open and print the dataset with 25 files. And with 10 files IPython halts with a memory error reporting that 85% of the memory is being used. After the upgrade to 0.15.0, running the test script with 10 files, it exhausted all the RAM on my machine and locked it up within a few seconds. I will not be able to investigate this further until I get back on my work machine on Monday.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Excessive memory usage when printing multi-file Dataset 243927150
316937056 https://github.com/pydata/xarray/issues/1481#issuecomment-316937056 https://api.github.com/repos/pydata/xarray/issues/1481 MDEyOklzc3VlQ29tbWVudDMxNjkzNzA1Ng== fmaussion 10050469 2017-07-21T08:18:52Z 2017-07-21T08:18:52Z MEMBER

0.14.3 pre-dates the fix https://github.com/dask/dask/pull/2364 mentioned above: can you try to update dask?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Excessive memory usage when printing multi-file Dataset 243927150
316932109 https://github.com/pydata/xarray/issues/1481#issuecomment-316932109 https://api.github.com/repos/pydata/xarray/issues/1481 MDEyOklzc3VlQ29tbWVudDMxNjkzMjEwOQ== hadfieldnz 29717790 2017-07-21T07:55:50Z 2017-07-21T07:55:50Z NONE

xarray 0.9.6 dask 0.14.3

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Excessive memory usage when printing multi-file Dataset 243927150
316806744 https://github.com/pydata/xarray/issues/1481#issuecomment-316806744 https://api.github.com/repos/pydata/xarray/issues/1481 MDEyOklzc3VlQ29tbWVudDMxNjgwNjc0NA== rabernat 1197350 2017-07-20T19:32:18Z 2017-07-20T19:32:18Z MEMBER

Hi @hadfieldnz -- I believe this issue could be related to #1396, which was fixed in dask/dask#2364.

Could you let us know what versions of xarray and dask you are using?

python import xarray import dask print(xarray.__version__) print(dask.__version__)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Excessive memory usage when printing multi-file Dataset 243927150

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 19.267ms · About: xarray-datasette