home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 1168483200

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/6733#issuecomment-1168483200 https://api.github.com/repos/pydata/xarray/issues/6733 1168483200 IC_kwDOAMm_X85FpaOA 9569132 2022-06-28T09:37:59Z 2022-06-28T09:37:59Z NONE

I've also tried pre-converting the float32 array to uint16:

python if pack: out_data = np.round(base_grid * scale_factor, 0) out_data[np.isnan(out_data)] = 65535 out_data = out_data.astype('uint16') else: out_data = base_grid

I expect that to add that extra 17GB for a total memory of 53GB or so but then exporting to netcdf still shows unexpectedly variable peak memory use:

bash $ grep peak conversion_* conversion_10.out: Used : 133 (peak) 0.53 (ave) conversion_11.out: Used : 117 (peak) 0.73 (ave) conversion_12.out: Used : 92 (peak) 0.93 (ave) conversion_13.out: Used : 103 (peak) 0.75 (ave) conversion_14.out: Used : 79 (peak) 0.64 (ave) conversion_15.out: Used : 94 (peak) 0.66 (ave) conversion_16.out: Used : 92 (peak) 0.95 (ave) conversion_17.out: Used : 129 (peak) 0.66 (ave) conversion_18.out: Used : 92 (peak) 0.91 (ave) conversion_19.out: Used : 105 (peak) 0.67 (ave) conversion_1.out: Used : 77 (peak) 0.94 (ave) conversion_20.out: Used : 87 (peak) 0.65 (ave) conversion_21.out: Used : 93 (peak) 0.63 (ave) conversion_2.out: Used : 92 (peak) 0.95 (ave) conversion_3.out: Used : 92 (peak) 0.94 (ave) conversion_4.out: Used : 92 (peak) 0.93 (ave) conversion_5.out: Used : 121 (peak) 0.47 (ave) conversion_6.out: Used : 92 (peak) 0.94 (ave) conversion_7.out: Used : 92 (peak) 0.96 (ave) conversion_8.out: Used : 92 (peak) 0.93 (ave) conversion_9.out: Used : 129 (peak) 0.47 (ave)

One thing I do see for some failing files in the script reporting is this exception - the to_netcdf process appears to be creating another 32GB float32 array?

python Data loaded; Memory usage: 35.70772171020508 Conversion complete; Memory usage: 53.329856872558594 Array created; Memory usage: 53.329856872558594 Traceback (most recent call last): File "/rds/general/project/lemontree/live/source/SNU_Ryu_FPAR_LAI/convert_SNU_Ryu_to_netcdf.py", line 162, in <module> xds.to_netcdf(out_file, encoding=encoding) File "/rds/general/user/dorme/home/anaconda3/envs/python3.10/lib/python3.10/site-packages/xarray/core/dataarray.py", line 2839, in to_netcdf return dataset.to_netcdf(*args, **kwargs) File "/rds/general/user/dorme/home/anaconda3/envs/python3.10/lib/python3.10/site-packages/xarray/core/dataset.py", line 1902, in to_netcdf return to_netcdf( File "/rds/general/user/dorme/home/anaconda3/envs/python3.10/lib/python3.10/site-packages/xarray/backends/api.py", line 1072, in to_netcdf dump_to_store( File "/rds/general/user/dorme/home/anaconda3/envs/python3.10/lib/python3.10/site-packages/xarray/backends/api.py", line 1119, in dump_to_store store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims) File "/rds/general/user/dorme/home/anaconda3/envs/python3.10/lib/python3.10/site-packages/xarray/backends/common.py", line 261, in store variables, attributes = self.encode(variables, attributes) File "/rds/general/user/dorme/home/anaconda3/envs/python3.10/lib/python3.10/site-packages/xarray/backends/common.py", line 350, in encode variables, attributes = cf_encoder(variables, attributes) File "/rds/general/user/dorme/home/anaconda3/envs/python3.10/lib/python3.10/site-packages/xarray/conventions.py", line 855, in cf_encoder new_vars = {k: encode_cf_variable(v, name=k) for k, v in variables.items()} File "/rds/general/user/dorme/home/anaconda3/envs/python3.10/lib/python3.10/site-packages/xarray/conventions.py", line 855, in <dictcomp> new_vars = {k: encode_cf_variable(v, name=k) for k, v in variables.items()} File "/rds/general/user/dorme/home/anaconda3/envs/python3.10/lib/python3.10/site-packages/xarray/conventions.py", line 269, in encode_cf_variable var = coder.encode(var, name=name) File "/rds/general/user/dorme/home/anaconda3/envs/python3.10/lib/python3.10/site-packages/xarray/coding/variables.py", line 168, in encode data = duck_array_ops.fillna(data, fill_value) File "/rds/general/user/dorme/home/anaconda3/envs/python3.10/lib/python3.10/site-packages/xarray/core/duck_array_ops.py", line 298, in fillna return where(notnull(data), data, other) File "/rds/general/user/dorme/home/anaconda3/envs/python3.10/lib/python3.10/site-packages/xarray/core/duck_array_ops.py", line 285, in where return _where(condition, *as_shared_dtype([x, y])) File "<__array_function__ internals>", line 180, in where numpy.core._exceptions._ArrayMemoryError: Unable to allocate 35.2 GiB for an array with shape (365, 3600, 7200) and data type float32

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  1286995366
Powered by Datasette · Queries took 1.015ms · About: xarray-datasette