github: issue_comments: 6 rows where issue = 1506437087 and user = 720460 sorted by updated

6 rows where issue = 1506437087 and user = 720460 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
1363988341	https://github.com/pydata/xarray/issues/7397#issuecomment-1363988341	https://api.github.com/repos/pydata/xarray/issues/7397	IC_kwDOAMm_X85RTM91	benoitespinola 720460	2022-12-23T14:15:25Z	2022-12-23T14:15:53Z	NONE	Because I want to have a worry-free holidays, I wrote a bit of code that basically creates a new NetCDF file from scratch. I load the data from Xarray, change the data to Numpy arrays and use the NetCDF4 library to write the files (does what I want). In the process, I also slice the data and drop unwanted variables to keep just the bits I want (unlike my original post). If I call .load() or .compute() on my xarray variable, the memory goes crazy (even if I am dropping unwanted variables - which I would expect to release memory). The same happens for slicing followed by .compute(). Unfortunately, the MCVE will have to wait until I am back from my holidays. Happy holidays to all!	{ "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 1, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Memory issue merging NetCDF files using xarray.open_mfdataset and to_netcdf 1506437087
1362583979	https://github.com/pydata/xarray/issues/7397#issuecomment-1362583979	https://api.github.com/repos/pydata/xarray/issues/7397	IC_kwDOAMm_X85RN2Gr	benoitespinola 720460	2022-12-22T09:04:17Z	2022-12-22T09:04:17Z	NONE	By the way, prior to writing this ticket, I also did the following (which did not help): Drop variables I do not care, keeping dimensions only and toce + soce ; I would expect to need less memory after that.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Memory issue merging NetCDF files using xarray.open_mfdataset and to_netcdf 1506437087
1362564754	https://github.com/pydata/xarray/issues/7397#issuecomment-1362564754	https://api.github.com/repos/pydata/xarray/issues/7397	IC_kwDOAMm_X85RNxaS	benoitespinola 720460	2022-12-22T08:44:06Z	2022-12-22T08:44:06Z	NONE	Answering to the question 'Did you do some processing with the data, changing attributes/encoding etc?': No processing. I do ask xarray to load the data (and I tried also loading + computing) and the final outcome is the same. I try now to do an MCVE with dummy data.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Memory issue merging NetCDF files using xarray.open_mfdataset and to_netcdf 1506437087
1362562275	https://github.com/pydata/xarray/issues/7397#issuecomment-1362562275	https://api.github.com/repos/pydata/xarray/issues/7397	IC_kwDOAMm_X85RNwzj	benoitespinola 720460	2022-12-22T08:41:21Z	2022-12-22T08:41:21Z	NONE	Just tested with to_zarr and it goes through: `State: COMPLETED (exit code 0) Nodes: 1 Cores per node: 2 CPU Utilized: 00:07:55 CPU Efficiency: 63.00% of 00:12:34 core-walltime Job Wall-clock time: 00:06:17 Memory Utilized: 164.89 GB Memory Efficiency: 44.56% of 370.00 GB` I did an extra run using a memory profiler as such: ``` import xarray as xr import zarr from memory_profiler import profile @profile def main(): path = './data/data_.nc' # files are: data_1.nc data_2.nc data_3.nc data_4.nc data_5.nc data = xr.open_mfdataset(path) `data = data.load() data = data.compute() data.to_zarr()` if name=='main': main() `The profiled code was also completed with great success:` State: COMPLETED (exit code 0) Nodes: 1 Cores per node: 2 CPU Utilized: 00:07:52 CPU Efficiency: 63.61% of 00:12:22 core-walltime Job Wall-clock time: 00:06:11 Memory Utilized: 165.53 GB Memory Efficiency: 44.74% of 370.00 GB ``` Here is the outcome for the memory profiling: ``` Line # Mem usage Increment Occurrences Line Contents ============================================================= 5 156.9 MiB 156.9 MiB 1 @profile 6 def main(): 7 156.9 MiB 0.0 MiB 1 path = './data/data_.nc' # files are: data_1.nc data_2.nc data_3.nc data_4.nc data_5.nc `8 209.3 MiB 52.4 MiB 1 data = xr.open_mfdataset(path) 9 10 82150.1 MiB 81940.8 MiB 1 data = data.load() 11 82101.2 MiB -49.0 MiB 1 data = data.compute() 12 13 90091.2 MiB 7990.0 MiB 1 data.to_zarr()` ``` PS: in this test I just realized I loaded 8 files instead of 5.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Memory issue merging NetCDF files using xarray.open_mfdataset and to_netcdf 1506437087
1362544813	https://github.com/pydata/xarray/issues/7397#issuecomment-1362544813	https://api.github.com/repos/pydata/xarray/issues/7397	IC_kwDOAMm_X85RNsit	benoitespinola 720460	2022-12-22T08:21:31Z	2022-12-22T08:21:31Z	NONE	A single file (from ncdump -h): dimensions: axis_nbounds = 2 ; x = 754 ; y = 277 ; deptht = 200 ; time_counter = UNLIMITED ; // (28 currently) variables: float nav_lat(y, x) ; nav_lat:standard_name = "latitude" ; nav_lat:long_name = "Latitude" ; nav_lat:units = "degrees_north" ; float nav_lon(y, x) ; nav_lon:standard_name = "longitude" ; nav_lon:long_name = "Longitude" ; nav_lon:units = "degrees_east" ; float deptht(deptht) ; deptht:name = "deptht" ; deptht:long_name = "Vertical T levels" ; deptht:units = "m" ; deptht:positive = "down" ; deptht:bounds = "deptht_bounds" ; float deptht_bounds(deptht, axis_nbounds) ; deptht_bounds:units = "m" ; double time_centered(time_counter) ; time_centered:standard_name = "time" ; time_centered:long_name = "Time axis" ; time_centered:calendar = "gregorian" ; time_centered:units = "seconds since 1900-01-01 00:00:00" ; time_centered:time_origin = "1900-01-01 00:00:00" ; time_centered:bounds = "time_centered_bounds" ; double time_centered_bounds(time_counter, axis_nbounds) ; double time_counter(time_counter) ; time_counter:axis = "T" ; time_counter:standard_name = "time" ; time_counter:long_name = "Time axis" ; time_counter:calendar = "gregorian" ; time_counter:units = "seconds since 1900-01-01 00:00:00" ; time_counter:time_origin = "1900-01-01 00:00:00" ; time_counter:bounds = "time_counter_bounds" ; double time_counter_bounds(time_counter, axis_nbounds) ; float toce(time_counter, deptht, y, x) ; toce:standard_name = "sea_water_potential_temperature" ; toce:long_name = "temperature" ; toce:units = "degC" ; toce:online_operation = "average" ; toce:interval_operation = "60 s" ; toce:interval_write = "6 h" ; toce:cell_methods = "time: mean (interval: 60 s)" ; toce:_FillValue = 1.e+20f ; toce:missing_value = 1.e+20f ; toce:coordinates = "time_centered nav_lat nav_lon" ; float soce(time_counter, deptht, y, x) ; soce:standard_name = "sea_water_practical_salinity" ; soce:long_name = "salinity" ; soce:units = "1e-3" ; soce:online_operation = "average" ; soce:interval_operation = "60 s" ; soce:interval_write = "6 h" ; soce:cell_methods = "time: mean (interval: 60 s)" ; soce:_FillValue = 1.e+20f ; soce:missing_value = 1.e+20f ; soce:coordinates = "time_centered nav_lat nav_lon" ; float taum(time_counter, y, x) ; taum:standard_name = "magnitude_of_surface_downward_stress" ; taum:long_name = "wind stress module" ; taum:units = "N/m2" ; taum:online_operation = "average" ; taum:interval_operation = "120 s" ; taum:interval_write = "6 h" ; taum:cell_methods = "time: mean (interval: 120 s)" ; taum:_FillValue = 1.e+20f ; taum:missing_value = 1.e+20f ; taum:coordinates = "time_centered nav_lat nav_lon" ; float wspd(time_counter, y, x) ; wspd:standard_name = "wind_speed" ; wspd:long_name = "wind speed module" ; wspd:units = "m/s" ; wspd:online_operation = "average" ; wspd:interval_operation = "120 s" ; wspd:interval_write = "6 h" ; wspd:cell_methods = "time: mean (interval: 120 s)" ; wspd:_FillValue = 1.e+20f ; wspd:missing_value = 1.e+20f ; wspd:coordinates = "time_centered nav_lat nav_lon" ; And after the merge, the only difference is in the time dimension that goes from 28 to 280 (or so)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Memory issue merging NetCDF files using xarray.open_mfdataset and to_netcdf 1506437087
1361621826	https://github.com/pydata/xarray/issues/7397#issuecomment-1361621826	https://api.github.com/repos/pydata/xarray/issues/7397	IC_kwDOAMm_X85RKLNC	benoitespinola 720460	2022-12-21T16:28:15Z	2022-12-21T16:28:15Z	NONE	By the way, Using `.encoding` to my data yields to `'complevel': 1`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Memory issue merging NetCDF files using xarray.open_mfdataset and to_netcdf 1506437087

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);