html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/7772#issuecomment-1518429926,https://api.github.com/repos/pydata/xarray/issues/7772,1518429926,IC_kwDOAMm_X85agWbm,2448579,2023-04-21T23:56:26Z,2023-04-21T23:56:26Z,MEMBER,"I cannot reproduce this on `main`. What version are you running ``` (xarray-tests) 17:55:11 [cgdm-caguas] {~/python/xarray/devel} ──────> python lazy-nbytes.py 8582842640 Filename: /Users/dcherian/work/python/xarray/devel/lazy-nbytes.py Line # Mem usage Increment Occurrences Line Contents ============================================================= 4 101.5 MiB 101.5 MiB 1 @profile 5 def get_dataset_size() : 6 175.9 MiB 74.4 MiB 1 dataset = xa.open_dataset(""test_1.nc"") 7 175.9 MiB 0.0 MiB 1 print(dataset.nbytes) ``` The BackendArray types define `shape` and `dtype` so we can calculate size without loading the data.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1676561243 https://github.com/pydata/xarray/issues/7772#issuecomment-1517659721,https://api.github.com/repos/pydata/xarray/issues/7772,1517659721,IC_kwDOAMm_X85adaZJ,14808389,2023-04-21T11:05:40Z,2023-04-21T11:05:40Z,MEMBER,"that's a numpy array with sparse data. What @TomNicholas was talking about is a array of type `sparse.COO` (from the [sparse](https://github.com/pydata/sparse/) package). And as far as I can tell, our wrapper class (which is the reason why you don't get the memory error on open) does not define `nbytes`, so at the moment there's no way to do that. You could try using `dask`, though, which does allow working with bigger-than-memory data.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1676561243 https://github.com/pydata/xarray/issues/7772#issuecomment-1516802286,https://api.github.com/repos/pydata/xarray/issues/7772,1516802286,IC_kwDOAMm_X85aaJDu,35968931,2023-04-20T18:58:48Z,2023-04-20T18:58:48Z,MEMBER,"Thanks for raising this @dabhicusp ! > So why have that if block at line 396? Because xarray can wrap many different type of numpy-like arrays, and for some of those types then the `self.size * self.dtype.itemsize` approach may not return the correct size. Think of a sparse matrix for example - its size in memory is designed to be much smaller than the size of the matrix would suggest. That's why in general we defer to the underlying array itself to tell us its size if it can (i.e. if it has a `.nbytes` attribute). But you're not using an unusual type of array, you're just opening a netCDF file as a numpy array, in theory lazily. The memory usage you're seeing is not desired, so something weird must be happening in the `.nbytes` call. Going deeper into the stack at that point would be helpful.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1676561243