html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/7772#issuecomment-1519897098,https://api.github.com/repos/pydata/xarray/issues/7772,1519897098,IC_kwDOAMm_X85al8oK,123355381,2023-04-24T10:51:16Z,2023-04-24T10:51:16Z,NONE,Thank you @dcherian . I cannot reproduced this on `main`.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1676561243
https://github.com/pydata/xarray/issues/7772#issuecomment-1518429926,https://api.github.com/repos/pydata/xarray/issues/7772,1518429926,IC_kwDOAMm_X85agWbm,2448579,2023-04-21T23:56:26Z,2023-04-21T23:56:26Z,MEMBER,"I cannot reproduce this on `main`. What version are you running
```
(xarray-tests) 17:55:11 [cgdm-caguas] {~/python/xarray/devel}
──────> python lazy-nbytes.py
8582842640
Filename: /Users/dcherian/work/python/xarray/devel/lazy-nbytes.py
Line # Mem usage Increment Occurrences Line Contents
=============================================================
4 101.5 MiB 101.5 MiB 1 @profile
5 def get_dataset_size() :
6 175.9 MiB 74.4 MiB 1 dataset = xa.open_dataset(""test_1.nc"")
7 175.9 MiB 0.0 MiB 1 print(dataset.nbytes)
```
The BackendArray types define `shape` and `dtype` so we can calculate size without loading the data.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1676561243
https://github.com/pydata/xarray/issues/7772#issuecomment-1517659721,https://api.github.com/repos/pydata/xarray/issues/7772,1517659721,IC_kwDOAMm_X85adaZJ,14808389,2023-04-21T11:05:40Z,2023-04-21T11:05:40Z,MEMBER,"that's a numpy array with sparse data. What @TomNicholas was talking about is a array of type `sparse.COO` (from the [sparse](https://github.com/pydata/sparse/) package).
And as far as I can tell, our wrapper class (which is the reason why you don't get the memory error on open) does not define `nbytes`, so at the moment there's no way to do that. You could try using `dask`, though, which does allow working with bigger-than-memory data.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1676561243
https://github.com/pydata/xarray/issues/7772#issuecomment-1517649648,https://api.github.com/repos/pydata/xarray/issues/7772,1517649648,IC_kwDOAMm_X85adX7w,123355381,2023-04-21T10:57:28Z,2023-04-21T10:57:28Z,NONE,"The first point that you mentioned does not seem to be correct. Please see the below code (we took the sparse matrix ) and output:
```
import xarray as xa
import numpy as np
def get_data():
lat_dim = 7210
lon_dim = 7440
lat = [0] * lat_dim
lon = [0] * lon_dim
time = [0] * 5
nlats = lat_dim; nlons = lon_dim; ntimes = 5
var_1 = np.empty((ntimes,nlats,nlons))
var_2 = np.empty((ntimes,nlats,nlons))
var_3 = np.empty((ntimes,nlats,nlons))
var_4 = np.empty((ntimes,nlats,nlons))
data_arr = np.random.uniform(low=0,high=0,size=(ntimes,nlats,nlons))
data_arr[:,0,:] = 1
data_arr[:,:,1] = 1
var_1[:,:,:] = data_arr
var_2[:,:,:] = data_arr
var_3[:,:,:] = data_arr
var_4[:,:,:] = data_arr
dataset = xa.Dataset(
data_vars = {
'var_1': (('time','lat','lon'), var_1),
'var_2': (('time','lat','lon'), var_2),
'var_3': (('time','lat','lon'), var_3),
'var_4': (('time','lat','lon'), var_4)},
coords = {
'lat': lat,
'lon': lon,
'time':time})
print(sum(v.size * v.dtype.itemsize for v in dataset.variables.values()))
print(dataset.nbytes)
if __name__ == ""__main__"":
get_data()
```
```
8582901240
8582901240
```
As we can observe here both `nbytes` and `self.size * self.dtype.itemsize` gives the same size.
And for the 2nd point can you share any solution for the nbytes for the `netCDF` or `grib` file as it takes too much memory and killed the process?
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1676561243
https://github.com/pydata/xarray/issues/7772#issuecomment-1516802286,https://api.github.com/repos/pydata/xarray/issues/7772,1516802286,IC_kwDOAMm_X85aaJDu,35968931,2023-04-20T18:58:48Z,2023-04-20T18:58:48Z,MEMBER,"Thanks for raising this @dabhicusp !
> So why have that if block at line 396?
Because xarray can wrap many different type of numpy-like arrays, and for some of those types then the `self.size * self.dtype.itemsize` approach may not return the correct size. Think of a sparse matrix for example - its size in memory is designed to be much smaller than the size of the matrix would suggest. That's why in general we defer to the underlying array itself to tell us its size if it can (i.e. if it has a `.nbytes` attribute).
But you're not using an unusual type of array, you're just opening a netCDF file as a numpy array, in theory lazily. The memory usage you're seeing is not desired, so something weird must be happening in the `.nbytes` call. Going deeper into the stack at that point would be helpful.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1676561243
https://github.com/pydata/xarray/issues/7772#issuecomment-1516188394,https://api.github.com/repos/pydata/xarray/issues/7772,1516188394,IC_kwDOAMm_X85aXzLq,30606887,2023-04-20T11:46:04Z,2023-04-20T11:46:04Z,NONE,"Thanks for opening your first issue here at xarray! Be sure to follow the issue template!
If you have an idea for a solution, we would really welcome a Pull Request with proposed changes.
See the [Contributing Guide](https://docs.xarray.dev/en/latest/contributing.html) for more.
It may take us a while to respond here, but we really value your contribution. Contributors like you help make xarray better.
Thank you!
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1676561243