html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/1985#issuecomment-373694632,https://api.github.com/repos/pydata/xarray/issues/1985,373694632,MDEyOklzc3VlQ29tbWVudDM3MzY5NDYzMg==,22245117,2018-03-16T12:09:50Z,2018-03-16T12:09:50Z,CONTRIBUTOR,"Alright, I found the problem. I'm loading several variables from different files. All the variables have 1464 snapshots. However, one of the 3D variables has just one snapshot at a different time (I found a bag in my bash script to re-organize the raw data). When I load my dataset using .open_mfdataset, the time dimension has an extra snapshot (length is 1465). However, xarray doesn't like it and when I run functions such as to_netcdf it takes forever (no error). Thanks @fujiisoup for the help!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,304624171 https://github.com/pydata/xarray/issues/1985#issuecomment-372852604,https://api.github.com/repos/pydata/xarray/issues/1985,372852604,MDEyOklzc3VlQ29tbWVudDM3Mjg1MjYwNA==,6815844,2018-03-13T23:24:37Z,2018-03-13T23:24:37Z,MEMBER,"I see no problem with your code... Can you try updating xarray to 0.10.2 (released today)? We updated some logic of lazy indexing.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,304624171 https://github.com/pydata/xarray/issues/1985#issuecomment-372570107,https://api.github.com/repos/pydata/xarray/issues/1985,372570107,MDEyOklzc3VlQ29tbWVudDM3MjU3MDEwNw==,22245117,2018-03-13T07:21:10Z,2018-03-13T07:21:10Z,CONTRIBUTOR,"I forgot to mention that I'm getting this warning: /home/idies/anaconda3/lib/python3.5/site-packages/dask/core.py:306: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison elif type_arg is type(key) and arg == key: However, I don't think it is relevant since I get the same warning when I'm able to run .to_netcdf() on the 3D variable.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,304624171 https://github.com/pydata/xarray/issues/1985#issuecomment-372566304,https://api.github.com/repos/pydata/xarray/issues/1985,372566304,MDEyOklzc3VlQ29tbWVudDM3MjU2NjMwNA==,22245117,2018-03-13T07:01:51Z,2018-03-13T07:01:51Z,CONTRIBUTOR,"The problem occurs when I run the very last line, which is to_netcdf(). Right before, the dataset looks like this: ```python Dimensions: (X: 10, Y: 25, Z: 1, time: 2) Coordinates: * time (time) datetime64[ns] 2007-11-15 2007-11-16 * Z (Z) float64 1.0 * X (X) float64 -29.94 -29.89 -29.85 -29.81 -29.76 -29.72 -29.67 ... * Y (Y) float64 65.01 65.03 65.05 65.07 65.09 65.11 65.13 65.15 ... Data variables: drF (time, Z) float64 2.0 2.0 dxF (time, Y, X) float64 2.066e+03 2.066e+03 2.066e+03 2.066e+03 ... dyF (time, Y, X) float64 2.123e+03 2.123e+03 2.123e+03 2.123e+03 ... rA (time, Y, X) float64 4.386e+06 4.386e+06 4.386e+06 4.386e+06 ... fCori (time, Y, X) float64 0.0001322 0.0001322 0.0001322 0.0001322 ... R_low (time, Y, X) float64 -2.001e+03 -1.989e+03 -1.973e+03 ... Ro_surf (time, Y, X) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... Depth (time, Y, X) float64 2.001e+03 1.989e+03 1.973e+03 1.963e+03 ... HFacC (time, Z, Y, X) float64 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 ... Temp (time, Z, Y, X) float64 dask.array ``` This is a dask array, right?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,304624171 https://github.com/pydata/xarray/issues/1985#issuecomment-372563938,https://api.github.com/repos/pydata/xarray/issues/1985,372563938,MDEyOklzc3VlQ29tbWVudDM3MjU2MzkzOA==,6815844,2018-03-13T06:48:23Z,2018-03-13T06:48:23Z,MEMBER,"Umm. I could not find what is wrong with your code. Can you find which line loads the data into memory? If your data is still a dask array, it does not print the entries of the array but instead, it shows something like this, ```python dask.array Dimensions without coordinates: x ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,304624171 https://github.com/pydata/xarray/issues/1985#issuecomment-372558850,https://api.github.com/repos/pydata/xarray/issues/1985,372558850,MDEyOklzc3VlQ29tbWVudDM3MjU1ODg1MA==,22245117,2018-03-13T06:19:47Z,2018-03-13T06:23:00Z,CONTRIBUTOR,"I have the same issue if I don't copy the dataset. Here are the coordinates of my dataset: ```python Dimensions: (X: 960, Xp1: 961, Y: 880, Yp1: 881, Z: 216, Zl: 216, Zp1: 217, Zu: 216, time: 1465) Coordinates: * Z (Z) float64 1.0 3.5 7.0 11.5 17.0 23.5 31.0 39.5 49.0 59.5 ... * Zp1 (Zp1) float64 0.0 2.0 5.0 9.0 14.0 20.0 27.0 35.0 44.0 54.0 ... * Zu (Zu) float64 2.0 5.0 9.0 14.0 20.0 27.0 35.0 44.0 54.0 65.0 ... * Zl (Zl) float64 0.0 2.0 5.0 9.0 14.0 20.0 27.0 35.0 44.0 54.0 ... * X (X) float64 -46.92 -46.83 -46.74 -46.65 -46.57 -46.48 -46.4 ... * Y (Y) float64 56.81 56.85 56.89 56.93 56.96 57.0 57.04 57.08 ... * Xp1 (Xp1) float64 -46.96 -46.87 -46.78 -46.7 -46.61 -46.53 ... * Yp1 (Yp1) float64 56.79 56.83 56.87 56.91 56.95 56.98 57.02 ... * time (time) datetime64[ns] 2007-09-01 2007-09-01T06:00:00 ... ``` I don't think the horizontal coordinates are the problem because it works fine when I use the same function on 3D variables. I'm also attaching the function that I use to open the dataset, just in case is helpful: ```python def load_dataset(): """""" Load the whole dataset """""" # Import grid and fields separately, then merge gridpath = '/home/idies/workspace/OceanCirculation/exp_ASR/grid_glued.nc' fldspath = '/home/idies/workspace/OceanCirculation/exp_ASR/result_*/output_glued/*.*_glued.nc' gridset = xr.open_dataset(gridpath, drop_variables = ['XU','YU','XV','YV','RC','RF','RU','RL']) fldsset = xr.open_mfdataset(fldspath, concat_dim = 'T', drop_variables = ['diag_levels','iter']) ds = xr.merge([gridset, fldsset]) # Adjust dimensions creating conflicts ds = ds.rename({'Z': 'Ztmp'}) ds = ds.rename({'T': 'time', 'Ztmp': 'Z', 'Zmd000216': 'Z'}) ds = ds.squeeze('Zd000001') for dim in ['Z','Zp1', 'Zu','Zl']: ds[dim].values = np.fabs(ds[dim].values) ds[dim].attrs.update({'positive': 'down'}) # Create horizontal vectors (remove zeros due to exch2) ds['X'].values = ds.XC.where((ds.XC!=0) & (ds.YC!=0)).mean(dim='Y', skipna=True) ds['Xp1'].values = ds.XG.where((ds.XG!=0) & (ds.YG!=0)).mean(dim='Yp1', skipna=True) ds['Y'].values = ds.YC.where((ds.XC!=0) & (ds.YC!=0)).mean(dim='X', skipna=True) ds['Yp1'].values = ds.YG.where((ds.XG!=0) & (ds.YG!=0)).mean(dim='Xp1', skipna=True) ds = ds.drop(['XC','YC','XG','YG']) # Create xgcm grid ds['Z'].attrs.update({'axis': 'Z'}) ds['X'].attrs.update({'axis': 'X'}) ds['Y'].attrs.update({'axis': 'Y'}) for dim in ['Zp1','Zu','Zl','Xp1','Yp1']: if min(ds[dim].values)min(ds[dim[0]].values): ds[dim].attrs.update({'axis': dim[0], 'c_grid_axis_shift': +0.5}) grid = xgcm.Grid(ds,periodic=False) return ds, grid ``` I think somewhere I trigger the loading of the whole dataset. Otherwise, I don't understand why it works when I open just one month instead of the whole year.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,304624171 https://github.com/pydata/xarray/issues/1985#issuecomment-372545491,https://api.github.com/repos/pydata/xarray/issues/1985,372545491,MDEyOklzc3VlQ29tbWVudDM3MjU0NTQ5MQ==,6815844,2018-03-13T04:44:52Z,2018-03-13T04:48:56Z,MEMBER,"I notice this line ```python # Copy dataset ds = ds2cut.copy(deep=True) ``` loads the data into memory. I think you don't need to copy the dataset here. If you need to copy the data, it is more efficient to make a copy *after* the indexing.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,304624171 https://github.com/pydata/xarray/issues/1985#issuecomment-372544809,https://api.github.com/repos/pydata/xarray/issues/1985,372544809,MDEyOklzc3VlQ29tbWVudDM3MjU0NDgwOQ==,6815844,2018-03-13T04:39:47Z,2018-03-13T04:39:47Z,MEMBER,"> When I load the sub-dataset after using the indexing routines, does xarray need to read the whole original 4D variable? I don't think so. We support lazy indexing for any dimensional arrays (but not coordinate variables). What does your data (especially '4Dvariable.nc') look like? Is `Xp1` coordinate or sufficiently small? `ds['Xp1'].values` loads `Xp1` into the memory.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,304624171