github: issue_comments: 8 rows where issue = 304624171 sorted by updated

8 rows where issue = 304624171 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
373694632	https://github.com/pydata/xarray/issues/1985#issuecomment-373694632	https://api.github.com/repos/pydata/xarray/issues/1985	MDEyOklzc3VlQ29tbWVudDM3MzY5NDYzMg==	malmans2 22245117	2018-03-16T12:09:50Z	2018-03-16T12:09:50Z	CONTRIBUTOR	Alright, I found the problem. I'm loading several variables from different files. All the variables have 1464 snapshots. However, one of the 3D variables has just one snapshot at a different time (I found a bag in my bash script to re-organize the raw data). When I load my dataset using .open_mfdataset, the time dimension has an extra snapshot (length is 1465). However, xarray doesn't like it and when I run functions such as to_netcdf it takes forever (no error). Thanks @fujiisoup for the help!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Load a small subset of data from a big dataset takes forever 304624171
372852604	https://github.com/pydata/xarray/issues/1985#issuecomment-372852604	https://api.github.com/repos/pydata/xarray/issues/1985	MDEyOklzc3VlQ29tbWVudDM3Mjg1MjYwNA==	fujiisoup 6815844	2018-03-13T23:24:37Z	2018-03-13T23:24:37Z	MEMBER	I see no problem with your code... Can you try updating xarray to 0.10.2 (released today)? We updated some logic of lazy indexing.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Load a small subset of data from a big dataset takes forever 304624171
372570107	https://github.com/pydata/xarray/issues/1985#issuecomment-372570107	https://api.github.com/repos/pydata/xarray/issues/1985	MDEyOklzc3VlQ29tbWVudDM3MjU3MDEwNw==	malmans2 22245117	2018-03-13T07:21:10Z	2018-03-13T07:21:10Z	CONTRIBUTOR	I forgot to mention that I'm getting this warning: /home/idies/anaconda3/lib/python3.5/site-packages/dask/core.py:306: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison elif type_arg is type(key) and arg == key: However, I don't think it is relevant since I get the same warning when I'm able to run .to_netcdf() on the 3D variable.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Load a small subset of data from a big dataset takes forever 304624171
372566304	https://github.com/pydata/xarray/issues/1985#issuecomment-372566304	https://api.github.com/repos/pydata/xarray/issues/1985	MDEyOklzc3VlQ29tbWVudDM3MjU2NjMwNA==	malmans2 22245117	2018-03-13T07:01:51Z	2018-03-13T07:01:51Z	CONTRIBUTOR	The problem occurs when I run the very last line, which is to_netcdf(). Right before, the dataset looks like this: python <xarray.Dataset> Dimensions: (X: 10, Y: 25, Z: 1, time: 2) Coordinates: * time (time) datetime64[ns] 2007-11-15 2007-11-16 * Z (Z) float64 1.0 * X (X) float64 -29.94 -29.89 -29.85 -29.81 -29.76 -29.72 -29.67 ... * Y (Y) float64 65.01 65.03 65.05 65.07 65.09 65.11 65.13 65.15 ... Data variables: drF (time, Z) float64 2.0 2.0 dxF (time, Y, X) float64 2.066e+03 2.066e+03 2.066e+03 2.066e+03 ... dyF (time, Y, X) float64 2.123e+03 2.123e+03 2.123e+03 2.123e+03 ... rA (time, Y, X) float64 4.386e+06 4.386e+06 4.386e+06 4.386e+06 ... fCori (time, Y, X) float64 0.0001322 0.0001322 0.0001322 0.0001322 ... R_low (time, Y, X) float64 -2.001e+03 -1.989e+03 -1.973e+03 ... Ro_surf (time, Y, X) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... Depth (time, Y, X) float64 2.001e+03 1.989e+03 1.973e+03 1.963e+03 ... HFacC (time, Z, Y, X) float64 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 ... Temp (time, Z, Y, X) float64 dask.array<shape=(2, 1, 25, 10), chunksize=(1, 1, 25, 10)> This is a dask array, right?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Load a small subset of data from a big dataset takes forever 304624171
372563938	https://github.com/pydata/xarray/issues/1985#issuecomment-372563938	https://api.github.com/repos/pydata/xarray/issues/1985	MDEyOklzc3VlQ29tbWVudDM3MjU2MzkzOA==	fujiisoup 6815844	2018-03-13T06:48:23Z	2018-03-13T06:48:23Z	MEMBER	Umm. I could not find what is wrong with your code. Can you find which line loads the data into memory? If your data is still a dask array, it does not print the entries of the array but instead, it shows something like this, `python <xarray.DataArray (x: 3)> dask.array<shape=(3,), dtype=int64, chunksize=(3,)> Dimensions without coordinates: x`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Load a small subset of data from a big dataset takes forever 304624171
372558850	https://github.com/pydata/xarray/issues/1985#issuecomment-372558850	https://api.github.com/repos/pydata/xarray/issues/1985	MDEyOklzc3VlQ29tbWVudDM3MjU1ODg1MA==	malmans2 22245117	2018-03-13T06:19:47Z	2018-03-13T06:23:00Z	CONTRIBUTOR	I have the same issue if I don't copy the dataset. Here are the coordinates of my dataset: python <xarray.Dataset> Dimensions: (X: 960, Xp1: 961, Y: 880, Yp1: 881, Z: 216, Zl: 216, Zp1: 217, Zu: 216, time: 1465) Coordinates: * Z (Z) float64 1.0 3.5 7.0 11.5 17.0 23.5 31.0 39.5 49.0 59.5 ... * Zp1 (Zp1) float64 0.0 2.0 5.0 9.0 14.0 20.0 27.0 35.0 44.0 54.0 ... * Zu (Zu) float64 2.0 5.0 9.0 14.0 20.0 27.0 35.0 44.0 54.0 65.0 ... * Zl (Zl) float64 0.0 2.0 5.0 9.0 14.0 20.0 27.0 35.0 44.0 54.0 ... * X (X) float64 -46.92 -46.83 -46.74 -46.65 -46.57 -46.48 -46.4 ... * Y (Y) float64 56.81 56.85 56.89 56.93 56.96 57.0 57.04 57.08 ... * Xp1 (Xp1) float64 -46.96 -46.87 -46.78 -46.7 -46.61 -46.53 ... * Yp1 (Yp1) float64 56.79 56.83 56.87 56.91 56.95 56.98 57.02 ... * time (time) datetime64[ns] 2007-09-01 2007-09-01T06:00:00 ... I don't think the horizontal coordinates are the problem because it works fine when I use the same function on 3D variables. I'm also attaching the function that I use to open the dataset, just in case is helpful: ```python def load_dataset(): """ Load the whole dataset """ # Import grid and fields separately, then merge gridpath = '/home/idies/workspace/OceanCirculation/exp_ASR/grid_glued.nc' fldspath = '/home/idies/workspace/OceanCirculation/exp_ASR/result_/output_glued/.*_glued.nc' gridset = xr.open_dataset(gridpath, drop_variables = ['XU','YU','XV','YV','RC','RF','RU','RL']) fldsset = xr.open_mfdataset(fldspath, concat_dim = 'T', drop_variables = ['diag_levels','iter']) ds = xr.merge([gridset, fldsset]) # Adjust dimensions creating conflicts ds = ds.rename({'Z': 'Ztmp'}) ds = ds.rename({'T': 'time', 'Ztmp': 'Z', 'Zmd000216': 'Z'}) ds = ds.squeeze('Zd000001') for dim in ['Z','Zp1', 'Zu','Zl']: ds[dim].values = np.fabs(ds[dim].values) ds[dim].attrs.update({'positive': 'down'}) # Create horizontal vectors (remove zeros due to exch2) ds['X'].values = ds.XC.where((ds.XC!=0) & (ds.YC!=0)).mean(dim='Y', skipna=True) ds['Xp1'].values = ds.XG.where((ds.XG!=0) & (ds.YG!=0)).mean(dim='Yp1', skipna=True) ds['Y'].values = ds.YC.where((ds.XC!=0) & (ds.YC!=0)).mean(dim='X', skipna=True) ds['Yp1'].values = ds.YG.where((ds.XG!=0) & (ds.YG!=0)).mean(dim='Xp1', skipna=True) ds = ds.drop(['XC','YC','XG','YG']) # Create xgcm grid ds['Z'].attrs.update({'axis': 'Z'}) ds['X'].attrs.update({'axis': 'X'}) ds['Y'].attrs.update({'axis': 'Y'}) for dim in ['Zp1','Zu','Zl','Xp1','Yp1']: if min(ds[dim].values)<min(ds[dim[0]].values): ds[dim].attrs.update({'axis': dim[0], 'c_grid_axis_shift': -0.5}) elif min(ds[dim].values)>min(ds[dim[0]].values): ds[dim].attrs.update({'axis': dim[0], 'c_grid_axis_shift': +0.5}) grid = xgcm.Grid(ds,periodic=False) return ds, grid ``` I think somewhere I trigger the loading of the whole dataset. Otherwise, I don't understand why it works when I open just one month instead of the whole year.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Load a small subset of data from a big dataset takes forever 304624171
372545491	https://github.com/pydata/xarray/issues/1985#issuecomment-372545491	https://api.github.com/repos/pydata/xarray/issues/1985	MDEyOklzc3VlQ29tbWVudDM3MjU0NTQ5MQ==	fujiisoup 6815844	2018-03-13T04:44:52Z	2018-03-13T04:48:56Z	MEMBER	I notice this line `python # Copy dataset ds = ds2cut.copy(deep=True)` loads the data into memory. I think you don't need to copy the dataset here. If you need to copy the data, it is more efficient to make a copy after the indexing.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Load a small subset of data from a big dataset takes forever 304624171
372544809	https://github.com/pydata/xarray/issues/1985#issuecomment-372544809	https://api.github.com/repos/pydata/xarray/issues/1985	MDEyOklzc3VlQ29tbWVudDM3MjU0NDgwOQ==	fujiisoup 6815844	2018-03-13T04:39:47Z	2018-03-13T04:39:47Z	MEMBER	When I load the sub-dataset after using the indexing routines, does xarray need to read the whole original 4D variable? I don't think so. We support lazy indexing for any dimensional arrays (but not coordinate variables). What does your data (especially '4Dvariable.nc') look like? Is `Xp1` coordinate or sufficiently small? `ds['Xp1'].values` loads `Xp1` into the memory.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Load a small subset of data from a big dataset takes forever 304624171

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);