github: issues: 8 rows where user = 30219501 sorted by updated

8 rows where user = 30219501 sorted by updated_at descending

Search:

✖

descending

id	node_id	number	title	user	state	comments	created_at	updated_at ▲	closed_at	author_association	body	reactions	state_reason	repo	type
332018176	MDU6SXNzdWUzMzIwMTgxNzY=	2231	Time bounds returned after an operation with resample-method	rpnaut 30219501	open	8	2018-06-13T14:22:49Z	2022-04-17T23:43:48Z		NONE	Problem description For datamining with xarray there is always the following issue with the resampling-method. If i resample e.g. a timeseries with hourly values to monthly values, the netcdf-standards tell us to put into the result file information about: the bounds for each timestep over which the aggregation was taken (for each month the beginning and the end of the month) the method which was used for aggregation decoded by the variable attribute 'cell_method' (e.g. 'time: mean'). The recent implementation should be improved which is proven by the following data example. Data example I have a dataset with hourly values over a period of 5 month. `python <xarray.Dataset> Dimensions: (bnds: 2, time: 3672) Coordinates: rlon float32 22.06 rlat float32 5.06 * time (time) datetime64[ns] 2006-05-01 2006-05-01T01:00:00 ... Dimensions without coordinates: bnds Data variables: rotated_pole int32 1 time_bnds (time, bnds) float64 1.304e+07 1.305e+07 1.305e+07 ... TOT_PREC (time) float64 nan nan nan nan nan nan nan nan nan nan nan ... Attributes:` Doing a resample process using the mean operator gives `In [36]: frs Out[36]: <xarray.Dataset> Dimensions: (bnds: 2, time: 5) Coordinates: * time (time) datetime64[ns] 2006-05-31 2006-06-30 2006-07-31 ... Dimensions without coordinates: bnds Data variables: rotated_pole (time) float64 1.0 1.0 1.0 1.0 1.0 time_bnds (time, bnds) float64 1.438e+07 1.438e+07 1.702e+07 ... TOT_PREC (time) float64 12.0 nan nan nan nan` Here the time_bnds is still in the file but the content is very strange: `In [37]: frs["time_bnds"] Out[37]: <xarray.DataArray 'time_bnds' (time: 5, bnds: 2)> array([[ 1.438020e+07, 1.438380e+07], [ 1.701540e+07, 1.701900e+07], [ 1.965060e+07, 1.965420e+07], [ 2.232900e+07, 2.233260e+07], [ -6.330338e+10, -6.330338e+10]]) Coordinates: * time (time) datetime64[ns] 2006-05-31 2006-06-30 2006-07-31 ... Dimensions without coordinates: bnds` So, he still knows that time_bnds is related to the coordinate time. However, the values are not correct. The first time_bnds entry should be [1.5.2006 00:00,31.5.2006 23:00]. That is definitely not the case, i.e. the numbers here are related to the original file (seconds since 2005-12-01), but they do not match to my expection. 1.438020e+07 equals "Dienstag, 16. Mai 2006, 10:30:00" and 1.438380e+07 equals "Dienstag, 16. Mai 2006, 11:30:00". Moreover, the xarray's do not consider to change the unit of the time_bnds according the unit of the variable 'time' if data is written to netcdf. Output of the program ncdump reveals that time was changed to days since but time_bnds seems to be still coded in "seconds since". ``` ncdump -v time_bnds try.nc netcdf try { dimensions: time = 5 ; bnds = 2 ; variables: double rotated_pole(time) ; rotated_pole:_FillValue = NaN ; double time_bnds(time, bnds) ; time_bnds:_FillValue = NaN ; double TOT_PREC(time) ; TOT_PREC:_FillValue = NaN ; int64 time(time) ; time:units = "days since 2006-05-31 00:00:00" ; time:calendar = "proleptic_gregorian" ; data: time_bnds = 14380200, 14383800, 17015400, 17019000, 19650600, 19654200, 22329000, 22332600, -63303379200, -63303379200 ; } ``` Is there a recommendation what to do?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2231/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
427644858	MDU6SXNzdWU0Mjc2NDQ4NTg=	2861	WHERE function, problems with memory operations?	rpnaut 30219501	closed	8	2019-04-01T11:09:11Z	2022-04-09T15:41:51Z	2022-04-09T15:41:51Z	NONE	I am facing with the where-functionality in xarray. I have two datasets ref = array([[14.82, 14.94, nan, ..., 16.21, 16.24, nan], [14.52, 14.97, nan, ..., 16.32, 16.34, nan], [15.72, 16.09, nan, ..., 17.38, 17.44, nan], ..., [ 6.55, 6.34, nan, ..., 6.67, 6.6 , nan], [ 8.76, 9.12, nan, ..., 9.07, 9.52, nan], [ 8.15, 8.97, nan, ..., 9.65, 9.52, nan]], dtype=float32) Coordinates: * height_WSS (height_WSS) float32 40.3 50.3 60.3 70.3 80.3 90.3 101.2 105.0 lat float32 54.01472 lon float32 6.5875 * time (time) datetime64[ns] 2006-10-31T00:10:00 ... 2006-11-03T23:10:00 Attributes: standard_name: wind_speed long_name: wind speed units: m s-1 cell_methods: time: mean comment: direction of the boom holding the measurement devices: 41... sensor: cup anemometer sensor_type: Vector Instruments Windspeed Ltd. A100LK/PC3/WR accuracy: 0.1 m s-1 and proof= <xarray.DataArray 'WSS' (time: 96, height_WSS: 8)> array([[13.395692, 13.653825, 13.911958, ..., 14.511758, 14.703774, 14.770716], [14.740592, 15.010887, 15.281183, ..., 15.866542, 16.045753, 16.10823 ], [15.241853, 15.523318, 15.804785, ..., 16.417458, 16.605673, 16.67129 ], ..., [ 8.254081, 8.309716, 8.365352, ..., 8.46401 , 8.489728, 8.498694], [ 9.83241 , 9.895019, 9.957627, ..., 10.055538, 10.077768, 10.085519], [ 8.772054, 8.849378, 8.926702, ..., 9.065577, 9.102219, 9.114992]], dtype=float32) Coordinates: * time (time) datetime64[ns] 2006-10-31T00:10:00 ... 2006-11-03T23:10:00 lon float32 6.5875 lat float32 54.01472 * height_WSS (height_WSS) float32 40.3 50.3 60.3 70.3 80.3 90.3 101.2 105.0 Attributes: standard_name: wind_speed long_name: wind speed units: m s-1 Applying something like this: `DSproof = proof["WSS"].where(ref["WSS"].notnull()).to_dataset(name="WSS")` gives me a dataarray of time length zero: `<xarray.Dataset> Dimensions: (height_WSS: 8, time: 0) Coordinates: * time (time) datetime64[ns] lon float32 6.5875 lat float32 54.01472 * height_WSS (height_WSS) float32 40.3 50.3 60.3 70.3 80.3 90.3 101.2 105.0 Data variables: WSS (time, height_WSS) float32` Problem description The problem seems to be that 'ref' and 'proof' are not entirely consistent somehow regarding coordinates. But if a subtract the coordinates from each other I do not get a difference. However, as I always fight with getting datasets consistent to each other for mathematical calculations with xarray, I have figured out following workarounds: One can drop the coordinates lon and lat from both datasets. Then everything works fine with 'where'. I am using WHERE in a large script with some operations done before WHERE is called. One operation is to make the data types and the coordinate names between 'ref' and 'proof' consistent (thatswhy the above output is very similar). If I save the files and reload them immediately before applying WHERE, it fixes my problem. Using a selection of all height levels `proof["WSS"].isel(height=slice(0,9).where(ref["WSS"].isel(height=slice(0,9).notnull()).to_dataset(name="WSS")` also fixes my problem. Maybe, here I deal with a problem of incomplete operations in memory? The printout between datasets is maybe consistent but still an additional operation on the datasets is required to make the datasets consistent in memory? Thanks in advance for your help	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2861/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
428180638	MDU6SXNzdWU0MjgxODA2Mzg=	2863	Memory Error for simple operations on NETCDF4 internally zipped files	rpnaut 30219501	closed	3	2019-04-02T11:48:01Z	2022-04-09T02:15:45Z	2022-04-09T02:15:45Z	NONE	Assuming you want to make easy computations with a data array loaded from internally zipped NETCDF4 files, you need at first to load a dataset: In [2]: eobs = xarray.open_dataset("eObs_ens_mean_0.1deg_reg_v18.0e.T_2M.1950-2018.nc") In [3]: eobs Out[3]: <xarray.Dataset> Dimensions: (lat: 465, lon: 705, time: 25049) Coordinates: * time (time) datetime64[ns] 1950-01-01 1950-01-02 1950-01-03 ... * lon (lon) float64 -24.95 -24.85 -24.75 -24.65 -24.55 -24.45 -24.35 ... * lat (lat) float64 25.05 25.15 25.25 25.35 25.45 25.55 25.65 25.75 ... Data variables: T_2M (time, lat, lon) float64 nan nan nan nan nan nan nan nan nan ... Attributes: _NCProperties: version=1\|netcdflibversion=4.4.1\|hdf5libversion=1.8.17 E-OBS_version: 18.0e Conventions: CF-1.4 References: http://surfobs.climate.copernicus.eu/dataaccess/access_eo... Afterwards I have tried to do this: ``` In [4]: datarray=eobs["T_2M"]+273.15 MemoryError Traceback (most recent call last) <ipython-input-4-eaff3bff5e27> in <module>() ----> 1 datarray=eobs["T_2M"]+273.15 /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/core/dataarray.py in func(self, other) 1539 1540 variable = (f(self.variable, other_variable) -> 1541 if not reflexive 1542 else f(other_variable, self.variable)) 1543 coords = self.coords._merge_raw(other_coords) /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/core/variable.py in func(self, other) 1139 if isinstance(other, (xr.DataArray, xr.Dataset)): 1140 return NotImplemented -> 1141 self_data, other_data, dims = _broadcast_compat_data(self, other) 1142 new_data = (f(self_data, other_data) 1143 if not reflexive /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/core/variable.py in _broadcast_compat_data(self, other) 1379 else: 1380 # rely on numpy broadcasting rules -> 1381 self_data = self.data 1382 other_data = other 1383 dims = self.dims /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/core/variable.py in data(self) 265 return self._data 266 else: --> 267 return self.values 268 269 @data.setter /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/core/variable.py in values(self) 306 def values(self): 307 """The variable's data as a numpy.ndarray""" --> 308 return _as_array_or_item(self._data) 309 310 @values.setter /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/core/variable.py in _as_array_or_item(data) 182 TODO: remove this (replace with np.asarray) once these issues are fixed 183 """ --> 184 data = np.asarray(data) 185 if data.ndim == 0: 186 if data.dtype.kind == 'M': /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/numpy-1.11.2-py3.5-linux-x86_64.egg/numpy/core/numeric.py in asarray(a, dtype, order) 480 481 """ --> 482 return array(a, dtype, copy=False, order=order) 483 484 def asanyarray(a, dtype=None, order=None): /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/core/indexing.py in array(self, dtype) 417 418 def array(self, dtype=None): --> 419 self._ensure_cached() 420 return np.asarray(self.array, dtype=dtype) 421 /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/core/indexing.py in _ensure_cached(self) 414 def _ensure_cached(self): 415 if not isinstance(self.array, np.ndarray): --> 416 self.array = np.asarray(self.array) 417 418 def array(self, dtype=None): /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/numpy-1.11.2-py3.5-linux-x86_64.egg/numpy/core/numeric.py in asarray(a, dtype, order) 480 481 """ --> 482 return array(a, dtype, copy=False, order=order) 483 484 def asanyarray(a, dtype=None, order=None): /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/core/indexing.py in array(self, dtype) 398 399 def array(self, dtype=None): --> 400 return np.asarray(self.array, dtype=dtype) 401 402 def getitem(self, key): /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/numpy-1.11.2-py3.5-linux-x86_64.egg/numpy/core/numeric.py in asarray(a, dtype, order) 480 481 """ --> 482 return array(a, dtype, copy=False, order=order) 483 484 def asanyarray(a, dtype=None, order=None): /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/core/indexing.py in array(self, dtype) 373 def array(self, dtype=None): 374 array = orthogonally_indexable(self.array) --> 375 return np.asarray(array[self.key], dtype=None) 376 377 def getitem(self, key): /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/conventions.py in getitem(self, key) 361 def getitem(self, key): 362 return mask_and_scale(self.array[key], self.fill_value, --> 363 self.scale_factor, self.add_offset, self._dtype) 364 365 def repr(self): /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/conventions.py in mask_and_scale(array, fill_value, scale_factor, add_offset, dtype) 57 """ 58 # by default, cast to float to ensure NaN is meaningful ---> 59 values = np.array(array, dtype=dtype, copy=True) 60 if fill_value is not None and not np.all(pd.isnull(fill_value)): 61 if getattr(fill_value, 'size', 1) > 1: MemoryError: ``` I have uploaded the datafile to the following link: https://swiftbrowser.dkrz.de/public/dkrz_c0725fe8741c474b97f291aac57f268f/GregorMoeller/ Do I use the wrong netcdf-engine?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2863/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
349077990	MDU6SXNzdWUzNDkwNzc5OTA=	2356	New Resample-Syntax leading to cancellation of dimensions	rpnaut 30219501	closed	8	2018-08-09T10:56:29Z	2019-10-15T15:01:33Z	2019-10-15T15:01:33Z	NONE	Example Starting with the dataset located here: https://swiftbrowser.dkrz.de/public/dkrz_c0725fe8741c474b97f291aac57f268f/GregorMoeller/, I want to calculate monthly sums of precipitation for each gridpoint in the daily data: ``` In [39]: data = array.open_dataset("eObs_gridded_0.22deg_rot_v14.0.TOT_PREC.1950-2016.nc_CutParamTimeUnitCor_FinalEvalGrid") In [40]: data Out[13]: <xarray.Dataset> Dimensions: (rlat: 136, rlon: 144, time: 153) Coordinates: * rlon (rlon) float32 -22.6 -22.38 -22.16 -21.94 -21.72 -21.5 ... * rlat (rlat) float32 -12.54 -12.32 -12.1 -11.88 -11.66 -11.44 ... * time (time) datetime64[ns] 2006-05-01T12:00:00 ... Data variables: rotated_pole int32 ... TOT_PREC (time, rlat, rlon) float32 ... Attributes: CDI: Climate Data Interface version 1.8.0 (http://m... Conventions: CF-1.6 history: Thu Jun 14 12:34:59 2018: cdo -O -s -P 4 remap... CDO: Climate Data Operators version 1.8.0 (http://m... cdo_openmp_thread_number: 4 In [41]: datamonth = data["TOT_PREC"].resample(time="M").sum() In [42]: datamonth Out[42]: <xarray.DataArray 'TOT_PREC' (time: 5)> array([ 551833.25 , 465640.09375, 328445.90625, 836892.1875 , 503601.5 ], dtype=float32) Coordinates: time (time) datetime64[ns] 2006-05-31 2006-06-30 2006-07-31 ... ``` Problem description The problem is that the dimensions 'rlon' and 'rlat' and the corresponding coordinates have not survived the resample process. Only the time is present in the result. Expected Output I expect to have the spatial dimensions still in the output of monthly sums. The surprise is, that this is the case using the old syntax: ``` In [41]: datamonth = data["TOT_PREC"].resample(dim="time",freq="M",how="sum") /usr/bin/ipython3:1: FutureWarning: .resample() has been modified to defer calculations. Instead of passing 'dim' and how="sum", instead consider using .resample(time="M").sum('time') #!/usr/bin/env python3 In [42]: datamonth Out[42]: <xarray.DataArray 'TOT_PREC' (time: 5, rlat: 136, rlon: 144)> array([[[ 0. , 0. , ..., 0. , 0. ], [ 0. , 0. , ..., 0. , 0. ], ..., [ 0. , 0. , ..., 44.900028, 41.400024], [ 0. , 0. , ..., 49.10001 , 46.5 ]]], dtype=float32) Coordinates: * time (time) datetime64[ns] 2006-05-31 2006-06-30 2006-07-31 ... * rlon (rlon) float32 -22.6 -22.38 -22.16 -21.94 -21.72 -21.5 -21.28 ... * rlat (rlat) float32 -12.54 -12.32 -12.1 -11.88 -11.66 -11.44 -11.22 ... ``` What is wrong here? And maybe I can also ask the question why the new syntax did not consider use cases with high complex scripting? I do not like to use in my programs a hardcoded dimension name, i.e. `time=${freq}` instead of `dim=${dim}; freq=${freq}`.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2356/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
331981984	MDU6SXNzdWUzMzE5ODE5ODQ=	2230	Inconsistency between Sum of NA's and Mean of NA's: resampling gives 0 or 'NA'	rpnaut 30219501	closed	7	2018-06-13T12:54:47Z	2018-08-16T06:59:33Z	2018-08-16T06:59:33Z	NONE	Problem description For datamining with xarray there is always the following issue with the resampling-method. If i resample e.g. a daily timeseries over one month and if the data are 'NA' at each day, I get zero as a result. That is annoying considering a timeseries of precipitation. It is definitely a difference if the monthly precipitation is zero for one month (each day zero precipitation) or the monthly precipitation was not measured due to problems with the device (each day NA) Data example I have a dataset with hourly values for 5 month 'fcut'. `python <xarray.Dataset> Dimensions: (bnds: 2, time: 3672) Coordinates: rlon float32 22.06 rlat float32 5.06 * time (time) datetime64[ns] 2006-05-01 2006-05-01T01:00:00 ... Dimensions without coordinates: bnds Data variables: rotated_pole int32 1 time_bnds (time, bnds) float64 1.304e+07 1.305e+07 1.305e+07 ... TOT_PREC (time) float64 nan nan nan nan nan nan nan nan nan nan nan ... Attributes:` Doing a resample process gives only zero values for each month. `In [10]: fcut.resample(dim='time',freq='M',how='sum') Out[10]: <xarray.Dataset> Dimensions: (bnds: 2, time: 5) Coordinates: * time (time) datetime64[ns] 2006-05-31 2006-06-30 2006-07-31 ... Dimensions without coordinates: bnds Data variables: rotated_pole (time) int64 1 1 1 1 1 time_bnds (time, bnds) float64 1.07e+10 1.07e+10 1.225e+10 1.225e+10 ... TOT_PREC (time) float64 0.0 0.0 0.0 0.0 0.0` But I expect to have NA for each month, as it is the case for the operator 'mean' I know that there is an ongoing discussion about that topic (see for example https://github.com/pandas-dev/pandas/issues/9422). For earth science it would be nice to have an option telling xarray what to do in case of a sum over values being all NA. Do you see a chance to have a fast fix for that issue in the model code?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2230/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
262696381	MDU6SXNzdWUyNjI2OTYzODE=	1604	Where functionality in xarray including else case (dask compability)	rpnaut 30219501	closed	8	2017-10-04T07:51:39Z	2018-06-13T14:57:25Z	2017-12-14T17:49:56Z	NONE	I am faced with the flexibility needed to compute different types of skill scores using xarray. Thus, keeping in mind the attached code - a method for computing a modified mean squared error skill score ("AVSS") - I am fighting with the following problems: 1. I want to try to keep the code user-friendly regarding an extension of my program to other skill scores. Thus, the middle part of the attached method utilizing the if-then-else statement shall be outsourced. 2. There are three input datasets in case of skill scores: self.DSref = observations, self.DSrefmod = reference model, self.proof = model to evaluate. I have to combine all three with simple arithmetics (minus), but xarray does not allow simple arithmetics in case of small differences in the coordinates between the three datasets (also if the data type of the coordinates differ from float64 to float). Thus, my horrifying workaround is to make a loop over all variables I want to evaluate and to do for each variable the following: a) create a new dataset "DSnew" based on the dataset-variable "self.DSproof[varnsproof]", b) rename the variable in "DSnew" to the variable name I want to have for the evaulation result (e.g. Bias of temperature or skill score of temperature), c) create some help variables "DSnew['MSE_p1]" by copying and d) modifying the data of the variables to compute those mathematical operations of the related skill score invariant to temporal aggregation, e) applying grouping and resampling to compute climate statistics as monthly means or daily cycles and f) final mathematical operation of the related skill score which has to be done after temporal aggregation. Is there a better way to handle the operations / to prevent the strange process of creating new datasets and copying variables and to prevent the outer loop over the variables? What would be your short code to handle my problem? 3. The where functionality is sometimes needed to compute skill scores. I have used the where function of numpy, but as I read in your xarray-documentation, an explicit call of numpy functions is not compatible with dask-arrays? Is there an analogue in the xarray-package? ``` def squarefunc(x): return xarray.ufuncs.square(x) def AVSS_def(x): AVSS_p1 = x["MSE_p1"]/x["MSE_p2"] * (-1.0) + 1.0 AVSS_p2 = x["MSE_p2"]/x["MSE_p1"] - 1.0 x[varnsres].data = np.where( (x["MSE_p2"] - x["MSE_p1"]) > 0,AVSS_p1,AVSS_p2 ) return x endresult = xarray.Dataset() for varnsrefmod,varnsproof,varnsref,varnsres in zip(self.varns_refmod,self.varns_proof,self.varns_ref,varns_result): DSnew = xarray.merge([xarray.Dataset(),self.DSproof[varnsproof]]) DSnew.rename({varnsproof : varnsres },inplace=True) DSnew["MSE_p1"] = DSnew[varnsres].copy() DSnew["MSE_p2"] = DSnew[varnsres].copy() DSnew["MSE_p1"].data = squarefunc(self.DSproof[varnsproof].data - self.DSref[varnsref].data) DSnew["MSE_p2"].data = squarefunc(self.DSrefmod[varnsrefmod].data - self.DSref[varnsref].data) coordtime = GeneralUtils.FromDimList2Pyxarray(dim_time[varnsref]) if aggregtime == 'fullperiod': DSnew = DSnew.mean(coordtime); self.RepairTime.update({'Needed' : False}); elif aggregtime == '-': DSnew = DSnew; self.RepairTime.update({'Needed' : False}); elif "overyears" in aggregtime: grpby_method=GeneralUtils.ConvertAggregationKey2XRgroupby(aggregtime) DSnew = DSnew.groupby(coordtime+'.'+grpby_method).mean(coordtime); self.RepairTime.update({'Needed' : True}); self.RepairTime.update({'start' : self.DSref[coordtime].data[0] }); self.RepairTime.update({'end' : self.DSref[coordtime].data[-1]}) elif "overyears" not in aggregtime: resamplefreq=GeneralUtils.ConvertAggregationKey2Resample(aggregtime) DSnew = DSnew.resample(resamplefreq, dim=coordtime, how='mean'); self.RepairTime.update({'Needed' : False}); AVSS_def(DSnew); self.Update_Attributes(Datasetobj=DSnew,variable=varnsres,stdname=varnsres,units=self.DSref[varnsref].attrs['units'], \ longname="temporal AVSS of "+self.DSref[varnsref].attrs['long_name']) endresult = xarray.merge([endresult,DSnew]) ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1604/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
249188875	MDU6SXNzdWUyNDkxODg4NzU=	1506	Support for basic math (multiplication, difference) on two xarray-Datasets	rpnaut 30219501	closed	3	2017-08-09T23:16:09Z	2017-08-10T16:14:41Z	2017-08-10T16:14:41Z	NONE	Lets assume one has loaded two datasets 'datmod' and 'datref' containing daily data over one year. Data look like: `Dimensions: (bnds: 2, rlat: 228, rlon: 234, time: 365) Coordinates: * rlon (rlon) float64 -28.24 -28.02 -27.8 -27.58 -27.36 -27.14 ... * rlat (rlat) float64 -23.52 -23.3 -23.08 -22.86 -22.64 -22.42 ... * time (time) datetime64[ns] 2013-01-01T11:30:00 ... Dimensions without coordinates: bnds Data variables: rotated_pole \|S1 '' time_bnds (time, bnds) float64 1.073e+09 1.073e+09 1.073e+09 ... ASWGLOB_S (time, rlat, rlon) float64 nan nan nan nan nan nan nan nan ...` Now I want to compute a more complex metric as the temporal correlation and combine it with the functionality of groupby or resample, i.e. determine the temporal correlation for each month seperately. So, starting with ``` def anomaly(x): return x - x.mean('time') a = datref.groupby('time.month').apply(anomaly) b = datmod.groupby('time.month').apply(anomaly) `gives me the anomalies for each time step with respect to monthly means. However, for the nominator of the correlation (the denominator is not discussed here) the elementwise multiplication is needed:` corr = ab `and later on this product is grouped monthly and averaged over time. The problem is that the product 'ab' gives an dataset with missing variables` <xarray.Dataset> Dimensions: (rlat: 228, rlon: 234, time: 0) Coordinates: * time (time) datetime64[ns] * rlon (rlon) float64 -28.24 -28.02 -27.8 -27.58 -27.36 -27.14 -26.92 ... * rlat (rlat) float64 -23.52 -23.3 -23.08 -22.86 -22.64 -22.42 -22.2 ... Data variables: month (time) int64 ``` I can overcome the problem by doing something like `corr=a[varname].data - b[varname.data]`. But then I have an numpy.array which does not support groupby- and aggregation-functionality, i.e. I must clone the dataset 'datmod' and replace all the data with data of 'corr'. Then I can use again Dataset-aggregation functionality. Is there a way to overcome the problem of elementwise multiplication (as well as subtraction) or should such a feature be added in the future?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1506/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
243270042	MDU6SXNzdWUyNDMyNzAwNDI=	1480	Time Dimension, Big problem with methods 'groupby' and 'to_netcdf'	rpnaut 30219501	closed	4	2017-07-16T22:10:52Z	2017-07-20T19:46:15Z	2017-07-17T19:01:01Z	NONE	My problem is that I would like to use the easy functionality of the xarray-library in python, but I run into problems with the time dimension in case of aggregating data and in case of writing netcdf. I am using pandas version 0.17.1 and xarray 0.9.6. I have opened a dataset, which contains daily data over the year 2013: `datset=xr.open_dataset(filein).` The contents of the file are: <xarray.Dataset> Dimensions: (bnds: 2, rlat: 228, rlon: 234, time: 365) Coordinates: * rlon (rlon) float64 -28.24 -28.02 -27.8 -27.58 -27.36 -27.14 ... * rlat (rlat) float64 -23.52 -23.3 -23.08 -22.86 -22.64 -22.42 ... * time (time) datetime64[ns] 2013-01-01T11:30:00 ... Dimensions without coordinates: bnds Data variables: rotated_pole \|S1 '' time_bnds (time, bnds) float64 1.073e+09 1.073e+09 1.073e+09 ... ASWGLOB_S (time, rlat, rlon) float64 nan nan nan nan nan nan nan nan ... Attributes: CDI: Climate Data Interface version 1.7.0 (http://m... Conventions: CF-1.4 When I use now the groupby method to compute the monthly means, the time dimension is destroyed: ``` datset.groupby('time.month').mean('time') <xarray.Dataset> Dimensions: (bnds: 2, month: 12, rlat: 228, rlon: 234) Coordinates: * rlon (rlon) float64 -28.24 -28.02 -27.8 -27.58 -27.36 -27.14 ... * rlat (rlat) float64 -23.52 -23.3 -23.08 -22.86 -22.64 -22.42 -22.2 ... * month (month) int64 1 2 3 4 5 6 7 8 9 10 11 12 Dimensions without coordinates: bnds Data variables: time_bnds (month, bnds) float64 1.074e+09 1.074e+09 1.077e+09 1.077e+09 ... ASWGLOB_S (month, rlat, rlon) float64 nan nan nan nan nan nan nan nan ... ``` Now I have instead of a time dimension a month dimension with values from 1 to 12. Is this a side effect of the 'mean' - function? As long as i do not use this mean function, the time variable is retained. The examples given in the documentation seems to have a different behaviour. That is, the timestamps are retained and the first date of each month is used. It seems to be impossible to reinvent my old time dimension. Method A: I have tried to create my own time variable with `endresult.assign_coords(time=pd.date_range(start='2013-01',end='2014-01',freq='M'` . That perfectly gives me a new coordinate with the correct dates. Afterwards, I have to swap the dimensions from month to time. It was only possible by changing the dimension of the coordinate 'time' to the dimension of the coordinate 'month'. However, the netcdf file contained wrong dates as output, i.e. values from 1 to 12. Thus the first time step was at 31-January 2013 and the next one day later and the next one day later and so on. If I add the attributes 'calendar' and 'units' to the time-coordinate, then the output seems to be correct but type int64 is not readable by programs like ncview. Method B: Create the own time variable by using pandas and then converting the datetime64-dates to the usual python datetime-object. Further, the datetime-object is converted to numbers with the netcdf4.datetime.date2num method. Further, I assign this numbers to the time-coordinate and add the encoded attributes for units and calendar. However, the encoded units are not writen to the netcdf-data. So I have to add them with an external program like ncatted. How to improve method A und B in order to have a correct time stamp in my nc-file.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1480/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);

issues

8 rows where user = 30219501 sorted by updated_at descending

Problem description

Data example

Problem description

Example

Problem description

Expected Output

Problem description

Data example

Advanced export