issues
7 rows where state = "closed", type = "issue" and user = 30219501 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: comments, created_at (date), updated_at (date), closed_at (date)
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at ▲ | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
427644858 | MDU6SXNzdWU0Mjc2NDQ4NTg= | 2861 | WHERE function, problems with memory operations? | rpnaut 30219501 | closed | 0 | 8 | 2019-04-01T11:09:11Z | 2022-04-09T15:41:51Z | 2022-04-09T15:41:51Z | NONE | I am facing with the where-functionality in xarray. I have two datasets
and
Applying something like this:
gives me a dataarray of time length zero:
Problem descriptionThe problem seems to be that 'ref' and 'proof' are not entirely consistent somehow regarding coordinates. But if a subtract the coordinates from each other I do not get a difference. However, as I always fight with getting datasets consistent to each other for mathematical calculations with xarray, I have figured out following workarounds:
Maybe, here I deal with a problem of incomplete operations in memory? The printout between datasets is maybe consistent but still an additional operation on the datasets is required to make the datasets consistent in memory? Thanks in advance for your help |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/2861/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
428180638 | MDU6SXNzdWU0MjgxODA2Mzg= | 2863 | Memory Error for simple operations on NETCDF4 internally zipped files | rpnaut 30219501 | closed | 0 | 3 | 2019-04-02T11:48:01Z | 2022-04-09T02:15:45Z | 2022-04-09T02:15:45Z | NONE | Assuming you want to make easy computations with a data array loaded from internally zipped NETCDF4 files, you need at first to load a dataset:
Afterwards I have tried to do this: ``` In [4]: datarray=eobs["T_2M"]+273.15 MemoryError Traceback (most recent call last) <ipython-input-4-eaff3bff5e27> in <module>() ----> 1 datarray=eobs["T_2M"]+273.15 /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/core/dataarray.py in func(self, other) 1539 1540 variable = (f(self.variable, other_variable) -> 1541 if not reflexive 1542 else f(other_variable, self.variable)) 1543 coords = self.coords._merge_raw(other_coords) /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/core/variable.py in func(self, other) 1139 if isinstance(other, (xr.DataArray, xr.Dataset)): 1140 return NotImplemented -> 1141 self_data, other_data, dims = _broadcast_compat_data(self, other) 1142 new_data = (f(self_data, other_data) 1143 if not reflexive /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/core/variable.py in _broadcast_compat_data(self, other) 1379 else: 1380 # rely on numpy broadcasting rules -> 1381 self_data = self.data 1382 other_data = other 1383 dims = self.dims /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/core/variable.py in data(self) 265 return self._data 266 else: --> 267 return self.values 268 269 @data.setter /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/core/variable.py in values(self) 306 def values(self): 307 """The variable's data as a numpy.ndarray""" --> 308 return _as_array_or_item(self._data) 309 310 @values.setter /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/core/variable.py in _as_array_or_item(data) 182 TODO: remove this (replace with np.asarray) once these issues are fixed 183 """ --> 184 data = np.asarray(data) 185 if data.ndim == 0: 186 if data.dtype.kind == 'M': /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/numpy-1.11.2-py3.5-linux-x86_64.egg/numpy/core/numeric.py in asarray(a, dtype, order) 480 481 """ --> 482 return array(a, dtype, copy=False, order=order) 483 484 def asanyarray(a, dtype=None, order=None): /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/core/indexing.py in array(self, dtype) 417 418 def array(self, dtype=None): --> 419 self._ensure_cached() 420 return np.asarray(self.array, dtype=dtype) 421 /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/core/indexing.py in _ensure_cached(self) 414 def _ensure_cached(self): 415 if not isinstance(self.array, np.ndarray): --> 416 self.array = np.asarray(self.array) 417 418 def array(self, dtype=None): /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/numpy-1.11.2-py3.5-linux-x86_64.egg/numpy/core/numeric.py in asarray(a, dtype, order) 480 481 """ --> 482 return array(a, dtype, copy=False, order=order) 483 484 def asanyarray(a, dtype=None, order=None): /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/core/indexing.py in array(self, dtype) 398 399 def array(self, dtype=None): --> 400 return np.asarray(self.array, dtype=dtype) 401 402 def getitem(self, key): /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/numpy-1.11.2-py3.5-linux-x86_64.egg/numpy/core/numeric.py in asarray(a, dtype, order) 480 481 """ --> 482 return array(a, dtype, copy=False, order=order) 483 484 def asanyarray(a, dtype=None, order=None): /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/core/indexing.py in array(self, dtype) 373 def array(self, dtype=None): 374 array = orthogonally_indexable(self.array) --> 375 return np.asarray(array[self.key], dtype=None) 376 377 def getitem(self, key): /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/conventions.py in getitem(self, key) 361 def getitem(self, key): 362 return mask_and_scale(self.array[key], self.fill_value, --> 363 self.scale_factor, self.add_offset, self._dtype) 364 365 def repr(self): /sw/rhel6-x64/python/python-3.5.2-gcc49/lib/python3.5/site-packages/xarray-0.9.5-py3.5.egg/xarray/conventions.py in mask_and_scale(array, fill_value, scale_factor, add_offset, dtype) 57 """ 58 # by default, cast to float to ensure NaN is meaningful ---> 59 values = np.array(array, dtype=dtype, copy=True) 60 if fill_value is not None and not np.all(pd.isnull(fill_value)): 61 if getattr(fill_value, 'size', 1) > 1: MemoryError: ``` I have uploaded the datafile to the following link: https://swiftbrowser.dkrz.de/public/dkrz_c0725fe8741c474b97f291aac57f268f/GregorMoeller/ Do I use the wrong netcdf-engine? |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/2863/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
349077990 | MDU6SXNzdWUzNDkwNzc5OTA= | 2356 | New Resample-Syntax leading to cancellation of dimensions | rpnaut 30219501 | closed | 0 | 8 | 2018-08-09T10:56:29Z | 2019-10-15T15:01:33Z | 2019-10-15T15:01:33Z | NONE | ExampleStarting with the dataset located here: https://swiftbrowser.dkrz.de/public/dkrz_c0725fe8741c474b97f291aac57f268f/GregorMoeller/, I want to calculate monthly sums of precipitation for each gridpoint in the daily data: ``` In [39]: data = array.open_dataset("eObs_gridded_0.22deg_rot_v14.0.TOT_PREC.1950-2016.nc_CutParamTimeUnitCor_FinalEvalGrid") In [40]: data Out[13]: <xarray.Dataset> Dimensions: (rlat: 136, rlon: 144, time: 153) Coordinates: * rlon (rlon) float32 -22.6 -22.38 -22.16 -21.94 -21.72 -21.5 ... * rlat (rlat) float32 -12.54 -12.32 -12.1 -11.88 -11.66 -11.44 ... * time (time) datetime64[ns] 2006-05-01T12:00:00 ... Data variables: rotated_pole int32 ... TOT_PREC (time, rlat, rlon) float32 ... Attributes: CDI: Climate Data Interface version 1.8.0 (http://m... Conventions: CF-1.6 history: Thu Jun 14 12:34:59 2018: cdo -O -s -P 4 remap... CDO: Climate Data Operators version 1.8.0 (http://m... cdo_openmp_thread_number: 4 In [41]: datamonth = data["TOT_PREC"].resample(time="M").sum() In [42]: datamonth Out[42]: <xarray.DataArray 'TOT_PREC' (time: 5)> array([ 551833.25 , 465640.09375, 328445.90625, 836892.1875 , 503601.5 ], dtype=float32) Coordinates: time (time) datetime64[ns] 2006-05-31 2006-06-30 2006-07-31 ... ``` Problem descriptionThe problem is that the dimensions 'rlon' and 'rlat' and the corresponding coordinates have not survived the resample process. Only the time is present in the result. Expected OutputI expect to have the spatial dimensions still in the output of monthly sums. The surprise is, that this is the case using the old syntax: ``` In [41]: datamonth = data["TOT_PREC"].resample(dim="time",freq="M",how="sum") /usr/bin/ipython3:1: FutureWarning: .resample() has been modified to defer calculations. Instead of passing 'dim' and how="sum", instead consider using .resample(time="M").sum('time') #!/usr/bin/env python3 In [42]: datamonth Out[42]: <xarray.DataArray 'TOT_PREC' (time: 5, rlat: 136, rlon: 144)> array([[[ 0. , 0. , ..., 0. , 0. ], [ 0. , 0. , ..., 0. , 0. ], ..., [ 0. , 0. , ..., 44.900028, 41.400024], [ 0. , 0. , ..., 49.10001 , 46.5 ]]], dtype=float32) Coordinates: * time (time) datetime64[ns] 2006-05-31 2006-06-30 2006-07-31 ... * rlon (rlon) float32 -22.6 -22.38 -22.16 -21.94 -21.72 -21.5 -21.28 ... * rlat (rlat) float32 -12.54 -12.32 -12.1 -11.88 -11.66 -11.44 -11.22 ... ``` What is wrong here? And maybe I can also ask the question why the new syntax did not consider use cases with high complex scripting? I do not like to use in my programs a hardcoded dimension name, i.e. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/2356/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
331981984 | MDU6SXNzdWUzMzE5ODE5ODQ= | 2230 | Inconsistency between Sum of NA's and Mean of NA's: resampling gives 0 or 'NA' | rpnaut 30219501 | closed | 0 | 7 | 2018-06-13T12:54:47Z | 2018-08-16T06:59:33Z | 2018-08-16T06:59:33Z | NONE | Problem descriptionFor datamining with xarray there is always the following issue with the resampling-method. Data exampleI have a dataset with hourly values for 5 month 'fcut'.
I know that there is an ongoing discussion about that topic (see for example https://github.com/pandas-dev/pandas/issues/9422). For earth science it would be nice to have an option telling xarray what to do in case of a sum over values being all NA. Do you see a chance to have a fast fix for that issue in the model code? |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/2230/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
262696381 | MDU6SXNzdWUyNjI2OTYzODE= | 1604 | Where functionality in xarray including else case (dask compability) | rpnaut 30219501 | closed | 0 | 8 | 2017-10-04T07:51:39Z | 2018-06-13T14:57:25Z | 2017-12-14T17:49:56Z | NONE | I am faced with the flexibility needed to compute different types of skill scores using xarray. Thus, keeping in mind the attached code - a method for computing a modified mean squared error skill score ("AVSS") - I am fighting with the following problems: 1. I want to try to keep the code user-friendly regarding an extension of my program to other skill scores. Thus, the middle part of the attached method utilizing the if-then-else statement shall be outsourced. 2. There are three input datasets in case of skill scores: self.DSref = observations, self.DSrefmod = reference model, self.proof = model to evaluate. I have to combine all three with simple arithmetics (minus), but xarray does not allow simple arithmetics in case of small differences in the coordinates between the three datasets (also if the data type of the coordinates differ from float64 to float). Thus, my horrifying workaround is to make a loop over all variables I want to evaluate and to do for each variable the following: a) create a new dataset "DSnew" based on the dataset-variable "self.DSproof[varnsproof]", b) rename the variable in "DSnew" to the variable name I want to have for the evaulation result (e.g. Bias of temperature or skill score of temperature), c) create some help variables "DSnew['MSE_p1]" by copying and d) modifying the data of the variables to compute those mathematical operations of the related skill score invariant to temporal aggregation, e) applying grouping and resampling to compute climate statistics as monthly means or daily cycles and f) final mathematical operation of the related skill score which has to be done after temporal aggregation. Is there a better way to handle the operations / to prevent the strange process of creating new datasets and copying variables and to prevent the outer loop over the variables? What would be your short code to handle my problem? 3. The where functionality is sometimes needed to compute skill scores. I have used the where function of numpy, but as I read in your xarray-documentation, an explicit call of numpy functions is not compatible with dask-arrays? Is there an analogue in the xarray-package? ``` def squarefunc(x): return xarray.ufuncs.square(x) def AVSS_def(x): AVSS_p1 = x["MSE_p1"]/x["MSE_p2"] * (-1.0) + 1.0 AVSS_p2 = x["MSE_p2"]/x["MSE_p1"] - 1.0 x[varnsres].data = np.where( (x["MSE_p2"] - x["MSE_p1"]) > 0,AVSS_p1,AVSS_p2 ) return x endresult = xarray.Dataset() for varnsrefmod,varnsproof,varnsref,varnsres in zip(self.varns_refmod,self.varns_proof,self.varns_ref,varns_result): DSnew = xarray.merge([xarray.Dataset(),self.DSproof[varnsproof]]) DSnew.rename({varnsproof : varnsres },inplace=True) DSnew["MSE_p1"] = DSnew[varnsres].copy() DSnew["MSE_p2"] = DSnew[varnsres].copy() DSnew["MSE_p1"].data = squarefunc(self.DSproof[varnsproof].data - self.DSref[varnsref].data) DSnew["MSE_p2"].data = squarefunc(self.DSrefmod[varnsrefmod].data - self.DSref[varnsref].data) coordtime = GeneralUtils.FromDimList2Pyxarray(dim_time[varnsref]) if aggregtime == 'fullperiod': DSnew = DSnew.mean(coordtime); self.RepairTime.update({'Needed' : False}); elif aggregtime == '-': DSnew = DSnew; self.RepairTime.update({'Needed' : False}); elif "overyears" in aggregtime: grpby_method=GeneralUtils.ConvertAggregationKey2XRgroupby(aggregtime) DSnew = DSnew.groupby(coordtime+'.'+grpby_method).mean(coordtime); self.RepairTime.update({'Needed' : True}); self.RepairTime.update({'start' : self.DSref[coordtime].data[0] }); self.RepairTime.update({'end' : self.DSref[coordtime].data[-1]}) elif "overyears" not in aggregtime: resamplefreq=GeneralUtils.ConvertAggregationKey2Resample(aggregtime) DSnew = DSnew.resample(resamplefreq, dim=coordtime, how='mean'); self.RepairTime.update({'Needed' : False}); AVSS_def(DSnew); self.Update_Attributes(Datasetobj=DSnew,variable=varnsres,stdname=varnsres,units=self.DSref[varnsref].attrs['units'], \ longname="temporal AVSS of "+self.DSref[varnsref].attrs['long_name']) endresult = xarray.merge([endresult,DSnew]) ``` |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1604/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
249188875 | MDU6SXNzdWUyNDkxODg4NzU= | 1506 | Support for basic math (multiplication, difference) on two xarray-Datasets | rpnaut 30219501 | closed | 0 | 3 | 2017-08-09T23:16:09Z | 2017-08-10T16:14:41Z | 2017-08-10T16:14:41Z | NONE | Lets assume one has loaded two datasets 'datmod' and 'datref' containing daily data over one year. Data look like:
Now I want to compute a more complex metric as the temporal correlation and combine it with the functionality of groupby or resample, i.e. determine the temporal correlation for each month seperately. So, starting with ``` def anomaly(x): return x - x.mean('time') a = datref.groupby('time.month').apply(anomaly)
b = datmod.groupby('time.month').apply(anomaly)
I can overcome the problem by doing something like Is there a way to overcome the problem of elementwise multiplication (as well as subtraction) or should such a feature be added in the future? |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1506/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
243270042 | MDU6SXNzdWUyNDMyNzAwNDI= | 1480 | Time Dimension, Big problem with methods 'groupby' and 'to_netcdf' | rpnaut 30219501 | closed | 0 | 4 | 2017-07-16T22:10:52Z | 2017-07-20T19:46:15Z | 2017-07-17T19:01:01Z | NONE | My problem is that I would like to use the easy functionality of the xarray-library in python, but I run into problems with the time dimension in case of aggregating data and in case of writing netcdf. I am using pandas version 0.17.1 and xarray 0.9.6. I have opened a dataset, which contains daily data over the year 2013: The contents of the file are:
``` datset.groupby('time.month').mean('time') <xarray.Dataset> Dimensions: (bnds: 2, month: 12, rlat: 228, rlon: 234) Coordinates: * rlon (rlon) float64 -28.24 -28.02 -27.8 -27.58 -27.36 -27.14 ... * rlat (rlat) float64 -23.52 -23.3 -23.08 -22.86 -22.64 -22.42 -22.2 ... * month (month) int64 1 2 3 4 5 6 7 8 9 10 11 12 Dimensions without coordinates: bnds Data variables: time_bnds (month, bnds) float64 1.074e+09 1.074e+09 1.077e+09 1.077e+09 ... ASWGLOB_S (month, rlat, rlon) float64 nan nan nan nan nan nan nan nan ... ``` Now I have instead of a time dimension a month dimension with values from 1 to 12. Is this a side effect of the 'mean' - function? As long as i do not use this mean function, the time variable is retained. The examples given in the documentation seems to have a different behaviour. That is, the timestamps are retained and the first date of each month is used. It seems to be impossible to reinvent my old time dimension.
How to improve method A und B in order to have a correct time stamp in my nc-file. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1480/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issues] ( [id] INTEGER PRIMARY KEY, [node_id] TEXT, [number] INTEGER, [title] TEXT, [user] INTEGER REFERENCES [users]([id]), [state] TEXT, [locked] INTEGER, [assignee] INTEGER REFERENCES [users]([id]), [milestone] INTEGER REFERENCES [milestones]([id]), [comments] INTEGER, [created_at] TEXT, [updated_at] TEXT, [closed_at] TEXT, [author_association] TEXT, [active_lock_reason] TEXT, [draft] INTEGER, [pull_request] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [state_reason] TEXT, [repo] INTEGER REFERENCES [repos]([id]), [type] TEXT ); CREATE INDEX [idx_issues_repo] ON [issues] ([repo]); CREATE INDEX [idx_issues_milestone] ON [issues] ([milestone]); CREATE INDEX [idx_issues_assignee] ON [issues] ([assignee]); CREATE INDEX [idx_issues_user] ON [issues] ([user]);