home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

13 rows where user = 3404817 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: title, comments, closed_at, created_at (date), updated_at (date), closed_at (date)

type 2

  • issue 8
  • pull 5

state 1

  • closed 13

repo 1

  • xarray 13
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
90658514 MDU6SXNzdWU5MDY1ODUxNA== 442 NetCDF attributes like `long_name` and `units` lost on `.mean()` j08lue 3404817 closed 0     5 2015-06-24T12:14:30Z 2020-04-05T18:18:31Z 2015-06-26T07:28:54Z CONTRIBUTOR      

When reading in a variable from netCDF, the standard attributes like long_name, standard_name, and units are being propagated, but apparently lost when calling ~~.load()~~ .mean() on the DataArray.

Couldn't these CF-Highly Recommended Variable Attributes be kept during this operation?

(What to do with them afterwards, e.g. upon merge, is a different question, unresolved also in the pandas community.)

EDIT: the problem actually occurs when calling .mean() (not .load(), as originally posted).

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/442/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
219692578 MDU6SXNzdWUyMTk2OTI1Nzg= 1354 concat automagically outer-joins coordinates j08lue 3404817 closed 0     8 2017-04-05T19:39:07Z 2019-08-07T12:17:07Z 2019-08-07T12:17:07Z CONTRIBUTOR      

I would like to concatenate two netCDF files that have float64 coordinate variables. I thought the coordinate values were the same, but in fact they differ by something in the order of 1e-14.

Using open_mfdataset or concat to merge the datasets, I get a completely different output shape than the two input files have.

This is because concat is somewhere performing an outer join on the coordinates. Now I am wondering where else in my workflows this might happen without my notice...

It would be awesome if there was an option to change this behaviour on concat, open_mfdataset, and auto_combine. I would actually rather make these functions fail if any dimension other than the concatenation dimension differs.

Note: This could also be a special case, because ```python

(ds1.lon == ds2.lon).all() <xarray.DataArray 'lon' ()> array(True, dtype=bool) whilepython (ds1.lon.values == ds2.lon.values).all() False ```

Maybe an interface change could be considered together with that discussed in #1340?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1354/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
110979851 MDU6SXNzdWUxMTA5Nzk4NTE= 623 Circular longitude axis j08lue 3404817 closed 0     6 2015-10-12T13:56:45Z 2019-06-20T20:09:46Z 2016-12-24T00:11:45Z CONTRIBUTOR      

A common issue with global data or model output is that the zonal grid boundary might cut right through a region of interest. In that case, the field must be re-wrapped/shifted such that a region in the far east of the field is placed to the left of a region in the far west.

An way of achieving this when using xray (or netCDF4) I found was to use indices for isel that start somewhere in the east and then jump to 0 and continue from there.

But this works only on data that is already loaded into memory (e.g. with .load()), as illustrated in this gist. I assume that this constraint is due to the netCDF backend (in this case netCDF4) not supporting irregular slicing. Once loaded, the operation is performed on NumPy arrays, I guess?

Now the first thing is that it took me quite a while to figure out why this worked in some cases and not in others. Perhaps the IndexError that is thrown by the backend could be caught to give more hints on this? Or add a note in the Docs about this?

It would of course be great if xray had a nice high-level function to handle circular axes. Another item for the wish list...

(By the way, in Ferret this is called Modulo Axes.)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/623/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
302447879 MDExOlB1bGxSZXF1ZXN0MTcyOTc1OTY4 1965 avoid integer overflow when decoding large time numbers j08lue 3404817 closed 0     6 2018-03-05T20:21:20Z 2018-05-01T12:41:28Z 2018-05-01T12:41:28Z CONTRIBUTOR   0 pydata/xarray/pulls/1965

The issue: int32 time data in seconds or so leads to an overflow in time decoding.

This is in way the back side of #1859: By ensuring that _NS_PER_TIME_DELTA is integer, we got rid of round-off errors that were due to casting to float but now we are getting int overflow in this line:

https://github.com/pydata/xarray/blob/0e73e240107caee3ffd1a1149f0150c390d43251/xarray/coding/times.py#L169-L170

e.g. '2001-01-01' in seconds since 1970-01-01 means np.array([978307200]) * int(1e9) which gives 288686080 that gets decoded to '1970-01-01T00:00:00.288686080' -- note also the trailing digits. Something is very wrong here.

  • [x] Tests added (for all bug fixes or enhancements)
  • [x] Tests passed (for all non-documentation changes)
  • [x] Fully documented, including whats-new.rst for all changes
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1965/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
319132629 MDExOlB1bGxSZXF1ZXN0MTg1MTMxMjg0 2096 avoid integer overflow when decoding large time numbers j08lue 3404817 closed 0     3 2018-05-01T07:02:24Z 2018-05-01T12:41:13Z 2018-05-01T12:41:08Z CONTRIBUTOR   0 pydata/xarray/pulls/2096
  • [x] Closes #1965
  • [x] Tests added (for all bug fixes or enhancements)
  • [ ] Tests passed (for all non-documentation changes)
  • [x] Fully documented, including whats-new.rst for all changes and api.rst for new API
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2096/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
292231408 MDExOlB1bGxSZXF1ZXN0MTY1NTgzMTIw 1863 test decoding num_dates in float types j08lue 3404817 closed 0     4 2018-01-28T19:34:52Z 2018-02-10T12:16:26Z 2018-02-02T02:01:47Z CONTRIBUTOR   0 pydata/xarray/pulls/1863
  • [x] Closes #1859
  • [x] Tests added (for all bug fixes or enhancements)
  • [x] Tests passed (for all non-documentation changes)
  • [x] try to find the origin of the difference in behaviour between v0.10.0 and current base
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1863/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
291565176 MDU6SXNzdWUyOTE1NjUxNzY= 1859 Time decoding has round-off error 0.10.0. Gone now. j08lue 3404817 closed 0     3 2018-01-25T13:12:13Z 2018-02-02T02:01:47Z 2018-02-02T02:01:47Z CONTRIBUTOR      

Note: This problem occurs with version 0.10.0, but is gone when using current master (0.10.0+dev44.g0a0593d).

Here is a complete example: https://gist.github.com/j08lue/34498cf17b176d15933e778278ba2921

Problem description

I have this time variable from a netCDF file: float32 time(time) units: days since 1980-1-1 0:0:0 standard_name: time calendar: gregorian And when I open the file with xr.open_dataset(..., decode_times=True) the first time stamp becomes 1982-12-31T23:59:59.560122368, while it should be 1983-01-01. For reference, netCDF4.num2date gets it right.

I tracked the problem down to xarray.conventions.decode_cf_datetime. But then also noticed that you in https://github.com/pydata/xarray/commit/50b0a69a7aa0fb7ac3afb28e7bd971cf08055f99 made major changes to the decoding.

So I also tested with current master (0.10.0+dev44.g0a0593d) and the problem is gone.

Up to you what you make of this. 😄 Maybe you can just close the issue.

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.4.final.0 python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel byteorder: little LC_ALL: None LANG: ZZ LOCALE: None.None xarray: 0.10.0 pandas: 0.21.1 numpy: 1.13.3 scipy: 1.0.0 netCDF4: 1.3.1 h5netcdf: 0.5.0 Nio: None bottleneck: 1.2.1 cyordereddict: 1.0.0 dask: 0.16.1 matplotlib: 2.1.1 cartopy: None seaborn: None setuptools: 38.2.4 pip: 9.0.1 conda: None pytest: 3.3.1 IPython: 6.2.1 sphinx: None In [4]:
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1859/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
89268800 MDU6SXNzdWU4OTI2ODgwMA== 438 `xray.open_mfdataset` concatenates also variables without time dimension j08lue 3404817 closed 0   0.5.2 1172685 13 2015-06-18T11:34:53Z 2017-09-19T16:16:58Z 2015-07-15T21:47:11Z CONTRIBUTOR      

When opening a multi-file dataset with xray.open_mfdataset, also some variables are concatenated that do not have a time dimension.

My netCDF files contain a lot of those "static" variables (e.g. grid spacing etc.). netCDF4.MFDataset used to handle those as expected (i.e. did not concatenate them).

Is the different behaviour of xray.open_mfdataset intentional or due to a bug?

Note: I am using decode_times=False.

Example

python with xray.open_dataset(files[0], decode_times=False) as single: print single['dz']

<xray.DataArray 'dz' (z_t: 60)> array([ 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1019.68078613, 1056.44836426, 1105.99511719, 1167.80700684, 1242.41333008, 1330.96777344, 1435.14099121, 1557.12585449, 1699.67956543, 1866.21240234, 2060.90234375, 2288.85205078, 2556.24707031, 2870.57495117, 3240.8371582 , 3677.77246094, 4194.03076172, 4804.22363281, 5524.75439453, 6373.19189453, 7366.94482422, 8520.89257812, 9843.65820312, 11332.46582031, 12967.19921875, 14705.34375 , 16480.70898438, 18209.13476562, 19802.234375 , 21185.95703125, 22316.50976562, 23186.49414062, 23819.44921875, 24257.21679688, 24546.77929688, 24731.01367188, 24844.328125 , 24911.97460938, 24951.29101562, 24973.59375 , 24985.9609375 , 24992.67382812, 24996.24414062, 24998.109375 ]) Coordinates: * z_t (z_t) float32 500.0 1500.0 2500.0 3500.0 4500.0 5500.0 6500.0 ... Attributes: long_name: thickness of layer k units: centimeters

python with xray.open_mfdataset(files, decode_times=False) as multiple: print multiple['dz']

<xray.DataArray 'dz' (time: 12, z_t: 60)> dask.array<concatenate-1156, shape=(12, 60), chunks=((1, 1, 1, ..., 1, 1), (60,)), dtype=float64> Coordinates: * z_t (z_t) float32 500.0 1500.0 2500.0 3500.0 4500.0 5500.0 6500.0 ... * time (time) float64 3.653e+04 3.656e+04 3.659e+04 3.662e+04 ... Attributes: long_name: thickness of layer k units: centimeters

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/438/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
197514417 MDExOlB1bGxSZXF1ZXN0OTkzMjkwNjc= 1184 Add test for issue 1140 j08lue 3404817 closed 0     2 2016-12-25T20:37:16Z 2017-03-30T23:08:40Z 2017-03-30T23:08:40Z CONTRIBUTOR   0 pydata/xarray/pulls/1184

1140

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1184/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
192122307 MDU6SXNzdWUxOTIxMjIzMDc= 1140 sel with method 'nearest' fails with AssertionError j08lue 3404817 closed 0     10 2016-11-28T21:30:43Z 2017-03-30T19:18:34Z 2017-03-30T19:18:34Z CONTRIBUTOR      

The following fails

``python def test_sel_nearest(): """Testsel` with method='nearest'""" # create test data times = pd.date_range('2000-01-01', periods=4) foo = xr.DataArray([1,2,3,4], coords=dict(time=times), dims='time')

# works
expected = foo.sel(time='2000-01-01')

# fails
assert foo.sel(time='2000-01-01', method='nearest') == expected

```

with an AssertionError in xarray/core/variable.py:

``` C:\Users\uuu\AppData\Local\Continuum\Miniconda2\lib\site-packages\xarray\core\dataarray.pyc in sel(self, method, tolerance, indexers) 623 self, indexers, method=method, tolerance=tolerance 624 ) --> 625 return self.isel(pos_indexers)._replace_indexes(new_indexes) 626 627 def isel_points(self, dim='points', **indexers):

C:\Users\uuu\AppData\Local\Continuum\Miniconda2\lib\site-packages\xarray\core\dataarray.pyc in isel(self, indexers) 608 DataArray.sel 609 """ --> 610 ds = self._to_temp_dataset().isel(indexers) 611 return self._from_temp_dataset(ds) 612

C:\Users\uuu\AppData\Local\Continuum\Miniconda2\lib\site-packages\xarray\core\dataset.pyc in isel(self, indexers) 910 for name, var in iteritems(self._variables): 911 var_indexers = dict((k, v) for k, v in indexers if k in var.dims) --> 912 variables[name] = var.isel(var_indexers) 913 return self._replace_vars_and_dims(variables) 914

C:\Users\uuu\AppData\Local\Continuum\Miniconda2\lib\site-packages\xarray\core\variable.pyc in isel(self, **indexers) 539 if dim in indexers: 540 key[i] = indexers[dim] --> 541 return self[tuple(key)] 542 543 def _shift_one_dim(self, dim, count):

C:\Users\uuu\AppData\Local\Continuum\Miniconda2\lib\site-packages\xarray\core\variable.pyc in getitem(self, key) 377 # orthogonal indexing should ensure the dimensionality is consistent 378 if hasattr(values, 'ndim'): --> 379 assert values.ndim == len(dims), (values.ndim, len(dims)) 380 else: 381 assert len(dims) == 0, len(dims)

AssertionError: (0, 1) ```

It does not matter which type the dimension has that is indexed:

``python def test_sel_nearest_int(): """Testsel` with method='nearest'""" bar = xr.DataArray([1, 2, 3, 4], coords=dict(dummy=range(4)), dims='dummy')

# works
expected = bar.sel(dummy=3)

# fails
assert bar.sel(dummy=3, method='nearest') == expected

```

This is on Miniconda for Windows 64 bit with conda-forge and IOOS builds and * xarray=0.8.2 * pandas=0.19.1 * numpy=1.11.2

Why might this be? Am I doing something wrong?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1140/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
173773358 MDU6SXNzdWUxNzM3NzMzNTg= 992 Creating unlimited dimensions with xarray.Dataset.to_netcdf j08lue 3404817 closed 0     18 2016-08-29T13:23:48Z 2017-01-24T06:38:49Z 2017-01-24T06:38:49Z CONTRIBUTOR      

@shoyer you wrote in a comment on another issue

xray doesn't use or set unlimited dimensions. (It's pretty irrelevant for us, given that NumPy arrays can be stored in either row-major or column-major order.)

I see that xarray does not need UNLIMITED dimensions internally. But I need to create a netCDF file that I subsequently can append to (along the time dimension, in this case). Can this be done?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/992/reactions",
    "total_count": 5,
    "+1": 5,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
190683531 MDU6SXNzdWUxOTA2ODM1MzE= 1132 groupby with datetime DataArray fails with `AttributeError` j08lue 3404817 closed 0     7 2016-11-21T11:00:57Z 2016-12-19T17:15:17Z 2016-12-19T17:11:57Z CONTRIBUTOR      

I want to group some data by Oct-May season of each year, i.e. [(Oct 2000 - May 2001), (Oct 2001 - May 2002), ...]. I.e. I do not want some DJF-like mean over all the data but one value for each year.

To achieve this, I construct a DataArray that maps the time steps of my dataset (time coordinate) to the start of each of such a season (values) and feed that to groupby. python def _get_oct_may_index(ds): dd = ds.time.to_index().to_pydatetime() labels = [] coords = [] for d in dd: if d.month <= 5: refyear = d.year - 1 elif d.month >= 10: refyear = d.year else: continue dref = datetime.datetime(refyear, 10, 1) labels.append(dref) coords.append(d) return xr.DataArray(labels, coords=dict(time=coords), dims='time', name='season_start')

I give it a custom name='season_start', so I end up in the last else of GroupBy.__init__, where my DataArray named group gets passed on to the function unique_value_groups. However, that function apparently expects a NumPy array, rather than a DataArray.

Please see this ipynb showing the error.

Proposed solution

So it turns out this can easily be fixed by changing line 226 from

unique_values, group_indices = unique_value_groups(group, sort=sort)

to

unique_values, group_indices = unique_value_groups(group.values, sort=sort)

Please see this other ipynb where the result is as expected.

Now is this a bug or am I abusing the code somehow?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1132/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
190690822 MDExOlB1bGxSZXF1ZXN0OTQ1OTgxMzI= 1133 use safe_cast_to_index to sanitize DataArrays for groupby j08lue 3404817 closed 0     2 2016-11-21T11:33:08Z 2016-12-19T17:12:31Z 2016-12-19T17:11:57Z CONTRIBUTOR   0 pydata/xarray/pulls/1133

Fixes https://github.com/pydata/xarray/issues/1132

Let me know whether this is a valid bug fix or I am misunderstanding something.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1133/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 22.918ms · About: xarray-datasette