home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

10 rows where type = "issue" and user = 291576 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, created_at (date), updated_at (date), closed_at (date)

type 1

  • issue · 10 ✖

state 1

  • closed 10

repo 1

  • xarray 10
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
636493109 MDU6SXNzdWU2MzY0OTMxMDk= 4142 Should we make "rasterio" an engine option? WeatherGod 291576 closed 0     6 2020-06-10T19:28:49Z 2021-05-27T16:17:53Z 2021-05-27T16:17:53Z CONTRIBUTOR      

In a similar vein to how #4003 is going for zarr files, I would like to see if a rasterio engine could be created so that geotiff files could get opened through open_mfdataset() and friends. I am willing to put some cycles to putting this together. It has been a long time since I did the initial prototype for the pynio backend back in the xray days. Does anyone see any immediate pitfalls or gotchas for doing this?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4142/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
307318224 MDU6SXNzdWUzMDczMTgyMjQ= 2004 Slicing DataArray can take longer than not slicing WeatherGod 291576 closed 0     14 2018-03-21T16:20:49Z 2020-12-03T18:15:35Z 2020-12-03T18:15:35Z CONTRIBUTOR      

Code Sample, a copy-pastable example if possible

```ipython In [1]: import xarray as xr

In [2]: radmax_ds = xr.open_dataset('tests/radmax_baseline.nc')

In [3]: radmax_ds Out[3]: <xarray.Dataset> Dimensions: (latitude: 5650, longitude: 12050, time: 3) Coordinates: * latitude (latitude) float32 13.505002 13.515002 13.525002 13.535002 ... * longitude (longitude) float32 -170.495 -170.485 -170.475 -170.465 ... * time (time) datetime64[ns] 2017-03-07T01:00:00 2017-03-07T02:00:00 ... Data variables: RadarMax (time, latitude, longitude) float32 ... Attributes: start_date: 03/07/2017 01:00 end_date: 03/07/2017 01:55 elapsed: 60 data_rights: Respond (TM) Confidential Data. (c) Insurance Services Offi...

In [4]: %timeit foo = radmax_ds.RadarMax.load() The slowest run took 35509.20 times longer than the fastest. This could mean that an intermediate result is being cached. 1 loop, best of 3: 216 µs per loop

In [5]: 216 * 35509.2 Out[5]: 7669987.199999999 ``` So, without any slicing, it takes approximately 7.5 seconds for me to load this complete file into memory. Now, let's see what happens when I slice the DataArray and load it:

``` ipython In [1]: import xarray as xr

In [2]: radmax_ds = xr.open_dataset('tests/radmax_baseline.nc')

In [3]: %timeit foo = radmax_ds.RadarMax[::1, ::1, ::1].load() 1 loop, best of 3: 7.56 s per loop

In [4]: radmax_ds.close()

In [5]: radmax_ds = xr.open_dataset('tests/radmax_baseline.nc')

In [6]: %timeit foo = radmax_ds.RadarMax[::1, ::10, ::10].load() `` I killed this session after 17 minutes.top` did not report any unusual io wait, and memory usage was not out of control. I am using v0.10.2 of xarray. My suspicion is that there is something wrong with the indexing system that is causing xarray to read in the data in a bad order. Notice that if I slice all the data, then the timing works out the same as reading it all in straight-up. Not shown here is a run where if I slice every 100 lats and 100 longitudes, then the timing is shorter again, but not to the same amount of time as reading it all in at once.

Let me know if you want a copy of the file. It is a compressed netcdf4, taking up only 1.7MB.

I wonder if this is related to #1985?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2004/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
306067267 MDU6SXNzdWUzMDYwNjcyNjc= 1997 can't do in-place clip() with DataArrays. WeatherGod 291576 closed 0     4 2018-03-16T20:31:03Z 2020-02-19T22:59:08Z 2020-02-19T22:59:08Z CONTRIBUTOR      

Code Sample, a copy-pastable example if possible

Where foo is a DataArray, there doesn't seem to be a nice way to use clip() in-place. python foo.clip(0, None, out=foo) Traceback (most recent call last): File "<stdin>", line 1, in <module> foo.clip(0, None, out=foo) File "/rd22/scratch/broot/Programs/xarray/xarray/core/dataarray.py", line 1726, in func **kwargs)) File "/rd22/scratch/broot/Programs/xarray/xarray/core/ops.py", line 205, in func return _call_possibly_missing_method(self, name, args, kwargs) File "/rd22/scratch/broot/Programs/xarray/xarray/core/ops.py", line 192, in _call_possibly_missing_method return method(*args, **kwargs) TypeError: output must be an array You get a similar exception if you do np.clip(foo, ..., out=foo).

Problem description

Note the docstring for DataArray.clip(): ``` Help on method clip in module xarray.core.ops:

clip(self, args, *kwargs) method of xarray.core.dataarray.DataArray instance a.clip(min=None, max=None, out=None)

Return an array whose values are limited to ``[min, max]``.
One of max or min must be given.

Refer to `numpy.clip` for full documentation.

See Also
--------
numpy.clip : equivalent function

`` So, the docstring advertises support forout`.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1997/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
395994055 MDU6SXNzdWUzOTU5OTQwNTU= 2647 getting a "truth value of an array" error when supplying my own `concat_dim`. WeatherGod 291576 closed 0     7 2019-01-04T16:52:00Z 2019-01-06T14:32:08Z 2019-01-05T06:46:34Z CONTRIBUTOR      

This bug was introduced sometime after v0.11.0 and has turned up in my test suite using v0.11.2. I'll pass a DataArray() as my concat_dim, and the failure will happen at line 609 in backends/api.py because of: if concat_dim is None or concat_dim == _CONCAT_DIM_DEFAULT:

I am not sure how this change got through. In #2048, I added a unit test that passes a DataArray as a concat_dim argument.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2647/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
305327479 MDU6SXNzdWUzMDUzMjc0Nzk= 1988 open_mfdataset() on a single file drops the concat_dim WeatherGod 291576 closed 0     6 2018-03-14T21:02:39Z 2018-04-10T20:48:59Z 2018-04-10T20:48:59Z CONTRIBUTOR      

When calling xr.open_mfdataset() on a 1 element list of filenames, the concat dimension is never added.

This isn't a MWE at the moment (will make one soon enough), just wanted to get my thoughts down.

```python from datetime import datetime import xarray as xr

time_coord = xr.DataArray([datetime.utcnow()], name='time', dims='time') radmax_ds = xr.open_mfdataset(['foobar.nc'], concat_dim=time_coord) print(radmax_ds) <xarray.Dataset> Dimensions: (latitude: 5650, longitude: 12050) Coordinates: * latitude (latitude) float32 13.505002 13.515002 13.525002 13.535002 ... * longitude (longitude) float32 -170.495 -170.485 -170.475 -170.465 ... Data variables: RadarMax (latitude, longitude) float32 dask.array<shape=(5650, 12050), chunksize=(5650, 12050)> Attributes: start_date: 03/07/2017 01:00 end_date: 03/07/2017 01:55 elapsed: 60 ```

Problem description

If there are two files, then there is a time coordinate, and the data array becomes 3D.

Output of xr.show_versions()

I am currently on a recent-ish master of xarray.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1988/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
110726841 MDU6SXNzdWUxMTA3MjY4NDE= 615 operations with pd.to_timedelta() now fails WeatherGod 291576 closed 0     3 2015-10-09T20:01:00Z 2015-10-09T21:21:41Z 2015-10-09T20:15:49Z CONTRIBUTOR      

Not exactly sure when this started to fail, but I recently upgraded my pandas install and a script of mine started to fail. The SSCCE:

``` from datetime import datetime, timedelta import xray import pandas as pd

a = xray.Dataset({'time': [datetime(2000, 1, 1)]}) a['time'] -= pd.to_timedelta(timedelta(hours=6)) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/nas/home/broot/centos6/lib/python2.7/site-packages/xray-0.6.0_154_gf270b9f-py2.7.egg/xray/core/dataarray.py", line 1091, in func f(self.variable, other_variable) File "/nas/home/broot/centos6/lib/python2.7/site-packages/xray-0.6.0_154_gf270b9f-py2.7.egg/xray/core/variable.py", line 799, in func self.values = f(self_data, other_data) TypeError: ufunc subtract cannot use operands with types dtype('<M8[ns]') and dtype('O') ```

Perhaps it makes sense to create a new xray convenience method like pandas's "to_timedelta()" that returns numpy arrays of timedelta64 instead of pandas's special timedelta objects? Or to somehow cast these objects appropriately on the fly?

My current workaround is the following:

import numpy as np a['time'] -= np.array([6], dtype='timedelta64[h]')

While that particular form isn't terrible, it gets very awkward if my timedelta is a combination of different units like 3 hours and 15 minutes or something odd like that.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/615/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
46768521 MDU6SXNzdWU0Njc2ODUyMQ== 268 groupby reduction sometimes collapses variables into scalars WeatherGod 291576 closed 0     3 2014-10-24T18:25:45Z 2015-04-08T03:44:09Z 2015-04-08T03:44:09Z CONTRIBUTOR      

If groupby is done on a Dataset, and all of the values for a particular variable are identical, then the variable is collapsed into a scalar. The same does not occur if the same thing is done to a DataArray. At first, I thought this was useful, but it caused problems because I could no longer concat() datasets together that sometimes had a scalar variable and sometimes did not.

```

x1 = xray.DataArray(np.arange(0, 10, 0.2), name='x') a = xray.DataArray(np.zeros(x1.shape), {'dim_0': x1}, name='foo') a.to_dataset().groupby(np.round(x1)).reduce(np.min) <xray.Dataset> Dimensions: (x: 11) Coordinates: * x (x) float64 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 Variables: foo float64 0.0 a.groupby(np.round(x1)).reduce(np.min) <xray.DataArray 'foo' (x: 11)> array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]) Coordinates: * x (x) float64 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/268/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
46745063 MDU6SXNzdWU0Njc0NTA2Mw== 264 align silently upcasts data arrays when NaNs are inserted WeatherGod 291576 closed 0     2 2014-10-24T14:36:20Z 2014-10-28T06:47:38Z 2014-10-28T06:47:38Z CONTRIBUTOR      

The NaNs being inserted during the join is irrespective of the dtype of the array.

import numpy as np import xray x1 = np.arange(30) x2 = np.arange(5, 35) a = xray.DataArray(np.random.random((30,)).astype('f32'), {'x': x1}) b = xray.DataArray(np.random.random((30,)).astype('f32'), {'x': x2}) c, d = xray.align(a, b, join='outer') print c.dtype

The output is float64.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/264/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
46756880 MDU6SXNzdWU0Njc1Njg4MA== 267 can't use datetime or pandas datetime to index time dimension WeatherGod 291576 closed 0     6 2014-10-24T16:28:49Z 2014-10-28T04:15:04Z 2014-10-28T04:15:04Z CONTRIBUTOR      

Consider the following:

```

c <xray.DataArray 'SPD' (time: 2216)> array([ 9., 6., 6., ..., 10., 5., 3.], dtype=float32) Coordinates: latitude float32 64.833 elevation float32 137.5 longitude float32 -147.6 * time (time) datetime64[ns] 2013-01-01T11:15:00 ... Attributes: units: miles per hour c.sel(time='2013-01-01') <xray.DataArray 'SPD' (time: 13)> array([ 9., 6., 6., 1., nan, 1., nan, 2., 1., 1., 1., 1., 2.], dtype=float32) Coordinates: latitude float32 64.833 elevation float32 137.5 longitude float32 -147.6 * time (time) datetime64[ns] 2013-01-01T11:15:00 ... Attributes: units: miles per hour c.sel(time=datetime(2013, 1, 1)) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/nas/home/broot/centos6/lib/python2.7/site-packages/xray-0.3.1.dev_ad43f0b-py2.7.egg/xray/core/dataarray.py", line 495, in sel return self.isel(**indexing.remap_label_indexers(self, indexers)) File "/nas/home/broot/centos6/lib/python2.7/site-packages/xray-0.3.1.dev_ad43f0b-py2.7.egg/xray/core/indexing.py", line 145, in remap_label_indexers for dim, label in iteritems(indexers)) File "/nas/home/broot/centos6/lib/python2.7/site-packages/xray-0.3.1.dev_ad43f0b-py2.7.egg/xray/core/indexing.py", line 145, in <genexpr> for dim, label in iteritems(indexers)) File "/nas/home/broot/centos6/lib/python2.7/site-packages/xray-0.3.1.dev_ad43f0b-py2.7.egg/xray/core/indexing.py", line 129, in convert_label_indexer indexer = index.get_loc(np.asscalar(label)) File "/nas/home/broot/centos6/lib/python2.7/site-packages/pandas/tseries/index.py", line 1280, in get_loc return self._engine.get_loc(stamp) File "index.pyx", line 519, in pandas.index.DatetimeEngine.get_loc (pandas/index.c:9229) File "index.pyx", line 544, in pandas.index.DatetimeEngine.get_loc (pandas/index.c:9018) KeyError: Timestamp('2013-01-01 00:00:00+0000', tz='UTC') ```

The same thing happens if I do a pd.to_datetime() first.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/267/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
46022646 MDU6SXNzdWU0NjAyMjY0Ng== 254 order matters when doing comparisons against scalar xray objects WeatherGod 291576 closed 0   0.3.1 799012 2 2014-10-16T19:03:11Z 2014-10-23T06:43:30Z 2014-10-23T06:43:23Z CONTRIBUTOR      

Working on some bounding box extraction code, I computed a bounding box by taking mins and maxes of the coordinates from an xray object resulting in a dictionary of scalar xray objects. When comparing an xray DataArray against this scalar xray object, the order seems to matter. This results in problems down the road that wouldn't happen if I just had a scalar value instead of a scalar xray object.

```

bbox {'longitude': (<xray.DataArray 'longitude' ()> array(-102.8782), <xray.DataArray 'longitude' ()> array(-94.6244))} a = bbox['longitude'][0] <= mod['longitude'] b = mod['longitude'] <= bbox['longitude'][1] c = mod['longitude'] >= bbox['longitude'][0] a <xray.DataArray 'longitude' (longitude: 7001)> array([False, False, False, ..., True, True, True], dtype=bool) Coordinates: * longitude (longitude) bool False False False False False False False False False ... b <xray.DataArray (longitude: 7001)> array([ True, True, True, ..., False, False, False], dtype=bool) Coordinates: * longitude (longitude) float32 -129.995 -129.985 -129.975 -129.965 -129.955 ... ```

See that the "a" object has a name "longitude" while the "b" object does not. Therefore...

```

a & b Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/nas/home/broot/centos6/lib/python2.7/site-packages/xray/core/dataarray.py", line 850, in func ds = self.coords.merge(other_coords) File "/nas/home/broot/centos6/lib/python2.7/site-packages/xray/core/coordinates.py", line 122, in merge conflicts = self._merge_validate(other) File "/nas/home/broot/centos6/lib/python2.7/site-packages/xray/core/coordinates.py", line 80, in _merge_validate raise ValueError('index %r not aligned' % k) ValueError: index 'longitude' not aligned ```

But, if I use the "c" object instead which was created flipping the comparison around:

```

c * b <xray.DataArray (longitude: 7001)> array([False, False, False, ..., False, False, False], dtype=bool) Coordinates: * longitude (longitude) float32 -129.995 -129.985 -129.975 -129.965 -129.955 ...

```

everything works as expected. I have a vague idea of why this is happening, but I am not exactly sure how one should go about dealing with this. It is a similar problem elsewhere with subclassed numpy arrays. For now, I am going to have to go with the rule of keeping the xray dataarray object first, but that really isn't going to work in other places where I may not know that I am passing xray objects.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/254/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 21.566ms · About: xarray-datasette