home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

34 rows where user = 3404817 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, created_at (date), updated_at (date)

issue 15

  • Creating unlimited dimensions with xarray.Dataset.to_netcdf 6
  • avoid integer overflow when decoding large time numbers 4
  • sel with method 'nearest' fails with AssertionError 3
  • test decoding num_dates in float types 3
  • `xray.open_mfdataset` concatenates also variables without time dimension 2
  • NetCDF attributes like `long_name` and `units` lost on `.mean()` 2
  • Circular longitude axis 2
  • Use xarray.open_dataset() for password-protected Opendap files 2
  • groupby with datetime DataArray fails with `AttributeError` 2
  • Time decoding has round-off error 0.10.0. Gone now. 2
  • avoid integer overflow when decoding large time numbers 2
  • Examples combining multiple files 1
  • Aggregating NetCDF files 1
  • concat automagically outer-joins coordinates 1
  • We need a fast path for open_mfdataset 1

user 1

  • j08lue · 34 ✖

author_association 1

  • CONTRIBUTOR 34
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
582288134 https://github.com/pydata/xarray/issues/1068#issuecomment-582288134 https://api.github.com/repos/pydata/xarray/issues/1068 MDEyOklzc3VlQ29tbWVudDU4MjI4ODEzNA== j08lue 3404817 2020-02-05T08:06:28Z 2020-02-05T08:06:28Z CONTRIBUTOR

Yes, seems like a redirect issue. The URL is fine.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use xarray.open_dataset() for password-protected Opendap files 186169975
499367730 https://github.com/pydata/xarray/issues/1354#issuecomment-499367730 https://api.github.com/repos/pydata/xarray/issues/1354 MDEyOklzc3VlQ29tbWVudDQ5OTM2NzczMA== j08lue 3404817 2019-06-06T06:33:44Z 2019-06-06T06:33:44Z CONTRIBUTOR

Thanks for reminding me, stalte bot, this is still relevant to me.

I think your proposed solution @shoyer would solve this issue perfectly: expose align's join parameter in the concat function - and please also in open_mfdataset. Would you like a PR for this?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  concat automagically outer-joins coordinates 219692578
489064553 https://github.com/pydata/xarray/issues/1823#issuecomment-489064553 https://api.github.com/repos/pydata/xarray/issues/1823 MDEyOklzc3VlQ29tbWVudDQ4OTA2NDU1Mw== j08lue 3404817 2019-05-03T11:26:06Z 2019-05-03T11:36:44Z CONTRIBUTOR

The original issue of this thread is that you sometimes might want to disable alignment checks for coordinates other than the concat_dim and only check for same dimensions and dimension shapes.

When you xr.merge with join='exact', it still checks for alignment (see https://github.com/pydata/xarray/pull/1330#issuecomment-302711852), but does not join the coordinates if they are not aligned. This behavior (not joining) is also included in what @rabernat envisioned here, but his suggestion goes beyond that: you don't even load coordinate values from all but the first dataset and just blindly trust that they are aligned.

So xr.open_mfdataset(join='exact', coords='minimal') does not fix this issue here, I think.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  We need a fast path for open_mfdataset 288184220
385661908 https://github.com/pydata/xarray/pull/2096#issuecomment-385661908 https://api.github.com/repos/pydata/xarray/issues/2096 MDEyOklzc3VlQ29tbWVudDM4NTY2MTkwOA== j08lue 3404817 2018-05-01T12:41:08Z 2018-05-01T12:41:08Z CONTRIBUTOR

I'm confused. It seems that I did exactly this a while ago

@fmaussion don't be. You are perfectly right, we worked on the same line of code, only that I was about 50 PRs earlier, hence the merge conflict.

I'll close this.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  avoid integer overflow when decoding large time numbers 319132629
385624187 https://github.com/pydata/xarray/pull/2096#issuecomment-385624187 https://api.github.com/repos/pydata/xarray/issues/2096 MDEyOklzc3VlQ29tbWVudDM4NTYyNDE4Nw== j08lue 3404817 2018-05-01T08:26:45Z 2018-05-01T08:26:45Z CONTRIBUTOR

That single AppVeyor CI fail is because the py27 32bit conda environment cannot be created: ResolvePackageNotFound: - cftime

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  avoid integer overflow when decoding large time numbers 319132629
385612859 https://github.com/pydata/xarray/pull/1965#issuecomment-385612859 https://api.github.com/repos/pydata/xarray/issues/1965 MDEyOklzc3VlQ29tbWVudDM4NTYxMjg1OQ== j08lue 3404817 2018-05-01T07:03:53Z 2018-05-01T07:03:53Z CONTRIBUTOR

mind freshening up this PR

I deleted my fork in the meantime. I opened a new PR at #2096 and fixed the merge conflict there.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  avoid integer overflow when decoding large time numbers 302447879
371758069 https://github.com/pydata/xarray/pull/1965#issuecomment-371758069 https://api.github.com/repos/pydata/xarray/issues/1965 MDEyOklzc3VlQ29tbWVudDM3MTc1ODA2OQ== j08lue 3404817 2018-03-09T09:19:38Z 2018-03-09T09:19:38Z CONTRIBUTOR

Can you add a brief note on the bug fix in whats-new?

Done.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  avoid integer overflow when decoding large time numbers 302447879
370571480 https://github.com/pydata/xarray/pull/1965#issuecomment-370571480 https://api.github.com/repos/pydata/xarray/issues/1965 MDEyOklzc3VlQ29tbWVudDM3MDU3MTQ4MA== j08lue 3404817 2018-03-05T21:25:53Z 2018-03-06T07:18:40Z CONTRIBUTOR

I can see netcdftime is casting the time numbers to float64:

https://github.com/Unidata/netcdftime/blob/e745434547de728d53cde6316cae75b62db102e2/netcdftime/_netcdftime.pyx#L241

like I also suggested for https://github.com/pydata/xarray/pull/1863#issuecomment-361089977.

How about we do that, too?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  avoid integer overflow when decoding large time numbers 302447879
370568574 https://github.com/pydata/xarray/pull/1965#issuecomment-370568574 https://api.github.com/repos/pydata/xarray/issues/1965 MDEyOklzc3VlQ29tbWVudDM3MDU2ODU3NA== j08lue 3404817 2018-03-05T21:16:28Z 2018-03-05T21:29:19Z CONTRIBUTOR

Seems like this issue only exists on Windows. See https://ci.appveyor.com/project/shoyer/xray/build/1.0.3782/job/yjodrbenev1tw5y7#L4686

(The Travis failure is due to a failing build.)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  avoid integer overflow when decoding large time numbers 302447879
370381376 https://github.com/pydata/xarray/issues/597#issuecomment-370381376 https://api.github.com/repos/pydata/xarray/issues/597 MDEyOklzc3VlQ29tbWVudDM3MDM4MTM3Ng== j08lue 3404817 2018-03-05T10:49:19Z 2018-03-05T10:49:19Z CONTRIBUTOR

Can't this be closed?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Aggregating NetCDF files 109202603
361091210 https://github.com/pydata/xarray/issues/1859#issuecomment-361091210 https://api.github.com/repos/pydata/xarray/issues/1859 MDEyOklzc3VlQ29tbWVudDM2MTA5MTIxMA== j08lue 3404817 2018-01-28T20:01:10Z 2018-01-28T20:01:52Z CONTRIBUTOR

I found out why the issue is absent in the current implementation. Please see https://github.com/pydata/xarray/pull/1863#issuecomment-361091073. That PR adds the test you asked for @shoyer. It passes in the current version and fails in v0.10.0.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Time decoding has round-off error 0.10.0. Gone now. 291565176
361091073 https://github.com/pydata/xarray/pull/1863#issuecomment-361091073 https://api.github.com/repos/pydata/xarray/issues/1863 MDEyOklzc3VlQ29tbWVudDM2MTA5MTA3Mw== j08lue 3404817 2018-01-28T19:59:23Z 2018-01-28T19:59:23Z CONTRIBUTOR

Store _NS_PER_TIME_DELTA values as int, then numpy will do the casting.

Haha, OK, this is actually what you implemented in https://github.com/pydata/xarray/commit/50b0a69a7aa0fb7ac3afb28e7bd971cf08055f99:

https://github.com/pydata/xarray/blob/50b0a69a7aa0fb7ac3afb28e7bd971cf08055f99/xarray/coding/times.py#L32-L37

So that is why it works now.

Case closed, I guess.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  test decoding num_dates in float types 292231408
361089977 https://github.com/pydata/xarray/pull/1863#issuecomment-361089977 https://api.github.com/repos/pydata/xarray/issues/1863 MDEyOklzc3VlQ29tbWVudDM2MTA4OTk3Nw== j08lue 3404817 2018-01-28T19:45:49Z 2018-01-28T19:48:09Z CONTRIBUTOR

I believe the issue originates in these lines:

https://github.com/pydata/xarray/blob/ac854f081d4b57d292755d3aff1476f8e2e2da11/xarray/conventions.py#L174-L175

where we multiply the num_dates with some float value and then cast to int64.

If num_dates is float32, numpy keeps float32 when multiplying with e.g. 1e9 and that somehow introduces an error. Here is a stripped version of the above:

python flat_num_dates = np.arange(100).astype('float32') n = 1e9 roundtripped = (flat_num_dates * n).astype(np.int64) / n assert np.all(flat_num_dates == roundtripped)

By the way, the factor has to be large, like 1e9. E.g. 1e6 ('ms since ...') won't give the error.

The weird thing is that the corresponding code in the current master is identical:

https://github.com/pydata/xarray/blob/50b0a69a7aa0fb7ac3afb28e7bd971cf08055f99/xarray/coding/times.py#L151-L152

I will look into why the result is still different from v0.10.0.

Also, if this really is the origin of the error, there are two easy ways to avoid this:

  1. Cast flat_num_dates to float64: (flat_num_dates.astype(np.float64) * n).astype(np.int64)
  2. Store _NS_PER_TIME_DELTA values as int, then numpy will do the casting.
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  test decoding num_dates in float types 292231408
361089805 https://github.com/pydata/xarray/pull/1863#issuecomment-361089805 https://api.github.com/repos/pydata/xarray/issues/1863 MDEyOklzc3VlQ29tbWVudDM2MTA4OTgwNQ== j08lue 3404817 2018-01-28T19:43:21Z 2018-01-28T19:43:21Z CONTRIBUTOR

For comparison, I added the same test to the 0.10.0 version in https://github.com/j08lue/xarray/tree/0p10-f4-time-decode, where it fails:

``` pytest -v xarray\tests\test_conventions.py

...

          self.assertArrayEqual(expected, actual_cmp)

E AssertionError: E Arrays are not equal E E (mismatch 90.0%) E x: array([datetime.datetime(2000, 1, 1, 0, 0), E datetime.datetime(2000, 1, 2, 0, 0), E datetime.datetime(2000, 1, 3, 0, 0),... E y: array(['2000-01-01T00:00:00.000000', '2000-01-02T00:00:00.003211', E '2000-01-03T00:00:00.006422', '2000-01-04T00:00:00.001245', E '2000-01-05T00:00:00.012845', '2000-01-06T00:00:00.024444',... ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  test decoding num_dates in float types 292231408
360782050 https://github.com/pydata/xarray/issues/1859#issuecomment-360782050 https://api.github.com/repos/pydata/xarray/issues/1859 MDEyOklzc3VlQ29tbWVudDM2MDc4MjA1MA== j08lue 3404817 2018-01-26T13:17:46Z 2018-01-26T13:17:46Z CONTRIBUTOR

PR with a test

I'll see if I can find the time in the next few days.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Time decoding has round-off error 0.10.0. Gone now. 291565176
281075869 https://github.com/pydata/xarray/issues/1140#issuecomment-281075869 https://api.github.com/repos/pydata/xarray/issues/1140 MDEyOklzc3VlQ29tbWVudDI4MTA3NTg2OQ== j08lue 3404817 2017-02-20T13:13:07Z 2017-02-20T13:13:07Z CONTRIBUTOR

@JamesSample excellent work confirming the limited scope.

I am quite sure that the issue is with this line: https://github.com/pydata/xarray/blob/v0.9.1/xarray/core/variable.py#L375-L376

dims = tuple(dim for k, dim in zip(key, self.dims) if not isinstance(k, (int, np.integer)))

When you run the test case I added in #1184, you will see that, inside __getitem__, key is a tuple with an empty array inside (key = (np.array(0, dtype=np.int64),)).

With that value, isinstance(key[0], (int, np.integer)) is False on 64 bit Windows and so dims=('time',), which has length 1.

But values = self._indexable_data[key] gives a zero-dimension array, such that values.ndim == 0...

I am unable to tell which of the two sides of the assertion expression is unexpected...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  sel with method 'nearest' fails with AssertionError 192122307
281038974 https://github.com/pydata/xarray/issues/1140#issuecomment-281038974 https://api.github.com/repos/pydata/xarray/issues/1140 MDEyOklzc3VlQ29tbWVudDI4MTAzODk3NA== j08lue 3404817 2017-02-20T10:13:45Z 2017-02-20T10:13:45Z CONTRIBUTOR

@JamesSample Thanks for restoring my credibility a bit here... But, no, I did not figure this out yet.

Appveyor apparently does not have the combination Windows 64 bit + Python 2.7 (https://ci.appveyor.com/project/jhamman/xarray-injyf/build/1.0.659), maybe that is why it does not reproduce the error?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  sel with method 'nearest' fails with AssertionError 192122307
269136711 https://github.com/pydata/xarray/issues/1140#issuecomment-269136711 https://api.github.com/repos/pydata/xarray/issues/1140 MDEyOklzc3VlQ29tbWVudDI2OTEzNjcxMQ== j08lue 3404817 2016-12-25T20:41:41Z 2016-12-25T20:41:41Z CONTRIBUTOR

Yes, this is Python 2. :flushed: Let's see what Travis & Co say.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  sel with method 'nearest' fails with AssertionError 192122307
263007655 https://github.com/pydata/xarray/issues/1068#issuecomment-263007655 https://api.github.com/repos/pydata/xarray/issues/1068 MDEyOklzc3VlQ29tbWVudDI2MzAwNzY1NQ== j08lue 3404817 2016-11-25T18:21:05Z 2016-11-25T18:21:05Z CONTRIBUTOR

@jenfly did you find a solution how to make opendap authentication work with xarray? Might be worthwhile posting it here, even though the issue has to do with the backends.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use xarray.open_dataset() for password-protected Opendap files 186169975
262557502 https://github.com/pydata/xarray/issues/1132#issuecomment-262557502 https://api.github.com/repos/pydata/xarray/issues/1132 MDEyOklzc3VlQ29tbWVudDI2MjU1NzUwMg== j08lue 3404817 2016-11-23T16:06:46Z 2016-11-23T16:06:46Z CONTRIBUTOR

Great, safe_cast_to_index works nicely (it passes my test). I added the change to the existing PR.

Do we need to add more groupby tests to make sure the solution is safe for other cases (e.g. other data types)?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  groupby with datetime DataArray fails with `AttributeError` 190683531
262233644 https://github.com/pydata/xarray/issues/1132#issuecomment-262233644 https://api.github.com/repos/pydata/xarray/issues/1132 MDEyOklzc3VlQ29tbWVudDI2MjIzMzY0NA== j08lue 3404817 2016-11-22T12:56:03Z 2016-11-22T13:21:04Z CONTRIBUTOR

OK, here is the minimal example:

```python import xarray as xr import pandas as pd

def test_groupby_da_datetime(): """groupby with a DataArray of dtype datetime""" # create test data times = pd.date_range('2000-01-01', periods=4) foo = xr.DataArray([1,2,3,4], coords=dict(time=times), dims='time')

# create test index
dd = times.to_datetime()
reference_dates = [dd[0], dd[2]]
labels = reference_dates[0:1]*2 + reference_dates[1:2]*2
ind = xr.DataArray(labels, coords=dict(time=times), dims='time', name='reference_date')

# group foo by ind
g = foo.groupby(ind)

# check result
actual = g.sum(dim='time')
expected = xr.DataArray([3,7], coords=dict(reference_date=reference_dates), dims='reference_date')
assert actual.to_dataset(name='foo').equals(expected.to_dataset(name='foo'))

```

Making that, I found out that the problem only occurs when the DataArray used with groupby has dtype=datetime64[ns].

The problem is that we effectively feed the DataArray to pd.factorize and that goes well for most data types: Pandas checks with the function needs_i8_conversion whether it can factorize the DataArray and decides YES for our datetime64[ns]. But then in pd.factorize it fails because it tries to access DataArray.view to convert to int64.

So as I see it there are three possible solutions to this: 1. Make Pandas' pd.factorize handle our datetime DataArrays better, 2. Add an attribute .view to DataArrays, or 3. Use the solution in the above PR, which means feeding only the NumPy .values to pd.factorize.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  groupby with datetime DataArray fails with `AttributeError` 190683531
258949295 https://github.com/pydata/xarray/issues/992#issuecomment-258949295 https://api.github.com/repos/pydata/xarray/issues/992 MDEyOklzc3VlQ29tbWVudDI1ODk0OTI5NQ== j08lue 3404817 2016-11-07T20:14:37Z 2016-11-07T20:14:37Z CONTRIBUTOR

Great to see you are pushing forward on this issue @jhamman and @shoyer. I would really have liked to contribute here, but it seems like there are quite some design choices to make, which are better left in your hands.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Creating unlimited dimensions with xarray.Dataset.to_netcdf 173773358
243367823 https://github.com/pydata/xarray/issues/992#issuecomment-243367823 https://api.github.com/repos/pydata/xarray/issues/992 MDEyOklzc3VlQ29tbWVudDI0MzM2NzgyMw== j08lue 3404817 2016-08-30T08:22:32Z 2016-08-30T08:22:32Z CONTRIBUTOR

@jhamman sorry, I only now saw that you pointed to a previous issue on the same topic (with exactly the same considerations). I did not find that issue when I searched (for "unlimited").

You were against changing to_netcdf then. Are you still?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Creating unlimited dimensions with xarray.Dataset.to_netcdf 173773358
243365966 https://github.com/pydata/xarray/issues/992#issuecomment-243365966 https://api.github.com/repos/pydata/xarray/issues/992 MDEyOklzc3VlQ29tbWVudDI0MzM2NTk2Ng== j08lue 3404817 2016-08-30T08:14:08Z 2016-08-30T08:14:08Z CONTRIBUTOR

The above solution would not require much more than that, in set_necessary_dimensions,

def set_necessary_dimensions(self, variable): for d, l in zip(variable.dims, variable.shape): if d not in self.dimensions: self.set_dimension(d, l)

would become

def set_necessary_dimensions(self, variable): for d, l in zip(variable.dims, variable.shape): if d in self._unlimited_dimensions: l = None if d not in self.dimensions: self.set_dimension(d, l)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Creating unlimited dimensions with xarray.Dataset.to_netcdf 173773358
243364284 https://github.com/pydata/xarray/issues/992#issuecomment-243364284 https://api.github.com/repos/pydata/xarray/issues/992 MDEyOklzc3VlQ29tbWVudDI0MzM2NDI4NA== j08lue 3404817 2016-08-30T08:06:49Z 2016-08-30T08:07:07Z CONTRIBUTOR

But maybe the encoding dict is not the way to go after all, since it contains entries per variable, while it is the dimension that must be unlimited.

Currently the dataset variables can be created in any order and their necessary dimensions created whenever needed (in the set_necessary_dimensions function). I would not like to change that logic (e.g. towards creating all dimensions required by all variables first, before adding the data variables).

So how about a new kw argument to to_netcdf, like

ds.to_netcdf(unlimited_dimensions=['time'])

or

ds.to_netcdf(dimension_unlimited={'time': True})

(the second option better for explicitly setting {'time': False})?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Creating unlimited dimensions with xarray.Dataset.to_netcdf 173773358
243350429 https://github.com/pydata/xarray/issues/992#issuecomment-243350429 https://api.github.com/repos/pydata/xarray/issues/992 MDEyOklzc3VlQ29tbWVudDI0MzM1MDQyOQ== j08lue 3404817 2016-08-30T06:59:14Z 2016-08-30T06:59:14Z CONTRIBUTOR

I think it makes sense to preserve the UNLIMITED state through read/write. In my case, I subset a netCDF file along lat and lon dimensions, leaving the time dimension untouched and would therefore expect it to pass unchanged through xarray IO (staying UNLIMITED).

However, when the dataset is indexed/subset/resampled along the unlimited dimension, it would make sense that its state is dropped. But that would require a lot of ifs and buts, so I suggest we leave that aside for now.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Creating unlimited dimensions with xarray.Dataset.to_netcdf 173773358
243253510 https://github.com/pydata/xarray/issues/992#issuecomment-243253510 https://api.github.com/repos/pydata/xarray/issues/992 MDEyOklzc3VlQ29tbWVudDI0MzI1MzUxMA== j08lue 3404817 2016-08-29T20:56:55Z 2016-08-29T20:56:55Z CONTRIBUTOR

OK, I'd be up for taking a shot at it.

Since it is per-variable and specific to netCDF, I guess the perfect place to add this is in the encoding dictionary that you can pass to to_netcdf, right? Maybe as key unlimited? E.g.

ds.to_netcdf(encoding={'time': dict(unlimited=True)})

I need to look up whether netCDF allows for defining more than one unlimited dimension, otherwise that must throw an error.

And then it is just about passing None as length to CreateDimension, at least in netCDF4 and scipy.io.netcdf. But I did not look into how xarray handles that under the hood.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Creating unlimited dimensions with xarray.Dataset.to_netcdf 173773358
147962600 https://github.com/pydata/xarray/issues/623#issuecomment-147962600 https://api.github.com/repos/pydata/xarray/issues/623 MDEyOklzc3VlQ29tbWVudDE0Nzk2MjYwMA== j08lue 3404817 2015-10-14T07:33:44Z 2015-10-14T07:33:44Z CONTRIBUTOR

Alright, I see. In that case we should not create an entirely new syntax for this - unless there are more people than just me interested in this feature. I can just use my own little convenience function that generates a series of array indices that can be fed to isel, something like this:

python def wrap_iselkw(ds, dim, istart, istop): """Returns a kw dict for indices from `istart` to `istop` that wrap around dimension `dim`""" n = len(ds[dim]) if istart > istop: istart -= n return { dim : np.mod(np.arange(istart, istop), n) }

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Circular longitude axis 110979851
147727880 https://github.com/pydata/xarray/issues/623#issuecomment-147727880 https://api.github.com/repos/pydata/xarray/issues/623 MDEyOklzc3VlQ29tbWVudDE0NzcyNzg4MA== j08lue 3404817 2015-10-13T14:17:08Z 2015-10-13T14:17:08Z CONTRIBUTOR

@jhamman You guys seem to be always ahead of me - which is great! (I just did not notice it this time before I raised the issue...)

But it would actually not be that hard to make selecting regions across the boundary of a circular dimension more convenient in xray.

It would of course be great if it worked with real coordinate (longitude) values:

ds.sel(lon=slice(-90, 15))

which, however, only works if the axis explicitly covers this range (e.g. has range -180E to 180E). If it ranges, say, from 0E to 360E, an empty DataArray is returned. There is no nice way of resolving this, I guess, because you would need to hard-code 360 or have it as a clumsy, non-CF-standard parameter and transform all values to some common range before trying to find the requested region.

But it should be less problematic with

ds.isel(lon=slice(-100, 200))

i.e. isel with negative slice start. As it is now, this also returns an empty DataArray for me (xray 0.6.0), which is in a way consistent with the sel behaviour. But asking for negative indices is actually a quite pythonic way of implying circular. So could this be implemented, now that netCDF4 supports non-contiguous indexing?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Circular longitude axis 110979851
115561175 https://github.com/pydata/xarray/issues/442#issuecomment-115561175 https://api.github.com/repos/pydata/xarray/issues/442 MDEyOklzc3VlQ29tbWVudDExNTU2MTE3NQ== j08lue 3404817 2015-06-26T07:28:54Z 2015-06-26T07:28:54Z CONTRIBUTOR

That makes sense. Great that there is an option to keep_attrs. Closing this issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  NetCDF attributes like `long_name` and `units` lost on `.mean()` 90658514
115177414 https://github.com/pydata/xarray/issues/442#issuecomment-115177414 https://api.github.com/repos/pydata/xarray/issues/442 MDEyOklzc3VlQ29tbWVudDExNTE3NzQxNA== j08lue 3404817 2015-06-25T09:10:26Z 2015-06-25T09:10:38Z CONTRIBUTOR

Sorry for the confusion! The loss of attributes actually occurs when applying .mean() (rather than .load()).

See this notebook (same in nbviewer) for an example with some opendap-hosted data.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  NetCDF attributes like `long_name` and `units` lost on `.mean()` 90658514
113422235 https://github.com/pydata/xarray/issues/438#issuecomment-113422235 https://api.github.com/repos/pydata/xarray/issues/438 MDEyOklzc3VlQ29tbWVudDExMzQyMjIzNQ== j08lue 3404817 2015-06-19T08:03:19Z 2015-06-19T08:03:44Z CONTRIBUTOR

netCDF4-python uses a dimension specified by the user or an unlimited dimension it finds in the dataset. Here is the corresponding code section.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `xray.open_mfdataset` concatenates also variables without time dimension 89268800
113415070 https://github.com/pydata/xarray/issues/438#issuecomment-113415070 https://api.github.com/repos/pydata/xarray/issues/438 MDEyOklzc3VlQ29tbWVudDExMzQxNTA3MA== j08lue 3404817 2015-06-19T07:44:25Z 2015-06-19T07:45:40Z CONTRIBUTOR

Here is a print-out of the full dataset for POP ocean model output (see that gist in nbviewer).

I can see that the heuristics exclude variables from concatenation that are associated with dimensions of other variables. But why not just exclude all that do not have a time dimension?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `xray.open_mfdataset` concatenates also variables without time dimension 89268800
113127892 https://github.com/pydata/xarray/issues/436#issuecomment-113127892 https://api.github.com/repos/pydata/xarray/issues/436 MDEyOklzc3VlQ29tbWVudDExMzEyNzg5Mg== j08lue 3404817 2015-06-18T11:47:42Z 2015-06-18T11:47:42Z CONTRIBUTOR

That is actually an excellent demonstration of the power of xray for climate data analysis, @shoyer.

Something like this (including the pandas bridge) should be included in the documentation somewhere, for example under Why xray?.

Just a thought...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Examples combining multiple files 88897697

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 19.029ms · About: xarray-datasette