home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

3 rows where repo = 13221727 and user = 6404167 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: updated_at, closed_at, created_at (date), updated_at (date), closed_at (date)

type 2

  • issue 2
  • pull 1

state 1

  • closed 3

repo 1

  • xarray · 3 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
327613219 MDU6SXNzdWUzMjc2MTMyMTk= 2198 DataArray.encoding['chunksizes'] not respected in to_netcdf Karel-van-de-Plassche 6404167 closed 0     2 2018-05-30T07:50:59Z 2019-06-06T20:35:50Z 2019-06-06T20:35:50Z CONTRIBUTOR      

This might be just a documentation issue, so sorry if this is not a problem with xarray.

I'm trying to save an intermediate result of a calculation with xarray + dask to disk, but I'd like to preserve the on-disk chunking. Setting the encoding of a Dataset.data_var or DataArray using the encoding attribute seems to work for (at least) some encoding variables, but not for chunksizes. For example:

``` python import xarray as xr import dask.array as da from dask.distributed import Client from IPython import embed

First generate a file with random numbers

rng = da.random.RandomState() shape = (10, 10000) chunks = [10, 10] dims = ['x', 'y'] z = rng.standard_normal(shape, chunks=chunks) da = xr.DataArray(z, dims=dims, name='z')

Set encoding of the DataArray

da.encoding['chunksizes'] = chunks # Not conserved da.encoding['zlib'] = True # Conserved ds = da.to_dataset() print(ds['z'].encoding) #out: {'chunksizes': [10, 10], 'zlib': True}

This one is chunked and compressed correctly

ds.to_netcdf('test1.nc', encoding={'z': {'chunksizes': chunks}})

While this one is only compressed

ds.to_netcdf('test2.nc') ```

INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Linux OS-release: 4.16.5-1-ARCH machine: x86_64 processor: byteorder: little LC_ALL: LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.4 pandas: 0.22.0 numpy: 1.14.3 scipy: 0.19.0 netCDF4: 1.4.0 h5netcdf: 0.5.1 h5py: 2.7.1 Nio: None zarr: None bottleneck: None cyordereddict: None dask: 0.17.5 distributed: 1.21.8 matplotlib: 2.0.2 cartopy: None seaborn: 0.7.1 setuptools: 39.1.0 pip: 9.0.1 conda: None pytest: 3.2.2 IPython: 6.3.1 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2198/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
328439361 MDExOlB1bGxSZXF1ZXN0MTkxOTc1NTkz 2207 Fixes #2198: Drop chunksizes when only when original_shape is different, not when it isn't found Karel-van-de-Plassche 6404167 closed 0     4 2018-06-01T09:08:11Z 2019-06-06T20:35:50Z 2019-06-06T20:35:50Z CONTRIBUTOR   0 pydata/xarray/pulls/2207

Before this fix chunksizes was dropped even when original_shape was not found in encoding

  • [x] Closes #2198
  • [x] Tests added (for all bug fixes or enhancements)
  • [x] Tests passed (for all non-documentation changes)

Four seemingly unrelated tests failed ``` python _____________ TestEncodeCFVariable.test_missing_fillvalue ________________

self = <xarray.tests.test_conventions.TestEncodeCFVariable testMethod=test_missing_fillvalue>

def test_missing_fillvalue(self):
    v = Variable(['x'], np.array([np.nan, 1, 2, 3]))
    v.encoding = {'dtype': 'int16'}
    with pytest.warns(Warning, match='floating point data as an integer'):
      conventions.encode_cf_variable(v)

E Failed: DID NOT WARN. No warnings of type (<class 'Warning'>,) was emitted. The list of emitted warnings is: [SerializationWarning('saving variable None with floating point data as an integer dtype without any _FillValue to use for NaNs',)].

xarray/tests/test_conventions.py:89: Failed ----------------------------------------------------------------------------------------------- Captured stderr call ----------------------------------------------------------------------------------------------- /usr/lib/python3.6/site-packages/pytest/vendored_packages/pluggy.py:248: SerializationWarning: saving variable None with floating point data as an integer dtype without any _FillValue to use for NaNs call_outcome = _CallOutcome(func) _____________ TestAccessor.test_register ______________

self = <xarray.tests.test_extensions.TestAccessor testMethod=test_register>

def test_register(self):

    @xr.register_dataset_accessor('demo')
    @xr.register_dataarray_accessor('demo')
    class DemoAccessor(object):
        """Demo accessor."""

        def __init__(self, xarray_obj):
            self._obj = xarray_obj

        @property
        def foo(self):
            return 'bar'

    ds = xr.Dataset()
    assert ds.demo.foo == 'bar'

    da = xr.DataArray(0)
    assert da.demo.foo == 'bar'

    # accessor is cached
    assert ds.demo is ds.demo

    # check descriptor
    assert ds.demo.__doc__ == "Demo accessor."
    assert xr.Dataset.demo.__doc__ == "Demo accessor."
    assert isinstance(ds.demo, DemoAccessor)
    assert xr.Dataset.demo is DemoAccessor

    # ensure we can remove it
    del xr.Dataset.demo
    assert not hasattr(xr.Dataset, 'demo')

    with pytest.warns(Warning, match='overriding a preexisting attribute'):
        @xr.register_dataarray_accessor('demo')
      class Foo(object):

E Failed: DID NOT WARN. No warnings of type (<class 'Warning'>,) was emitted. The list of emitted warnings is: [AccessorRegistrationWarning("registration of accessor <class 'xarray.tests.test_extensions.TestAccessor.test_register.\<locals>.Foo'> under name 'demo' for type <class 'xarray.core.dataarray.DataArray'> is overriding a preexisting attribute with the same name.",)].

xarray/tests/test_extensions.py:60: Failed ----------------------------------------------------------------------------------------------- Captured stderr call ----------------------------------------------------------------------------------------------- /home/karel/working/xarray/xarray/tests/test_extensions.py:60: AccessorRegistrationWarning: registration of accessor <class 'xarray.tests.test_extensions.TestAccessor.test_register.\<locals>.Foo'> under name 'demo' for type <class 'xarray.core.dataarray.DataArray'> is overriding a preexisting attribute with the same name. class Foo(object): ______________ TestAlias.test ______________

self = <xarray.tests.test_utils.TestAlias testMethod=test>

def test(self):
    def new_method():
        pass
    old_method = utils.alias(new_method, 'old_method')
    assert 'deprecated' in old_method.__doc__
    with pytest.warns(Warning, match='deprecated'):
      old_method()

E Failed: DID NOT WARN. No warnings of type (<class 'Warning'>,) was emitted. The list of emitted warnings is: [FutureWarning('old_method has been deprecated. Use new_method instead.',)].

xarray/tests/test_utils.py:28: Failed ----------------------------------------------------------------------------------------------- Captured stderr call ----------------------------------------------------------------------------------------------- /home/karel/working/xarray/xarray/tests/test_utils.py:28: FutureWarning: old_method has been deprecated. Use new_method instead. old_method() _____________ TestIndexVariable.test_coordinate_alias ______________

self = <xarray.tests.test_variable.TestIndexVariable testMethod=test_coordinate_alias>

def test_coordinate_alias(self):
    with pytest.warns(Warning, match='deprecated'):
      x = Coordinate('x', [1, 2, 3])

E Failed: DID NOT WARN. No warnings of type (<class 'Warning'>,) was emitted. The list of emitted warnings is: [FutureWarning('Coordinate has been deprecated. Use IndexVariable instead.',)].

xarray/tests/test_variable.py:1763: Failed ----------------------------------------------------------------------------------------------- Captured stderr call ----------------------------------------------------------------------------------------------- /home/karel/working/xarray/xarray/tests/test_variable.py:1763: FutureWarning: Coordinate has been deprecated. Use IndexVariable instead. x = Coordinate('x', [1, 2, 3]) ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2207/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
327064908 MDU6SXNzdWUzMjcwNjQ5MDg= 2190 Parallel non-locked read using dask.Client crashes Karel-van-de-Plassche 6404167 closed 0     5 2018-05-28T15:42:40Z 2019-01-14T21:09:04Z 2019-01-14T21:09:03Z CONTRIBUTOR      

I'm trying to parallelize my code using Dask. Using their distributed.Client() I was able to do computations in parallel. Unfortunately, it seems ~60% of the time is spend in a file lock. As I'm only reading data and doing computations in memory, I should be able to work without a lock, so I tried to pass lock=False to open_dataset. Unfortunately this crashes my code. A minimal reproducible example can be found below:

``` python import xarray as xr import dask.array as da from dask.distributed import Client from IPython import embed

First generate a file with random numbers

rng = da.random.RandomState() shape = (10, 10000) chunks = (10, 10) dims = ['y', 'z'] x = rng.standard_normal(shape, chunks=chunks) da = xr.DataArray(x, dims=dims, name='x') da.to_netcdf('test.nc')

Open file without a lock

client = Client(processes=False) ds = xr.open_dataset('test.nc', chunks=dict(zip(dims, chunks)), lock=False)

This will crash!

print((ds['x'] * ds['x']).compute()) Crashes with (sometimes) python distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x7ffb69033c50>, key=BasicIndexer((slice(None, None, None), slice(None, None, None)))))), (slice(0, 10, None), slice(5710, 5720, None))) kwargs: {} Exception: RuntimeError('NetCDF: HDF error',) `` And usually just withterminated by signal SIGSEGV (Address boundary error)`

Output of xr.show_versions()

``` python INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Linux OS-release: 4.16.9-1-ARCH machine: x86_64 processor: byteorder: little LC_ALL: LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.2 pandas: 0.20.3 numpy: 1.14.0 scipy: 0.19.1 netCDF4: 1.4.0 h5netcdf: None h5py: 2.7.1 Nio: None zarr: None bottleneck: None cyordereddict: None dask: 0.17.5 distributed: 1.21.8 matplotlib: 2.1.2 cartopy: None seaborn: 0.8.1 setuptools: 38.5.1 pip: 10.0.1 conda: None pytest: 3.4.0 IPython: 6.3.1 sphinx: 1.6.4 ```

A "Minimal, Complete and Verifiable Example" will make it much easier for maintainers to help you: http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports

```python

Your code here

```

Problem description

[this should explain why the current behavior is a problem and why the expected output is a better solution.]

Expected Output

Output of xr.show_versions()

# Paste the output here xr.show_versions() here
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2190/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 3130.038ms · About: xarray-datasette