home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

2 rows where issue = 336273865 and user = 1217238 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • shoyer · 2 ✖

issue 1

  • Writing Datasets to netCDF4 with "inconsistent" chunks · 2 ✖

author_association 1

  • MEMBER 2
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
400859677 https://github.com/pydata/xarray/issues/2254#issuecomment-400859677 https://api.github.com/repos/pydata/xarray/issues/2254 MDEyOklzc3VlQ29tbWVudDQwMDg1OTY3Nw== shoyer 1217238 2018-06-27T23:18:18Z 2018-06-27T23:18:18Z MEMBER

Yes, a pull request would be appreciated! On Wed, Jun 27, 2018 at 1:53 PM Mike Neish notifications@github.com wrote:

So yes, it looks like we could fix this by checking chunks on each array independently like you suggest. There's no reason why all dask arrays need to have the same chunking for storing with to_netcdf().

I could throw together a pull request if that's all that's involved.

This is because you need to indicate chunks for variables separately, via encoding: http://xarray.pydata.org/en/stable/io.html#writing-encoded-data

Thanks! I was able to write chunked output the netCDF file by adding chunksizes to the encoding attribute of the variables. I found I also had to specify original_shape as a workaround for #2198 https://github.com/pydata/xarray/issues/2198.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/2254#issuecomment-400825442, or mute the thread https://github.com/notifications/unsubscribe-auth/ABKS1nMYi7kPsJImZLwiHGKqpVHH_UlAks5uA_DIgaJpZM4U551i .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Writing Datasets to netCDF4 with "inconsistent" chunks 336273865
400813372 https://github.com/pydata/xarray/issues/2254#issuecomment-400813372 https://api.github.com/repos/pydata/xarray/issues/2254 MDEyOklzc3VlQ29tbWVudDQwMDgxMzM3Mg== shoyer 1217238 2018-06-27T20:11:35Z 2018-06-27T20:11:35Z MEMBER

For reference, here's the full traceback: ```python


ValueError Traceback (most recent call last) <ipython-input-13-6a835b914234> in <module>() 12 13 # Save to a netCDF4 file. ---> 14 dset.to_netcdf("test.nc")

~/dev/xarray/xarray/core/dataset.py in to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute) 1148 engine=engine, encoding=encoding, 1149 unlimited_dims=unlimited_dims, -> 1150 compute=compute) 1151 1152 def to_zarr(self, store=None, mode='w-', synchronizer=None, group=None,

~/dev/xarray/xarray/backends/api.py in to_netcdf(dataset, path_or_file, mode, format, group, engine, writer, encoding, unlimited_dims, compute) 701 # handle scheduler specific logic 702 scheduler = get_scheduler() --> 703 if (dataset.chunks and scheduler in ['distributed', 'multiprocessing'] and 704 engine != 'netcdf4'): 705 raise NotImplementedError("Writing netCDF files with the %s backend "

~/dev/xarray/xarray/core/dataset.py in chunks(self) 1237 for dim, c in zip(v.dims, v.chunks): 1238 if dim in chunks and c != chunks[dim]: -> 1239 raise ValueError('inconsistent chunks') 1240 chunks[dim] = c 1241 return Frozen(SortedKeysDict(chunks))

ValueError: inconsistent chunks ```

So yes, it looks like we could fix this by checking chunks on each array independently like you suggest. There's no reason why all dask arrays need to have the same chunking for storing with to_netcdf().

and replace the instances of dataset.chunks with have_chunks, then the netCDF4 file gets written without any problems (although the data seems to be stored contiguously instead of chunked).

This is because you need to indicate chunks for variables separately, via encoding: http://xarray.pydata.org/en/stable/io.html#writing-encoded-data

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Writing Datasets to netCDF4 with "inconsistent" chunks 336273865

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 229.871ms · About: xarray-datasette