home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

3 rows where state = "open", type = "issue" and user = 167164 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

type 1

  • issue · 3 ✖

state 1

  • open · 3 ✖

repo 1

  • xarray 3
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
60303760 MDU6SXNzdWU2MDMwMzc2MA== 364 pd.Grouper support? naught101 167164 open 0     24 2015-03-09T06:25:14Z 2022-04-09T01:48:48Z   NONE      

In pandas, you can pas a pandas.TimeGrouper object to a .groupby() call, and it allows you to group by month, year, day, or other times, without manually creating a new index with those values first. It would be great if you could do this with xray, but at the moment, I get:

`` /usr/local/lib/python3.4/dist-packages/xray/core/groupby.py in __init__(self, obj, group, squeeze) 66 if the dimension is squeezed out. 67 """ ---> 68 if group.ndim != 1: 69 # TODO: remove this limitation? 70 raise ValueError('group` must be 1 dimensional')

AttributeError: 'TimeGrouper' object has no attribute 'ndim' ```

Not sure how this will work though, because pandas.TimeGrouper doesn't appear to work with multi-index dataframes yet anyway, so maybe there needs to be a feature request over there too, or maybe it's better to implement something from scratch...

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/364/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
429572364 MDU6SXNzdWU0Mjk1NzIzNjQ= 2868 netCDF4: support for structured arrays as attribute values; serialize as "compound types" naught101 167164 open 0     3 2019-04-05T03:54:17Z 2022-04-07T15:23:25Z   NONE      

Code Sample, a copy-pastable example if possible

A "Minimal, Complete and Verifiable Example" will make it much easier for maintainers to help you: http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports

```python ds.attrs = dict(a=dict(b=2)) ds.to_netcdf(outfile)

...

~/miniconda3/envs/ana/lib/python3.6/site-packages/xarray/backends/api.py in check_attr(name, value) 158 'a string, an ndarray or a list/tuple of ' 159 'numbers/strings for serialization to netCDF ' --> 160 'files'.format(value)) 161 162 # Check attrs on the dataset itself

TypeError: Invalid value for attr: {'b': 2} must be a number, a string, an ndarray or a list/tuple of numbers/strings for serialization to netCDF files

```

Problem description

I'm not entirely sure if this should be possible, but it seems like it should be from this email: https://www.unidata.ucar.edu/support/help/MailArchives/netcdf/msg10502.html

Nested attributes would be nice as a way to namespace metadata.

Expected Output

Netcdf with nested global attributes.

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 4.18.0-16-lowlatency machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_AU.UTF-8 LOCALE: en_AU.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.12.0 pandas: 0.24.2 numpy: 1.16.2 scipy: 1.2.1 netCDF4: 1.4.3.2 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.0.3 cartopy: 0.17.0 seaborn: None setuptools: 40.8.0 pip: 19.0.3 conda: None pytest: 4.3.1 IPython: 7.3.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2868/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
446933504 MDU6SXNzdWU0NDY5MzM1MDQ= 2979 Reading single grid cells from a multi-file netcdf dataset? naught101 167164 open 0     1 2019-05-22T05:01:50Z 2019-05-23T16:15:54Z   NONE      

I have a multifile dataset made up of month-long 8-hourly netcdf datasets over nearly 30 years. The files are available from ftp://ftp.ifremer.fr/ifremer/ww3/HINDCAST/GLOBAL/, and I'm spcifically looking at e.g. 1990_CFSR/hs/ww3.199001_hs.nc for each year and month. Each file is about 45Mb, for about 15Gb total.

I want to calculate some lognormal distribution parameters of the Hs variable at each grid point (actually, only a smallish subset of points, using a mask). However, if I load the data with open_mfdataset and try to read a single lat/lon grid cell, my computer tanks, and python gets killed due to running out of memory (I have 16Gb, but even if I only try to open 1 year of data - ~500Mb, python ends up using 27% of my memory).

Is there a way in xarray/dask to force dask to only read single sub-arrays at a time? I have tried using lat/lon chunking, e.g.

python mfdata_glob = '/home/nedcr/cr/data/wave/*1990*.nc' global_ds = xr.open_mfdataset( mfdata_glob, chunks={'latitude': 1, 'longitude': 1}) but that doesn't seem to improve things.

Is there any way around this problem? I guess I could try using preprocess= to sub-select grid cells, and loop over that, but that seems like it would require opening and reading each file 317*720 times, which sounds like a recipe for a long wait.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2979/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 1678.126ms · About: xarray-datasette