home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

6 rows where author_association = "NONE" and user = 4849151 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, reactions, created_at (date), updated_at (date)

issue 4

  • Using where() in datasets with dataarrays with different dimensions results in huge RAM consumption 2
  • Using `DataArray.where()` with a DataArray as the condition drops the name 2
  • Netcdf char array not being decoded to string in compound dtype 1
  • NetCDF coordinates in parent group is not used when reading sub group 1

user 1

  • jacklovell · 6 ✖

author_association 1

  • NONE · 6 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
613313818 https://github.com/pydata/xarray/issues/2129#issuecomment-613313818 https://api.github.com/repos/pydata/xarray/issues/2129 MDEyOklzc3VlQ29tbWVudDYxMzMxMzgxOA== jacklovell 4849151 2020-04-14T08:54:33Z 2020-04-14T08:54:33Z NONE

This bug appears to have been fixed, as of Xarray 0.15.0.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Using `DataArray.where()` with a DataArray as the condition drops the name 322849322
389079066 https://github.com/pydata/xarray/issues/2129#issuecomment-389079066 https://api.github.com/repos/pydata/xarray/issues/2129 MDEyOklzc3VlQ29tbWVudDM4OTA3OTA2Ng== jacklovell 4849151 2018-05-15T08:00:31Z 2018-05-15T08:00:31Z NONE

Yes, I think it makes sense to couple it with keep_attrs=True. One could question whether the name should be considered separate to the attributes (and therefore have an additional keyword argument for keeping the name), but I think in most cases it makes sense to keep the name if you're already keeping the attributes, and to drop the name if you're dropping the attributes.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Using `DataArray.where()` with a DataArray as the condition drops the name 322849322
373653541 https://github.com/pydata/xarray/issues/1977#issuecomment-373653541 https://api.github.com/repos/pydata/xarray/issues/1977 MDEyOklzc3VlQ29tbWVudDM3MzY1MzU0MQ== jacklovell 4849151 2018-03-16T09:27:22Z 2018-03-16T09:27:22Z NONE

Now that https://github.com/Unidata/netcdf4-python/pull/778 has been merged, it should be a bit easier to support this in xarray too. Though as previously mentioned, it will require no longer calling var.set_auto_chartostring(False) for compound types.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Netcdf char array not being decoded to string in compound dtype 303809308
372324699 https://github.com/pydata/xarray/issues/1982#issuecomment-372324699 https://api.github.com/repos/pydata/xarray/issues/1982 MDEyOklzc3VlQ29tbWVudDM3MjMyNDY5OQ== jacklovell 4849151 2018-03-12T14:17:01Z 2018-03-12T14:17:01Z NONE

It looks to me like #1092 is about a Dataset-like object which can contain groups and sub-groups. Here we have a simpler issue: the Dataset can still be a flat object containing a single group, but it should respect the scope of netCDF dimensions. This means that any dimensions which are mentioned but not visible in the group being written should be searched for and copied (linked?) from a parent group, up to and including the root group if the dimensions reside there.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  NetCDF coordinates in parent group is not used when reading sub group 304314787
273529203 https://github.com/pydata/xarray/issues/1217#issuecomment-273529203 https://api.github.com/repos/pydata/xarray/issues/1217 MDEyOklzc3VlQ29tbWVudDI3MzUyOTIwMw== jacklovell 4849151 2017-01-18T16:43:03Z 2017-01-19T05:15:52Z NONE

The problem isn't as bad with a smaller example (though the runtime is doubled). I've attached a minimum working example, which seems to suggest that maybe there was a problem with xarray creating a MultiIndex and duplicating all the data? (I've left in input() to allow checking memory usage before the program exists, but there isn't much difference in this example). xrmin.py.txt

Edit by @shoyer: added code from attachment inline: ```python

!/usr/bin/env python3

import time import sys import numpy as np import xarray as xr

ds = xr.Dataset() ds['data1'] = xr.DataArray(np.arange(1000), coords={'t1': np.linspace(0, 1, 1000)}) ds['data1b'] = xr.DataArray(np.arange(1000, 2000), coords={'t1': np.linspace(0, 1, 1000)}) ds['data2'] = xr.DataArray(np.arange(2000, 5000), coords={'t2': np.linspace(0, 1, 3000)}) ds['data2b'] = xr.DataArray(np.arange(6000, 9000), coords={'t2': np.linspace(0, 1, 3000)}) if sys.argv[1] == "nodrop": now = time.time() print(ds.where(ds.data1 < 50, drop=True)) print("Took {} seconds".format(time.time() - now)) elif sys.argv[1] == "drop": ds1 = ds.drop('t2') now = time.time() print(ds1.where(ds1.data1 < 50, drop=True)) print("Took {} seconds".format(time.time() - now)) input("Press return to exit") ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Using where() in datasets with dataarrays with different dimensions results in huge RAM consumption 201617371
273523770 https://github.com/pydata/xarray/issues/1217#issuecomment-273523770 https://api.github.com/repos/pydata/xarray/issues/1217 MDEyOklzc3VlQ29tbWVudDI3MzUyMzc3MA== jacklovell 4849151 2017-01-18T16:25:19Z 2017-01-18T16:25:19Z NONE

data1 and data2 represent two stages of data acquisition within one "shot" of our experiment. I'd like to be able to group each shot's data into a single dataset.

I want to extract from the dataset only the values for which my where() condition is true, and I'll only be using DataArrays which share the same dimension as the one in the condition. For example, if I do: ds_low = ds.where(ds.data1 < 0.1, drop=True) I'll only use stuff in ds_low with the same dimension as ds.data1. So in my case extracting the data with the shared dimension using ds.drop(<unused dim>) is appropriate.

It would be nice to have xarray throw a warning or error to prevent me chomping up all the RAM in my system if I do try to do this sort of thing though. Or it could simply mask off with NaN everything in the DataArrays which have a different dimension.

Give me a second to provide a minimal working example.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Using where() in datasets with dataarrays with different dimensions results in huge RAM consumption 201617371

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.195ms · About: xarray-datasette