home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

18 rows where user = 1177508 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, created_at (date), updated_at (date)

issue 5

  • segmentation fault with `open_mfdataset` 8
  • multiple files - variable X not equal across datasets 4
  • issue with xray.open_mfdataset and binary operations 3
  • Preprocess argument for open_mfdataset and threading lock 2
  • Conventions decoding should properly handle byte type attributes on Python 3 1

user 1

  • razcore-rad · 18 ✖

author_association 1

  • NONE 18
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
137379731 https://github.com/pydata/xarray/issues/542#issuecomment-137379731 https://api.github.com/repos/pydata/xarray/issues/542 MDEyOklzc3VlQ29tbWVudDEzNzM3OTczMQ== razcore-rad 1177508 2015-09-03T08:41:00Z 2015-09-03T08:41:00Z NONE

cool, thanks

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  issue with xray.open_mfdataset and binary operations 102177256
133376299 https://github.com/pydata/xarray/issues/542#issuecomment-133376299 https://api.github.com/repos/pydata/xarray/issues/542 MDEyOklzc3VlQ29tbWVudDEzMzM3NjI5OQ== razcore-rad 1177508 2015-08-21T11:22:06Z 2015-08-21T11:22:06Z NONE

Nope, it gives me the same error. It seem to only work with ds.load().

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  issue with xray.open_mfdataset and binary operations 102177256
133112303 https://github.com/pydata/xarray/issues/542#issuecomment-133112303 https://api.github.com/repos/pydata/xarray/issues/542 MDEyOklzc3VlQ29tbWVudDEzMzExMjMwMw== razcore-rad 1177508 2015-08-20T18:40:24Z 2015-08-20T18:40:24Z NONE

I see, I'll try that, thanks

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  issue with xray.open_mfdataset and binary operations 102177256
119698728 https://github.com/pydata/xarray/issues/444#issuecomment-119698728 https://api.github.com/repos/pydata/xarray/issues/444 MDEyOklzc3VlQ29tbWVudDExOTY5ODcyOA== razcore-rad 1177508 2015-07-08T19:07:41Z 2015-07-08T19:07:41Z NONE

I think this issue can be closed, after some digging and playing with different netcdf4 modules I'm pretty certain that it was a linkage and compilation issue between system hdf5 and netcdf libraries. You see, the computer I got this error on is one of those "module load" managed supercomputers... and somewhere on the way things got messed up while compiling python modules...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  segmentation fault with `open_mfdataset` 91184107
118373477 https://github.com/pydata/xarray/issues/444#issuecomment-118373477 https://api.github.com/repos/pydata/xarray/issues/444 MDEyOklzc3VlQ29tbWVudDExODM3MzQ3Nw== razcore-rad 1177508 2015-07-03T15:28:16Z 2015-07-03T15:28:16Z NONE

Per file basis (open_dataset) there's no problem... but again, if I try h5netcdf engine, open_mfdataset doesn't throw a segmentation fault, but then I go into the string unicode/ascii problem. So I guess h5netcdf and netcdf4 use the same netcdf/hdf5 libraries don't they? so if if works for h5netcdf then it should work for netcdf4 as well...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  segmentation fault with `open_mfdataset` 91184107
118097399 https://github.com/pydata/xarray/pull/446#issuecomment-118097399 https://api.github.com/repos/pydata/xarray/issues/446 MDEyOklzc3VlQ29tbWVudDExODA5NzM5OQ== razcore-rad 1177508 2015-07-02T17:16:41Z 2015-07-02T17:16:41Z NONE

I need to get my head around this... I know that when you do list comprehension, this isn't lazy so basically it goes through the loop and evaluates for each iteration... so I thought that:

if preprocess is not None: datasets = [preprocess(ds) for ds in datasets]

translates to forcing the application of the preprocess function to the dataset effectively loading it in memory... anyway, this is really cool, I'll definitely try it out :+1:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Preprocess argument for open_mfdataset and threading lock 91547750
118093087 https://github.com/pydata/xarray/pull/446#issuecomment-118093087 https://api.github.com/repos/pydata/xarray/issues/446 MDEyOklzc3VlQ29tbWVudDExODA5MzA4Nw== razcore-rad 1177508 2015-07-02T17:00:47Z 2015-07-02T17:00:47Z NONE

I have a question about this preprocess thing. Would it mean now that... basically xray will load all data in memory? because of the preprocesing step, whereas before... or at least that's what I understood from the documentation, xray would access the data by a need only basis.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Preprocess argument for open_mfdataset and threading lock 91547750
118091969 https://github.com/pydata/xarray/issues/444#issuecomment-118091969 https://api.github.com/repos/pydata/xarray/issues/444 MDEyOklzc3VlQ29tbWVudDExODA5MTk2OQ== razcore-rad 1177508 2015-07-02T16:55:02Z 2015-07-02T16:55:02Z NONE

Yes, I'm using the same files that I once uploaded on Dropbox for you to play with for #443. I'm not doing anything special, just passing in the glob pattern to open_mfdataset with no option for engine (which I guess goes for netcdf4 by default).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  segmentation fault with `open_mfdataset` 91184107
118090139 https://github.com/pydata/xarray/issues/451#issuecomment-118090139 https://api.github.com/repos/pydata/xarray/issues/451 MDEyOklzc3VlQ29tbWVudDExODA5MDEzOQ== razcore-rad 1177508 2015-07-02T16:46:41Z 2015-07-02T16:46:41Z NONE

Yes, I think that this should be implemented in h5netcdf (possibly h5py? because h5netcdf depends on h5py I belive)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Conventions decoding should properly handle byte type attributes on Python 3 92682010
117993960 https://github.com/pydata/xarray/issues/444#issuecomment-117993960 https://api.github.com/repos/pydata/xarray/issues/444 MDEyOklzc3VlQ29tbWVudDExNzk5Mzk2MA== razcore-rad 1177508 2015-07-02T10:36:06Z 2015-07-02T12:18:09Z NONE

OK... as a follow-up, I did some tests and with netcdf4 I got this error again, but using open_mfdataset with the latest versions of h5py & h5netcdf I don't. But there are some decodings that aren't happening now... for whatever reason (maybe h5netcdf?). Anyway, my netcdf files store the attributes in 'ascii', that is, bytes in python so when trying to check for the time I get:

Traceback (most recent call last): File "segfault.py", line 62, in <module> concat_dim='time', engine='h5netcdf')) File "/ichec/home/users/razvan/.local/lib/python3.4/site-packages/xray/backends/api.py", line 202, in open_mfdataset datasets = [open_dataset(p, **kwargs) for p in paths] File "/ichec/home/users/razvan/.local/lib/python3.4/site-packages/xray/backends/api.py", line 202, in <listcomp> datasets = [open_dataset(p, **kwargs) for p in paths] File "/ichec/home/users/razvan/.local/lib/python3.4/site-packages/xray/backends/api.py", line 145, in open_dataset return maybe_decode_store(store) File "/ichec/home/users/razvan/.local/lib/python3.4/site-packages/xray/backends/api.py", line 101, in maybe_decode_store concat_characters=concat_characters, decode_coords=decode_coords) File "/ichec/home/users/razvan/.local/lib/python3.4/site-packages/xray/conventions.py", line 850, in decode_cf decode_coords) File "/ichec/home/users/razvan/.local/lib/python3.4/site-packages/xray/conventions.py", line 791, in decode_cf_variables decode_times=decode_times) File "/ichec/home/users/razvan/.local/lib/python3.4/site-packages/xray/conventions.py", line 735, in decode_cf_variable if 'since' in attributes['units']: TypeError: Type str doesn't support the buffer API

This is simple to solve.. just have every byte attribute decode to 'utf8' when first reading in the variables... I'll have some more time to look at this alter today.

edit: boy... there are some differences between these packages (netcdf4 & h5netcdf)... so, when trying to open_mfdataset with netcdf4 I get the segmentation fault... when I open it with h5netcdf I don't, but I the attributes are in bytes so then xray gives some errors when trying to get the date/time... but netcdf4 doesn't produce this error, it probably converts the bytes to strings internally... so I went in and tried to patch some .decode('utf8') here and there in xray and it works... when using h5netcdf, but then I get another error from h5netcdf:

File "/ichec/home/users/razvan/.local/lib/python3.4/site-packages/h5py/_hl/attrs.py", line 55, in __getitem__ raise IOError("Empty attributes cannot be read") OSError: Empty attributes cannot be read

I didn't put the full error cause I don't think it's relevant. Anyway, needless to say... netcdf4 doesn't give this error... so these things need to be put in accordance somehow :)

edit2: so I was going through the posts here and now I saw you addressed this issue using that lock thing, which is set to True by default in open_datset, right? well, I don't know exactly what this thing is supposed to do, but I'm still getting a segmentation fault, but as stated before, only when using netcdf4, not h5netcdf, but then I run in that inconsistency with the ascii vs utf8 issue if I use h5netcdf... maybe I should open an open issue about this string issue? I don't know if this is an upstream issue or not, I mean, I guess h5netcdf just decides to not convert the ascii to utf8, whereas netcdf4 goes with the more contemporary approach of returning utf8... or is this internally handled by xray?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  segmentation fault with `open_mfdataset` 91184107
117217039 https://github.com/pydata/xarray/issues/444#issuecomment-117217039 https://api.github.com/repos/pydata/xarray/issues/444 MDEyOklzc3VlQ29tbWVudDExNzIxNzAzOQ== razcore-rad 1177508 2015-06-30T14:55:58Z 2015-06-30T14:55:58Z NONE

Well... I have a couple of remarks to make. After some more thought about this it might have been all along my fault. Let me explain. I have this machine at work where I don't have administrative privileges so I decided to give linuxbrew a try. Now there are some system hdf5 libraries (but in custom locations) and they have this module command to load different versions of packages and set up proper environment variables. Before I had this issue, I did have xray installed with dask and everything compiled against the system libraries (and I had no problems with it). Then, with linuxbrew I started getting this weird behavior, using the latest version of hdf5 (1.8.14), but then I tried with version (1.8.13) and I had the same issue. Then I read somewhere on the net that... because of this mixture of local - system install with linuxbrew there might be issues when compiling, that is, the compiler uses versions of some header files that don't necessarily match local installed libraries. I can't confirm this any more though cause I reconfigured everything and removed linuxbrew cause it was producing more problems than solving... but I'll be happy to give the current installation a try and see if I can reproduce the error... can't do more than this though... sorry.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  segmentation fault with `open_mfdataset` 91184107
116146897 https://github.com/pydata/xarray/issues/444#issuecomment-116146897 https://api.github.com/repos/pydata/xarray/issues/444 MDEyOklzc3VlQ29tbWVudDExNjE0Njg5Nw== razcore-rad 1177508 2015-06-27T21:33:30Z 2015-06-27T21:33:30Z NONE

So I just tried @mrocklin's idea with using single-threaded stuff. This seems to fix the segmentation fault, but I am very curious as to why there's a problem with working in parallel. I tried two different hdf5 libraries (I think version 1.8.13 and 1.8.14) but I got the same segmentation fault. Anyway, working on a single thread is not a big deal, I'll just do that for the time being... I already tried gdb on python but I'm not experienced enough to make heads or tails of it... I have the gdb backtrace here but I don't know what to do with it...

@shoyer, the files are not the issue here, they're the same ones I provided in #443.

Question: does the hdf5 library need to be built with parallel support (mpi or something) maybe?... thanks guys

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  segmentation fault with `open_mfdataset` 91184107
115906191 https://github.com/pydata/xarray/issues/444#issuecomment-115906191 https://api.github.com/repos/pydata/xarray/issues/444 MDEyOklzc3VlQ29tbWVudDExNTkwNjE5MQ== razcore-rad 1177508 2015-06-26T22:10:46Z 2015-06-26T22:22:11Z NONE

Just tried engine='h5netcdf'. Still get the segfault. It looks to me that something doesn't properly initialize the hdf5 library and calling that isnull function like this somehow triggers some initialization for the both arrays. It might also be the & operator... because if I do isnull(arr1) & isnull(arr2) I still get the segmentation fault. Only when using isnull(arr1 & arr2) it seems to work... strange things.

edit: I was right... it's actually the & operator, I just need to call arr1 & arr2 before the return statement and I don't get the segmentation fault...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  segmentation fault with `open_mfdataset` 91184107
115906809 https://github.com/pydata/xarray/issues/443#issuecomment-115906809 https://api.github.com/repos/pydata/xarray/issues/443 MDEyOklzc3VlQ29tbWVudDExNTkwNjgwOQ== razcore-rad 1177508 2015-06-26T22:14:09Z 2015-06-26T22:14:09Z NONE

I try concatenating on an existing axis, the 'time' axis. I uploaded a couple of files here. It's just easier and you can try experimenting.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  multiple files - variable X not equal across datasets 91109966
115901388 https://github.com/pydata/xarray/issues/443#issuecomment-115901388 https://api.github.com/repos/pydata/xarray/issues/443 MDEyOklzc3VlQ29tbWVudDExNTkwMTM4OA== razcore-rad 1177508 2015-06-26T21:57:43Z 2015-06-26T21:57:43Z NONE

Well, I'm not sure if it's a bug, I would say it's more like a missing feature... in my case, each netCDF file has a different mean_height_agl coordinate, that is, they have the same length (it's 1D), but different values in each file. I can understand why it can't concatenate, but I would argue that a better way to handle this is to create a dummy coordinate (as I did) and replace the troublesome coordinate with that dummy coordinate.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  multiple files - variable X not equal across datasets 91109966
115900337 https://github.com/pydata/xarray/issues/444#issuecomment-115900337 https://api.github.com/repos/pydata/xarray/issues/444 MDEyOklzc3VlQ29tbWVudDExNTkwMDMzNw== razcore-rad 1177508 2015-06-26T21:50:01Z 2015-06-26T21:53:50Z NONE

Unfortunately I can't use engine='scipy' cause they're not netcdf3 files so it defaults to 'netcdf4'. On the other hand here you can find the back trace from gdb... if that helps in any way...

``` print(arr1.dtype, arr2.dtype) print((arr1 == arr2)) print((arr1 == arr2) | (isnull(arr1) & isnull(arr2)))

gives:

float64 float64 dask.array<x_1, shape=(50, 39, 59), chunks=((50,), (39,), (59,)), dtype=bool> dask.array<x_6, shape=(50, 39, 59), chunks=((50,), (39,), (59,)), dtype=bool> ```

Funny thing is when I'm adding these print statements and so on I get some traceback from Python (some times). Without them I would only get segmetation fault with no additional information. For example, just now, after introducing these prints I got this traceback. This doesn't seem to be an xray bug, I mean it can't since it's just Python code... but any help is appreciated. Thanks!

edit: oh yeah... this is a funny thing. If I do print(((arr1 == arr2) | (isnull(arr1) & isnull(arr2))).all()), I get dask.array<x_13, shape=(), chunks=(), dtype=bool> which I guess it's a problem... so calling that all method kind of screws things up, or at least calls other stuff that screw it up, but I have no idea why calling isnull(arr1 & arr2) before all this... makes it run without segfault.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  segmentation fault with `open_mfdataset` 91184107
115555123 https://github.com/pydata/xarray/issues/443#issuecomment-115555123 https://api.github.com/repos/pydata/xarray/issues/443 MDEyOklzc3VlQ29tbWVudDExNTU1NTEyMw== razcore-rad 1177508 2015-06-26T07:13:00Z 2015-06-26T07:13:00Z NONE

I get this with open_mfdataset:

Traceback (most recent call last): File "box.py", line 59, in <module> concat_dim='time')) File "/ichec/home/users/razvan/.local/lib/python3.4/site-packages/xray/backends/api.py", line 205, in open_mfdataset combined = auto_combine(datasets, concat_dim=concat_dim) File "/ichec/home/users/razvan/.local/lib/python3.4/site-packages/xray/core/alignment.py", line 352, in auto_combine concatenated = [_auto_concat(ds, dim=concat_dim) for ds in grouped] File "/ichec/home/users/razvan/.local/lib/python3.4/site-packages/xray/core/alignment.py", line 352, in <listcomp> concatenated = [_auto_concat(ds, dim=concat_dim) for ds in grouped] File "/ichec/home/users/razvan/.local/lib/python3.4/site-packages/xray/core/alignment.py", line 303, in _auto_concat return concat(datasets, dim=dim) File "/ichec/home/users/razvan/.local/lib/python3.4/site-packages/xray/core/alignment.py", line 278, in concat return cls._concat(objs, dim, indexers, mode, concat_over, compat) File "/ichec/home/users/razvan/.local/lib/python3.4/site-packages/xray/core/dataset.py", line 1712, in _concat 'variable %r not %s across datasets' % (k, verb)) ValueError: variable 'mean_height_agl' not equal across datasets

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  multiple files - variable X not equal across datasets 91109966
115485265 https://github.com/pydata/xarray/issues/443#issuecomment-115485265 https://api.github.com/repos/pydata/xarray/issues/443 MDEyOklzc3VlQ29tbWVudDExNTQ4NTI2NQ== razcore-rad 1177508 2015-06-26T03:16:23Z 2015-06-26T03:16:23Z NONE

So I don't know if this is what you're asking for (I only have one dataset with this problem) but here's how it looks like:

<xray.Dataset> Dimensions: (agl: 50, lat: 39, lon: 59, time: 192) Coordinates: * lon (lon) float64 -29.0 -28.0 -27.0 -26.0 -25.0 -24.0 ... * lat (lat) float64 32.0 33.0 34.0 35.0 36.0 37.0 38.0 39.0 ... * agl (agl) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 ... * time (time) datetime64[ns] 2011-05-21T13:00:00 ... mean_height_agl (time, agl) float64 28.28 97.21 191.1 310.7 460.9 ... Data variables: so2_concentration (time, agl, lat, lon) float64 3.199e-13 3.199e-13 ... ash_wetdep (time, lat, lon) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... ash_concentration (time, agl, lat, lon) float64 9.583e-16 9.581e-16 ... ash_mass_loading (time, lat, lon) float64 1.091e-11 1.091e-11 ... so2_mass_loading (time, lat, lon) float64 2.602e-09 2.602e-09 ... ash_drydep (time, lat, lon) float64 4.086e-10 4.084e-10 4.08e-10 ... Attributes:

This is read in with get_ds from above. I wouldn't be able to read it normally with xray.open_mfdataset because it would give me 'Variable mean_height_agl not equal across datasets'. But 'mean_height_agl' is indeed a coordinate per individual file, so I had to create the dummy 'agl' coordinate and convert 'mean_height_agl' to a variable basically. This way I can still treat the data as being part of one file.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  multiple files - variable X not equal across datasets 91109966

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.223ms · About: xarray-datasette