home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

17 rows where author_association = "NONE" and user = 32069530 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, reactions, created_at (date), updated_at (date)

issue 10

  • bug or unclear definition of combine_attrs with xr.merge() 4
  • Concurrent acces with multiple processes using open_mfdataset 2
  • Pad method 2
  • [Feature Request] iteration equivalent numpy's nditer or ndenumerate 2
  • Wrong list of coordinate when a singleton coordinate exists 2
  • groupby very slow compared to pandas 1
  • nonzero method for xr.DataArray 1
  • removing uneccessary dimension 1
  • Releasing memory? 1
  • [FEATURE]: dimension attribute are lost when stacking an xarray 1

user 1

  • lanougue · 17 ✖

author_association 1

  • NONE · 17 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1450036767 https://github.com/pydata/xarray/issues/6196#issuecomment-1450036767 https://api.github.com/repos/pydata/xarray/issues/6196 IC_kwDOAMm_X85Wbc4f lanougue 32069530 2023-03-01T12:09:21Z 2023-03-01T12:09:40Z NONE

Hello @TomNicholas ,

Reopening this issue 1 year later ! To answer your last question, singleton dimension seems to have, indeed, a unique behavior since they are reattached systematically to other coordinates (even if they naturally do not share any dimension with other coordinates). These singleton dimensions introduce some strange behavior. This is another example: a = xr.DataArray(np.random.rand(2,3,2), dims=('x','y','z'), coords={'x':[1,2], 'y':[3,4,5],'z':['0','1']}) b = xr.DataArray(np.random.rand(2,3,2), dims=('x','y','z'), coords={'x':[1,2], 'y':[3,4,5],'z':['0','1']}) res1 = a.sel(z='0')/b res2 = a.sel(z='0').expand_dims('z')/b res1 and res2 do not have the same size on dimension "z". In res1, dimension "z" is not considered anymore as a dimension at all !

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Wrong list of coordinate when a singleton coordinate exists 1115166039
1255029201 https://github.com/pydata/xarray/issues/2805#issuecomment-1255029201 https://api.github.com/repos/pydata/xarray/issues/2805 IC_kwDOAMm_X85KzjnR lanougue 32069530 2022-09-22T13:30:26Z 2022-09-22T16:12:47Z NONE

Hello guys,

While waiting for a integrated solution. Here is a function that should do the job in a safe way. It returns an iterator

```` def xndindex(ds, dims=None): if dims is None: dims = ds.dims elif type(dims) is str: dims=[dims] else: pass

for d in dims:
    if d not in ds.dims:
        raise ValueError("Invalid dimension '{}'. Available dimensions {}".format(d, ds.dims))

iter_dict = {k:v for k,v in ds.sizes.items() if k in dims}
for d,k in zip(repeat(tuple(iter_dict.keys())),zip(np.ndindex(tuple(iter_dict.values())))):
    yield {k:l for k,l in zip(d,k[0])}

Example of use a = xr.DataArray(np.random.rand(4,3), dims=['x','y'], coords={'x':np.arange(4), 'y':np.arange(3)}) for i in xndindex(a): print(a[i]) ````

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  [Feature Request] iteration equivalent numpy's nditer or ndenumerate 419543087
1239645797 https://github.com/pydata/xarray/issues/2805#issuecomment-1239645797 https://api.github.com/repos/pydata/xarray/issues/2805 IC_kwDOAMm_X85J435l lanougue 32069530 2022-09-07T16:53:34Z 2022-09-07T17:00:44Z NONE

Hi guys,

For now, when I want to iterate over all my dataset I use the simple (but dangerous I believe) workaround: for i in np.ndindex(ds.shape): res = ds[{d:j for d,j in zip(ds.dims,i)}] but, I am not sure that ndindex will iterate in the good order relatively to the ds.dims return.

Is there any news on this topic ?

Many thanks !

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  [Feature Request] iteration equivalent numpy's nditer or ndenumerate 419543087
1085742545 https://github.com/pydata/xarray/issues/1772#issuecomment-1085742545 https://api.github.com/repos/pydata/xarray/issues/1772 IC_kwDOAMm_X85Atx3R lanougue 32069530 2022-04-01T10:42:20Z 2022-04-01T10:42:20Z NONE

I wake up this issue, Any news ?

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  nonzero method for xr.DataArray 280875330
1026774266 https://github.com/pydata/xarray/issues/6196#issuecomment-1026774266 https://api.github.com/repos/pydata/xarray/issues/6196 IC_kwDOAMm_X849M1T6 lanougue 32069530 2022-02-01T12:07:51Z 2022-02-01T12:07:51Z NONE

Thanks for the enlightening.

Actually, this coordinates dependency with singleton dimension caused me a problem when using the to_netcdf() function. No problem playing whith the xr.Dataset but I get some error when trying to write on disk using to_netcdf(). For now, I wasn't able to reproduce a minimalist example because the error disappears with minimalist example. I wasn't able to find the fundamental difference between the dataset causing the error and the minimalist one. Printing them are exactly the same. I have to do deeper inspection.

Concerning the philosophy of what a coordinate should be: For me the "label" idea is understandable at a dataset level. A singleton dimension become a (shared) "label' for the whole dataset. This is ok for me. However, I do not understand why it should also be a "label" of the other coordinates of the dataset. A singleton dimension should not be "more important" than the other (not singleton) dimensions. Why the singleton dimension should become a "label" of another dimension while the other dimensions are not. This do not seem logical to me.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Wrong list of coordinate when a singleton coordinate exists 1115166039
1022313564 https://github.com/pydata/xarray/issues/6183#issuecomment-1022313564 https://api.github.com/repos/pydata/xarray/issues/6183 IC_kwDOAMm_X84870Rc lanougue 32069530 2022-01-26T15:31:43Z 2022-01-26T15:31:43Z NONE

ok, thanks ! I will thus be patient

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  [FEATURE]: dimension attribute are lost when stacking an xarray 1110623911
861792425 https://github.com/pydata/xarray/issues/5436#issuecomment-861792425 https://api.github.com/repos/pydata/xarray/issues/5436 MDEyOklzc3VlQ29tbWVudDg2MTc5MjQyNQ== lanougue 32069530 2021-06-15T20:00:29Z 2021-06-15T20:00:29Z NONE

an additional flag like "keep_attrs" is not feasible ? It would be a boolean

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  bug or unclear definition of combine_attrs with xr.merge() 911513701
854812439 https://github.com/pydata/xarray/issues/5436#issuecomment-854812439 https://api.github.com/repos/pydata/xarray/issues/5436 MDEyOklzc3VlQ29tbWVudDg1NDgxMjQzOQ== lanougue 32069530 2021-06-04T15:24:25Z 2021-06-04T15:24:25Z NONE

I understand but I still beleive that we should be able to control separately the attrs of the final dataset and the attrs of the merged dataArray inside (whatever the way they are passed to the merge function)

Thanks for the pint-xarray suggestion! I didn't know about it. I will look into it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  bug or unclear definition of combine_attrs with xr.merge() 911513701
854768921 https://github.com/pydata/xarray/issues/5436#issuecomment-854768921 https://api.github.com/repos/pydata/xarray/issues/5436 MDEyOklzc3VlQ29tbWVudDg1NDc2ODkyMQ== lanougue 32069530 2021-06-04T14:27:07Z 2021-06-04T14:27:07Z NONE

Ok, I understand your point of view. My question (or what you think could be a bug) thus becomes: why "drop" option removes attrs from the variables in the merged dataset while "drop_conflicts" and "override" keep them ?

It should thus be some way to say the merging to keep or not the attrs of each variables in the final dataset. (I do not understand your comment: how to keep the units on the data instead of in the attributes ?)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  bug or unclear definition of combine_attrs with xr.merge() 911513701
854739959 https://github.com/pydata/xarray/issues/5436#issuecomment-854739959 https://api.github.com/repos/pydata/xarray/issues/5436 MDEyOklzc3VlQ29tbWVudDg1NDczOTk1OQ== lanougue 32069530 2021-06-04T13:52:44Z 2021-06-04T13:52:44Z NONE

@keewis , do you think this behaviour to be the expected one ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  bug or unclear definition of combine_attrs with xr.merge() 911513701
611208139 https://github.com/pydata/xarray/issues/3946#issuecomment-611208139 https://api.github.com/repos/pydata/xarray/issues/3946 MDEyOklzc3VlQ29tbWVudDYxMTIwODEzOQ== lanougue 32069530 2020-04-08T21:37:45Z 2020-04-08T21:37:45Z NONE

@TomNicholas , Thanks for yor help. That is exactly what I wanted to do but, as you said there is probably a more efficent way to do it.

@dcherian I needed this function because I sometimes use the groupby_bins() function followed by a concatenantion along a new dimension. This can drastically increase memory due to the multiplication of other variables in a Dataset. Independantly of my usage, having a function that remove redundant data seems interessant to me. There is probably other combination of function that can accidently duplicate data.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  removing uneccessary dimension 595813283
610402170 https://github.com/pydata/xarray/issues/3948#issuecomment-610402170 https://api.github.com/repos/pydata/xarray/issues/3948 MDEyOklzc3VlQ29tbWVudDYxMDQwMjE3MA== lanougue 32069530 2020-04-07T13:57:04Z 2020-04-07T13:57:04Z NONE

Hi, If results1 is already evaluated, just replace "da1.release()" with "del da1". Python should automatically release the memory

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Releasing memory? 595882590
562654929 https://github.com/pydata/xarray/issues/2605#issuecomment-562654929 https://api.github.com/repos/pydata/xarray/issues/2605 MDEyOklzc3VlQ29tbWVudDU2MjY1NDkyOQ== lanougue 32069530 2019-12-06T17:02:05Z 2019-12-06T17:02:05Z NONE

Ho, sorry... I just see the PR...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Pad method 390774883
562652648 https://github.com/pydata/xarray/issues/2605#issuecomment-562652648 https://api.github.com/repos/pydata/xarray/issues/2605 MDEyOklzc3VlQ29tbWVudDU2MjY1MjY0OA== lanougue 32069530 2019-12-06T16:56:20Z 2019-12-06T16:56:20Z NONE

Hi, I was looking to some xarray padding function and get this issue. For the moment, I made a function of my own based on numpy.pad and xr.apply_ufunc When possible, it also pad associated coordinates. If it can be of any help here... Here it is: ``` def xpad(ds, dims={}): """ Padding of xarray. Coordinate are linearly padded if original coordinates are evenly spaced. Otherwise, no new coordinates are affected to padded axis. Padded dimension is named with prefix 'padded_'

Args:
    ds (xarray): xarray
    dims (dict): keys are dimensions along which to pad and values are padding tuple (see np.pad). (ex {'pulse:(10,0)})

Returns:
    (xarray) : same as input with padded axis. 
"""
mypad = [(0,0) for n in ds.dims if n not in dims.keys()]
mypad+=list(dims.values())
padded_ds = xr.apply_ufunc(np.pad, ds, mypad,input_core_dims=[list(dims.keys()),[]], output_core_dims=[['padded_'+d for d in dims.keys()]], keep_attrs=True)

for var, ext in dims.items():
    dvar = np.diff(ds[var])
    if np.allclose(dvar, dvar[0]):
        dvar = dvar[0]
        left_bound, right_bound  = (np.min(ds[var]).data, np.max(ds[var]).data) if dvar>0. else (np.max(ds[var]).data, np.min(ds[var]).data)
        extended_var = np.append(ds[var].data, np.max(ds[var]).data+np.arange(1,ext[1]+1)*dvar)
        extended_var = np.append(np.min(ds[var]).data+np.arange(-ext[0],0)*dvar, extended_var)
        padded_ds = padded_ds.assign_coords(**{'padded_'+var:extended_var})
    else:
        print('Coordinates {} are not evenly spaced, padding is impossible'.format(var))
return padded_ds

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Pad method 390774883
522592263 https://github.com/pydata/xarray/issues/659#issuecomment-522592263 https://api.github.com/repos/pydata/xarray/issues/659 MDEyOklzc3VlQ29tbWVudDUyMjU5MjI2Mw== lanougue 32069530 2019-08-19T14:09:36Z 2019-08-19T14:09:36Z NONE

I gave a look to functions such as "np.add.at" which can be highly faster than home-made solution. The aggregate function of the "numpy-groupies" package is even faster (25 x faster than np.add.at in my case). Maybe xarray groupby functionalities can rely on such effective package.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  groupby very slow compared to pandas 117039129
433393215 https://github.com/pydata/xarray/issues/2494#issuecomment-433393215 https://api.github.com/repos/pydata/xarray/issues/2494 MDEyOklzc3VlQ29tbWVudDQzMzM5MzIxNQ== lanougue 32069530 2018-10-26T12:37:30Z 2018-10-26T12:37:30Z NONE

Hi all, I finally figured out my problem. On each independent process xr.open_mfdataset() seems to naturally try to do some multi-threaded access (even without parallel option ?). Each node of my cluster was configured in such a way that multi-threading was possible (my mistake). Here was my yaml config file used by PBSCluster() jobqueue: pbs: name: dask-worker # Dask worker options cores: 56 processes: 28 I tough that the parallel=True option was to enable parallelized access for my independent process. It actually enable parallelized access for possible threads of each process. Now, I have removed parallel=True from xr.open_mfdataset() call and ensure 1 thread by process by changing my config file: cores: 28 processes: 28 Thanks again for your help

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Concurrent acces with multiple processes using open_mfdataset 371906566
431796693 https://github.com/pydata/xarray/issues/2494#issuecomment-431796693 https://api.github.com/repos/pydata/xarray/issues/2494 MDEyOklzc3VlQ29tbWVudDQzMTc5NjY5Mw== lanougue 32069530 2018-10-22T10:27:04Z 2018-10-22T10:27:04Z NONE

@jhamman I was aware of the difference between the two parallel options. I was thus wondering if I could pass a parallel option to the netcdf4 library via the open_mfdataset() call. I tried to change the engine to netcdf4 and added the backend_kwarg : backend_kwargs={'parallel':True} but I get the same error. I 'll try the suggestion of Stephan to see how it behaves and I will report back. Thanks

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Concurrent acces with multiple processes using open_mfdataset 371906566

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 721.834ms · About: xarray-datasette