home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

13 rows where user = 11411331 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, reactions, created_at (date), updated_at (date)

issue 8

  • coarsen deletes attrs on original object 4
  • Advice on unit-aware arithmetic 2
  • HTML repr fix for Furo Sphinx theme 2
  • open_mfdataset too many files 1
  • If a NetCDF file is chunked on disk, open it with compatible dask chunks 1
  • Is it possible to disable datetime64 encoding? 1
  • Preserve attrs with coarsen 1
  • Documentation on tuple-type data in DataArrays 1

user 1

  • kmpaul · 13 ✖

author_association 1

  • CONTRIBUTOR 13
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1101767340 https://github.com/pydata/xarray/pull/6501#issuecomment-1101767340 https://api.github.com/repos/pydata/xarray/issues/6501 IC_kwDOAMm_X85Bq6Ks kmpaul 11411331 2022-04-18T21:05:43Z 2022-04-18T21:05:43Z CONTRIBUTOR

😄 It's a good community to contribute to.

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
  HTML repr fix for Furo Sphinx theme 1207399616
1101759613 https://github.com/pydata/xarray/pull/6501#issuecomment-1101759613 https://api.github.com/repos/pydata/xarray/issues/6501 IC_kwDOAMm_X85Bq4R9 kmpaul 11411331 2022-04-18T20:54:41Z 2022-04-18T20:54:41Z CONTRIBUTOR

Isn't that sad that it is my first PR? 😄

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  HTML repr fix for Furo Sphinx theme 1207399616
740118863 https://github.com/pydata/xarray/issues/4404#issuecomment-740118863 https://api.github.com/repos/pydata/xarray/issues/4404 MDEyOklzc3VlQ29tbWVudDc0MDExODg2Mw== kmpaul 11411331 2020-12-07T19:07:01Z 2020-12-07T19:07:01Z CONTRIBUTOR

I'd support that.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Documentation on tuple-type data in DataArrays 692106936
689764253 https://github.com/pydata/xarray/pull/4360#issuecomment-689764253 https://api.github.com/repos/pydata/xarray/issues/4360 MDEyOklzc3VlQ29tbWVudDY4OTc2NDI1Mw== kmpaul 11411331 2020-09-09T19:18:42Z 2020-09-09T19:18:42Z CONTRIBUTOR

Nice work, @jukent!

{
    "total_count": 2,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 2,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Preserve attrs with coarsen 683115879
673869137 https://github.com/pydata/xarray/issues/4120#issuecomment-673869137 https://api.github.com/repos/pydata/xarray/issues/4120 MDEyOklzc3VlQ29tbWVudDY3Mzg2OTEzNw== kmpaul 11411331 2020-08-14T03:55:32Z 2020-08-14T03:55:32Z CONTRIBUTOR

Yeah. That's true. I did overlook that. Thanks!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  coarsen deletes attrs on original object 630062936
673748046 https://github.com/pydata/xarray/issues/4120#issuecomment-673748046 https://api.github.com/repos/pydata/xarray/issues/4120 MDEyOklzc3VlQ29tbWVudDY3Mzc0ODA0Ng== kmpaul 11411331 2020-08-13T22:50:48Z 2020-08-13T22:50:48Z CONTRIBUTOR

Also, while I was walking through the logic in this problem, I found that in the _reduce_method functions of the DataArrayCoarsen and DatasetCoarsen classes, the kwargs variable is being shadowed:

https://github.com/pydata/xarray/blob/df7b2eae3a26c1e86bd5f1dd7dab9cc8c4e53914/xarray/core/rolling.py#L692-L701

So, you can see that the skipna option is just being ignored. That's another pretty easy fix, but it will change existing behavior. It might be prudent to look to see if there are any known bug reports referencing the skipna parameter being ignored.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  coarsen deletes attrs on original object 630062936
673746511 https://github.com/pydata/xarray/issues/4120#issuecomment-673746511 https://api.github.com/repos/pydata/xarray/issues/4120 MDEyOklzc3VlQ29tbWVudDY3Mzc0NjUxMQ== kmpaul 11411331 2020-08-13T22:45:40Z 2020-08-13T22:45:40Z CONTRIBUTOR

@dcherian @jukent: After a little walk through the code, I think the problem is in xarray/core/variable.py. If you look at the first part of the coarsen function in this file:

https://github.com/pydata/xarray/blob/df7b2eae3a26c1e86bd5f1dd7dab9cc8c4e53914/xarray/core/variable.py#L1945-L1953

you will see that **kwargs is not being pass into the _coarsen_reshape function:

https://github.com/pydata/xarray/blob/df7b2eae3a26c1e86bd5f1dd7dab9cc8c4e53914/xarray/core/variable.py#L1961

And down at the bottom of that function:

https://github.com/pydata/xarray/blob/df7b2eae3a26c1e86bd5f1dd7dab9cc8c4e53914/xarray/core/variable.py#L2024-L2025

it retrieves the keep_attrs parameter from global options, and doesn't check for a passed-in argument.

@jukent, why don't you draft a PR to fix this problem?

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  coarsen deletes attrs on original object 630062936
673729936 https://github.com/pydata/xarray/issues/4120#issuecomment-673729936 https://api.github.com/repos/pydata/xarray/issues/4120 MDEyOklzc3VlQ29tbWVudDY3MzcyOTkzNg== kmpaul 11411331 2020-08-13T21:56:14Z 2020-08-13T21:56:46Z CONTRIBUTOR

After some testing, I discovered that:

```python

Your code here

import xarray as xr

xr.set_options(keep_attrs=True) # NOTE GLOBAL OPTION!!!!

ds = xr.tutorial.load_dataset("air_temperature") ds2 = xr.tutorial.load_dataset("air_temperature")

xr.testing.assert_identical(ds, ds2) # passes ds.coarsen(lat=5).mean() xr.testing.assert_identical(ds, ds2) # passes ```

makes your example pass. So, it seems that somewhere along the chain of functions the keep_attrs parameter is being lost or modified.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  coarsen deletes attrs on original object 630062936
632285419 https://github.com/pydata/xarray/issues/1440#issuecomment-632285419 https://api.github.com/repos/pydata/xarray/issues/1440 MDEyOklzc3VlQ29tbWVudDYzMjI4NTQxOQ== kmpaul 11411331 2020-05-21T19:01:36Z 2020-05-21T19:01:36Z CONTRIBUTOR

@rabernat When you say "underlying array store", are you talking about the storage layer? That is, the zarr store or the netcdf file?

It seems to me that the there are lots of "layers" of "chunking", especially when you are talking about chunking an entire dataset, which really confuses the whole issue. On an HPC system, there's filesystem blocksize, NetCDF/HDF5 "internal" chunks, chunking by spreading the data over multiple files, and in-memory chunks (i.e., Dask chunks). I'm not an expert on object store, but my understanding of object store is that (if you are storing NetCDF/HDF5 on object store) there is still "interal" NetCDF/HDF5 "chunking", then chunking over objects/files, and then in-memory chunking.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  If a NetCDF file is chunked on disk, open it with compatible dask chunks 233350060
391596424 https://github.com/pydata/xarray/issues/2176#issuecomment-391596424 https://api.github.com/repos/pydata/xarray/issues/2176 MDEyOklzc3VlQ29tbWVudDM5MTU5NjQyNA== kmpaul 11411331 2018-05-24T05:50:26Z 2018-05-24T05:50:26Z CONTRIBUTOR

Well, I'm certainly not trying to argue that the PhysArray implementation is the solution. It's just a solution, and a solution for a much smaller problem.

I understand your concerns about cf_units. I've heard many people make the same argument for pint, and I appreciate it. I went with cf_units because of its dependence on UDUNITS, which is declared as the "standard" in the CF Conventions. I never had the time (nor do I foresee ever having the time) to see if pint was fully compliant with UDUNITS, so I went with something I knew was.

I personally think it's fine to discuss this here, unless other people would like to see this go offline. To address some of your issues:

1) As a container rather than subclass, it does not implement many of the methods of DataArray

Yes. I agree. In fact, my first implementation was a subclass of DataArray, but after reading the Xarray documentation here and the Issue #706, I decided on composition for this implementation.

2) There are a few design choices I don't understand, like why calendar is always a property of a PhysArray even when it isn't storing a time

I personally didn't see much of a benefit to a check like hasattr(obj, 'calendar') versus obj.calendar is None (other than the fact that these two checks are opposite). So, for my code, I felt that obj.calendar is None was a reasonable check to see if the units were "calendared time" units.

why cftime objects aren't used instead of units to manage time

As I mentioned in my post, this is just my personal preference. I think that calendared time units are...well...just units, and that the same mechanism for dealing with all other units should deal with calendared time units. It's just an aesthetic that I prefer. (That said, I'm extremely happy that someone finally dealt with non-standard calendars with cftime.)

why the positive attribute is important enough for PhysArray to manage (I've never seen it in any data I've used, and it's easy to check if a DataArray is all positive or negative with a function call)

The CF Conventions define the positive attribute to indicate that the field represents the vertical component of a vector, and the value of positive indicates what direction the vertical component points (up or down) if the data is positive. So, if you add field X and field Y, and X.positive == 'up' but Y.positive == 'down', then you need to actually subtract the fields. It's an annoying attribute that I've had to deal with when preparing data for CMIP6.

In the end, I'm happy doing whatever the community wants with this code. I can pull it out into it's own repo (along with the unit tests, which are also in the PyConform repo). And then, after that, I have no problem with people taking it in a different direction (e.g., using pint, injecting properties based on the value of the attributes, etc.). I'm also happy with a subclass option, as my C++ experiences in the past have made me very comfortable with inheritance. I take guidance easily. I just don't usually have a lot of time to devote to development any more these days. :-(

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Advice on unit-aware arithmetic 325810810
391470885 https://github.com/pydata/xarray/issues/2176#issuecomment-391470885 https://api.github.com/repos/pydata/xarray/issues/2176 MDEyOklzc3VlQ29tbWVudDM5MTQ3MDg4NQ== kmpaul 11411331 2018-05-23T19:37:26Z 2018-05-23T19:37:26Z CONTRIBUTOR

Thanks, @jhamman, for the reference.

I have, indeed, been thinking about this a lot. I've hit the same problem that you have @mcgibbon and came to the same conclusion that subclassing was the only way to go. I'm not sure I'm happy with my solution, but I wrote a (partial) implementation of a DataArray subclass to deal with units. @jhamman suggested that I pull this out into a separate repo, and I hope to do that one day... However, if you are curious, now, you can view it here:

https://github.com/NCAR/PyConform/tree/rewrite/source/pyconform/physarrays

Note that this implementation deals with time as just another unit, rather than requiring any special dealing with time. This works if you units package can deal with calendared times, too. I am currently using cf_units to deal with the units, not pint, but the result is similar. However, cf_units does not deal with calendar conversion, and it doesn't always deal with non-standard calendars well. In the end, though, if I could voice my own 2 cents, I believe that calendared times should be dealt with using the same mechanics as any other unit. Just my bias.

Also, note that this implementation also deals with the positive attribute of the data, in addition to the units and calendar attributes. The positive attribute can be tricky, but if the values don't match between two arrays, then you need to convert before doing any math.

Let me know what you think about this. If it's a priority, I can pull this out into its own repository.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Advice on unit-aware arithmetic 325810810
375819524 https://github.com/pydata/xarray/issues/2013#issuecomment-375819524 https://api.github.com/repos/pydata/xarray/issues/2013 MDEyOklzc3VlQ29tbWVudDM3NTgxOTUyNA== kmpaul 11411331 2018-03-23T22:56:17Z 2018-03-23T22:56:17Z CONTRIBUTOR

Oh, good grief! I looked at that documentation the just yesterday and I completely missed that!

Sorry. I'm getting old.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Is it possible to disable datetime64 encoding? 308200446
263647433 https://github.com/pydata/xarray/issues/463#issuecomment-263647433 https://api.github.com/repos/pydata/xarray/issues/463 MDEyOklzc3VlQ29tbWVudDI2MzY0NzQzMw== kmpaul 11411331 2016-11-29T17:59:20Z 2016-11-29T17:59:20Z CONTRIBUTOR

Sorry for the delay... I saw the reference and then needed to find some time to read back over the issues to get some context.

You are correct. The PyReshaper was designed to address this type of problem, though not exactly the issue with xarray and dask. It's a pretty common problem, and it's the reason that the CESM developers are moving to long-term archival of time-series files ONLY. (In other words, PyReshaper is being incorporated into the automated CESM run-processes.) ...Of course, one could argue that this step shouldn't be necessary with some clever I/O in the models themselves to write time-series directly.

The PyReshaper opens and closes each time-slice file explicitly before and after each read, respectively. And, if fully scaled (i.e., 1 MPI process per output file), you only ever have 2 files open at a time per process. In this particular operation, the overhead associated with open/close on the input files is negligible compared to the total R/W times.

So, anyway, the PyReshaper (https://github.com/NCAR/PyReshaper) can definitely help...though I consider it a stop-gap for the moment. I'm happy to help people figure out how to get it to work for you problems, if that's a path you want to consider.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset too many files 94328498

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 15.449ms · About: xarray-datasette
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows