issue_comments
80 rows where author_association = "NONE" and user = 4992424 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: reactions, created_at (date), updated_at (date)
user 1
- darothen · 80 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
666422864 | https://github.com/pydata/xarray/issues/2314#issuecomment-666422864 | https://api.github.com/repos/pydata/xarray/issues/2314 | MDEyOklzc3VlQ29tbWVudDY2NjQyMjg2NA== | darothen 4992424 | 2020-07-30T14:52:50Z | 2020-07-30T14:52:50Z | NONE | Hi @shaprann, I haven't re-visited this exact workflow recently, but one really good option (if you can manage the intermediate storage cost) would be to try to use new tools like http://github.com/pangeo-data/rechunker to pre-process and prepare your data archive prior to analysis. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Chunked processing across multiple raster (geoTIF) files 344621749 | |
661953980 | https://github.com/pydata/xarray/issues/1086#issuecomment-661953980 | https://api.github.com/repos/pydata/xarray/issues/1086 | MDEyOklzc3VlQ29tbWVudDY2MTk1Mzk4MA== | darothen 4992424 | 2020-07-21T16:09:25Z | 2020-07-21T16:09:52Z | NONE | Hi @andreall, I'll leave @dcherian or another maintainer to comment on internals of ``` python import xarray as xr from pathlib import Path from joblib import delayed, Parallel dir_input = Path('.') fns = list(sorted(dir_input.glob('*/' + 'WW3_EUR-11_CCCma-CanESM2_r1i1p1_CLMcom-CCLM4-8-17_v1_6hr_.nc'))) Helper function to convert NetCDF to CSV with our processingdef _nc_to_csv(fn): data_ww3 = xr.open_dataset(fn) data_ww3 = data_ww3.isel(latitude=74, longitude=18) df_ww3 = data_ww3[['hs', 't02', 't0m1', 't01', 'fp', 'dir', 'spr', 'dp']].to_dataframe()
Using joblib.Parallel to distribute my work across whatever resources i haveout_fns = Parallel( n_jobs=-1, # Use all cores available here delayed(_nc_to_csv)(fn) for fn in fns ) Read the CSV files and merge themdfs = [ pd.read_csv(fn) for fn in out_fns ] df_ww3_all = pd.concat(dfs, ignore_index=True) ``` YMMV but this pattern often works for many types of processing applications. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Is there a more efficient way to convert a subset of variables to a dataframe? 187608079 | |
536079602 | https://github.com/pydata/xarray/issues/3349#issuecomment-536079602 | https://api.github.com/repos/pydata/xarray/issues/3349 | MDEyOklzc3VlQ29tbWVudDUzNjA3OTYwMg== | darothen 4992424 | 2019-09-27T20:07:13Z | 2019-09-27T20:07:13Z | NONE | I second @TomNicholas' point... functionality like this would be wonderful to have but where would be the best place for it to live? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Implement polyfit? 499477363 | |
524104485 | https://github.com/pydata/xarray/issues/3213#issuecomment-524104485 | https://api.github.com/repos/pydata/xarray/issues/3213 | MDEyOklzc3VlQ29tbWVudDUyNDEwNDQ4NQ== | darothen 4992424 | 2019-08-22T22:39:21Z | 2019-08-22T22:39:21Z | NONE | Tagging @jeliashi for visibility/collaboration |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
How should xarray use/support sparse arrays? 479942077 | |
485272748 | https://github.com/pydata/xarray/issues/2911#issuecomment-485272748 | https://api.github.com/repos/pydata/xarray/issues/2911 | MDEyOklzc3VlQ29tbWVudDQ4NTI3Mjc0OA== | darothen 4992424 | 2019-04-21T18:32:56Z | 2019-04-21T18:32:56Z | NONE | Hi @tomchor, it's not too difficult to take the readers that you already have and two wrap them in such a way that you can interact with them via xarray; you can check out the packages xgcm or xbpch for examples of this can work in practice. I'm not sure if a more generic reader is within or beyond the scope of the core xarray project, though... although example implementations and writeups would make a great contribution to the community! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support from reading unformatted Fortran files 435532136 | |
417175383 | https://github.com/pydata/xarray/issues/2314#issuecomment-417175383 | https://api.github.com/repos/pydata/xarray/issues/2314 | MDEyOklzc3VlQ29tbWVudDQxNzE3NTM4Mw== | darothen 4992424 | 2018-08-30T03:09:41Z | 2018-08-30T03:09:41Z | NONE | Can you provide a |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Chunked processing across multiple raster (geoTIF) files 344621749 | |
372475210 | https://github.com/pydata/xarray/issues/1970#issuecomment-372475210 | https://api.github.com/repos/pydata/xarray/issues/1970 | MDEyOklzc3VlQ29tbWVudDM3MjQ3NTIxMA== | darothen 4992424 | 2018-03-12T21:52:22Z | 2018-03-12T21:52:22Z | NONE | @jhamman What do you think would be involved in fleshing out the integration between xarray and rasterio in order to output cloud-optimized GeoTiffs? I |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
API Design for Xarray Backends 302806158 | |
336634555 | https://github.com/pydata/xarray/issues/1631#issuecomment-336634555 | https://api.github.com/repos/pydata/xarray/issues/1631 | MDEyOklzc3VlQ29tbWVudDMzNjYzNDU1NQ== | darothen 4992424 | 2017-10-14T13:19:58Z | 2017-10-14T13:19:58Z | NONE | Thanks for documenting this @jhamman. I think all the logic is in |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Resample / upsample behavior diverges from pandas 265056503 | |
336001921 | https://github.com/pydata/xarray/issues/1627#issuecomment-336001921 | https://api.github.com/repos/pydata/xarray/issues/1627 | MDEyOklzc3VlQ29tbWVudDMzNjAwMTkyMQ== | darothen 4992424 | 2017-10-12T02:26:05Z | 2017-10-12T02:26:05Z | NONE | Wow, great job @benbovy! With the upcoming move towards Jupyter Lab and a better infrastructure for custom plugins, could this serve as the basis for a "NetCDF Extension" for Jupyter Lab? It would be great if double clicking on a NetCDF file in the JLab file explorer could open up this sort of information, or even a quick and dirty ncview-like plotter. |
{ "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
html repr of xarray object (for the notebook) 264747372 | |
334526971 | https://github.com/pydata/xarray/pull/1608#issuecomment-334526971 | https://api.github.com/repos/pydata/xarray/issues/1608 | MDEyOklzc3VlQ29tbWVudDMzNDUyNjk3MQ== | darothen 4992424 | 2017-10-05T16:57:03Z | 2017-10-05T16:57:03Z | NONE | I'm a bit slow on the uptake here, but big 👍 from me. Thanks for catching this bug! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Fix resample/interpolate for non-upsampling case 262874270 | |
334453965 | https://github.com/pydata/xarray/pull/1608#issuecomment-334453965 | https://api.github.com/repos/pydata/xarray/issues/1608 | MDEyOklzc3VlQ29tbWVudDMzNDQ1Mzk2NQ== | darothen 4992424 | 2017-10-05T12:46:54Z | 2017-10-05T12:46:54Z | NONE | Great catch; do you need any input from me @jhamman ? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Fix resample/interpolate for non-upsampling case 262874270 | |
334224596 | https://github.com/pydata/xarray/issues/1605#issuecomment-334224596 | https://api.github.com/repos/pydata/xarray/issues/1605 | MDEyOklzc3VlQ29tbWVudDMzNDIyNDU5Ng== | darothen 4992424 | 2017-10-04T17:10:02Z | 2017-10-04T17:10:02Z | NONE | (sorry, originally commented from my work account) The tutorial dataset is ~6-hourly, so your operation is a downsampling operation. We don't actually support interpolation on downsampling operations - just aggregations/reductions. Upsampling supports interpolation since there is no implicit way to estimate data between the gaps at the lower temporal frequency. If you just want to estimate a given field at 15-day intervals, for 00Z on those days, then I think you should use |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Resample interpolate failing on tutorial dataset 262847801 | |
332619692 | https://github.com/pydata/xarray/issues/1596#issuecomment-332619692 | https://api.github.com/repos/pydata/xarray/issues/1596 | MDEyOklzc3VlQ29tbWVudDMzMjYxOTY5Mg== | darothen 4992424 | 2017-09-27T18:49:34Z | 2017-09-27T18:49:34Z | NONE | @willirath Never hurts! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Equivalent of numpy.insert for DataSet / DataArray? 260912521 | |
332519089 | https://github.com/pydata/xarray/issues/1596#issuecomment-332519089 | https://api.github.com/repos/pydata/xarray/issues/1596 | MDEyOklzc3VlQ29tbWVudDMzMjUxOTA4OQ== | darothen 4992424 | 2017-09-27T13:23:38Z | 2017-09-27T13:23:38Z | NONE | @willirath is your time data equally spaced? If so, you should be able to use the new version of Should work something like this, assuming each timestep is a daily value on the time axis: ``` python ds = xr.open_mfdataset("paths/to/my/data.nc") ds_infilled = ds.resample(time='1D').asfreq() ``` That should get you nans wherever your data is missing. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Equivalent of numpy.insert for DataSet / DataArray? 260912521 | |
331281120 | https://github.com/pydata/xarray/pull/1272#issuecomment-331281120 | https://api.github.com/repos/pydata/xarray/issues/1272 | MDEyOklzc3VlQ29tbWVudDMzMTI4MTEyMA== | darothen 4992424 | 2017-09-21T21:02:39Z | 2017-09-21T21:10:51Z | NONE | @jhamman Ohhh i totally misunderstood the last readout from travis-ci. Dealing with the scipy dependency is easy enough. ~However, another test fails because it uses Nevermind, easy solution is just to use other axis-reversal methods :) |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Groupby-like API for resampling 208215185 | |
330910590 | https://github.com/pydata/xarray/pull/1272#issuecomment-330910590 | https://api.github.com/repos/pydata/xarray/issues/1272 | MDEyOklzc3VlQ29tbWVudDMzMDkxMDU5MA== | darothen 4992424 | 2017-09-20T16:41:01Z | 2017-09-20T16:41:01Z | NONE | @jhamman done - caught me right while I was compiling GEOS-Chem, and the merge conflicts were very simple. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Groupby-like API for resampling 208215185 | |
330840457 | https://github.com/pydata/xarray/pull/1272#issuecomment-330840457 | https://api.github.com/repos/pydata/xarray/issues/1272 | MDEyOklzc3VlQ29tbWVudDMzMDg0MDQ1Nw== | darothen 4992424 | 2017-09-20T12:47:08Z | 2017-09-20T12:47:08Z | NONE | @jhamman Think we're good. I deferred 4 small pep8 issues because they're in parts of the codebase which I don't think I ever touched, and i'm worried they're going to screw up the merge. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Groupby-like API for resampling 208215185 | |
330530760 | https://github.com/pydata/xarray/pull/1272#issuecomment-330530760 | https://api.github.com/repos/pydata/xarray/issues/1272 | MDEyOklzc3VlQ29tbWVudDMzMDUzMDc2MA== | darothen 4992424 | 2017-09-19T12:58:34Z | 2017-09-19T12:58:34Z | NONE | @jhamman Gotcha, I'll clean everything up by the end of the week. If that's going to block 0.10.0, let me know and I'll shuffle some things around to prioritize this. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Groupby-like API for resampling 208215185 | |
329227114 | https://github.com/pydata/xarray/pull/1272#issuecomment-329227114 | https://api.github.com/repos/pydata/xarray/issues/1272 | MDEyOklzc3VlQ29tbWVudDMyOTIyNzExNA== | darothen 4992424 | 2017-09-13T16:43:32Z | 2017-09-13T16:43:32Z | NONE | @shoyer fixed. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Groupby-like API for resampling 208215185 | |
329162517 | https://github.com/pydata/xarray/pull/1272#issuecomment-329162517 | https://api.github.com/repos/pydata/xarray/issues/1272 | MDEyOklzc3VlQ29tbWVudDMyOTE2MjUxNw== | darothen 4992424 | 2017-09-13T13:10:04Z | 2017-09-13T13:10:04Z | NONE | Hmmm. Something is really screwy with my feature branch and making the task of cleaning up the merge difficult. I'll work on fixing this. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Groupby-like API for resampling 208215185 | |
329039697 | https://github.com/pydata/xarray/pull/1272#issuecomment-329039697 | https://api.github.com/repos/pydata/xarray/issues/1272 | MDEyOklzc3VlQ29tbWVudDMyOTAzOTY5Nw== | darothen 4992424 | 2017-09-13T02:34:21Z | 2017-09-13T02:34:21Z | NONE | Try refreshing? Latest commit is 7a767d8 and has all these changes plus some more tweaks. Daniel Rothenberg * Postdoctoral Research Associate Center for Global Change Science Massachusetts Institute of Technology A: 77 Massachusetts Ave | E18-402A Cambridge, MA 02139 T: (502) 648-7513; T: (617) 258-0407 E: darothen@mit.edu H: *www.danielrothenberg.com http://github.com/darothen [image: http://www.linkedin.com/in/rothenbergdaniel/] http://www.linkedin.com/in/rothenbergdaniel/ http://www.twitter.com/danrothenberg On Tue, Sep 12, 2017 at 12:02 PM, Stephan Hoyer notifications@github.com wrote:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Groupby-like API for resampling 208215185 | |
328724595 | https://github.com/pydata/xarray/issues/1279#issuecomment-328724595 | https://api.github.com/repos/pydata/xarray/issues/1279 | MDEyOklzc3VlQ29tbWVudDMyODcyNDU5NQ== | darothen 4992424 | 2017-09-12T03:29:29Z | 2017-09-12T03:29:29Z | NONE | @shoyer - This output is usually provided as a sequence of daily netCDF files, each on a ~2 degree global grid with 24 timesteps per file (so shape 24 x 96 x 144). For convenience, I usually concatenate these files into yearly datasets, so they'll have a shape (8736 x 96 x 144). I haven't played too much with how to chunk the data, but it's not uncommon for me to load 20-50 of these files simultaneously (each holding a years worth of data) and treat each year as an "ensemble member dimension, so my data has shape (50 x 8736 x 96 x 144). Yes, keeping everything in dask array land is preferable, I suppose. @jhamman - Wow, that worked pretty much perfectly! There's a handful of typos (you switch from "a" to "x" halfway through), and there's a lot of room for optimization by chunksize. But it just works, which is absolutely ridiculous. I just pushed a ~200 GB dataset on my cluster with ~50 cores and it screamed through the calculation. Is there anyway this could be pushed before 0.10.0? It's a killer enhancement. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Rolling window operation does not work with dask arrays 208903781 | |
328314676 | https://github.com/pydata/xarray/issues/1279#issuecomment-328314676 | https://api.github.com/repos/pydata/xarray/issues/1279 | MDEyOklzc3VlQ29tbWVudDMyODMxNDY3Ng== | darothen 4992424 | 2017-09-10T02:04:33Z | 2017-09-10T02:04:33Z | NONE | In light of #1489 is there a way to move forward here with In soliciting the atmospheric chemistry community for a few illustrative examples for gcpy, it's become apparent that indices computed from re-sampled timeseries would be killer, attention-grabbing functionality. For instance, the EPA air quality standard we use for ozone involves taking hourly data, computing 8-hour rolling means for each day of your dataset, and then picking the maximum of those means for each day ("MDA8 ozone"). Similar metrics exist for other pollutants. With traditional xarray data-structures, it's trivial to compute this quantity (assuming we have hourly data and using the new resample API from #1272):
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Rolling window operation does not work with dask arrays 208903781 | |
326689304 | https://github.com/pydata/xarray/pull/1272#issuecomment-326689304 | https://api.github.com/repos/pydata/xarray/issues/1272 | MDEyOklzc3VlQ29tbWVudDMyNjY4OTMwNA== | darothen 4992424 | 2017-09-01T21:38:18Z | 2017-09-01T21:38:18Z | NONE | Resolved to drop auxiliary coordinates which are defined along the dimension to be re-sampled. This makes sense; if someone wants them to be interpolated or manipulated in some way, then they should promote them from coordinates to variables before doing the resampling. In response to #1328, Final review, @shoyer, before merging in anticipation of 0.10.0? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Groupby-like API for resampling 208215185 | |
325974604 | https://github.com/pydata/xarray/issues/486#issuecomment-325974604 | https://api.github.com/repos/pydata/xarray/issues/486 | MDEyOklzc3VlQ29tbWVudDMyNTk3NDYwNA== | darothen 4992424 | 2017-08-30T12:26:07Z | 2017-08-30T12:26:07Z | NONE | @ocefpaf Awesome, good to know that hurdle has already been leaped :) |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
API for multi-dimensional resampling/regridding 96211612 | |
325969302 | https://github.com/pydata/xarray/issues/486#issuecomment-325969302 | https://api.github.com/repos/pydata/xarray/issues/486 | MDEyOklzc3VlQ29tbWVudDMyNTk2OTMwMg== | darothen 4992424 | 2017-08-30T12:01:29Z | 2017-08-30T12:01:29Z | NONE | If ESMF is the way to go, then some effort needs to be made to build conda recipes and other infrastructure for distributing and building the platform. It's a heavy dependency to haul around. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
API for multi-dimensional resampling/regridding 96211612 | |
325777712 | https://github.com/pydata/xarray/issues/1534#issuecomment-325777712 | https://api.github.com/repos/pydata/xarray/issues/1534 | MDEyOklzc3VlQ29tbWVudDMyNTc3NzcxMg== | darothen 4992424 | 2017-08-29T19:42:24Z | 2017-08-29T19:42:24Z | NONE | @mmartini-usgs, an entire netCDF file (as long as it only has 1 group, which it most likely does if we're talking about standard atmospheric/oceanic data) would be the equivalent of an To start with, you should read in your data using the chunks keyword to
You'd have to choose chunks based on the dimensions of your data. Like @rabernat previously mentioned, it's very likely you can perform your entire workflow within xarray without every having to drop down to pandas; let us know if you can share more details |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
to_dataframe (pandas) usage question 253407851 | |
325494110 | https://github.com/pydata/xarray/issues/1535#issuecomment-325494110 | https://api.github.com/repos/pydata/xarray/issues/1535 | MDEyOklzc3VlQ29tbWVudDMyNTQ5NDExMA== | darothen 4992424 | 2017-08-28T21:52:54Z | 2017-08-28T21:52:54Z | NONE | Great; there's only a single action item left on #1272, so I'll try to get to that later this week. |
{ "total_count": 1, "+1": 0, "-1": 0, "laugh": 1, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
v0.10 Release 253463226 | |
323539716 | https://github.com/pydata/xarray/pull/1272#issuecomment-323539716 | https://api.github.com/repos/pydata/xarray/issues/1272 | MDEyOklzc3VlQ29tbWVudDMyMzUzOTcxNg== | darothen 4992424 | 2017-08-19T18:24:29Z | 2017-08-19T18:24:29Z | NONE | All set except for my one question to @shoyer above. I've opted not to include a chart outlining the various upsampling options... couldn't really think of a nice and clean way to do so, because adding it to the time series doc page ends up being really ugly and there isn't quite enough substance for its own worked example page. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Groupby-like API for resampling 208215185 | |
320297159 | https://github.com/pydata/xarray/pull/1272#issuecomment-320297159 | https://api.github.com/repos/pydata/xarray/issues/1272 | MDEyOklzc3VlQ29tbWVudDMyMDI5NzE1OQ== | darothen 4992424 | 2017-08-04T16:45:56Z | 2017-08-19T18:23:06Z | NONE | Okay, it was a bit of effort but I implemented upsampling. For the padding methods I just re-index the Dataset or DataArray using the re-sampled time frequencies. I also added interpolation, but that was a bit tricky; we have to sort of break the split-apply-combine idiom to do that, so I created a The padding methods work 100% with dask arrays - since we're just calling xarray methods which themselves work with dask arrays! There are some eager computations (just the calculation of the up-sampled time frequencies) but I don't think that's a major issue; the actual re-indexing/padding is deferred. Interpolation works with dask arrays too, but eagerly does the computations. Could use a review from @shoyer or @jhamman. New TODO list:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Groupby-like API for resampling 208215185 | |
323105006 | https://github.com/pydata/xarray/issues/1509#issuecomment-323105006 | https://api.github.com/repos/pydata/xarray/issues/1509 | MDEyOklzc3VlQ29tbWVudDMyMzEwNTAwNg== | darothen 4992424 | 2017-08-17T15:20:22Z | 2017-08-17T15:20:22Z | NONE | @betaplane a re-factoring of the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Unexpected behavior with DataArray.resample(how='sum') in presence of NaNs 250751931 | |
321245721 | https://github.com/pydata/xarray/issues/1505#issuecomment-321245721 | https://api.github.com/repos/pydata/xarray/issues/1505 | MDEyOklzc3VlQ29tbWVudDMyMTI0NTcyMQ== | darothen 4992424 | 2017-08-09T12:50:40Z | 2017-08-09T12:50:40Z | NONE | How exactly is your WRF output split? It's not clear exactly what you want to do... is it split along different tiles such that indices [1, ..., m] are in I'm not sure that |
{ "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 0 } |
A problem about xarray.concat 248942085 | |
316404161 | https://github.com/pydata/xarray/pull/1272#issuecomment-316404161 | https://api.github.com/repos/pydata/xarray/issues/1272 | MDEyOklzc3VlQ29tbWVudDMxNjQwNDE2MQ== | darothen 4992424 | 2017-07-19T14:24:38Z | 2017-08-04T16:39:53Z | NONE | TODO
Alright @jhamman, here's the complete list of work left here. I'll tackle some of it during my commutes this week. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Groupby-like API for resampling 208215185 | |
319988645 | https://github.com/pydata/xarray/pull/1272#issuecomment-319988645 | https://api.github.com/repos/pydata/xarray/issues/1272 | MDEyOklzc3VlQ29tbWVudDMxOTk4ODY0NQ== | darothen 4992424 | 2017-08-03T14:39:04Z | 2017-08-03T14:39:04Z | NONE | Finished off everything except upsampling. In pandas, all upsampling works by constructing a new time index (which we already do) and then filling in the NaNs that result in the dataset with one of a few different rules. Not sure how involved this will be, but I anticipate this can all be implemented in core/resample.py |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Groupby-like API for resampling 208215185 | |
318079611 | https://github.com/pydata/xarray/issues/1490#issuecomment-318079611 | https://api.github.com/repos/pydata/xarray/issues/1490 | MDEyOklzc3VlQ29tbWVudDMxODA3OTYxMQ== | darothen 4992424 | 2017-07-26T14:57:58Z | 2017-07-26T14:57:58Z | NONE | Did some digging. Note here that the dtypes of ``` python
But, if we directly print its values, we get something slightly different: ``` python
The difference is that the timezone delta has been automatically added in terms of hours to each value in
Both ``` python
``` python
But also, the type of So what happens is that the resulting One solution would be to catch this potential glitch in either |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Resample not working when time coordinate is timezone aware 245649333 | |
316398830 | https://github.com/pydata/xarray/pull/1272#issuecomment-316398830 | https://api.github.com/repos/pydata/xarray/issues/1272 | MDEyOklzc3VlQ29tbWVudDMxNjM5ODgzMA== | darothen 4992424 | 2017-07-19T14:07:00Z | 2017-07-19T14:07:00Z | NONE | I did my best to re-base everything to master... plan on spending an hour or so figuring out what's broken and at least restoring the status quo. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Groupby-like API for resampling 208215185 | |
316377854 | https://github.com/pydata/xarray/issues/1483#issuecomment-316377854 | https://api.github.com/repos/pydata/xarray/issues/1483 | MDEyOklzc3VlQ29tbWVudDMxNjM3Nzg1NA== | darothen 4992424 | 2017-07-19T12:59:04Z | 2017-07-19T12:59:04Z | NONE | Instead of computing the mean over your non-stacked dimension by
why not just instead call
so that you just collapse the time dimension and preserve the attributes on your data? Then you can |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Loss of coordinate information from groupby.apply() on a stacked object 244016361 | |
316376598 | https://github.com/pydata/xarray/issues/1482#issuecomment-316376598 | https://api.github.com/repos/pydata/xarray/issues/1482 | MDEyOklzc3VlQ29tbWVudDMxNjM3NjU5OA== | darothen 4992424 | 2017-07-19T12:54:30Z | 2017-07-19T12:54:30Z | NONE | @mitar it depends on your data/application, right? But that information would also be helpful in figuring out alternative pathways. If you're always going to process the images individually or sequentially, then what advantage is there (aside from convenience) of dumping them in some giant array with forced dimensions/shape per slice? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support for jagged array 243964948 | |
316371416 | https://github.com/pydata/xarray/issues/1482#issuecomment-316371416 | https://api.github.com/repos/pydata/xarray/issues/1482 | MDEyOklzc3VlQ29tbWVudDMxNjM3MTQxNg== | darothen 4992424 | 2017-07-19T12:34:32Z | 2017-07-19T12:34:32Z | NONE | The problem is that these sorts of arrays break the common data model on top of which xarray (and NetCDF) is built.
Yes, if you can pre-process all the images and align them on some common set of dimensions (maybe just xi and yi, denoting integer index in the x and y directions), and pad unused space for each image with NaNs, then you could concatenate everything into a |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Support for jagged array 243964948 | |
315355743 | https://github.com/pydata/xarray/pull/1272#issuecomment-315355743 | https://api.github.com/repos/pydata/xarray/issues/1272 | MDEyOklzc3VlQ29tbWVudDMxNTM1NTc0Mw== | darothen 4992424 | 2017-07-14T13:10:22Z | 2017-07-14T13:10:22Z | NONE | I think a pull against the new releases is critical to see what breaks. Beyond that, just code clean up and testing. I can try to bump this higher on my priority list. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Groupby-like API for resampling 208215185 | |
313106392 | https://github.com/pydata/xarray/issues/1354#issuecomment-313106392 | https://api.github.com/repos/pydata/xarray/issues/1354 | MDEyOklzc3VlQ29tbWVudDMxMzEwNjM5Mg== | darothen 4992424 | 2017-07-05T13:41:56Z | 2017-07-05T13:41:56Z | NONE | @wqshen, a workaround until a more complete modification to |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
concat automagically outer-joins coordinates 219692578 | |
308914123 | https://github.com/pydata/xarray/issues/1447#issuecomment-308914123 | https://api.github.com/repos/pydata/xarray/issues/1447 | MDEyOklzc3VlQ29tbWVudDMwODkxNDEyMw== | darothen 4992424 | 2017-06-16T02:14:31Z | 2017-06-16T02:14:31Z | NONE | For xbpch I followed a similar naming convention based on @rabernat's xmitgcm. Brewing on the horizon is an xarray-powered toolkit for GEOS-Chem and while it'll be a stand-alone library, I imagine it'll belong to this confederation of toolkits and provide an accessor or two for computing model grid geometries and related things on-the-fly. I'd also +1 for an |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Package naming "conventions" for xarray extensions 234658224 | |
305178905 | https://github.com/pydata/xarray/issues/1192#issuecomment-305178905 | https://api.github.com/repos/pydata/xarray/issues/1192 | MDEyOklzc3VlQ29tbWVudDMwNTE3ODkwNQ== | darothen 4992424 | 2017-05-31T12:59:52Z | 2017-05-31T12:59:52Z | NONE | Not to hijack the thread, but @PeterDSteinberg - this is the first I've heard of earthio and I think there would be a lot of interest from the broader atmospheric/oceanic sciences community to hear about what your all's plans are. Could your team do a blog post on Continuum sometime outlining the goals of the project? |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Implementing dask.array.coarsen in xarrays 198742089 | |
304107683 | https://github.com/pydata/xarray/issues/470#issuecomment-304107683 | https://api.github.com/repos/pydata/xarray/issues/470 | MDEyOklzc3VlQ29tbWVudDMwNDEwNzY4Mw== | darothen 4992424 | 2017-05-25T19:57:22Z | 2017-05-25T19:57:22Z | NONE | This certainly could be useful, but since this is essentially plotting a vector of data, why not just drop into pandas? ``` df = da.to_dataframe() Could reset coordinates if you really wanteddf = df.reset_index()df.plot.scatter('longitude', 'latitude', c=da.name) ``` Patching in this rough functionality into the plotting module should be really straightforward, maybe @jhamman has some tips? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
add scatter plot method to dataset 94787306 | |
301489242 | https://github.com/pydata/xarray/issues/1279#issuecomment-301489242 | https://api.github.com/repos/pydata/xarray/issues/1279 | MDEyOklzc3VlQ29tbWVudDMwMTQ4OTI0Mg== | darothen 4992424 | 2017-05-15T14:18:55Z | 2017-05-15T14:18:55Z | NONE | Dask dataframes have recently been updated so that rolling operations work (dask/dask#2198). Does this open a pathway to enable rolling on dask arrays within xarray? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Rolling window operation does not work with dask arrays 208903781 | |
300462962 | https://github.com/pydata/xarray/issues/1391#issuecomment-300462962 | https://api.github.com/repos/pydata/xarray/issues/1391 | MDEyOklzc3VlQ29tbWVudDMwMDQ2Mjk2Mg== | darothen 4992424 | 2017-05-10T12:11:56Z | 2017-05-10T12:11:56Z | NONE | @klapo! Great to see you here! Happy to iterate with you on documenting this functionality. For reference, I wrote a package for my dissertation work to help automate the task of constructing multi-dimensional Datasets which include dimensions corresponding to experimental/ensemble factors. One of my on-going projects is to actually fully abstract this (I have a not-uploaded branch of the project which tries to build the notion of an "EnsembleDataset", which has the same relationship to a Dataset that an pandas Panel used to have to a DataFrame). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Adding Example/Tutorial of importing data to Xarray (Merge/conact/etc) 225536793 | |
299194997 | https://github.com/pydata/xarray/issues/1397#issuecomment-299194997 | https://api.github.com/repos/pydata/xarray/issues/1397 | MDEyOklzc3VlQ29tbWVudDI5OTE5NDk5Nw== | darothen 4992424 | 2017-05-04T14:05:48Z | 2017-05-04T14:05:48Z | NONE | Cool; please keep me in the loop if you don't mind, because I also have an application which I'd really like to just be able use the built-in faceting for rather than building my plot grids manually. A good comparison case is to perform the same plots (with the same set aspect/size/ratio at both the figure and subplot level) but just don't use the Cartopy transformations. In these cases, I have all the control that I would expect. There are also important differences between |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Changing projections under plot() 225846258 | |
299191499 | https://github.com/pydata/xarray/issues/1397#issuecomment-299191499 | https://api.github.com/repos/pydata/xarray/issues/1397 | MDEyOklzc3VlQ29tbWVudDI5OTE5MTQ5OQ== | darothen 4992424 | 2017-05-04T13:53:09Z | 2017-05-04T13:53:09Z | NONE | @fmaussion What happens if you add I'm tempted to have us move this discussion to StackOverflow (for heightened visibility), but I suspect there might actually be a bug somewhere in the finalization of the faceting that undoes the specifications you pass to the initial subplot constructor. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Changing projections under plot() 225846258 | |
299056235 | https://github.com/pydata/xarray/issues/1397#issuecomment-299056235 | https://api.github.com/repos/pydata/xarray/issues/1397 | MDEyOklzc3VlQ29tbWVudDI5OTA1NjIzNQ== | darothen 4992424 | 2017-05-03T22:43:55Z | 2017-05-03T22:43:55Z | NONE |
You just need to pass the "pad" argument to The trickier problem is that sometimes cartopy can be a bit unpredictable in controlling the size and aspect ratio of axes after you've plotted maps on them. You can force a plot to respect the aspect ratio you use when you construct an axis by using the keyword |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Changing projections under plot() 225846258 | |
294829429 | https://github.com/pydata/xarray/pull/1356#issuecomment-294829429 | https://api.github.com/repos/pydata/xarray/issues/1356 | MDEyOklzc3VlQ29tbWVudDI5NDgyOTQyOQ== | darothen 4992424 | 2017-04-18T12:53:01Z | 2017-04-18T12:53:01Z | NONE | Alrighty, patched and ready for a final look-over! I appreciate the help and patience, @shoyer! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Add DatetimeAccessor for accessing datetime fields via `.dt` attribute 220011864 | |
294628295 | https://github.com/pydata/xarray/pull/1356#issuecomment-294628295 | https://api.github.com/repos/pydata/xarray/issues/1356 | MDEyOklzc3VlQ29tbWVudDI5NDYyODI5NQ== | darothen 4992424 | 2017-04-17T23:44:08Z | 2017-04-17T23:44:08Z | NONE | Turns out it was easy enough to add an accessor for |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Add DatetimeAccessor for accessing datetime fields via `.dt` attribute 220011864 | |
294520064 | https://github.com/pydata/xarray/pull/1356#issuecomment-294520064 | https://api.github.com/repos/pydata/xarray/issues/1356 | MDEyOklzc3VlQ29tbWVudDI5NDUyMDA2NA== | darothen 4992424 | 2017-04-17T16:21:19Z | 2017-04-17T16:21:19Z | NONE | There's a test-case relating to #367 (test_virtual_variable_same_name) which is causing me a bit of grief as I re-factor the virtual variable logic. Should we really be able to access variables like Two options for fixing:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Add DatetimeAccessor for accessing datetime fields via `.dt` attribute 220011864 | |
293881177 | https://github.com/pydata/xarray/pull/1356#issuecomment-293881177 | https://api.github.com/repos/pydata/xarray/issues/1356 | MDEyOklzc3VlQ29tbWVudDI5Mzg4MTE3Nw== | darothen 4992424 | 2017-04-13T12:26:24Z | 2017-04-13T12:26:24Z | NONE | Finished clean-up, added some documentation, etc. I mangled resolving a merge conflict with my update to wrt to the virtual variables, I think some more thinking is necessary so we can come up with a plan of approach. Do we want to deprecate this feature entirely? Do we just want to wrap the datetime component virtual variables to the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Add DatetimeAccessor for accessing datetime fields via `.dt` attribute 220011864 | |
293280073 | https://github.com/pydata/xarray/pull/1356#issuecomment-293280073 | https://api.github.com/repos/pydata/xarray/issues/1356 | MDEyOklzc3VlQ29tbWVudDI5MzI4MDA3Mw== | darothen 4992424 | 2017-04-11T14:25:27Z | 2017-04-11T14:25:27Z | NONE | Updated with support for multi-dimensional time data stored as dask array. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Add DatetimeAccessor for accessing datetime fields via `.dt` attribute 220011864 | |
292930254 | https://github.com/pydata/xarray/issues/1352#issuecomment-292930254 | https://api.github.com/repos/pydata/xarray/issues/1352 | MDEyOklzc3VlQ29tbWVudDI5MjkzMDI1NA== | darothen 4992424 | 2017-04-10T12:06:52Z | 2017-04-10T12:07:03Z | NONE | Yeah, I tend to agree, there should be some sort of auto-magic happening. But, I can think of at least two options:
I use workflows where I concatenate things like multiple ensemble members into a single file, and I wind up with this pattern all the time. I usually just |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Saving to netCDF with 0D dimension doesn't work 219321876 | |
292926691 | https://github.com/pydata/xarray/issues/1352#issuecomment-292926691 | https://api.github.com/repos/pydata/xarray/issues/1352 | MDEyOklzc3VlQ29tbWVudDI5MjkyNjY5MQ== | darothen 4992424 | 2017-04-10T11:48:37Z | 2017-04-10T11:48:37Z | NONE | @andreas-h you can drop the 0D dimensions:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Saving to netCDF with 0D dimension doesn't work 219321876 | |
292569100 | https://github.com/pydata/xarray/pull/1356#issuecomment-292569100 | https://api.github.com/repos/pydata/xarray/issues/1356 | MDEyOklzc3VlQ29tbWVudDI5MjU2OTEwMA== | darothen 4992424 | 2017-04-07T15:30:43Z | 2017-04-07T15:30:43Z | NONE | @shoyer I corrected things based on your comments. The last commit is an attempt to refactor things to match the way that methods like rolling/groupby functions are injected into the class; this might be totally superfluous here, but I thought it was worth trying. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Add DatetimeAccessor for accessing datetime fields via `.dt` attribute 220011864 | |
291568964 | https://github.com/pydata/xarray/issues/358#issuecomment-291568964 | https://api.github.com/repos/pydata/xarray/issues/358 | MDEyOklzc3VlQ29tbWVudDI5MTU2ODk2NA== | darothen 4992424 | 2017-04-04T17:14:18Z | 2017-04-04T17:14:18Z | NONE | Proof of concept, borrowing liberally from pandas. I think this will be pretty straightforward to hook up into xarray. I wonder, is there any way to register such an accessor with |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
add .dt and .str accessors to DataArray (like pandas.Series) 59720901 | |
291228898 | https://github.com/pydata/xarray/issues/358#issuecomment-291228898 | https://api.github.com/repos/pydata/xarray/issues/358 | MDEyOklzc3VlQ29tbWVudDI5MTIyODg5OA== | darothen 4992424 | 2017-04-03T18:20:32Z | 2017-04-03T18:20:32Z | NONE | Working on a project today which would greatly benefit from having the .dt accessors. Given that this issue is nearly two years old, any thoughts on what it would take to resolve in the present codebase? Still as straightforward as wrappers on the pandas time series methods? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
add .dt and .str accessors to DataArray (like pandas.Series) 59720901 | |
290133148 | https://github.com/pydata/xarray/issues/1092#issuecomment-290133148 | https://api.github.com/repos/pydata/xarray/issues/1092 | MDEyOklzc3VlQ29tbWVudDI5MDEzMzE0OA== | darothen 4992424 | 2017-03-29T15:47:57Z | 2017-03-29T15:48:17Z | NONE | Ah, thanks for the heads-up @benbovy! I see the difference now, and I agree
both approaches could co-exist. I may play around with building some of
your proposed |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Dataset groups 187859705 | |
290106782 | https://github.com/pydata/xarray/issues/1092#issuecomment-290106782 | https://api.github.com/repos/pydata/xarray/issues/1092 | MDEyOklzc3VlQ29tbWVudDI5MDEwNjc4Mg== | darothen 4992424 | 2017-03-29T14:26:15Z | 2017-03-29T14:26:15Z | NONE | Would the domain for this just be to simulate the tree-like structure that NetCDF permits, or could it extend to multiple datasets on disk? One of the ideas that we had during the aospy hackathon involved some sort of idiom based on xarray for packing multiple, similar datasets together. For instance, it's very common in climate science to re-run a model multiple times nearly identically, but changing a parameter or boundary condition. So you end up with large archives of data on disk which are identical in shape and metadata, and you want to be able to quickly analyze across them. As an example, I built a helper tool during my dissertation to automate much of this, allowing you to dump your processed output in some sort of directory structure and consistent naming scheme, and then easily ingest what you need for a given analysis. It's actually working great for a much larger, Monte Carlo set of model simulations right now (3 factor levels with 3-5 values at each level, for a total of 1500 years of simulation). My tool works by concatenating each experimental factor as a new dimension, which lets you use xarray's selection tools to perform analyses across the ensemble. You can pre-process things before concatenating too, if the data ends up being too big to fit in memory (e.g. for every simulation in the experiment, compute time-zonal averages before concatenation). Going back to @shoyer's comment, it still seems as though there is room to build some sort of collection of |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Dataset groups 187859705 | |
289106737 | https://github.com/pydata/xarray/issues/1327#issuecomment-289106737 | https://api.github.com/repos/pydata/xarray/issues/1327 | MDEyOklzc3VlQ29tbWVudDI4OTEwNjczNw== | darothen 4992424 | 2017-03-24T18:25:40Z | 2017-03-24T18:25:40Z | NONE | I saw your PR #1328 on this, but just a heads up that there is an open issue #1269 and pull-request #1272 to re-factor the resampling API to match the GroupBy-like API used by pandas. I've been extremely busy but can try to carve out some more time in the near future to settle some remaining issues on that PR, which would resolve this issue too. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Add 'count' as option for how in dataset resample 216833414 | |
283448614 | https://github.com/pydata/xarray/pull/1272#issuecomment-283448614 | https://api.github.com/repos/pydata/xarray/issues/1272 | MDEyOklzc3VlQ29tbWVudDI4MzQ0ODYxNA== | darothen 4992424 | 2017-03-01T19:46:46Z | 2017-03-01T19:46:46Z | NONE | Should As written, non-aggregation ("transformation"?) doesn't work because the call in |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Groupby-like API for resampling 208215185 | |
281208031 | https://github.com/pydata/xarray/pull/1272#issuecomment-281208031 | https://api.github.com/repos/pydata/xarray/issues/1272 | MDEyOklzc3VlQ29tbWVudDI4MTIwODAzMQ== | darothen 4992424 | 2017-02-20T23:51:01Z | 2017-02-20T23:51:01Z | NONE | Thanks for the feedback, @shoyer! Will circle back around to continue working on this in a few days when I have some free time.
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Groupby-like API for resampling 208215185 | |
281186680 | https://github.com/pydata/xarray/pull/1272#issuecomment-281186680 | https://api.github.com/repos/pydata/xarray/issues/1272 | MDEyOklzc3VlQ29tbWVudDI4MTE4NjY4MA== | darothen 4992424 | 2017-02-20T21:36:06Z | 2017-02-20T21:36:06Z | NONE | Smoothed out most of the problems from earlier and missing details. Still not sure if it's wise to refactor most of the resampling logic into a new resample.py, like what was done with rolling, but it still makes some sense to keep things in groupby.py because we're just subclassing existing machinery from there. The only issue now is the signature for init() in |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Groupby-like API for resampling 208215185 | |
280663975 | https://github.com/pydata/xarray/issues/1273#issuecomment-280663975 | https://api.github.com/repos/pydata/xarray/issues/1273 | MDEyOklzc3VlQ29tbWVudDI4MDY2Mzk3NQ== | darothen 4992424 | 2017-02-17T14:28:21Z | 2017-02-17T14:28:21Z | NONE | +1 from me; adding this as a method on |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
replace a dim with a coordinate from another dataset 208312826 | |
280104546 | https://github.com/pydata/xarray/issues/1269#issuecomment-280104546 | https://api.github.com/repos/pydata/xarray/issues/1269 | MDEyOklzc3VlQ29tbWVudDI4MDEwNDU0Ng== | darothen 4992424 | 2017-02-15T18:59:17Z | 2017-02-15T18:59:17Z | NONE | @MaximilianR Oh, the interface is easy enough to do, even maintaining backwards-compatibility (already have that working). I was considering going the route done with GroupBy and the classes that compose it, like DatasetGroupBy... basically, we just record the wanted resampling dimension and inject the grouping/resampling operations we want. Also adds the ability to specialize methods like But.... if there's a simpler way, that might be preferable! |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
GroupBy like API for resample 207587161 | |
279845588 | https://github.com/pydata/xarray/issues/1269#issuecomment-279845588 | https://api.github.com/repos/pydata/xarray/issues/1269 | MDEyOklzc3VlQ29tbWVudDI3OTg0NTU4OA== | darothen 4992424 | 2017-02-14T21:44:11Z | 2017-02-14T21:44:11Z | NONE | Assuming we want to stick with
or else |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
GroupBy like API for resample 207587161 | |
279810604 | https://github.com/pydata/xarray/issues/1269#issuecomment-279810604 | https://api.github.com/repos/pydata/xarray/issues/1269 | MDEyOklzc3VlQ29tbWVudDI3OTgxMDYwNA== | darothen 4992424 | 2017-02-14T19:32:01Z | 2017-02-14T19:32:01Z | NONE | Let me dig into this a bit right now. My analysis project for this afternoon was already going to require digging into pandas' resampling in more depth anyways. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
GroupBy like API for resample 207587161 | |
243124532 | https://github.com/pydata/xarray/issues/988#issuecomment-243124532 | https://api.github.com/repos/pydata/xarray/issues/988 | MDEyOklzc3VlQ29tbWVudDI0MzEyNDUzMg== | darothen 4992424 | 2016-08-29T13:32:11Z | 2016-08-29T13:32:11Z | NONE | I definitely see the logic with regards to encouraging users to use a context manager, and from the perspective of someone building a third-party library on top of xarray it would be fine. However, I think that from the perspective of an end-user (for example, a scientist) crunching numbers and analyzing data with xarray simply as a convenience library, this produces much too obfuscated code - a standard library import ( I think your earlier proposal of an |
{ "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Hooks for custom attribute handling in xarray operations 173612265 | |
242912131 | https://github.com/pydata/xarray/issues/987#issuecomment-242912131 | https://api.github.com/repos/pydata/xarray/issues/987 | MDEyOklzc3VlQ29tbWVudDI0MjkxMjEzMQ== | darothen 4992424 | 2016-08-27T11:34:28Z | 2016-08-27T11:34:28Z | NONE | @joonro, I think there's a strong case to be made about returning a It might be more prudent to add this attribute whenever we apply these operations to a I can whip up a working example/pull request if people think this is a direction to go. I'd probably build a decorator which handles inspection of the operator name and arguments and uses that to add the cell_methods attribute, that way people can add the same functionality to homegrown methods/operators. |
{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Return a scalar instead of DataArray when the return value is a scalar 173494017 | |
224049602 | https://github.com/pydata/xarray/issues/463#issuecomment-224049602 | https://api.github.com/repos/pydata/xarray/issues/463 | MDEyOklzc3VlQ29tbWVudDIyNDA0OTYwMg== | darothen 4992424 | 2016-06-06T18:42:06Z | 2016-06-06T18:42:06Z | NONE | @mangecoeur, although it's not an xarray-based solution, I've found that by far the best solution to this problem is to transform your dataset from the "timeslice" format (which is convenient for models to write out - all the data at a given point in time, often in separate files for each time step) to "timeseries" format - a continuous format, where you have all the data for a single variable in a single (or much smaller collection of) files. NCAR published a great utility for converting batches of NetCDF output from timeslice to timeseries format here; it's significantly faster than any shell-script/CDO/NCO solution I've ever encountered, and it parallelizes extremely easily. Adding a simple post-processing step to convert my simulation output to timeseries format dramatically reduced my overall work time. Before, I had a separate handler which re-implemented open_mfdataset(), performed an intermediate reduction (usually extracting a variable), and then concatenated within xarray. This could get around the open file limit, but it wasn't fast. My pre-processed data is often still big - barely fitting within memory - but it's far easier to handle, and you can throw dask at it no problem to get huge speedups in analysis. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset too many files 94328498 | |
220334426 | https://github.com/pydata/xarray/issues/851#issuecomment-220334426 | https://api.github.com/repos/pydata/xarray/issues/851 | MDEyOklzc3VlQ29tbWVudDIyMDMzNDQyNg== | darothen 4992424 | 2016-05-19T14:05:34Z | 2016-05-19T14:05:34Z | NONE | @byersiiasa, what happens if you just concatenate them using the NCO command |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xr.concat and xr.to_netcdf new filesize 155741762 | |
192357422 | https://github.com/pydata/xarray/issues/784#issuecomment-192357422 | https://api.github.com/repos/pydata/xarray/issues/784 | MDEyOklzc3VlQ29tbWVudDE5MjM1NzQyMg== | darothen 4992424 | 2016-03-04T16:58:59Z | 2016-03-04T16:58:59Z | NONE | The |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
almost-equal grids 138443211 | |
192332830 | https://github.com/pydata/xarray/issues/784#issuecomment-192332830 | https://api.github.com/repos/pydata/xarray/issues/784 | MDEyOklzc3VlQ29tbWVudDE5MjMzMjgzMA== | darothen 4992424 | 2016-03-04T15:56:58Z | 2016-03-04T15:56:58Z | NONE | Hi @mathause, I actually just ran into a very similar problem to your second bullet point. I had some limited success by manually re-building the re-gridded dataset onto the CESM coordinate system, swapping out the not-exactly-but-actually-close-enough coordinates for the CESM reference data's coordinates. In my case, I was re-gridding with CDO, but even when I explicitly pull out the CESM grid definition it wouldn't match precisely. Since there was a lot of boilerplate code to do this in xarray (although I had a lot of success defining a callback to pass in with open_dataset), it was far easier just to use NCO to copy the correct coordinate variables into the re-gridded data. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
almost-equal grids 138443211 | |
187245860 | https://github.com/pydata/xarray/issues/768#issuecomment-187245860 | https://api.github.com/repos/pydata/xarray/issues/768 | MDEyOklzc3VlQ29tbWVudDE4NzI0NTg2MA== | darothen 4992424 | 2016-02-22T16:04:39Z | 2016-02-22T16:04:39Z | NONE | Hi @jonathanstrong, Just thought it would be useful to point out that the people who maintain NetCDF is Unidata, a branch of the University Corporation for Atmospheric Research. In fact, netCDF-4 is essentially built on top of HDF5 - a much more widely-known file format, with first-class support including an I/O layer in pandas. While it would certainly be great to "sell" netCDF as a format in the documentation, those of us who still have to write netCDF-based I/O modules for our Fortran models might have to throw up a little in our mouths when we do so... |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
save/load DataArray to numpy npz functions 134376872 | |
169057010 | https://github.com/pydata/xarray/issues/704#issuecomment-169057010 | https://api.github.com/repos/pydata/xarray/issues/704 | MDEyOklzc3VlQ29tbWVudDE2OTA1NzAxMA== | darothen 4992424 | 2016-01-05T16:44:41Z | 2016-01-05T16:44:41Z | NONE | I also like |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Complete renaming xray -> xarray 124867009 | |
148376642 | https://github.com/pydata/xarray/issues/624#issuecomment-148376642 | https://api.github.com/repos/pydata/xarray/issues/624 | MDEyOklzc3VlQ29tbWVudDE0ODM3NjY0Mg== | darothen 4992424 | 2015-10-15T12:57:04Z | 2015-10-15T12:57:04Z | NONE | Is there another easy way to add a constant offset to all the values of a dimension (e.g. add, say, 10 meters to every value in the dimension)? I don't typically use operations like that, but I can see where they might be useful. If not, then rolling in integer space is the way to go. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
roll method 111471076 | |
148206569 | https://github.com/pydata/xarray/issues/624#issuecomment-148206569 | https://api.github.com/repos/pydata/xarray/issues/624 | MDEyOklzc3VlQ29tbWVudDE0ODIwNjU2OQ== | darothen 4992424 | 2015-10-14T21:24:35Z | 2015-10-14T21:24:35Z | NONE | Using an API like |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
roll method 111471076 | |
131214583 | https://github.com/pydata/xarray/issues/531#issuecomment-131214583 | https://api.github.com/repos/pydata/xarray/issues/531 | MDEyOklzc3VlQ29tbWVudDEzMTIxNDU4Mw== | darothen 4992424 | 2015-08-14T19:26:18Z | 2015-08-14T19:26:18Z | NONE | Hi @jsbj, The fancy indexing notation you're trying to use only works when xray successfully decodes the time dimension. As discussed in the documentation here, this only works when the year of record falls between 1678 and 2262. Since you have years 2262-2300 in your dataset, this is a feature - xray is failing gracefully. There are a few current open discussions on this behavior, which is an issue higher up the python chain with numpy: 1. time decoding error with "days since" 2. Fix datetime decoding when time units are 'days since 0000-01-01 00:00:00' 3. ocefpaf - Loading non-standard dates with cf_units 4. numpy - Non-standard Calendar Support For now, a very simple hack would be to re-compute your time units so that they're re-based, say, with units 'days since 1700-01-01 00:00:00'. That way all of them would fit within the permissible range to use the decoding routine built into xray. You could simply pass the decode_cf=False flag when you open the dataset, modify the non-decoded time array and units, then run xray.decode_cf() on the modified dataset. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Having trouble with time dim of CMIP5 dataset 100980878 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
issue >30