issue_comments
14 rows where user = 1797906 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: issue_url, created_at (date), updated_at (date)
user 1
- jamesstidard · 14 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
1240646680 | https://github.com/pydata/xarray/issues/7005#issuecomment-1240646680 | https://api.github.com/repos/pydata/xarray/issues/7005 | IC_kwDOAMm_X85J8sQY | jamesstidard 1797906 | 2022-09-08T12:24:41Z | 2022-09-08T12:24:41Z | NONE | Hi @benbovy, Thanks for the detailed response. Yeah, that it was only raising for the second multi indexing map, does seem like a bug in that case, I'll leave the ticket open to track that. I didn't stumble on the For anyone else who's looking to do the same, or for anyone to tell me what I'm doing is not safe, or there's a simpler way, here's the updated function: ```python import numpy as np import pandas as pd import xarray as xr def map_coords(ds, *, name, mapping): """ Takes a xarray dataset's coordinate values and updates them with the given the provided mapping. In-place.
midx = pd.MultiIndex.from_product([list("abc"), [0, 1]], names=("x_one", "x_two")) midy = pd.MultiIndex.from_product([list("abc"), [0, 1]], names=("y_one", "y_two")) mda = xr.DataArray(np.random.rand(6, 6, 3), [("x", midx), ("y", midy), ("z", range(3))]) map_coords(mda, name="z", mapping={0: "zero", 1: "one", 2: "two"}) map_coords(mda, name="x_one", mapping={"a": "aa", "b": "bb", "c": "cc"}) map_coords(mda, name="y_one", mapping={"a": "aa", "b": "bb", "c": "cc"}) print(mda) ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Cannot re-index or align objects with conflicting indexes 1364911775 | |
365925282 | https://github.com/pydata/xarray/issues/1854#issuecomment-365925282 | https://api.github.com/repos/pydata/xarray/issues/1854 | MDEyOklzc3VlQ29tbWVudDM2NTkyNTI4Mg== | jamesstidard 1797906 | 2018-02-15T13:21:33Z | 2018-02-15T13:24:46Z | NONE | @rabernat Still seem to get a SIGKILL 9 (exit code 137) when trying to run with that pre-processor as well. Maybe my expectations of how it lazy loads files is too high. The machine I'm running on has 8GB or ram and the files in total are just under 1Tb |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Drop coordinates on loading large dataset. 291332965 | |
365896646 | https://github.com/pydata/xarray/issues/1854#issuecomment-365896646 | https://api.github.com/repos/pydata/xarray/issues/1854 | MDEyOklzc3VlQ29tbWVudDM2NTg5NjY0Ng== | jamesstidard 1797906 | 2018-02-15T11:12:48Z | 2018-02-15T11:12:48Z | NONE | @jhamman Here's the ```bash netcdf \34.128_1900_01_05_05 { dimensions: longitude = 720 ; latitude = 361 ; time = UNLIMITED ; // (124 currently) variables: float longitude(longitude) ; longitude:units = "degrees_east" ; longitude:long_name = "longitude" ; float latitude(latitude) ; latitude:units = "degrees_north" ; latitude:long_name = "latitude" ; int time(time) ; time:units = "hours since 1900-01-01 00:00:0.0" ; time:long_name = "time" ; time:calendar = "gregorian" ; short sst(time, latitude, longitude) ; sst:scale_factor = 0.000552094668668839 ; sst:add_offset = 285.983000319853 ; sst:_FillValue = -32767s ; sst:missing_value = -32767s ; sst:units = "K" ; sst:long_name = "Sea surface temperature" ; // global attributes: :Conventions = "CF-1.6" ; :history = "2017-08-04 06:17:58 GMT by grib_to_netcdf-2.4.0: grib_to_netcdf /data/data05/scratch/_mars-atls09-95e2cf679cd58ee9b4db4dd119a05a8d-gF5gxN.grib -o /data/data04/scratch/_grib2netcdf-atls01-a562cefde8a29a7288fa0b8b7f9413f7-VvH7PP.nc -utime" ; :_Format = "64-bit offset" ; } ``` Unfortunately removing the chunks didn't seem to help. I'm running with the pre-process workaround this morning to see if that completes. Sorry for the late response on this - been pretty busy. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Drop coordinates on loading large dataset. 291332965 | |
364492783 | https://github.com/pydata/xarray/issues/1854#issuecomment-364492783 | https://api.github.com/repos/pydata/xarray/issues/1854 | MDEyOklzc3VlQ29tbWVudDM2NDQ5Mjc4Mw== | jamesstidard 1797906 | 2018-02-09T16:58:42Z | 2018-02-09T16:58:42Z | NONE | I'll give both of those a shot. For hosting, the files are currently on a local drive and they sum to about 1Tb. I can probably host a couple examples though. Thanks again for the support. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Drop coordinates on loading large dataset. 291332965 | |
364488847 | https://github.com/pydata/xarray/issues/1854#issuecomment-364488847 | https://api.github.com/repos/pydata/xarray/issues/1854 | MDEyOklzc3VlQ29tbWVudDM2NDQ4ODg0Nw== | jamesstidard 1797906 | 2018-02-09T16:45:51Z | 2018-02-09T16:45:51Z | NONE | That run was killed with the output
Process finished with exit code 137 (interrupted by signal 9: SIGKILL) ``` I wasn't watching the machine at the time but I assume that's it falling over to memory pressure. Hi @jhamman, I'm using I'm just using whatever the default scheduler is as that's pretty much all the code I've got written above. I'm unsure how to do a performance check as the dataset can't even be fully loaded currently. I've tried different chuck sizes in the past hoping to stumble on a magic size, but have been unsuccessful with that. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Drop coordinates on loading large dataset. 291332965 | |
364463855 | https://github.com/pydata/xarray/issues/1854#issuecomment-364463855 | https://api.github.com/repos/pydata/xarray/issues/1854 | MDEyOklzc3VlQ29tbWVudDM2NDQ2Mzg1NQ== | jamesstidard 1797906 | 2018-02-09T15:22:38Z | 2018-02-09T15:22:38Z | NONE | Sure, I'm running that now. I'll reply once/if it finished. Though watching my system monitor memory usage, it does not appear to be growing. I seem to remember the open function continually allocating itself more ram until it was killed. I'll take a read through that issue while I wait. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Drop coordinates on loading large dataset. 291332965 | |
364459162 | https://github.com/pydata/xarray/issues/1854#issuecomment-364459162 | https://api.github.com/repos/pydata/xarray/issues/1854 | MDEyOklzc3VlQ29tbWVudDM2NDQ1OTE2Mg== | jamesstidard 1797906 | 2018-02-09T15:06:37Z | 2018-02-09T15:09:02Z | NONE | That's true, maybe I misread last time or it's month dependant. Hopefully this is what you're after - let me know if not. I used 3
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Drop coordinates on loading large dataset. 291332965 | |
364451782 | https://github.com/pydata/xarray/issues/1854#issuecomment-364451782 | https://api.github.com/repos/pydata/xarray/issues/1854 | MDEyOklzc3VlQ29tbWVudDM2NDQ1MTc4Mg== | jamesstidard 1797906 | 2018-02-09T14:40:20Z | 2018-02-09T14:40:20Z | NONE | Sure, this is the repr of a single file:
Thanks |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Drop coordinates on loading large dataset. 291332965 | |
364399084 | https://github.com/pydata/xarray/issues/1854#issuecomment-364399084 | https://api.github.com/repos/pydata/xarray/issues/1854 | MDEyOklzc3VlQ29tbWVudDM2NDM5OTA4NA== | jamesstidard 1797906 | 2018-02-09T10:41:28Z | 2018-02-09T10:41:28Z | NONE | Sorry to bump this. Still looking to a solution to this problem if anyone has had a similar experience. Thanks. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Drop coordinates on loading large dataset. 291332965 | |
361576685 | https://github.com/pydata/xarray/issues/1854#issuecomment-361576685 | https://api.github.com/repos/pydata/xarray/issues/1854 | MDEyOklzc3VlQ29tbWVudDM2MTU3NjY4NQ== | jamesstidard 1797906 | 2018-01-30T12:19:12Z | 2018-01-30T12:19:12Z | NONE | Hi @rabernat, thanks for the response. Sorry it's taken me a few days to get back to you. Here's the info dump of one of the files: ``` xarray.Dataset { dimensions: latitude = 361 ; longitude = 720 ; time = 248 ; variables: float32 longitude(longitude) ; longitude:units = degrees_east ; longitude:long_name = longitude ; float32 latitude(latitude) ; latitude:units = degrees_north ; latitude:long_name = latitude ; datetime64[ns] time(time) ; time:long_name = time ; float64 mwd(time, latitude, longitude) ; mwd:units = Degree true ; mwd:long_name = Mean wave direction ; // global attributes: :Conventions = CF-1.6 ; :history = 2017-08-09 18:15:34 GMT by grib_to_netcdf-2.4.0: grib_to_netcdf /data/data05/scratch/_mars-atls02-70e05f9f8ba4e9d19932f1c45a7be8d8-Pwy6jZ.grib -o /data/data01/scratch/_grib2netcdf-atls02-95e2cf679cd58ee9b4db4dd119a05a8d-v4TKah.nc -utime ; ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Drop coordinates on loading large dataset. 291332965 | |
330162706 | https://github.com/pydata/xarray/issues/1572#issuecomment-330162706 | https://api.github.com/repos/pydata/xarray/issues/1572 | MDEyOklzc3VlQ29tbWVudDMzMDE2MjcwNg== | jamesstidard 1797906 | 2017-09-18T08:57:39Z | 2017-09-18T08:59:24Z | NONE | @shoyer great, thanks. I added the line below and it has reduced the size of the file down to that of the duplicate. Thanks pointing me the in the right direction. I'm assuming I do not need to fillnans with _FillValue after (though maybe I might).
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Modifying data set resulting in much larger file size 257400162 | |
329233581 | https://github.com/pydata/xarray/issues/1572#issuecomment-329233581 | https://api.github.com/repos/pydata/xarray/issues/1572 | MDEyOklzc3VlQ29tbWVudDMyOTIzMzU4MQ== | jamesstidard 1797906 | 2017-09-13T17:06:12Z | 2017-09-13T17:06:12Z | NONE | @fmaussion @jhamman Ah great - that makes sense. I'll see if I can set them to the original file's short fill representation instead of nan. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Modifying data set resulting in much larger file size 257400162 | |
329230620 | https://github.com/pydata/xarray/issues/1572#issuecomment-329230620 | https://api.github.com/repos/pydata/xarray/issues/1572 | MDEyOklzc3VlQ29tbWVudDMyOTIzMDYyMA== | jamesstidard 1797906 | 2017-09-13T16:55:45Z | 2017-09-13T16:59:57Z | NONE | Sure, here you go: Original (128.9MB): ```bash $ ncdump -h -s swh_2010_01_05_05.nc netcdf swh_2010_01_05_05 { dimensions: longitude = 720 ; latitude = 361 ; time = UNLIMITED ; // (248 currently) variables: float longitude(longitude) ; longitude:units = "degrees_east" ; longitude:long_name = "longitude" ; float latitude(latitude) ; latitude:units = "degrees_north" ; latitude:long_name = "latitude" ; int time(time) ; time:units = "hours since 1900-01-01 00:00:0.0" ; time:long_name = "time" ; time:calendar = "gregorian" ; short swh(time, latitude, longitude) ; swh:scale_factor = 0.000203558072860934 ; swh:add_offset = 6.70098898894319 ; swh:_FillValue = -32767s ; swh:missing_value = -32767s ; swh:units = "m" ; swh:long_name = "Significant height of combined wind waves and swell" ; // global attributes:
:Conventions = "CF-1.6" ;
:history = "2017-08-09 16:41:57 GMT by grib_to_netcdf-2.4.0: grib_to_netcdf /data/data04/scratch/_mars-atls01-a562cefde8a29a7288fa0b8b7f9413f7-5gV0xP.grib -o /data/data05/scratch/_grib2netcdf-atls09-70e05f9f8ba4e9d19932f1c45a7be8d8-jU8lEi.nc -utime" ;
:_Format = "64-bit offset" ;
}
// global attributes:
:_NCProperties = "version=1|netcdflibversion=4.4.1.1|hdf5libversion=1.8.18" ;
:Conventions = "CF-1.6" ;
:history = "2017-08-09 16:41:57 GMT by grib_to_netcdf-2.4.0: grib_to_netcdf /data/data04/scratch/_mars-atls01-a562cefde8a29a7288fa0b8b7f9413f7-5gV0xP.grib -o /data/data05/scratch/_grib2netcdf-atls09-70e05f9f8ba4e9d19932f1c45a7be8d8-jU8lEi.nc -utime" ;
:_Format = "netCDF-4" ;
}
// global attributes: :_NCProperties = "version=1|netcdflibversion=4.4.1.1|hdf5libversion=1.8.18" ; :Conventions = "CF-1.6" ; :history = "2017-08-09 16:41:57 GMT by grib_to_netcdf-2.4.0: grib_to_netcdf /data/data04/scratch/_mars-atls01-a562cefde8a29a7288fa0b8b7f9413f7-5gV0xP.grib -o /data/data05/scratch/_grib2netcdf-atls09-70e05f9f8ba4e9d19932f1c45a7be8d8-jU8lEi.nc -utime" ; :_Format = "netCDF-4" ; } ``` I assume it's about that fill/missing value changing? Thanks for the help. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Modifying data set resulting in much larger file size 257400162 | |
329181674 | https://github.com/pydata/xarray/issues/1561#issuecomment-329181674 | https://api.github.com/repos/pydata/xarray/issues/1561 | MDEyOklzc3VlQ29tbWVudDMyOTE4MTY3NA== | jamesstidard 1797906 | 2017-09-13T14:16:06Z | 2017-09-13T14:16:06Z | NONE | Increasing the chunk sizes seemed to resolve this issue. I was loading readings over time on the world as 360x180xT was tryin to rechunk them to 1x1xT. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
exit code 137 when using xarray.open_mfdataset 255997962 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
issue 4