issues
4 rows where repo = 13221727 and "updated_at" is on date 2020-01-20 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: user, closed_at, author_association, created_at (date), updated_at (date), closed_at (date)
| id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at ▲ | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 548475127 | MDU6SXNzdWU1NDg0NzUxMjc= | 3686 | Different data values from xarray open_mfdataset when using chunks | abarciauskas-bgse 15016780 | closed | 0 | 7 | 2020-01-11T20:15:12Z | 2020-01-20T20:35:48Z | 2020-01-20T20:35:47Z | NONE | MCVE Code SampleYou will first need to download or (mount podaac's drive) from PO.DAAC, including credentials: ```bash curl -u USERNAME:PASSWORD https://podaac-tools.jpl.nasa.gov/drive/files/allData/ghrsst/data/GDS2/L4/GLOB/JPL/MUR/v4.1/2002/152/20020601090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc -O data/mursst_netcdf/152/ curl -u USERNAME:PASSWORD https://podaac-tools.jpl.nasa.gov/drive/files/allData/ghrsst/data/GDS2/L4/GLOB/JPL/MUR/v4.1/2002/153/20020602090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc -O data/mursst_netcdf/153/ curl -u USERNAME:PASSWORD https://podaac-tools.jpl.nasa.gov/drive/files/allData/ghrsst/data/GDS2/L4/GLOB/JPL/MUR/v4.1/2002/154/20020603090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc -O data/mursst_netcdf/154/ curl -u USERNAME:PASSWORD https://podaac-tools.jpl.nasa.gov/drive/files/allData/ghrsst/data/GDS2/L4/GLOB/JPL/MUR/v4.1/2002/155/20020604090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc -O data/mursst_netcdf/155/ curl -u USERNAME:PASSWORD https://podaac-tools.jpl.nasa.gov/drive/files/allData/ghrsst/data/GDS2/L4/GLOB/JPL/MUR/v4.1/2002/156/20020605090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc -O data/mursst_netcdf/156/ ``` Then run the following code: ```python from datetime import datetime import xarray as xr import glob def generate_file_list(start_doy, end_doy): Invariants - but could be made configurableyear = 2002 prefix = f"data/mursst_netcdf" chunks = {'time': 1, 'lat': 1799, 'lon': 3600} Create a list of filesstart_doy = 152 num_days = 5 end_doy = start_doy + num_days fileObjs = generate_file_list(start_doy, end_doy) will use this timeslice in query later ontime_slice = slice(datetime.strptime(f"{year}-06-02", '%Y-%m-%d'), datetime.strptime(f"{year}-06-04", '%Y-%m-%d')) print("results from unchunked dataset") ds_unchunked = xr.open_mfdataset(fileObjs, combine='by_coords') print(ds_unchunked.analysed_sst[1,:,:].sel(lat=slice(20,50),lon=slice(-170,-110)).mean().values) print(ds_unchunked.analysed_sst.sel(time=time_slice).mean().values) print(f"results from chunked dataset using {chunks}") ds_chunked = xr.open_mfdataset(fileObjs, combine='by_coords', chunks=chunks) print(ds_chunked.analysed_sst[1,:,:].sel(lat=slice(20,50),lon=slice(-170,-110)).mean().values) print(ds_chunked.analysed_sst.sel(time=time_slice).mean().values) print("results from chunked dataset using 'auto'") ds_chunked = xr.open_mfdataset(fileObjs, combine='by_coords', chunks={'time': 'auto', 'lat': 'auto', 'lon': 'auto'}) print(ds_chunked.analysed_sst[1,:,:].sel(lat=slice(20,50),lon=slice(-170,-110)).mean().values) print(ds_chunked.analysed_sst.sel(time=time_slice).mean().values) ``` Note, these are just a few examples but I tried a variety of other chunk options and got similar discrepancies between the unchunked and chunked datasets. Output:
Expected OutputValues output from queries of chunked and unchunked xarray dataset are equal. Problem DescriptionI want to understand how to chunk or query data to verify data opened using chunks will have the same output as data opened without chunking. Would like to store data ultimately in Zarr but verifying data integrity is critical. Output of
|
{
"url": "https://api.github.com/repos/pydata/xarray/issues/3686/reactions",
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
completed | xarray 13221727 | issue | ||||||
| 539821504 | MDExOlB1bGxSZXF1ZXN0MzU0NzMwNzI5 | 3642 | Make datetime_to_numeric more robust to overflow errors | huard 81219 | closed | 0 | 1 | 2019-12-18T17:34:41Z | 2020-01-20T19:21:49Z | 2020-01-20T19:21:49Z | CONTRIBUTOR | 0 | pydata/xarray/pulls/3642 |
This is likely only safe with NumPy>=1.17 though. |
{
"url": "https://api.github.com/repos/pydata/xarray/issues/3642/reactions",
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
xarray 13221727 | pull | |||||
| 550964139 | MDExOlB1bGxSZXF1ZXN0MzYzNzcyNzE3 | 3699 | Feature/align in dot | mathause 10194086 | closed | 0 | 4 | 2020-01-16T17:55:38Z | 2020-01-20T12:55:51Z | 2020-01-20T12:09:27Z | MEMBER | 0 | pydata/xarray/pulls/3699 |
Happy to get feedback @fujiisoup @shoyer |
{
"url": "https://api.github.com/repos/pydata/xarray/issues/3699/reactions",
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
xarray 13221727 | pull | |||||
| 549679475 | MDU6SXNzdWU1NDk2Nzk0NzU= | 3694 | xr.dot requires equal indexes (join="exact") | mathause 10194086 | closed | 0 | 5 | 2020-01-14T16:28:15Z | 2020-01-20T12:09:27Z | 2020-01-20T12:09:27Z | MEMBER | MCVE Code Sample```python import xarray as xr import numpy as np d1 = xr.DataArray(np.arange(4), dims=["a"], coords=dict(a=[0, 1, 2, 3])) d2 = xr.DataArray(np.arange(4), dims=["a"], coords=dict(a=[0, 1, 2, 3])) note: different coordsd3 = xr.DataArray(np.arange(4), dims=["a"], coords=dict(a=[1, 2, 3, 4])) (d1 * d2).sum() # -> array(14) xr.dot(d1, d2) # -> array(14) (d2 * d3).sum() # -> array(8) xr.dot(d2, d3) # -> ValueError ``` Expected Output
Problem DescriptionThe last statement results in an
This is a problem for #2922 (weighted operations) - I think it is fine for the weights and data to not align. Fixing this may be as easy as specifying @fujiisoup Output of
|
{
"url": "https://api.github.com/repos/pydata/xarray/issues/3694/reactions",
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
completed | xarray 13221727 | issue |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issues] (
[id] INTEGER PRIMARY KEY,
[node_id] TEXT,
[number] INTEGER,
[title] TEXT,
[user] INTEGER REFERENCES [users]([id]),
[state] TEXT,
[locked] INTEGER,
[assignee] INTEGER REFERENCES [users]([id]),
[milestone] INTEGER REFERENCES [milestones]([id]),
[comments] INTEGER,
[created_at] TEXT,
[updated_at] TEXT,
[closed_at] TEXT,
[author_association] TEXT,
[active_lock_reason] TEXT,
[draft] INTEGER,
[pull_request] TEXT,
[body] TEXT,
[reactions] TEXT,
[performed_via_github_app] TEXT,
[state_reason] TEXT,
[repo] INTEGER REFERENCES [repos]([id]),
[type] TEXT
);
CREATE INDEX [idx_issues_repo]
ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
ON [issues] ([user]);