home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

3 rows where user = 2014301 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue 3

state 1

  • closed 3

repo 1

  • xarray 3
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
435535284 MDU6SXNzdWU0MzU1MzUyODQ= 2912 Writing a netCDF file is unexpectedly slow msaharia 2014301 closed 0     12 2019-04-21T18:31:36Z 2023-09-12T15:58:18Z 2023-09-12T15:58:18Z NONE      

```python ncdat=xr.open_mfdataset(nclist, concat_dim='time')

ncdat['lat']=ncdat['lat'].isel(time=0).drop('time') ncdat['lon']=ncdat['lon'].isel(time=0).drop('time') ncdat=ncdat.rename({'north_south':'lat', 'east_west':'lon'})

lat_coords=ncdat.lat[:,0] #Extract latitudes lon_coords=ncdat.lon[0,:] #Extract longitudes

ncdat=ncdat.drop(['lat','lon'])

reformatted_ncdat=ncdat.assign_coords(lat=lat_coords,lon=lon_coords, time=ncdat.coords['time'])

ncdat = reformatted_ncdat.sortby('time') ncdat.to_netcdf('testing.nc')

```

Problem description

After some processing, I am left with this xarray dataset ncdat which I want to export to a netCDF file.

<xarray.Dataset> Dimensions: (lat: 59, lon: 75, time: 500) Coordinates: * time (time) datetime64[ns] 2007-01-22 ... 2008-06-04 * lat (lat) float32 -4.25 -4.15 ... 1.4500003 1.5500002 * lon (lon) float32 29.049988 29.149994 ... 36.450012 Data variables: Streamflow_tavg (time, lat, lon) float32 dask.array<shape=(500, 59, 75), chunksize=(1, 59, 75)> RiverDepth_tavg (time, lat, lon) float32 dask.array<shape=(500, 59, 75), chunksize=(1, 59, 75)> RiverFlowVelocity_tavg (time, lat, lon) float32 dask.array<shape=(500, 59, 75), chunksize=(1, 59, 75)> FloodedFrac_tavg (time, lat, lon) float32 dask.array<shape=(500, 59, 75), chunksize=(1, 59, 75)> SurfElev_tavg (time, lat, lon) float32 dask.array<shape=(500, 59, 75), chunksize=(1, 59, 75)> SWS_tavg (time, lat, lon) float32 dask.array<shape=(500, 59, 75), chunksize=(1, 59, 75)> Attributes: missing_value: -9999.0 NUM_SOIL_LAYERS: 1 SOIL_LAYER_THICKNESSES: 1.0 title: LIS land surface model output institution: NASA GSFC source: model_not_specified history: created on date: 2019-04-19T09:11:12.992 references: Kumar_etal_EMS_2006, Peters-Lidard_etal_ISSE_2007 conventions: CF-1.6 comment: website: http://lis.gsfc.nasa.gov/ MAP_PROJECTION: EQUIDISTANT CYLINDRICAL SOUTH_WEST_CORNER_LAT: -4.25 SOUTH_WEST_CORNER_LON: 29.05 DX: 0.1 DY: 0.1 But the problem is it takes an inordinately long time to export. Almost 10 mins for this particular file which is only 35M.

How can I expedite this process? Is there anything wrong with the structure of ncdat?

Expected Output

A netCDF file

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 | packaged by conda-forge | (default, Mar 27 2019, 23:01:00) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.0.101-0.47.105-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.12.1 pandas: 0.24.2 numpy: 1.16.2 scipy: 1.2.1 netCDF4: 1.5.0.1 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 1.2.0 distributed: 1.27.0 matplotlib: 3.0.3 cartopy: 0.17.0 seaborn: 0.9.0 setuptools: 41.0.0 pip: 19.0.3 conda: None pytest: None IPython: 7.4.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2912/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
455200681 MDU6SXNzdWU0NTUyMDA2ODE= 3017 Why am I running into a IndexVariable error message while assigning dims? msaharia 2014301 closed 0     1 2019-06-12T13:01:44Z 2020-04-06T22:43:53Z 2020-04-06T22:43:53Z NONE      

Code Sample, a copy-pastable example if possible

I am trying to remove the (north_south, east_west) from lat and lon and and make them coordinate dimensions. The following piece of code was working fine earlier. But now I am running into an error message. I think I updated xarray. Would like to know what I am doing wrong. Or if there is another way to do this?

```python ncdat = xr.open_mfdataset(files_lis) ncdat['lat']=ncdat['lat'].isel(time=0).drop('time') ncdat['lon']=ncdat['lon'].isel(time=0).drop('time') ncdat=ncdat.rename({'north_south':'lat', 'east_west':'lon'})

lat_coords = ncdat['lat'].values[:,-1] #Extract latitudes lon_coords = ncdat['lon'].values[-1,:] #Extract longitudes

reformatted_ncdat=ncdat.assign_coords(lat=lat_coords, lon=lon_coords, time=ncdat.coords['time'])

```

Problem description

This is ncdat after xr.open_mfdataset(files_lis) <xarray.Dataset> Dimensions: (SoilMoist_profiles: 3, east_west: 172, north_south: 92, time: 10) Coordinates: * time (time) datetime64[ns] 2016-08-01 2016-08-02 ... 2016-08-10 Dimensions without coordinates: SoilMoist_profiles, east_west, north_south Data variables: lat (time, north_south, east_west) float32 dask.array<shape=(10, 92, 172), chunksize=(1, 92, 172)> lon (time, north_south, east_west) float32 dask.array<shape=(10, 92, 172), chunksize=(1, 92, 172)> Snowf_tavg (time, north_south, east_west) float32 dask.array<shape=(10, 92, 172), chunksize=(1, 92, 172)> Rainf_tavg (time, north_south, east_west) float32 dask.array<shape=(10, 92, 172), chunksize=(1, 92, 172)> Evap_tavg (time, north_south, east_west) float32 dask.array<shape=(10, 92, 172), chunksize=(1, 92, 172)> Qs_tavg (time, north_south, east_west) float32 dask.array<shape=(10, 92, 172), chunksize=(1, 92, 172)> Qsb_tavg (time, north_south, east_west) float32 dask.array<shape=(10, 92, 172), chunksize=(1, 92, 172)> Qsm_tavg (time, north_south, east_west) float32 dask.array<shape=(10, 92, 172), chunksize=(1, 92, 172)> SWE_tavg (time, north_south, east_west) float32 dask.array<shape=(10, 92, 172), chunksize=(1, 92, 172)> SnowDepth_tavg (time, north_south, east_west) float32 dask.array<shape=(10, 92, 172), chunksize=(1, 92, 172)> SoilMoist_tavg (time, SoilMoist_profiles, north_south, east_west) float32 dask.array<shape=(10, 3, 92, 172), chunksize=(1, 3, 92, 172)> CanopInt_tavg (time, north_south, east_west) float32 dask.array<shape=(10, 92, 172), chunksize=(1, 92, 172)> WaterTableD_tavg (time, north_south, east_west) float32 dask.array<shape=(10, 92, 172), chunksize=(1, 92, 172)> TWS_tavg (time, north_south, east_west) float32 dask.array<shape=(10, 92, 172), chunksize=(1, 92, 172)> GWS_tavg (time, north_south, east_west) float32 dask.array<shape=(10, 92, 172), chunksize=(1, 92, 172)> SnowCover_tavg (time, north_south, east_west) float32 dask.array<shape=(10, 92, 172), chunksize=(1, 92, 172)> TotalPrecip_tavg (time, north_south, east_west) float32 dask.array<shape=(10, 92, 172), chunksize=(1, 92, 172)>

Error

ValueError: IndexVariable objects must be 1-dimensional

Expected Output

Please ignore values. Just an illustration. <xarray.Dataset> Dimensions: (SoilMoist_profiles: 3, lat: 92, lon: 172, time: 10) Coordinates: * lat (lat) float32 4.1249995 4.375 4.625 ... 26.624996 26.875 * lon (lon) float32 nan nan nan ... 24.375 24.625004 24.874996 * time (time) datetime64[ns] 2016-08-01 2016-08-02 ... 2017-12-31 Dimensions without coordinates: SoilMoist_profiles Data variables: Snowf_tavg (time, lat, lon) float32 ... Rainf_tavg (time, lat, lon) float32 ... Evap_tavg (time, lat, lon) float32 ... Qs_tavg (time, lat, lon) float32 ... Qsb_tavg (time, lat, lon) float32 ... Qsm_tavg (time, lat, lon) float32 ... SWE_tavg (time, lat, lon) float32 ... SnowDepth_tavg (time, lat, lon) float32 ... SoilMoist_tavg (time, SoilMoist_profiles, lat, lon) float32 ... CanopInt_tavg (time, lat, lon) float32 ... WaterTableD_tavg (time, lat, lon) float32 ... TWS_tavg (time, lat, lon) float32 ... GWS_tavg (time, lat, lon) float32 ... SnowCover_tavg (time, lat, lon) float32 ... TotalPrecip_tavg (time, lat, lon) float32 ...

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 (default, Mar 27 2019, 16:54:48) [Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 18.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.12.1 pandas: 0.24.2 numpy: 1.16.4 scipy: 1.2.1 netCDF4: 1.4.2 pydap: None h5netcdf: None h5py: 2.9.0 Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: 1.0.21 cfgrib: None iris: None bottleneck: None dask: 1.2.2 distributed: 1.28.1 matplotlib: 3.1.0 cartopy: None seaborn: 0.9.0 setuptools: 41.0.1 pip: 19.1.1 conda: None pytest: None IPython: 7.5.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3017/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
392816513 MDU6SXNzdWUzOTI4MTY1MTM= 2623 Why is my export to netcdf command leading to a __truediv__ error? msaharia 2014301 closed 0     7 2018-12-19T23:16:14Z 2018-12-24T15:58:03Z 2018-12-24T15:58:03Z NONE      

Code Sample, a copy-pastable example if possible

```python import numpy as np import xarray as xr import glob, os

NCDIR = './output/out/' finalfile = 'summaout.nc'

outfilelist = glob.glob((NCDIR+'/{}.nc').format('basin_*timestep'))

ds=xr.open_mfdataset(outfilelist, concat_dim='hru')

replace = ds['pptrate'] runoff = ds['averageInstantRunoff'].values runoff = np.squeeze(runoffdata, axis=2) runoff = runoff.transpose() replace.values = runoff

ncconvert = ds.drop('averageInstantRunoff') runoffarray = xr.DataArray(runoff, dims=['time','hru']) ds['averageInstantRunoff'] = runoffarray ds.to_netcdf('test.nc')

```

Problem description

This is ds just before export. <xarray.Dataset> Dimensions: (hru: 17, time: 233) Coordinates: * hru (hru) int64 9 17 11 8 3 2 6 4 7 12 1 13 10 16 15 5 14 * time (time) datetime64[ns] 2010-01-01 ... 2010-01-30 Data variables: pptrate (time, hru) float64 9.241e-05 9.241e-05 ... 2.717e-09 hruId (hru) int64 dask.array<shape=(17,), chunksize=(1,)> averageInstantRunoff (time, hru) float64 9.241e-05 9.241e-05 ... 2.717e-09 nSnow (time, hru) int32 dask.array<shape=(233, 17), chunksize=(233, 1)> nSoil (time, hru) int32 dask.array<shape=(233, 17), chunksize=(233, 1)> nLayers (time, hru) int32 dask.array<shape=(233, 17), chunksize=(233, 1)> I get this error message:

TypeError: cannot perform __truediv__ with this index type: <class 'pandas.core.indexes.datetimes.DatetimeIndex'>

Expected Output

netCDF

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.4.final.0 python-bits: 64 OS: Linux OS-release: 3.12.62-60.64.8-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.11.0 pandas: 0.21.0 numpy: 1.13.3 scipy: 1.0.0 netCDF4: 1.4.0 h5netcdf: 0.5.0 h5py: 2.7.1 Nio: None zarr: None cftime: 1.0.0 PseudonetCDF: None rasterio: None iris: None bottleneck: 1.2.1 cyordereddict: None dask: 0.19.3 distributed: 1.23.3 matplotlib: 2.1.2 cartopy: 0.15.1 seaborn: 0.9.0 setuptools: 36.7.2 pip: 18.1 conda: 4.5.11 pytest: 3.2.5 IPython: 6.2.1 sphinx: 1.6.5
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2623/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 24.433ms · About: xarray-datasette