home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 1636435706

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1636435706 I_kwDOAMm_X85higb6 7662 Describe output time index after resampling in docs / docstring 103456955 open 0     3 2023-03-22T20:14:40Z 2023-03-23T19:27:31Z   NONE      

What is your issue?

I have monthly files of hourly TCWV for the years 2000-2021. I am loading in all the files as an xarray.Dataset as follows: ``` path = '/ocean/projects/atm200007p/sferrett/data/raw/' files = np.sort(glob.glob(path+'tcwv.nc')) tcwv = xr.open_mfdataset(files,chunks={'time':24,'latitude':121,'longitude':161}) print(tcwv)


<xarray.Dataset> Dimensions: (time: 192864, latitude: 121, longitude: 161) Coordinates: * latitude (latitude) float64 30.0 29.75 29.5 29.25 ... 0.75 0.5 0.25 0.0 * longitude (longitude) float64 45.0 45.25 45.5 45.75 ... 84.5 84.75 85.0 * time (time) datetime64[ns] 2000-01-01 ... 2021-12-31T23:00:00 Data variables: TCWV (time, latitude, longitude) float32 dask.array<chunksize=(24, 121, 161), meta=np.ndarray> Attributes: DATA_SOURCE: ECMWF: https://cds.climate.copernicus.eu, Copernicu... NETCDF_CONVERSION: CISL RDA: Conversion from ECMWF GRIB1 data to netCDF4. NETCDF_VERSION: 4.6.1 CONVERSION_PLATFORM: Linux casper05 3.10.0-693.21.1.el7.x86_64 #1 SMP We... CONVERSION_DATE: Fri Jul 26 12:11:15 MDT 2019 Conventions: CF-1.6 NETCDF_COMPRESSION: NCO: Precision-preserving compression to netCDF4/HD... history: Fri Mar 17 08:19:49 2023: ncks -d latitude,0.0,30.0... NCO: netCDF Operators version 5.0.3 (Homepage = http://n.../ I want to calculate daily means for June - August, so first, I call.sel(): jja = tcwv.sel(time=tcwv.time.dt.month.isin([6,7,8])) print(jja.groupby('time.month')) print(jja)


DatasetGroupBy, grouped over 'month' 3 groups with labels 6, 7, 8. <xarray.Dataset> Dimensions: (time: 48576, latitude: 121, longitude: 161) Coordinates: * latitude (latitude) float64 30.0 29.75 29.5 29.25 ... 0.75 0.5 0.25 0.0 * longitude (longitude) float64 45.0 45.25 45.5 45.75 ... 84.5 84.75 85.0 * time (time) datetime64[ns] 2000-06-01 ... 2021-08-31T23:00:00 Data variables: TCWV (time, latitude, longitude) float32 dask.array<chunksize=(24, 121, 161), meta=np.ndarray> When I call.resample()to calculate the daily means, I expect the time dimension to be of length 2024 (92 days in JJA multiplied by 22 years total). However, calling.groupby('time.month')shows that time is being added somehow: dailymeans_jja_tcwv = jja_tcwv.resample(time='D').mean() print(dailymeans_jja_tcwv.groupby('time.month')) print(dailymeans_jja_tcwv)


DatasetGroupBy, grouped over 'month' 12 groups with labels 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12. <xarray.Dataset> Dimensions: (latitude: 121, longitude: 161, time: 7762) Coordinates: * latitude (latitude) float64 30.0 29.75 29.5 29.25 ... 0.75 0.5 0.25 0.0 * longitude (longitude) float64 45.0 45.25 45.5 45.75 ... 84.5 84.75 85.0 * time (time) datetime64[ns] 2000-06-01 2000-06-02 ... 2021-08-31 Data variables: TCWV (time, latitude, longitude) float32 dask.array<chunksize=(1, 121, 161), meta=np.ndarray> ``` Is there a reason that this is happening/a way to work around this? It seems too bulky to call resample then subset the time dimension, especially if needing to repeat this operation fir large amounts of data.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7662/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 3 rows from issue in issue_comments
Powered by Datasette · Queries took 0.729ms · About: xarray-datasette