home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where issue = 258500654 and user = 206773 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 1

  • forman · 5 ✖

issue 1

  • Variable of dtype int8 casted to float64 · 5 ✖

author_association 1

  • NONE 5
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
330464740 https://github.com/pydata/xarray/issues/1576#issuecomment-330464740 https://api.github.com/repos/pydata/xarray/issues/1576 MDEyOklzc3VlQ29tbWVudDMzMDQ2NDc0MA== forman 206773 2017-09-19T08:16:43Z 2017-09-19T08:16:43Z NONE

@shoyer

We currently decode anything with a _FillValue attribute to float, ...

I believe this fact is surprising for any user of integer/index/enum/classification datasets. Since its justification seems to be an implementation detail which comes at the cost of increased memory and CPU consumption I suggest documenting it in open_dataset() and decode_cf() functions.

Here is how we overcome this issue by deleting the _FillValue attribute of integer variables if their scale_factor and add_offset attributes are not provided:

ds = xr.open_dataset(path, decode_cf=False)
old_fill_values = unset_fill_value_for_int_vars(ds)
ds = xr.decode_cf(ds)
reset_fill_value_for_int_vars(ds, old_fill_values)

where old_fill_values is a mapping of variable names to fill values.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Variable of dtype int8 casted to float64 258500654
330275698 https://github.com/pydata/xarray/issues/1576#issuecomment-330275698 https://api.github.com/repos/pydata/xarray/issues/1576 MDEyOklzc3VlQ29tbWVudDMzMDI3NTY5OA== forman 206773 2017-09-18T16:20:33Z 2017-09-18T16:20:33Z NONE

@jhamman _NoFill is about optimizing writes, see nc_set_fill

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Variable of dtype int8 casted to float64 258500654
330273842 https://github.com/pydata/xarray/issues/1576#issuecomment-330273842 https://api.github.com/repos/pydata/xarray/issues/1576 MDEyOklzc3VlQ29tbWVudDMzMDI3Mzg0Mg== forman 206773 2017-09-18T16:13:45Z 2017-09-18T16:13:45Z NONE

I see, that is what is done in mask_and_scale(). Why can't xarray used masked arrays, that would retain the original dtype? (Dask, I guess?) Expanding integers to 8 byte floats not only cost memory but also CPU, including an inaccurate in-memory integer representation.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Variable of dtype int8 casted to float64 258500654
330267397 https://github.com/pydata/xarray/issues/1576#issuecomment-330267397 https://api.github.com/repos/pydata/xarray/issues/1576 MDEyOklzc3VlQ29tbWVudDMzMDI2NzM5Nw== forman 206773 2017-09-18T15:52:55Z 2017-09-18T16:00:01Z NONE

I guess, the poblem is caused in xarray/conventions.py.

Note, when debugging into it, fill_value == nd.array([0], dtype == np.int8) and fill_value.dtype.kind='i' and the latter kind is not dealt with. Therefore int8 is turned into float64.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Variable of dtype int8 casted to float64 258500654
330261323 https://github.com/pydata/xarray/issues/1576#issuecomment-330261323 https://api.github.com/repos/pydata/xarray/issues/1576 MDEyOklzc3VlQ29tbWVudDMzMDI2MTMyMw== forman 206773 2017-09-18T15:32:27Z 2017-09-18T15:32:27Z NONE

Here you are

$ ncdump -h -s
netcdf ESACCI-LC-L4-LCCS-Map-300m-P5Y-2005-v1.6.1 {
dimensions:
        lat = 64800 ;
        lon = 129600 ;
variables:
        byte lccs_class(lat, lon) ;
                lccs_class:long_name = "Land cover class defined in LCCS" ;
                lccs_class:standard_name = "land_cover_lccs" ;
                lccs_class:flag_values = 0b, 10b, 11b, 12b, 20b, 30b, 40b, 50b, 60b, 61b, 62b, 70b, 71b, 72b, 80b, 81b, 82b, 90b, 100b, 110b, 120b, 121b, 122b, -126b, -116b, -106b, -104b, -103b, -96b, -86b, -76b, -66b, -56b, -55b, -54b, -46b,
-36b ;
                lccs_class:flag_meanings = "no_data cropland_rainfed cropland_rainfed_herbaceous_cover cropland_rainfed_tree_or_shrub_cover cropland_irrigated mosaic_cropland mosaic_natural_vegetation tree_broadleaved_evergreen_closed_to_open
tree_broadleaved_deciduous_closed_to_open tree_broadleaved_deciduous_closed tree_broadleaved_deciduous_open tree_needleleaved_evergreen_closed_to_open tree_needleleaved_evergreen_closed tree_needleleaved_evergreen_open tree_needleleaved_decidu
ous_closed_to_open tree_needleleaved_deciduous_closed tree_needleleaved_deciduous_open tree_mixed mosaic_tree_and_shrub mosaic_herbaceous shrubland shrubland_evergreen shrubland_deciduous grassland lichens_and_mosses sparse_vegetation sparse_s
hrub sparse_herbaceous tree_cover_flooded_fresh_or_brakish_water tree_cover_flooded_saline_water shrub_or_herbaceous_cover_flooded urban bare_areas bare_areas_consolidated bare_areas_unconsolidated water snow_and_ice" ;
                lccs_class:valid_min = 1 ;
                lccs_class:valid_max = 220 ;
                lccs_class:_Unsigned = "true" ;
                lccs_class:_FillValue = 0b ;
                lccs_class:ancillary_variables = "processed_flag current_pixel_state observation_count algorithmic_confidence_level" ;
                lccs_class:_Storage = "chunked" ;
                lccs_class:_ChunkSizes = 2048, 2048 ;
                lccs_class:_DeflateLevel = 6 ;
                lccs_class:_NoFill = "true" ;
        byte processed_flag(lat, lon) ;
                processed_flag:standard_name = "land_cover_lccs status_flag" ;
                processed_flag:flag_values = 0b, 1b ;
                processed_flag:flag_meanings = "not_processed processed" ;
                processed_flag:valid_min = 0 ;
                processed_flag:valid_max = 1 ;
                processed_flag:_FillValue = -1b ;
                processed_flag:long_name = "LC map processed area flag" ;
                processed_flag:_Storage = "chunked" ;
                processed_flag:_ChunkSizes = 2048, 2048 ;
                processed_flag:_DeflateLevel = 6 ;
                processed_flag:_NoFill = "true" ;
        byte current_pixel_state(lat, lon) ;
                current_pixel_state:standard_name = "land_cover_lccs status_flag" ;
                current_pixel_state:flag_values = 0b, 1b, 2b, 3b, 4b, 5b ;
                current_pixel_state:flag_meanings = "invalid clear_land clear_water clear_snow_ice cloud cloud_shadow" ;
                current_pixel_state:valid_min = 0 ;
                current_pixel_state:valid_max = 5 ;
                current_pixel_state:_FillValue = -1b ;
                current_pixel_state:long_name = "LC pixel type mask" ;
                current_pixel_state:_Storage = "chunked" ;
                current_pixel_state:_ChunkSizes = 2048, 2048 ;
                current_pixel_state:_DeflateLevel = 6 ;
                current_pixel_state:_NoFill = "true" ;
        short observation_count(lat, lon) ;
                observation_count:standard_name = "land_cover_lccs number_of_observations" ;
                observation_count:valid_min = 0 ;
                observation_count:valid_max = 32767 ;
                observation_count:_FillValue = -1s ;
                observation_count:long_name = "number of valid observations" ;
                observation_count:_Storage = "chunked" ;
                observation_count:_ChunkSizes = 2048, 2048 ;
                observation_count:_DeflateLevel = 6 ;
                observation_count:_Endianness = "little" ;
                observation_count:_NoFill = "true" ;
        byte algorithmic_confidence_level(lat, lon) ;
                algorithmic_confidence_level:standard_name = "land_cover_lccs algorithmic_confidence" ;
                algorithmic_confidence_level:valid_min = 0 ;
                algorithmic_confidence_level:valid_max = 100 ;
                algorithmic_confidence_level:scale_factor = 0.01f ;
                algorithmic_confidence_level:_FillValue = -1b ;
                algorithmic_confidence_level:long_name = "LC map confidence level based on algorithm performance" ;
                algorithmic_confidence_level:_Storage = "chunked" ;
                algorithmic_confidence_level:_ChunkSizes = 2048, 2048 ;
                algorithmic_confidence_level:_DeflateLevel = 6 ;
                algorithmic_confidence_level:_NoFill = "true" ;
        float lat(lat) ;
                lat:long_name = "latitude" ;
                lat:standard_name = "latitude" ;
                lat:valid_min = -89.9986f ;
                lat:valid_max = 89.99861f ;
                lat:units = "degrees_north" ;
                lat:_Storage = "chunked" ;
                lat:_ChunkSizes = 64800 ;
                lat:_DeflateLevel = 6 ;
                lat:_Endianness = "little" ;
                lat:_NoFill = "true" ;
        float lon(lon) ;
                lon:long_name = "longitude" ;
                lon:standard_name = "longitude" ;
                lon:valid_min = -179.9986f ;
                lon:valid_max = 179.9986f ;
                lon:units = "degrees_east" ;
                lon:_Storage = "chunked" ;
                lon:_ChunkSizes = 129600 ;
                lon:_DeflateLevel = 6 ;
                lon:_Endianness = "little" ;
                lon:_NoFill = "true" ;
        int crs ;
                crs:i2m = "0.002777777701187,0.0,0.0,-0.002777777701187,-180.00000033927267,90.0" ;
                crs:wkt = "GEOGCS[\"WGS 84\", \r\n  DATUM[\"World Geodetic System 1984\", \r\n    SPHEROID[\"WGS 84\", 6378137.0, 298.257223563, AUTHORITY[\"EPSG\",\"7030\"]], \r\n    AUTHORITY[\"EPSG\",\"6326\"]], \r\n  PRIMEM[\"Greenwich\",
0.0, AUTHORITY[\"EPSG\",\"8901\"]], \r\n  UNIT[\"degree\", 0.017453292519943295], \r\n  AXIS[\"Geodetic longitude\", EAST], \r\n  AXIS[\"Geodetic latitude\", NORTH], \r\n  AUTHORITY[\"EPSG\",\"4326\"]]" ;
                crs:_Endianness = "little" ;
                crs:_NoFill = "true" ;

// global attributes:
                :title = "ESA CCI Land Cover Map" ;
                :summary = "This dataset contains the global ESA CCI land cover classification map derived from satellite data of one epoch." ;
                :type = "ESACCI-LC-L4-LCCS-Map-300m-P5Y" ;
                :id = "ESACCI-LC-L4-LCCS-Map-300m-P5Y-2005-v1.6.1" ;
                :project = "Climate Change Initiative - European Space Agency" ;
                :references = "http://www.esa-landcover-cci.org/" ;
                :institution = "Universite catholique de Louvain" ;
                :contact = "landcover-cci@uclouvain.be" ;
                :comment = "" ;
                :Conventions = "CF-1.6" ;
                :standard_name_vocabulary = "NetCDF Climate and Forecast (CF) Standard Names version 21" ;
                :keywords = "land cover classification,satellite,observation" ;
                :keywords_vocabulary = "NASA Global Change Master Directory (GCMD) Science Keywords" ;
                :license = "ESA CCI Data Policy: free and open access" ;
                :naming_authority = "org.esa-cci" ;
                :cdm_data_type = "grid" ;
                :TileSize = "2048:2048" ;
                :tracking_id = "00f7e0ee-3b0e-4ea3-9b9f-186e02fb4439" ;
                :product_version = "1.6.1" ;
                :date_created = "20151217T094622Z" ;
                :creator_name = "University catholique de Louvain" ;
                :creator_url = "http://www.uclouvain.be/" ;
                :creator_email = "landcover-cci@uclouvain.be" ;
                :source = "MERIS FR L1B version 5.05, MERIS RR L1B version 8.0, SPOT VGT P" ;
                :history = "amorgos-4,0, lc-sdr-1.0, lc-sr-1.0, lc-classification-1.0,lc-user-tools-3.10" ;
                :time_coverage_start = "20030101" ;
                :time_coverage_end = "20071231" ;
                :time_coverage_duration = "P5Y" ;
                :time_coverage_resolution = "P5Y" ;
                :geospatial_lat_min = "-89.99999" ;
                :geospatial_lat_max = "90.0" ;
                :geospatial_lon_min = "-180.0" ;
                :geospatial_lon_max = "179.99998" ;
                :spatial_resolution = "300m" ;
                :geospatial_lat_units = "degrees_north" ;
                :geospatial_lat_resolution = "0.002778" ;
                :geospatial_lon_units = "degrees_east" ;
                :geospatial_lon_resolution = "0.002778" ;
                :_SuperblockVersion = 2 ;
                :_IsNetcdf4 = 1 ;
                :_Format = "netCDF-4" ;
}
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Variable of dtype int8 casted to float64 258500654

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 403.9ms · About: xarray-datasette