issue_comments: 265174968

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/1084#issuecomment-265174968	https://api.github.com/repos/pydata/xarray/issues/1084	265174968	MDEyOklzc3VlQ29tbWVudDI2NTE3NDk2OA==	6628425	2016-12-06T15:13:41Z	2016-12-06T15:13:41Z	MEMBER	@shoyer brings up a good point regarding partial datetime string indexing. For instance in my basic example, indexing with truncated string dates of the form `'2000-01-01'` (versus the full specification, `2000-01-01 00:00:00'`) works because `netcdftime._parse_date` simply assumes that you meant `'2000-01-01 00:00:00'` when you wrote `'2000-01-01'`. This would mean that the same string specification could have different behavior for `DatetimeIndex` objects versus `NetCDFTimeIndex` objects, which is probably not desirable. For instance, using the current setup in my basic example with sub-daily resolution data, selecting a time using `'2000-01-01'` would give you just the value associated with `'2000-01-01 00:00:00'`: ``` In [20] dates = [netcdftime.DatetimeAllLeap(2000, 1, 1, 0), netcdftime.DatetimeAllLeap(2000, 1, 1, 3)] In [21] da = xr.DataArray(np.arange(2), coords=[NetCDFTimeIndex(dates)], dims=['time']) In [22] da.sel(time='2000-01-01') Out [22] <xarray.DataArray ()> array(0) Coordinates: time object 2000-01-01 00:00:00 ``` but using a `DatetimeIndex` this would give you both values (because of partial datetime string selection): ``` In [23] from datetime import datetime In [24] dates = [datetime(2000, 1, 1, 0), datetime(2000, 1, 1, 3)] In [25] da = xr.DataArray(np.arange(2), coords=[dates], dims=['time']) In [26] da.sel(time='2000-01-01') Out [26] <xarray.DataArray (time: 2)> array([0, 1]) Coordinates: * time (time) datetime64[ns] 2000-01-01 2000-01-01T03:00:00 ``` I think if we were to include string-based indexing, it would be best if it were completely consistent with the `DatetimeIndex` version. I would love to be wrong, but I don't see a clean way of directly using existing code from pandas to enable this. At least in my (possibly naive) reading of the internals of `DatetimeIndex`, the functions associated with partial datetime string selection are somewhat tied to using datetimes with standard calendars (somewhat in the weeds, but more specifically I'm looking at `pandas.tslib.parse_datetime_string_with_reso` and `pandas.tseries.index.DatetimeIndex._parsed_string_to_bounds`), and it could take a fair bit of adapting that code for our purposes to unhitch that dependence. Is that a fair assessment? So ultimately this raises the question, would we want to add just the field accessors to enable group-by operations for now and add string-based selection (and other features like `resample`) later, or should we put our heads down and work out a solution for partial datetime string based using netcdftime datetime objects?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		187591179