home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

17 rows where user = 6130352 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, reactions, created_at (date), updated_at (date)

issue 9

  • Extending Xarray for domain-specific toolkits 5
  • open_zarr: concat_characters has no effect when dtype=U1 4
  • Self joins with non-unique indexes 2
  • Use masked arrays while preserving int 1
  • Dataset.encode_cf function 1
  • Fancy indexing a Dataset with dask DataArray triggers multiple computes 1
  • Index level naming bug with `concat` 1
  • Export ufuncs from DataArray API 1
  • Zarr chunks would overlap multiple dask chunks error 1

user 1

  • eric-czech · 17 ✖

author_association 1

  • NONE 17
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
839846969 https://github.com/pydata/xarray/issues/5286#issuecomment-839846969 https://api.github.com/repos/pydata/xarray/issues/5286 MDEyOklzc3VlQ29tbWVudDgzOTg0Njk2OQ== eric-czech 6130352 2021-05-12T15:04:26Z 2021-05-12T15:04:26Z NONE

Thanks @shoyer, good to know!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Zarr chunks would overlap multiple dask chunks error 884209406
832659229 https://github.com/pydata/xarray/issues/5261#issuecomment-832659229 https://api.github.com/repos/pydata/xarray/issues/5261 MDEyOklzc3VlQ29tbWVudDgzMjY1OTIyOQ== eric-czech 6130352 2021-05-05T12:47:43Z 2021-05-05T12:47:43Z NONE

In general, we are deprecating xr.ufuncs in favor of np.ufunc(DataArray)

Makes sense. It is a somewhat awkward distinction to teach though to those who wouldn't appreciate the __array_ufunc__ protocol compliance, especially since most other functionality we appeal to like reductions (e.g. max, min, sum), concat, merge, indexing, filtering, etc. comes through the Xarray APIs alone.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Export ufuncs from DataArray API 876394165
828726383 https://github.com/pydata/xarray/issues/5229#issuecomment-828726383 https://api.github.com/repos/pydata/xarray/issues/5229 MDEyOklzc3VlQ29tbWVudDgyODcyNjM4Mw== eric-czech 6130352 2021-04-28T19:38:26Z 2021-04-28T19:38:26Z NONE

Yep that fixed it after upgrading to pandas 1.2.4. Thanks!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Index level naming bug with `concat` 869792877
740993933 https://github.com/pydata/xarray/issues/4663#issuecomment-740993933 https://api.github.com/repos/pydata/xarray/issues/4663 MDEyOklzc3VlQ29tbWVudDc0MDk5MzkzMw== eric-czech 6130352 2020-12-08T20:38:44Z 2020-12-08T20:39:23Z NONE

I like using our raise_if_dask_computes context since it points out where the compute is happening

Oo nice, great to know about that.

This looks like a duplicate of #2801. If you agree, can we move the conversation there?

Defining a general strategy for handling unknown chunk sizes seems like a good umbrella for it. I would certainly mention the multiple executions though, that seems somewhat orthogonal.

Have there been prior discussions about the fact that dask doesn't support consecutive slicing operations well (i.e. applying filters one after the other)? I am wondering what the thinking is on how far off that is in dask vs simply trying to support the current behavior well. I.e. maybe forcing evaluation of indexer arrays is the practical solution for the foreseeable future if xarray didn't do so more than once.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Fancy indexing a Dataset with dask DataArray triggers multiple computes 759709924
689046158 https://github.com/pydata/xarray/issues/4412#issuecomment-689046158 https://api.github.com/repos/pydata/xarray/issues/4412 MDEyOklzc3VlQ29tbWVudDY4OTA0NjE1OA== eric-czech 6130352 2020-09-08T18:06:23Z 2020-09-08T18:06:23Z NONE

Ok thanks @dcherian! I'll try that (feel free to close this).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Dataset.encode_cf function 696047530
686752919 https://github.com/pydata/xarray/issues/4405#issuecomment-686752919 https://api.github.com/repos/pydata/xarray/issues/4405 MDEyOklzc3VlQ29tbWVudDY4Njc1MjkxOQ== eric-czech 6130352 2020-09-03T20:38:47Z 2020-09-03T20:38:47Z NONE

Np! Sounds good.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_zarr: concat_characters has no effect when dtype=U1 692238160
686745468 https://github.com/pydata/xarray/issues/4405#issuecomment-686745468 https://api.github.com/repos/pydata/xarray/issues/4405 MDEyOklzc3VlQ29tbWVudDY4Njc0NTQ2OA== eric-czech 6130352 2020-09-03T20:29:21Z 2020-09-03T20:29:21Z NONE

Hm got it. Should I close this out then or might there be something awry given that concatenation doesn't work with U1 types?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_zarr: concat_characters has no effect when dtype=U1 692238160
686716048 https://github.com/pydata/xarray/issues/4405#issuecomment-686716048 https://api.github.com/repos/pydata/xarray/issues/4405 MDEyOklzc3VlQ29tbWVudDY4NjcxNjA0OA== eric-czech 6130352 2020-09-03T19:40:53Z 2020-09-03T19:40:53Z NONE

Also out of curiosity, do you know why that's True by default?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_zarr: concat_characters has no effect when dtype=U1 692238160
686715024 https://github.com/pydata/xarray/issues/4405#issuecomment-686715024 https://api.github.com/repos/pydata/xarray/issues/4405 MDEyOklzc3VlQ29tbWVudDY4NjcxNTAyNA== eric-czech 6130352 2020-09-03T19:38:36Z 2020-09-03T19:38:36Z NONE

🤦 lol yes that works. Should U1 characters not also be concatenated when that's True? I.e. is this expected:

```python chrs = np.array([ ['A', 'B'], ['C', 'D'], ['E', 'F'], ], dtype='U1') ds = xr.Dataset(dict(x=(('dim0', 'dim1'), chrs))) ds.to_zarr('/tmp/test.zarr', mode='w') xr.open_zarr('/tmp/test.zarr', concat_characters=True).x.compute()

No concatenation occurs

<xarray.DataArray 'x' (dim0: 3, dim1: 2)> array([['A', 'B'], ['C', 'D'], ['E', 'F']], dtype='<U1') Dimensions without coordinates: dim0, dim1 ```

Basically, what does from_zarr consider to be a character?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_zarr: concat_characters has no effect when dtype=U1 692238160
612978261 https://github.com/pydata/xarray/issues/3959#issuecomment-612978261 https://api.github.com/repos/pydata/xarray/issues/3959 MDEyOklzc3VlQ29tbWVudDYxMjk3ODI2MQ== eric-czech 6130352 2020-04-13T16:36:32Z 2020-04-13T16:36:32Z NONE

Thanks again @keewis! I moved the static typing discussion to https://github.com/pydata/xarray/issues/3967.

This is closed out now as far as I'm concerned.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Extending Xarray for domain-specific toolkits 597475005
612513722 https://github.com/pydata/xarray/issues/3959#issuecomment-612513722 https://api.github.com/repos/pydata/xarray/issues/3959 MDEyOklzc3VlQ29tbWVudDYxMjUxMzcyMg== eric-czech 6130352 2020-04-11T21:07:07Z 2020-04-11T21:39:42Z NONE

Thanks @keewis! I like those ideas so I experimented a bit and found a few things.

you could emulate the availability of the accessors by checking your variables in the constructor of the accessor using ... now that we have a "type" registry, we could also have one accessor, and pass a kind parameter to your analyze function:

Is there any reason not to put the name of the type into attrs and just switch on that rather than the keys in data_vars? Forcing unique data_vars keys across the different dataset types isn't a big deal, but I thought a single type name or something of the like in attrs would be simpler.

If someone actually gets this to work, we might be able to provide a xarray.typing module to allow something like (but depending on the amount of code needed, this could also fit in the Cookbook docs section)

I would love to try to use something like that. I couldn't get it to work either when trying to have a TypedDict that represents entire datasets, so I tried creating them for data_vars and coords separately. I think https://github.com/python/mypy/issues/4976 is particularly problematic in either case though. The gist of that issue is that covariance for TypeDict types doesn't really exist (i.e. TypedDict -> Mapping is ok but not TypedDict -> Mapping[Hashable, Any]) and contravariance definitely isn't supported (at least not with Dict or Mapping). Some examples I played around with:

```python MyDict = TypedDict('MyDict', {'x': str}) v1: MyDict = MyDict(x='x')

This is fine

v2: Mapping = v1

But this doesn't work:

v2: Mapping[Hashable, Any] = v1 # A notable examples since it's used in xr.Dataset

error: Incompatible types in assignment (expression has type "MyDict", variable has type "Mapping[Hashable, Any]")

And neither do any of these:

v2: dict = v1

error: Incompatible types in assignment (expression has type "MyDict", variable has type "Dict[Any, Any]")

v2: Mapping[str, str] = v1

error: Incompatible types in assignment (expression has type "MyDict", variable has type "Mapping[str, str]")

```

Going the other direction isn't possible at all (i.e. from Mapping -> TypeDict) since TypedDict acts like a subtype of Mapping. I think that's a big issue downstream if xr.Dataset requires Mapping types for data_vars and coords since you could never do something like this:

```python ds = xr.Dataset(data_vars=MyTypedDict(data=...))

Now assume a user wants to use data_vars/coords with type safety:

data_vars: MyTypedDict = ds.data_vars # This doesn't work ```

Generics seem like a decent solution to all these problems, but it would obviously involve a lot of type annotation changes:

```python

Ideally, xarray.typing would help specify more specific constraints,

but this works with what exists today:

GenotypeDataVars = TypedDict('GenotypeDataVars', {'data': DataArray, 'mask': DataArray}) GenotypeCoords = TypedDict('GenotypeCoords', {'variant': DataArray, 'sample': DataArray})

D = TypeVar('D', bound=Mapping) C = TypeVar('C', bound=Mapping)

Assume xr.Dataset was written something like this instead:

class Dataset(Generic[D, C]):

def __init__(self, data_vars: D, coords: C):
    self.data_vars = data_vars
    self.coords = coords

ds1: Dataset[GenotypeDataVars, GenotypeCoords] = Dataset( GenotypeDataVars(data=xr.DataArray(), mask=xr.DataArray()), GenotypeCoords(variant=xr.DataArray(), sample=xr.DataArray()) )

Types should then be preserved even if xarray is constantly redefining

new instances in internal functions:

ds2: Dataset[GenotypeDataVars, GenotypeCoords] = type(ds1)(ds1.data_vars, ds1.coords) # This is OK ```

Anyways, my takeaways from everything on the thread so far are:

  • Using accessors and some kind of runtime type analysis will cover what was in the scope of my original post, but it will prohibit any kind of static type safety.
  • I imagine supporting type safety on the structure of arrays, coordinates, and attributes in Xarray would be far easier than supporting polymorphism for Dataset/DataArray so I'm curious if that has been discussed much. Do you think it's worth opening a separate issue to continue a conversation that builds on your idea @keewis ?
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Extending Xarray for domain-specific toolkits 597475005
612050871 https://github.com/pydata/xarray/issues/3959#issuecomment-612050871 https://api.github.com/repos/pydata/xarray/issues/3959 MDEyOklzc3VlQ29tbWVudDYxMjA1MDg3MQ== eric-czech 6130352 2020-04-10T14:23:20Z 2020-04-10T14:24:35Z NONE

Thanks @keewis, that would work though I think it leads to an awkward result if I'm understanding correctly. Here's what I'm imagining:

```python from genetics import api

These are different types of data structures I originally wanted to model as classes

ds1 = api.create_genotype_call_dataset(...) ds2 = api.create_genotype_probability_dataset(...) ds3 = api.create_haplotype_call_dataset(...)

ds1, ds2, and ds3 are now just xr.Dataset instances

For each of these different types of datasets I have separate accessors

that expose dataset-type-specific analytical methods:

@xr.register_dataset_accessor("genotype_calls") class GenotypeCallAccessor: def init(self, ds): self.ds = ds

@property
def analyze(self):
    # Do something you can only do with genotype call data, not probabilities or 
    # haplotype data or CNV data or any other domain-specific kind of info
    pass

@xr.register_dataset_accessor("genotype_probabilities") class GenotypeProbabilityAccessor: ??? # This also has some "analyze" method

@xr.register_dataset_accessor("haplotype_calls") class HaplotypeCallAccessor: ??? # This also has some "analyze" method

* Now, how do I prevent this? ***

ds1.haplotype_calls.analyze()

ds1 is really genotype call data so it shouldn't be possible to do a haplotype analysis on it

```

Is there a way to make accessors available on an xr.Dataset based on some conditions about the dataset itself? That still seems like a bad solution, but I think it would help me here.

I was trying to think of some way to use static structural subtyping but I don't see how that could ever work with accessors given that 1) they're attached at runtime and 2) all accessors are available on ALL Dataset instances, regardless of whether or not I know only certain things should be possible based on their content.

If accessors are the only way Xarray plans to facilitate extension, has anyone managed to enable static type analysis on their extensions? In my case, I'd be happy to have any kind of safety whether its static or monkey-patched in at runtime, but I'm curious if making static analysis impossible was a part of the discussion in deciding on accessors.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Extending Xarray for domain-specific toolkits 597475005
611973587 https://github.com/pydata/xarray/issues/3959#issuecomment-611973587 https://api.github.com/repos/pydata/xarray/issues/3959 MDEyOklzc3VlQ29tbWVudDYxMTk3MzU4Nw== eric-czech 6130352 2020-04-10T10:20:37Z 2020-04-10T10:22:16Z NONE

would it be too unclear for them to hang off each HaploAccessor.specific_method()?

That works for documenting the methods but I'm more concerned with documenting how to build the Dataset in the first place. Specifically, this would mean describing how to construct several arrays relating to genotype calls, phasing information, variant call quality scores, individual pedigree info, blah blah etc. and all these domain-specific things can have some pretty nuanced relationships so I think describing how to create a sensible Dataset with them will be a big part of the learning curve for users. I want to essentially override the constructor docs for Dataset and make it more specific to our use cases. I can't see a good way to do that with accessors since the dataset would already need to have been created.

Checking dtype and dimensions shouldn't be expensive though, or is it more than that?

It is, or at least I'd like not to preclude the checks from doing things like checking min/max values and asserting conditions along axes (i.e. sums to 1).

If you have other questions about dtypes in xarray then please feel free to raise another issue about that.

Will do.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Extending Xarray for domain-specific toolkits 597475005
611950517 https://github.com/pydata/xarray/issues/3959#issuecomment-611950517 https://api.github.com/repos/pydata/xarray/issues/3959 MDEyOklzc3VlQ29tbWVudDYxMTk1MDUxNw== eric-czech 6130352 2020-04-10T09:08:06Z 2020-04-10T09:08:06Z NONE

Thanks @TomNicholas, some thoughts on your points:

xarray internally uses methods like self._construct_dataarray(dims, values, coords, attrs) to construct return values

I'm ok with the subtype being lost after running some methods. I saw that so I'm assuming all functions that do anything with the data structures take and return Xarray objects alone.

You could make custom accessors which perform checks on the input arrays when they get used?

Accessors could work but the issues I see with them are:

  1. What's a natural way for a user to access documentation on how to build a compliant dataset? A docstring on a constructor is great -- is there a way to do something like that with accessors? The docs could hang off of the check_* type methods, but that seems very awkward.
  2. Is there a way to avoid running check_* methods multiple times? I think those checks could be expensive and If Xarray often builds new instances as return values, this will be an issue:

```python ds.haplo.do_custom_analysis_1()

Do something with coords/indexes that causes a new Dataset to be created

e.g. ds.reset_index ultimately hits https://github.com/pydata/xarray/blob/1eedc5c146d9e6ebd46ab2cc8b271e51b3a25959/xarray/core/dataset.py#L882

which creates a new Dataset

ds = ds.reset_index()

The check_conforms_to_haplo_requirements function will run again even though

I would know it's not necessary at this point:

ds.haplo.do_custom_analysis_2() ```

would something akin to pandas' ExtensionDtype solve your problem?

Ah I can see how the title on the issue is misleading, but I don't actually have a need for dtypes beyond what's already available. Well, we do actually have that problem in trying to find some way to represent 2-bit integers with sub-byte data types but I wasn't trying to get into that on this thread. I'll make the title better.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Extending Xarray for domain-specific toolkits 597475005
605697466 https://github.com/pydata/xarray/issues/1194#issuecomment-605697466 https://api.github.com/repos/pydata/xarray/issues/1194 MDEyOklzc3VlQ29tbWVudDYwNTY5NzQ2Ng== eric-czech 6130352 2020-03-29T20:37:29Z 2020-03-29T20:37:29Z NONE

I agree, I have this same issue with large genotyping data arrays often containing tiny integers and some degree of missingness in nearly 100% of raw datasets. Are there recommended workarounds now? I am thinking of constantly using Datasets instead of DataArrays with mask arrays to accompany every data array, but I'm not sure if that's the best interim solution.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use masked arrays while preserving int 199188476
604580582 https://github.com/pydata/xarray/issues/3791#issuecomment-604580582 https://api.github.com/repos/pydata/xarray/issues/3791 MDEyOklzc3VlQ29tbWVudDYwNDU4MDU4Mg== eric-czech 6130352 2020-03-26T17:51:34Z 2020-03-26T17:51:34Z NONE

That'll work, thanks @keewis!

fwiw the number of use cases I've found concerning my initial question, where there are repeated index values on both sides of the join, is way lower.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Self joins with non-unique indexes 569176457
604464873 https://github.com/pydata/xarray/issues/3791#issuecomment-604464873 https://api.github.com/repos/pydata/xarray/issues/3791 MDEyOklzc3VlQ29tbWVudDYwNDQ2NDg3Mw== eric-czech 6130352 2020-03-26T14:32:40Z 2020-03-26T14:34:34Z NONE

Hey @mrocklin (cc @max-sixty), sure thing.

My original question was about how to implement a join in a typical relational algebra sense, where rows with identical values in the join clause are repeated, but I think I have an even simpler problem that is much more common in our workflows (and touches on how duplicated index values are supported).

For example, I'd like to do something like this:

```python import xarray as xr import numpy as np import pandas as pd

Assume we have a dataset of 3 individuals, one of African

ancestry and two of European ancestry

a = pd.DataFrame({'pop_name': ['AFR', 'EUR', 'EUR'], 'sample_id': [1, 2, 3]})

Join on ancestry to get population size

b = pd.DataFrame({'pop_name': ['AFR', 'EUR'], 'pop_size': [10, 100]}) pd.merge(a, b, on='pop_name') ``` | | pop_name | sample_id | pop_size | |----|------------|-------------|------------| | 0 | AFR | 1 | 10 | | 1 | EUR | 2 | 100 | | 2 | EUR | 3 | 100 |

With xarray, the closest equivalent to this I can find is:

```python a = xr.DataArray( data=[1, 2, 3], dims='x', coords=dict(pop_name=('x', ['AFR', 'EUR', 'EUR'])), name='sample_id' ).set_index(dict(x='pop_name'))

<xarray.DataArray 'sample_id' (x: 3)>

array([1, 2, 3])

Coordinates:

* x (x) object 'AFR' 'EUR' 'EUR'

b = xr.DataArray( data=[10, 100], dims='x', coords=dict(pop_name=('x', ['AFR', 'EUR'])), name='pop_size' ).set_index(dict(x='pop_name'))

<xarray.DataArray 'pop_size' (x: 2)>

array([100, 10])

Coordinates:

* x (x) object 'EUR' 'AFR'

xr.merge([a, b])

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

```

The above does exactly what I want as long as the population names being used as the coordinate to merge on are unique, but that obviously doesn't make sense if those names correspond to a bunch of individuals in one of a small number of populations.

The larger context for this is that genetic data itself is typically some 2+ dimensional array with the first two dimensions corresponding to genomic sites and people. Xarray is perfect for carrying around the extra information relating to those dimensions as coordinates, but being able to attach new coordinate values by joins to external tables is important.

Am I missing something obvious in the API that will do this? Or am I likely better off converting DataArrays to DFs, doing my operations with some DF api, and then converting back?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Self joins with non-unique indexes 569176457

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 564.052ms · About: xarray-datasette