home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

23 rows where user = 601025 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, reactions, created_at (date), updated_at (date)

issue 5

  • Anyone working on a to_tiff? Alternatively, how do you write an xarray to a geotiff? 13
  • Default chunking in GeoTIFF images 4
  • Image related methods 3
  • how do you flatten an xarray? 2
  • API Design for Xarray Backends 1

user 1

  • ebo · 23 ✖

author_association 1

  • NONE 23
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1093630278 https://github.com/pydata/xarray/issues/2042#issuecomment-1093630278 https://api.github.com/repos/pydata/xarray/issues/2042 IC_kwDOAMm_X85BL3lG ebo 601025 2022-04-09T03:14:41Z 2022-04-09T03:14:41Z NONE

Thanks for closing this dcherian, I had completely forgotten about it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Anyone working on a to_tiff? Alternatively, how do you write an xarray to a geotiff?  312203596
457584997 https://github.com/pydata/xarray/issues/2042#issuecomment-457584997 https://api.github.com/repos/pydata/xarray/issues/2042 MDEyOklzc3VlQ29tbWVudDQ1NzU4NDk5Nw== ebo 601025 2019-01-25T14:13:35Z 2019-01-25T14:13:35Z NONE

Here is an old chunk of code I wrote awhile back to do this. Please note three things. There is the metadata attached to the file (I think it was through "tags"), metadata attached to the metadata "meta" variable, and some metadata that is attached on a per-band basis. It can be problematic when you assume that the info is global to the image and is embedded somehow (it took me weeks to figure some of this out).
Also note that I do per-band and image statistics... Also, I did not keep good enough notes and cannot remember where I got some of the hints and are just as likely to come from published examples that have been hacked to marginally work. Also, the .xml weirdness has to do in part with historic artifacts of our particular dataset that is over 3.5 petabytes, and cannot easily be updated, and is easier to hack in the code.

Hope this helps:

=====================

def to_tiff(data, fname, template=None, **kwargs): import numpy as np

 # check and promote the number of dimentio(1)ns for consistency
 nbands = data.ndim
 if 2 == nbands:
     # expand the array so that it is least 3D (ie stacks of

surfaces) import numpy as np data = np.expand_dims(data,axis=0) elif 3 != nbands: # nothing to do if it is already 3D print("Error: to_tiff can only currently deal with 2D and 3D data") return

 profile = {}
 tags = {}
 tmpl = None
 if template:
     tmpl = rasterio.open(template,'r')
     profile = tmpl.profile.copy()
     tags = tmpl.tags()

 # the metadata should be appended.  Cache here to
 # simplify variable replacement below.
 meta = {}
 if 'meta' in profile:
     meta.update(profile['meta'])
 if 'meta' in kwargs:
     meta.update(kwargs['meta'])

 # overwrite anything inheritied from the template with
 # user supplied args
 profile.update(kwargs)

 # overwrite bits that write the array as geotiff and
 # save the cached metadata
 profile['driver'] = 'GTiff'
 profile['count'] = data.shape[0]
 profile['width'] = data.shape[2]
 profile['height'] = data.shape[1]
 profile['meta'] = meta

 if 'dtype' not in profile:
     profile['dtype'] = type(data[0,0,0])

 # if you do not remove the previously associated .xml file,
 # then the tags and metadata can get corrupted.
 try:
     os.remove(fname)
     os.remove(fname+".xml")
 except:
     pass

 # now create and save the array to a file
 with rasterio.open(fname,'w',**profile) as out:
     for b in range(data.shape[0]):
         #print("\nprocessing band %d"%(b+1))
         out.write(data[b].astype(profile['dtype']), b+1)

         # caluclate the stats for each band
         # not sure what the proper name for per band stats is in

QGIS stats = { 'STATISTICS_MINIMUM': np.nanmin(data[b]), 'STATISTICS_MAXIMUM': np.nanmax(data[b]), 'STATISTICS_MEAN': np.nanmean(data[b]), 'STATISTICS_STDDEV': np.nanstd(data[b])} out.update_tags(b+1,**stats) #print(" stats= %s"%str(stats))

     # now calculate the stats across all the bands
     stats = {
             'STATISTICS_MINIMUM': np.nanmin(data),
             'STATISTICS_MAXIMUM': np.nanmax(data),
             'STATISTICS_MEAN': np.nanmean(data),
             'STATISTICS_STDDEV': np.nanstd(data)}

     out.update_tags(**tags)
     if 'tags' in kwargs:
         out.update_tags(**kwargs['tags'])

     out.update_tags(**stats)
     #print("\n  overall stats= %s\n"%str(stats))

 del tmpl

On Jan 23 2019 1:29 PM, Guillaume Eynard-Bontemps wrote:

Thanks @djhoese @ebo.

@ebo if you have some examples, that would be really cool!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Anyone working on a to_tiff? Alternatively, how do you write an xarray to a geotiff?  312203596
456499728 https://github.com/pydata/xarray/issues/2042#issuecomment-456499728 https://api.github.com/repos/pydata/xarray/issues/2042 MDEyOklzc3VlQ29tbWVudDQ1NjQ5OTcyOA== ebo 601025 2019-01-22T18:01:36Z 2019-01-22T18:01:36Z NONE

I work with geotiff all the time. A separate to_tiff is not needed.
The trick is that there are two separate sections/areas where the metadata is stored. You will know where/how to store that information.
I do not have access to any of that code at the moment. If you cannot find the examples I will try to hack an example or three once I get back to work.

On Jan 22 2019 7:05 AM, David Hoese wrote:

@guillaumeeb Not that I know of but I'm not completely in the loop with xarray. There is the geoxarray project that I started (https://github.com/geoxarray/geoxarray) but really haven't had any time to work on it. Otherwise you could look at the satpy library or its dependency library

trollimage which uses rasterio but it assumes some things about how data is structured including an 'area' in .attrs from pyresample. Sorry I don't have a better idea.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Anyone working on a to_tiff? Alternatively, how do you write an xarray to a geotiff?  312203596
406698807 https://github.com/pydata/xarray/issues/2042#issuecomment-406698807 https://api.github.com/repos/pydata/xarray/issues/2042 MDEyOklzc3VlQ29tbWVudDQwNjY5ODgwNw== ebo 601025 2018-07-20T19:05:12Z 2018-07-20T19:05:12Z NONE

On Jul 20 2018 12:57 PM, David Hoese wrote:

I'd like to add to this discussion the issue I brought up here #2288. It is something that could/should probably result in a new xarray add-on package for doing these type of operations. For example, I work on the pyresample and satpy projects. Pyresample uses its own "AreaDefinition" objects to define the geolocation/projection information. SatPy uses these AreaDefinitions by setting DataArray.attrs['area'] and using then when necessary. This includes the ability to write geotiffs using rasterio and a custom array-like class for writing dask chunks to the geotiff between separate threads (does not work multiprocess, yet).

I would love to see these additions (or some recipies on how to do it as xarray stands). As a note, I figured out a rather simple way using with rasterio.open(...,'w',**profile) to effect the write. That might help in the short to medium term.

I am also interested in looking at your Pyresample and well as something similar to the morphological operators (in this context specifically measure).

Best of success!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Anyone working on a to_tiff? Alternatively, how do you write an xarray to a geotiff?  312203596
399514847 https://github.com/pydata/xarray/issues/1323#issuecomment-399514847 https://api.github.com/repos/pydata/xarray/issues/1323 MDEyOklzc3VlQ29tbWVudDM5OTUxNDg0Nw== ebo 601025 2018-06-22T17:12:59Z 2018-06-22T17:12:59Z NONE

On Jun 22 2018 10:10 AM, Scott wrote:

Hey Devs and Users... Just about to embark on a project where I want to populate an X-Array data set with a set of images.. Did this progress? Just trying to save an hour building my own hook to skimage

I have not built hooks into skimage. If you get that to work I would if you could share with me 8-) I'm not fully sure of how to make that work in production, but I would be glad to help test.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Image related methods 216621142
398214607 https://github.com/pydata/xarray/issues/2093#issuecomment-398214607 https://api.github.com/repos/pydata/xarray/issues/2093 MDEyOklzc3VlQ29tbWVudDM5ODIxNDYwNw== ebo 601025 2018-06-18T22:24:18Z 2018-06-18T22:24:18Z NONE

On Jun 18 2018 4:03 PM, Fabien Maussion wrote:

Has a default GeoTIFF chunk been implemented?

No, unfortunately.

ok. Maybe the overall chunking issue has been sorted. I will try to look into this and see what is working now related to this issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Default chunking in GeoTIFF images 318950038
398183774 https://github.com/pydata/xarray/issues/2093#issuecomment-398183774 https://api.github.com/repos/pydata/xarray/issues/2093 MDEyOklzc3VlQ29tbWVudDM5ODE4Mzc3NA== ebo 601025 2018-06-18T20:24:53Z 2018-06-18T20:24:53Z NONE

one of the issues related to this has been closed. Has a default GeoTIFF chunk been implemented?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Default chunking in GeoTIFF images 318950038
387058496 https://github.com/pydata/xarray/issues/2093#issuecomment-387058496 https://api.github.com/repos/pydata/xarray/issues/2093 MDEyOklzc3VlQ29tbWVudDM4NzA1ODQ5Ng== ebo 601025 2018-05-07T13:07:16Z 2018-05-07T13:07:16Z NONE

that would definitely work for me.

On May 7 2018 6:43 AM, Zac Hatfield-Dodds wrote:

With the benefit of almost a year's worth of procrastination, I think the best approach is to take the heuristics from #1440, but only support chunks=True - if a decent default heuristic isn't good enough, the user can specify exact chunks.

The underlying logic for this issue would be identical to that of

1440, so supporting both is "just" a matter of plumbing it in

correctly.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Default chunking in GeoTIFF images 318950038
385532574 https://github.com/pydata/xarray/issues/2042#issuecomment-385532574 https://api.github.com/repos/pydata/xarray/issues/2042 MDEyOklzc3VlQ29tbWVudDM4NTUzMjU3NA== ebo 601025 2018-04-30T21:20:49Z 2018-04-30T21:20:49Z NONE

When I poked at this I could not figure out how to keep the internal cached states separate. That may have been because the processing loop was opening many different images, and not just one. I'm glad you found a way.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Anyone working on a to_tiff? Alternatively, how do you write an xarray to a geotiff?  312203596
385498531 https://github.com/pydata/xarray/issues/2042#issuecomment-385498531 https://api.github.com/repos/pydata/xarray/issues/2042 MDEyOklzc3VlQ29tbWVudDM4NTQ5ODUzMQ== ebo 601025 2018-04-30T19:09:41Z 2018-04-30T19:09:41Z NONE

@mrocklin gdal can read/write windows:

```

Read raster as arrays

banddataraster = raster.GetRasterBand(1) dataraster = banddataraster.ReadAsArray(xoff, yoff, xcount, ycount).astype(numpy.float) ``` from: https://pcjericks.github.io/py-gdalogr-cookbook/raster_layers.html

Also see BandReadAsArray and BandWriteAsArray in http://gdal.org/python/osgeo.gdal_array-module.html (which appear to be a read/write gdal.Band.ReadAsArray method and gdal.Band.WriteArray method respectively).

But there are some got'yas there in that GDAL as far as I recall is not thread safe. I wonder how you got that to work other than setting up a slave read process that handles all reads.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Anyone working on a to_tiff? Alternatively, how do you write an xarray to a geotiff?  312203596
385496306 https://github.com/pydata/xarray/issues/2042#issuecomment-385496306 https://api.github.com/repos/pydata/xarray/issues/2042 MDEyOklzc3VlQ29tbWVudDM4NTQ5NjMwNg== ebo 601025 2018-04-30T19:01:29Z 2018-04-30T19:01:29Z NONE

@mrocklin it was the windowed-rw example that prompted a number of my early questions about dask.array and xarray equivalents. Maybe someting along the lines of the following would also be helpful:

https://gis.stackexchange.com/questions/158527/is-it-possible-to-read-raster-files-by-block-with-rasterio/158528#158528

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Anyone working on a to_tiff? Alternatively, how do you write an xarray to a geotiff?  312203596
385492564 https://github.com/pydata/xarray/issues/2042#issuecomment-385492564 https://api.github.com/repos/pydata/xarray/issues/2042 MDEyOklzc3VlQ29tbWVudDM4NTQ5MjU2NA== ebo 601025 2018-04-30T18:48:16Z 2018-04-30T18:48:16Z NONE

So far as I have run into open_rasterio takes care of most things out of the box. Besides how to deal with chunks, there is also how to deal with several types of metadata:

  • the regular metadata which rasterio access by either the meta or profile variables.

  • user defined metadata dictionary which rasterio use 'tags()'

  • per band metadata dictionary which rasterio uses 'tags(band)'

Whether xarray/open_rasterio uses the same interface or not, there will be a need to deal with file metadata and per-band metadata.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Anyone working on a to_tiff? Alternatively, how do you write an xarray to a geotiff?  312203596
385463527 https://github.com/pydata/xarray/issues/2093#issuecomment-385463527 https://api.github.com/repos/pydata/xarray/issues/2093 MDEyOklzc3VlQ29tbWVudDM4NTQ2MzUyNw== ebo 601025 2018-04-30T17:07:43Z 2018-04-30T17:07:43Z NONE

Most of the standard internal chunked (or what I believe to be called 'tiled' by the GIS community) is 256x256 (see: http://www.gdal.org/frmt_gtiff.html TILED=YES BLOCKXSIZE=n and BLOCKYSIZE=n). This is used when viewing images within a given region of interest or window. You can really tell the difference in speed between the tiled and stripped images (which has a blocksize 1x<width>).

@mrocklin, I agree that we might want to aggregate some number of them, but we would need to get some automation up front and sort out how we want to determine the expansion. Adding to the #1440 discussion mentioned, there will likely be advantage in increasing the block sizes in given directions.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Default chunking in GeoTIFF images 318950038
383359925 https://github.com/pydata/xarray/issues/2065#issuecomment-383359925 https://api.github.com/repos/pydata/xarray/issues/2065 MDEyOklzc3VlQ29tbWVudDM4MzM1OTkyNQ== ebo 601025 2018-04-22T06:54:10Z 2018-04-22T06:54:10Z NONE

On Apr 21 2018 10:17 PM, Keisuke Fujii wrote:

How about reset_index? python array.stack(z=('x', 'y')).reset_index('z')

Before I left work for the weekend I had tried array.stack(z=('x', 'y')), but I had not come across reset_index yet. I will give that a try ASAP.

EBo --

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  how do you flatten an xarray? 315149637
382082786 https://github.com/pydata/xarray/issues/2065#issuecomment-382082786 https://api.github.com/repos/pydata/xarray/issues/2065 MDEyOklzc3VlQ29tbWVudDM4MjA4Mjc4Ng== ebo 601025 2018-04-17T17:49:54Z 2018-04-17T17:49:54Z NONE

Thank you rabernat . I just tried:

``` array.stack(z=('x', 'y')) X

<xarray.DataArray (band: 3, z: 1647861)> dask.array<shape=(3, 1647861), dtype=float64, chunksize=(1, 1647861)> Coordinates: * band (band) int64 1 2 3 * z (z) MultiIndex - x (z) float64 1.939e+05 1.939e+05 1.939e+05 1.939e+05 1.939e+05 ... - y (z) float64 4.986e+06 4.985e+06 4.985e+06 4.984e+06 4.984e+06 ... nX = client.compute(X) ``` and got the following error:

/home/jldavid3/anaconda3/envs/pangeo/lib/python3.6/site-packages/distributed/worker.py:741: UserWarning: Large object of size 6.61 MB detected in task graph: ([[["('reshape-33c73e5277bff381fea27bc752d60c16', ... e, None), None) Consider scattering large objects ahead of time with client.scatter to reduce scheduler burden and keep data on workers

future = client.submit(func, big_data)    # bad

big_future = client.scatter(big_data)     # good
future = client.submit(func, big_future)  # good

% (format_bytes(len(b)), s)) distributed.worker - WARNING - Compute Failed Function: _dask_finalize args: ([[[array([[ 1.11333953, 0.15302669, 2.30724196, ..., -0.49583333, -0.31415252, 0.17898109]])], [array([[ 0.2049355 , 1.32097473, -1.11873895, ..., -0.10651731, 0.69806911, 1.34692913]])], [array([[ 0.59425151, -0.52178773, 0.80188672, ..., -0.83324054, -0.54774213, -0.15842612]])]]], <function Dataset._dask_postcompute at 0x7fbaded21158>, ([(False, 'band', <xarray.IndexVariable 'band' (band: 3)> array([1, 2, 3])), (False, 'z', <xarray.IndexVariable 'z' (z: 1647861)> array([(193899.75, 4985847.0), (193899.75, 4985391.0), (193899.75, 4984935.0), ..., (805851.75, 4427703.0), (805851.75, 4427247.0), (805851.75, 4426791.0)], dtype=object)), (True, <this-array>, (<function Variable._dask_finalize at 0x7fbadee159d8>, (<function finalize at 0x7fbaf986e8c8>, (), ('band', 'z'), OrderedDict(), None)))], {'z', 'band'}, {'band': 3, 'z': 1647861}, None, None, None), None) kwargs: {} Exception: KeyError(<this-array>,)

distributed.scheduler - ERROR - error from worker inproc://169.154.136.32/2193/2: <this-array>

Do you have any suggestions? I will read up more on stack later to see what else I can learn, but do you have any suggestions? I figure I probably am missing an argument or got something out of order.

Thanks again.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  how do you flatten an xarray? 315149637
379850842 https://github.com/pydata/xarray/issues/2042#issuecomment-379850842 https://api.github.com/repos/pydata/xarray/issues/2042 MDEyOklzc3VlQ29tbWVudDM3OTg1MDg0Mg== ebo 601025 2018-04-09T18:35:19Z 2018-04-09T18:35:19Z NONE

On Apr 9 2018 11:43 AM, Ryan Abernathey wrote:

I really need to know what xarray can and is planning to do with tiff's so that I can not only use them but also document stuff for a dozen or more of my coworkers

@ebo, we are very glad to hear your input about how you might use xarray together with geotiff data. The majority of xarray developers are coming from a netCDF background, so this is somewhat new territory for us. It sounds like you have a real need for the computational tools that xarray provides. Engaging the geotiff community could potentially be very advantageous for xarray, since it could bring lots of new users. On the other hand, there are already lots of powerful tools in the geotiff space, and we have limited resources (i.e. time), so we need to be a bit conservative.

It would probably be useful to clarify how the decision making process works on things like this for open source projects. There is no xarray master plan that can provide a simple answer to your question of "what xarray is planning to do with tiffs". The main questions that have to be answered when deciding whether to add a big new feature are - Does this feature make sense within the "scope" of the project? (Can be difficult to answer--much discussion is usually required.) - Do the xarray developers have the time and expertise to implement and support such a feature?

The first item, regarding "scope," is being addressed now via this discussion. What are the pros and cons of attempting to add the new feature? Different people will have different opinions. Let's hear them out. A key question, as identified by @fmaussion, is whether the geotiff data model is compatible enough with the xarray data model enough to provide a full-featured writeable backend. In other words, can I write any arbitrary xarray dataset to geotiff and then read it back, with no loss of information. If the answer is "no," then it will be hard to convince the xarray community that geotiff is a suitable candidate for a backend.

If you feel strongly that we need the ability to not only directly read (as we can already with open_rasterio) but also directly write geotiff, you should lay out your arguments persuasively, taking into account not only the immediate impacts on your personal project but the impact on xarray as a whole. There may be good ways to achieve what you want without making any changes to xarray, i.e. by creating a small standalone package to transform geotiff to / from xarray (as in @Schlump's example); that option needs to be considered seriously.

The second item (time) is a rather strong constraint: xarray is a volunteer effort. There are currently 369 open issues in xarray. Which ones should be the top priority? Will attempting to add a new feature lead to much more work down the line, in the form of unforeseen bugs?

Ultimately, what happens in xarray is determined by the needs of the xarray developers themselves, who use xarray heavily in their daily science work. This may sound exclusive, but it is the opposite, because anyone can become an xarray developer. The reason we can read geotiffs today is because, one year ago, @fmaussion rolled up his sleeves and wrote the rasterio backend (#1260).

That little number 1260 is a link to a merged pull request (aka "PR"). A PR is much more powerful than a feature request; it is an actual implementation of the feature someone wishes to see in xarray. Anyone is free to make a PR to xarray, although before doing so, it is good to discuss the possible new feature via the issue tracker, as described in the xarray contributing guide. As a full time programmer in a lab dealing with geospatial data, you yourself are already a prime candidate to implement your desired feature! 😉

As an example of how a new backend was incorporated into xarray, you can refer to #1905, in which @barronh implemented a backend for "pseudo-netCDF" a file format used by his research group. Skimming through that discussion will give you a good idea of some of the questions that arise in implementing new backend functionality.

Apologies for the long digression into open-source politics. I thought it would be useful to clarify these things.

No need to apologize about a long digression into open-source politics, and I fully understand and smack in the middle of that with at least 4 different projects. I also know about issue/commit numbers on github/bitbucket/redmine/etc. NASA has formal rules about what can be released and when. My last open-source project took 9 months to get the software release authorized, but that was for an entire project new code. For basic image I/O support I would not expect any problems, but I have to get permission before releasing anything beyond snippets and examples that do not include primary workflows. I will release as much as I can back in the the public domain, but this starts to get complicated as the scope grows.

I do not remember seeing anyone use the acronym PR for "pull request" before, so sorry for that confusion. I just could not guess it in the context.

The argument for providing basic functionality for GeoTIFF's and geotiffs, is that it is a common dataset used along side NetCDF and HDF. I can, if you need me to, try to track down a stack of sites which provide images in GeoTIFF's such as NASA's Giovanni, Digital Globe, Planet Labs, just to name a couple off the top of my head. How many folks here work in and around GIS folks?

I will have to post back later (probably to several separate issues) to address several of the pointers raised above, but I have to get on to fleshing some of this out.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Anyone working on a to_tiff? Alternatively, how do you write an xarray to a geotiff?  312203596
379823407 https://github.com/pydata/xarray/issues/2042#issuecomment-379823407 https://api.github.com/repos/pydata/xarray/issues/2042 MDEyOklzc3VlQ29tbWVudDM3OTgyMzQwNw== ebo 601025 2018-04-09T17:04:40Z 2018-04-09T17:04:40Z NONE

On Apr 9 2018 12:49 AM, Fabien Maussion wrote:

I do not care about the to_rasterio but I do care about a ''to_tiff''

Yes sorry, I meant to_tiff

ah... got it.

If xarray has no way to output tiffs then I cannot use xarray.

I'm not saying it shouldn't exist, I'm just asking whether it should be in the xarray codebase or elsewhere.

fair enough. I just need them to play well enough together that I can read, process, and write a chunk/window at a time (whether that is with a simple xr.compute() or something else).

If you'd like to parse new attributes when opening the geotiff file this could be added easily. PRs are welcome!

what is a PR? Did you mean functionality request?

I'm still not clear where dask.array, xarray, rasterio, and pangeo begin and end. I think I have posted an issue about extending the metatdata/tags some place, but I am sure it is not as clear as it should be, and for the life of me I am not sure where I posted that.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Anyone working on a to_tiff? Alternatively, how do you write an xarray to a geotiff?  312203596
379815753 https://github.com/pydata/xarray/issues/2042#issuecomment-379815753 https://api.github.com/repos/pydata/xarray/issues/2042 MDEyOklzc3VlQ29tbWVudDM3OTgxNTc1Mw== ebo 601025 2018-04-09T16:39:28Z 2018-04-09T16:39:28Z NONE

On Apr 9 2018 9:22 AM, Ryan Abernathey wrote:

I'm already starting to think that these kind of domain specific tools should exist in dedicated projects, not in the main xarray codebase.

👍

I am perfectly fine with that stance, but I also think it is also reasonable to ask/expect that if you provide a reader for some format that you also provide writers for them -- or at least document that you will not and why. Almost all of my current work is in geotiff format.
I have no choice, and many other people working in the geospatial domain will be hamstrung without it. Sitting down the pipeline from me is 1/2 million archived images (there are actually closer to 2 million images, but only 1/2 to 1 is associated with out current projects, and comprise several petabytes of data).

I really need to know what xarray can and is planning to do with tiff's so that I can not only use them but also document stuff for a dozen or more of my coworkers (heck the next time we run the Python Bootcamp I would probably offer to teach this). If you plan not to support it then fine. I will not spend any more time with xarrays and focus on dask.arrays or anything else that will work.

My question to you now is if supporting basic tiff I/O is in scope. If so I can deal with all the rest of the rasterio/geospatial stuff outside of xarray.

I will start fleshing out the stuff that Matthew Rocklin and Schlump have provided.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Anyone working on a to_tiff? Alternatively, how do you write an xarray to a geotiff?  312203596
379593639 https://github.com/pydata/xarray/issues/2042#issuecomment-379593639 https://api.github.com/repos/pydata/xarray/issues/2042 MDEyOklzc3VlQ29tbWVudDM3OTU5MzYzOQ== ebo 601025 2018-04-09T00:03:49Z 2018-04-09T00:03:49Z NONE

On Apr 8 2018 11:54 AM, Schlump wrote:

https://github.com/robintw/XArrayAndRasterio/blob/master/rasterio_to_xarray.py

Ahhh... Now I understand Fabien Maussion'd comment about to_rasterio.
I read the posts out of order. I will see if this does the job for me (I will likely have to extend it a little, but I think this is a great start).

EBo --

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Anyone working on a to_tiff? Alternatively, how do you write an xarray to a geotiff?  312203596
379593118 https://github.com/pydata/xarray/issues/2042#issuecomment-379593118 https://api.github.com/repos/pydata/xarray/issues/2042 MDEyOklzc3VlQ29tbWVudDM3OTU5MzExOA== ebo 601025 2018-04-08T23:57:20Z 2018-04-08T23:57:20Z NONE

On Apr 8 2018 12:45 PM, Fabien Maussion wrote:

if the profile and tags were propagated through open_rasterio, then the second open would not be necessary and would be generally useful.

We have been adding new attributes like this recently (https://github.com/pydata/xarray/pull/1583 and https://github.com/pydata/xarray/pull/1740), so I don't see much trouble in adding a few more. Note that the rasterio object is available via the (undocumented) _file_obj attribute. So a quick workaround for you in the mean time would be to access the info you need directly via this object.

As for the to_rasterio method, I'm currently against it. I'm already starting to think that these kind of domain specific tools should exist in dedicated projects, not in the main xarray codebase. For rasterio in particular, it turns out that the geotiff/GDAL data model is fairly different from the xarray/NetCDF model. The rasterio folks have also shown only limited interest in our endeavor (https://github.com/mapbox/rasterio/issues/920), which is understandable. I don't have a strong opinion though, and I am curious if the @pydata/xarray crew sees it differently.

I do not care about the to_rasterio but I do care about a ''to_tiff'' (even if I have to do all the geospatial stuff outside of xarray as long as I can output the image data portion of the tiff via xarray). I also do not overly care if the xarray interface is significantly different from the rasterio/GDAL API (however someone will have to document the differences so that it does not continually trip people like me up -- I should be able to help a little with this once it gets working). Basically however it is handled I have to be able to read a GeoTIFF, process, and write back out to a GeoTIFF. If xarray has no way to output tiffs then I cannot use xarray.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Anyone working on a to_tiff? Alternatively, how do you write an xarray to a geotiff?  312203596
373821913 https://github.com/pydata/xarray/issues/1323#issuecomment-373821913 https://api.github.com/repos/pydata/xarray/issues/1323 MDEyOklzc3VlQ29tbWVudDM3MzgyMTkxMw== ebo 601025 2018-03-16T19:33:01Z 2018-03-16T19:33:01Z NONE

thank you. I had stumbled onto the "tags" yesterday and had not had time to post back here that I had found it, and sorted through all the tags(), tags(1), and tags(ns="something"). The thing that was quite confusing was you access them in GDAL via GetMetadata, and not GetTags. So, I think I am sorted now. I would agree that there needs to be more info there, but it is possible that it is already available, but not in a single place.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Image related methods 216621142
372980129 https://github.com/pydata/xarray/issues/1970#issuecomment-372980129 https://api.github.com/repos/pydata/xarray/issues/1970 MDEyOklzc3VlQ29tbWVudDM3Mjk4MDEyOQ== ebo 601025 2018-03-14T10:51:36Z 2018-03-14T10:51:36Z NONE

Not sure what would be involved, but I am consistently having to roll my own (typically with GDAL post processing) to save to GeoTIFF's. On thing missing from the reads so far is that the attributes only read the standard metadata and not that user defined. In particular if I use DG WV02 imagery, I have not yet figured out how to access the sun - satellite geomerty. Even having a first pass for xarray.to_geotiff would be helpful.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  API Design for Xarray Backends 302806158
372482940 https://github.com/pydata/xarray/issues/1323#issuecomment-372482940 https://api.github.com/repos/pydata/xarray/issues/1323 MDEyOklzc3VlQ29tbWVudDM3MjQ4Mjk0MA== ebo 601025 2018-03-12T22:24:27Z 2018-03-12T22:24:27Z NONE

When I open up a tiff file, it only shows a few attributes. I have some images which have extensive provenance metadata. How do you access them?

eg: NITF_CSEXRA_SENSOR=PAN NITF_PIAIMC_SENSNAME=QB02 ...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Image related methods 216621142

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 15.368ms · About: xarray-datasette