issues
18 rows where type = "issue" and user = 306380 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: comments, created_at (date), updated_at (date), closed_at (date)
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at ▲ | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2045596856 | I_kwDOAMm_X8557VS4 | 8555 | Docs look odd in dark mode | mrocklin 306380 | open | 0 | 1 | 2023-12-18T02:31:26Z | 2023-12-19T15:32:11Z | MEMBER | What happened?What did you expect to happen?No response Minimal Complete Verifiable ExampleNo response MVCE confirmation
Relevant log outputNo response Anything else we need to know?No response Environment |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8555/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
1821467933 | I_kwDOAMm_X85skWUd | 8021 | Specify chunks in bytes | mrocklin 306380 | open | 0 | 4 | 2023-07-26T02:29:43Z | 2023-10-06T10:09:33Z | MEMBER | Is your feature request related to a problem?I'm playing around with xarray performance and would like a way to easily tweak chunk sizes. I'm able to do this by backing out what xarray chooses in an Dask array does this in two ways. We can provide a value in chunks as like the following:
We also refer to a value in Dask config ```python In [1]: import dask In [2]: dask.config.get("array.chunk-size") Out[2]: '128MiB' ``` This is not very important (I'm unblocked) but I thought I'd mention it in case someone is looking for some fun work 🙂 Describe the solution you'd likeNo response Describe alternatives you've consideredNo response Additional contextNo response |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8021/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
295270362 | MDU6SXNzdWUyOTUyNzAzNjI= | 1895 | Avoid Adapters in task graphs? | mrocklin 306380 | closed | 0 | 13 | 2018-02-07T19:52:02Z | 2022-05-11T20:26:42Z | 2022-05-11T20:26:42Z | MEMBER | Looking at an ```python
This object has many dependents, and so will presumably have to float around the network to all of the workers ```python
In principle this is fine, especially if this object is cheap to serialize, move, and deserialize. It does introduce a bit of friction though. I'm curious how hard it would be to build task graphs that generated these objects on the fly, or else removed them altogether. It is slightly more convenient from a task scheduling perspective for data access tasks to not have any dependencies. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1895/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
908971901 | MDU6SXNzdWU5MDg5NzE5MDE= | 5426 | Implement dask.sizeof for xarray.core.indexing.ImplicitToExplicitIndexingAdapter | mrocklin 306380 | open | 0 | 17 | 2021-06-02T01:55:23Z | 2021-11-16T15:08:03Z | MEMBER | I'm looking at a pangeo gallery workflow that suffers from poor load balancing because objects of type I'm seeing number of processing tasks charts that look like the following, which is a common sign of the load balancer not making good decisions, which is most commonly caused by poor data size measurements: |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5426/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
207021356 | MDU6SXNzdWUyMDcwMjEzNTY= | 1262 | Logical DTypes | mrocklin 306380 | open | 0 | 11 | 2017-02-12T01:26:23Z | 2020-12-26T14:26:00Z | MEMBER | tl;dr: Can XArray enable user-defined logical dtypes on top of physical NumPy arrays ? The Need for New DatatypesNumPy's dtypes (int, float, etc.) are appropriate for many, but not all cases. There are a variety of situations where we want numpy-like array semantics (broadcasting, memory layout) but with different element properties. Use cases include the following:
Currently dtypes need to be added directly to the NumPy source code. This is a high barrier for many community members, requires general approval (there can be only one datetime implementation) (this is good and bad), and limits experimentation. There is value to supporting user-definable datatypes. This is hard to do in NumPyIdeally we would implement extensible user-defined dtypes within NumPy (and there may be long-standing plans to do just this). However, changing NumPy today is hard, both because it's hard to find developers who are comfortable operating at that level and because the backwards compatibility pressure on NumPy is large. So as an alternative, we might consider lightly wrapping NumPy arrays in a new object that also includes extra dtype information. For example we might wrap an int64 numpy array with some datetime/timezone metadata to achieve a logical datetime array using a physical int64 array. We continue using NumPy as is but use this higher layer when necessary for more complex dtypes. However "lightly wrapping" NumPy arrays is hard to do while still maintaining a closed system where all operations remain consistent (raw NumPy arrays inevitably leak through). Additionally, asking communities to switch to new libraries is socially quite challenging. XArray is well placedFortunately XArray appears to have already solved some of these technical and social challenges. XArray lightly wraps NumPy arrays in a consistent manner. NumPy-like operations on XArrays remain XArrays. Interactions with other NumPy arrays are well defined. XArray has also attracted an active user/developer community and has attained general respect from the broader ecosystem. XArray seems to be hackable, benefits from a decently active community, and is not yet under as much backwards compatibility pressure. So question: Is it sensible to add logical dtype information to XArray? Can this be done with only moderate effort and maintenance costs to the XArray project? If the answer is "yes, probably", then what is the right way to go about this? |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1262/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
318950038 | MDU6SXNzdWUzMTg5NTAwMzg= | 2093 | Default chunking in GeoTIFF images | mrocklin 306380 | closed | 0 | 10 | 2018-04-30T16:21:30Z | 2020-06-18T06:27:07Z | 2020-06-18T06:27:07Z | MEMBER | Given a tiled GeoTIFF image I'm looking for the best practice in reading it as a chunked dataset. I did this in this notebook by first opening the file with rasterio, looking at the block sizes, and then using those to inform the argument to In dask.array every time this has come up we've always shot it down, automatic chunking is error prone and hard to do well. However in these cases the object we're being given usually also conveys its chunking in a way that matches how dask.array thinks about it, so the extra cognitive load on the user has been somewhat low. Rasterio's model and API feel much more foreign to me though than a project like NetCDF or H5Py. I find myself wanting a Thoughts on this? Is this in-scope? If so then what is the right API and what is the right policy for how to make xarray/dask.array chunks larger than GeoTIFF chunks? |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/2093/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
400948664 | MDU6SXNzdWU0MDA5NDg2NjQ= | 2692 | Xarray tutorial at SciPy 2019? | mrocklin 306380 | closed | 0 | 10 | 2019-01-19T01:56:38Z | 2020-03-25T04:34:27Z | 2019-02-17T05:07:45Z | MEMBER | Is anyone interested in submitting a tutorial to SciPy 2019? I think that it would be useful to have an official Xarray tutorial out there somewhere on the internet. This could be good motivation to create one. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/2692/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
287969295 | MDU6SXNzdWUyODc5NjkyOTU= | 1822 | Use apply_ufunc in xESMF regridding package | mrocklin 306380 | closed | 0 | 4 | 2018-01-12T00:17:04Z | 2020-01-15T00:01:49Z | 2020-01-15T00:01:49Z | MEMBER | I would like to call attention to https://github.com/JiaweiZhuang/xESMF/issues/3#issuecomment-354668897 . It seems like the xESMF package does regridding in a way that at least some XArray users find sensible. It should probably make use of, but does not currently use apply_ufunc, and is not particularly parallelizable (or at least that is my understanding). It could be that some modest development by someone more familiar with XArray could have a large impact by properly using apply_ufunc within that codebase. I apologize for posting an issue about another package in this issue tracker. Feel free to close. cc @JiaweiZhuang |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1822/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
221858543 | MDU6SXNzdWUyMjE4NTg1NDM= | 1375 | Sparse arrays | mrocklin 306380 | closed | 0 | 25 | 2017-04-14T18:00:14Z | 2019-08-30T02:36:12Z | 2019-08-13T03:31:14Z | MEMBER | I would like to have an XArray that has scipy.sparse arrays rather than numpy arrays. Is this in scope? What would need to happen within XArray to support this? |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1375/reactions", "total_count": 8, "+1": 8, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
323785231 | MDU6SXNzdWUzMjM3ODUyMzE= | 2143 | Upstream changes in Dask | mrocklin 306380 | closed | 0 | 1 | 2018-05-16T21:01:21Z | 2019-08-15T15:16:54Z | 2019-08-15T15:16:54Z | MEMBER | Hi All, There are a couple changes coming in dask that might affect XArray code:
Both of the old systems will still work, at least for a version or two, but we plan to remove them in the future. I thought I'd bring these changes up here so that we can plan a clean deprecation within XArray. These are also both not yet released, so both features are still up for discussion if this community has additional constraints. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/2143/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
355308699 | MDU6SXNzdWUzNTUzMDg2OTk= | 2390 | Why are there two compute calls for plot? | mrocklin 306380 | closed | 0 | 3 | 2018-08-29T19:53:45Z | 2019-08-04T23:00:59Z | 2019-08-04T23:00:59Z | MEMBER | Anecdotally I find that when I call |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/2390/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
456239422 | MDU6SXNzdWU0NTYyMzk0MjI= | 3022 | LazilyOuterIndexedArray doesn't support slicing with slice objects | mrocklin 306380 | open | 0 | 2 | 2019-06-14T13:05:56Z | 2019-06-14T21:48:12Z | MEMBER | Code Sample, a copy-pastable example if possible
```python-tracebackAttributeError Traceback (most recent call last) <ipython-input-4-42bee9beb30a> in <module> ----> 1 x[:3] ~/workspace/xarray/xarray/core/indexing.py in getitem(self, indexer) 518 array = LazilyVectorizedIndexedArray(self.array, self.key) 519 return array[indexer] --> 520 return type(self)(self.array, self._updated_key(indexer)) 521 522 def setitem(self, key, value): ~/workspace/xarray/xarray/core/indexing.py in _updated_key(self, new_key) 483 484 def _updated_key(self, new_key): --> 485 iter_new_key = iter(expanded_indexer(new_key.tuple, self.ndim)) 486 full_key = [] 487 for size, k in zip(self.array.shape, self.key.tuple): AttributeError: 'slice' object has no attribute 'tuple' ``` Problem descriptionDask array meta computations like to run This is on master |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/3022/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray 13221727 | issue | ||||||||
282178751 | MDU6SXNzdWUyODIxNzg3NTE= | 1784 | Add compute=False keywords to `to_foo` functions | mrocklin 306380 | closed | 0 | 9 | 2017-12-14T17:25:19Z | 2018-05-16T15:05:03Z | 2018-05-16T15:05:03Z | MEMBER | When working with @jhamman profiling the cc @jhamman @rabernat @jakirkham (who has been looking at similar questions within |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1784/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
295146502 | MDU6SXNzdWUyOTUxNDY1MDI= | 1894 | Zarr keys include variable name | mrocklin 306380 | closed | 0 | 1 | 2018-02-07T13:56:32Z | 2018-02-17T04:40:15Z | 2018-02-17T04:40:15Z | MEMBER | When using open_zarr on a dataset with many variables the keynames include the variable name, like
In the distributed scheduler these keynames get shortened to prefixes like We may want to avoid including the variable name into the keyname here in order to avoid breaking these out into several groups. Instead you might consider putting the variable name within the key as another member of the tuple like the following:
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1894/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
286448591 | MDU6SXNzdWUyODY0NDg1OTE= | 1810 | data_array.<tab> reads data | mrocklin 306380 | closed | 0 | 4 | 2018-01-06T01:34:55Z | 2018-01-06T14:26:36Z | 2018-01-06T14:26:36Z | MEMBER | Code Sample, a copy-pastable example if possible
Problem descriptionThis starts reading data. I don't know why. I'm using XArray against a FUSE system that is both expensive (it's targetting Google Cloud Storage) and also has logging. I can see that auto-completion immediately starts a lot of file reading on the file system. Output of
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1810/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
218314868 | MDU6SXNzdWUyMTgzMTQ4Njg= | 1343 | Some XArray key names don't group nicely | mrocklin 306380 | closed | 0 | 2 | 2017-03-30T20:15:44Z | 2017-05-22T20:38:56Z | 2017-05-22T20:38:56Z | MEMBER | Some XArray loading functions provide keys that don't adhere to dask conventions used for naming. We can solve this in XArray by using names like |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1343/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
218315793 | MDU6SXNzdWUyMTgzMTU3OTM= | 1344 | Dask Persist | mrocklin 306380 | closed | 0 | 5 | 2017-03-30T20:19:17Z | 2017-04-04T16:14:17Z | 2017-04-04T16:14:17Z | MEMBER | It would be convenient to load constituent dask.arrays into memory as dask.arrays rather than as numpy arrays. This would help with distributed computations where we want to load a large amount of data into distributed memory once and then iterate on the full xarray dataset repeatedly without reloading from disk every time. We can probably solve this from either side:
```python import dask dset.x, dset.y, dset.z = dask.persist(dset.x, dset.y, dset.z) ```
cc @shoyer @jcrist @rabernat @pwolfram |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1344/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
187594293 | MDU6SXNzdWUxODc1OTQyOTM= | 1085 | Always use absolute paths | mrocklin 306380 | closed | 0 | 3 | 2016-11-06T22:25:08Z | 2016-12-01T16:47:40Z | 2016-12-01T16:47:40Z | MEMBER | This would avoid a mismatch between clients and workers when using dask.distributed
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1085/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issues] ( [id] INTEGER PRIMARY KEY, [node_id] TEXT, [number] INTEGER, [title] TEXT, [user] INTEGER REFERENCES [users]([id]), [state] TEXT, [locked] INTEGER, [assignee] INTEGER REFERENCES [users]([id]), [milestone] INTEGER REFERENCES [milestones]([id]), [comments] INTEGER, [created_at] TEXT, [updated_at] TEXT, [closed_at] TEXT, [author_association] TEXT, [active_lock_reason] TEXT, [draft] INTEGER, [pull_request] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [state_reason] TEXT, [repo] INTEGER REFERENCES [repos]([id]), [type] TEXT ); CREATE INDEX [idx_issues_repo] ON [issues] ([repo]); CREATE INDEX [idx_issues_milestone] ON [issues] ([milestone]); CREATE INDEX [idx_issues_assignee] ON [issues] ([assignee]); CREATE INDEX [idx_issues_user] ON [issues] ([user]);