pull_requests
30 rows where user = 1197350
This data as json, CSV (advanced)
Suggested facets: state, title, body, base, created_at (date), updated_at (date), closed_at (date), merged_at (date)
id ▼ | node_id | number | state | locked | title | user | body | created_at | updated_at | closed_at | merged_at | merge_commit_sha | assignee | milestone | draft | head | base | author_association | auto_merge | repo | url | merged_by |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
39752514 | MDExOlB1bGxSZXF1ZXN0Mzk3NTI1MTQ= | 468 | closed | 0 | Option for closing files with scipy backend | rabernat 1197350 | This addresses issue #463, in which open_mfdataset failed when trying to open a list of files longer than my system's ulimit. I tried to find a solution in which the underlying netcdf file objects are kept closed by default and only reopened "when needed". I ended up subclassing scipy.io.netcdf_file and overwriting the variable attribute with a property which first checks whether the file is open or closed and opens it if needed. That was the easy part. The hard part was figuring out when to close them. The problem is that a couple of different parts of the code (e.g. each individual variable and also the datastore object itself) keep references to the netcdf_file object. In the end I used the debugger to find out when during initialization the variables were actually being read and added some calls to close() in various different places. It is relatively easy to close the files up at the end of the initialization, but it was much harder to make sure that the whole array of files is never open at the same time. I also had to disable mmap when this option is active. This solution is messy and, moreover, extremely slow. There is a factor of ~100 performance penalty during initialization for reopening and closing the files all the time (but only a factor of 10 for the actual calculation). I am sure this could be reduced if someone who understands the code better found some judicious points at which to call close() on the netcdf_file. The loss of mmap also sucks. This option can be accessed with the close_files key word, which I added to api. Timing for loading and doing a calculation with close_files=True: ``` python count_open_files() %time mfds = xray.open_mfdataset(ddir + '/dt_global_allsat_msla_uv_2014101*.nc', engine='scipy', close_files=True) count_open_files() %time print float(mfds.variables['u'].mean()) count_open_files() ``` output: ``` 3 open files CPU times: user 11.1 s, sys: 17.5 s, total: 28.5 s Wall time: 27.7 s 2 open files 0.0055650632367 CPU times: user 649 ms, sys: 974 ms, total: 1.62 s Wa… | 2015-07-11T21:24:24Z | 2015-08-10T12:50:45Z | 2015-08-09T00:04:12Z | fe363c15d6c4f23d664d8729a54c9c2ce5a4e918 | 0 | 200aeb006781528cf6d4ca2f118d7f9257bd191b | 200aeb006781528cf6d4ca2f118d7f9257bd191b | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/468 | |||||
41962443 | MDExOlB1bGxSZXF1ZXN0NDE5NjI0NDM= | 522 | closed | 0 | Fix datetime decoding when time units are 'days since 0000-01-01 00:00:00' | rabernat 1197350 | This fixes #521 using the workaround described in Unidata/netcdf4-python#442. | 2015-08-08T23:26:07Z | 2015-08-09T00:10:18Z | 2015-08-09T00:06:49Z | cd4a3c221516dbafbff9ccf7586913a2e1aeaefd | 0 | 54f63df0b25a2f0df10885e390d7b8f05320f33e | 200aeb006781528cf6d4ca2f118d7f9257bd191b | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/522 | |||||
41962908 | MDExOlB1bGxSZXF1ZXN0NDE5NjI5MDg= | 523 | closed | 0 | Fix datetime decoding when time units are 'days since 0000-01-01 00:00:00' | rabernat 1197350 | This fixes #521 using the workaround described in Unidata/netcdf4-python#442. | 2015-08-09T00:12:00Z | 2015-08-14T17:22:02Z | 2015-08-14T17:22:02Z | 368c623812a58d1e09ec121d531e3e076391fcbd | 0 | 653b8641787aa008c5c901f2967392b7894b207d | 200aeb006781528cf6d4ca2f118d7f9257bd191b | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/523 | |||||
42016208 | MDExOlB1bGxSZXF1ZXN0NDIwMTYyMDg= | 524 | closed | 0 | Option for closing files with scipy backend | rabernat 1197350 | This is the same as #468, which was accidentally closed. I just copied and pasted my comment below This addresses issue #463, in which open_mfdataset failed when trying to open a list of files longer than my system's ulimit. I tried to find a solution in which the underlying netcdf file objects are kept closed by default and only reopened "when needed". I ended up subclassing scipy.io.netcdf_file and overwriting the variable attribute with a property which first checks whether the file is open or closed and opens it if needed. That was the easy part. The hard part was figuring out when to close them. The problem is that a couple of different parts of the code (e.g. each individual variable and also the datastore object itself) keep references to the netcdf_file object. In the end I used the debugger to find out when during initialization the variables were actually being read and added some calls to close() in various different places. It is relatively easy to close the files up at the end of the initialization, but it was much harder to make sure that the whole array of files is never open at the same time. I also had to disable mmap when this option is active. This solution is messy and, moreover, extremely slow. There is a factor of ~100 performance penalty during initialization for reopening and closing the files all the time (but only a factor of 10 for the actual calculation). I am sure this could be reduced if someone who understands the code better found some judicious points at which to call close() on the netcdf_file. The loss of mmap also sucks. This option can be accessed with the close_files key word, which I added to api. Timing for loading and doing a calculation with close_files=True: ``` python count_open_files() %time mfds = xray.open_mfdataset(ddir + '/dt_global_allsat_msla_uv_2014101*.nc', engine='scipy', close_files=True) count_open_files() %time print float(mfds.variables['u'].mean()) count_open_files() ``` output: ``` 3 open files CPU times: user 11.1 s, sys: 17.5 s, total: 28.5 s … | 2015-08-10T12:49:23Z | 2016-06-24T17:45:07Z | 2016-06-24T17:45:07Z | 0 | 0145d62c12808c636d1a578fc82b9098f8b78d29 | 200aeb006781528cf6d4ca2f118d7f9257bd191b | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/524 | ||||||
42731548 | MDExOlB1bGxSZXF1ZXN0NDI3MzE1NDg= | 538 | closed | 0 | Fix contour color | rabernat 1197350 | This fixes #537 by adding a check for the presence of the colors kwarg. | 2015-08-18T18:24:36Z | 2015-09-01T17:48:12Z | 2015-09-01T17:20:56Z | 2015-09-01T17:20:55Z | ddf9177fad8cf94bc2dfe908892fa59589479056 | 0 | ee4fab78e2e54042ea1adb1c7bc7c849f9ec0c76 | d2e9bccf65241e0d962902605e91db96bc2c768d | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/538 | ||||
65407870 | MDExOlB1bGxSZXF1ZXN0NjU0MDc4NzA= | 818 | closed | 0 | Multidimensional groupby | rabernat 1197350 | Many datasets have a two dimensional coordinate variable (e.g. longitude) which is different from the logical grid coordinates (e.g. nx, ny). (See #605.) For plotting purposes, this is solved by #608. However, we still might want to split / apply / combine over such coordinates. That has not been possible, because groupby only supports creating groups on one-dimensional arrays. This PR overcomes that issue by using `stack` to collapse multiple dimensions in the group variable. A minimal example of the new functionality is ``` python >>> da = xr.DataArray([[0,1],[2,3]], coords={'lon': (['ny','nx'], [[30,40],[40,50]] ), 'lat': (['ny','nx'], [[10,10],[20,20]] )}, dims=['ny','nx']) >>> da.groupby('lon').sum() <xarray.DataArray (lon: 3)> array([0, 3, 3]) Coordinates: * lon (lon) int64 30 40 50 ``` This feature could have broad applicability for many realistic datasets (particularly model output on irregular grids): for example, averaging non-rectangular grids zonally (i.e. in latitude), binning in temperature, etc. If you think this is worth pursuing, I would love some feedback. The PR is not complete. Some items to address are - [x] Create a specialized grouper to allow coarser bins. By default, if no `grouper` is specified, the `GroupBy` object uses all unique values to define the groups. With a high resolution dataset, this could balloon to a huge number of groups. With the latitude example, we would like to be able to specify e.g. 1-degree bins. Usage would be `da.groupby('lon', bins=range(-90,90))`. - [ ] Allow specification of which dims to stack. For example, stack in space but keep time dimension intact. (Currently it just stacks all the dimensions of the group variable.) - [x] A nice example for the docs. | 2016-04-06T04:14:37Z | 2016-07-31T23:02:59Z | 2016-07-08T01:50:38Z | 2016-07-08T01:50:38Z | a0a3860a87815f1f580aa56b972c7e8d9359b6ce | 0 | dc50064728cceade436c65b958f1b06a60e2eec7 | 0d0ae9d3e766c3af9dd98383ab3b33dfea9494dc | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/818 | ||||
75682773 | MDExOlB1bGxSZXF1ZXN0NzU2ODI3NzM= | 892 | closed | 0 | fix printing of unicode attributes | rabernat 1197350 | fixes #834 I would welcome a suggestion of how to test this in a way that works with both python 2 and 3. This is somewhat outside my expertise. | 2016-06-29T16:47:27Z | 2016-07-24T02:57:13Z | 2016-07-24T02:57:13Z | 9ea1fbacfa935f884598775e40c2287e00c92ef7 | 0 | be0acd453f6d9b8082e7b2ebe0a957ac7dbc5e3e | 0d0ae9d3e766c3af9dd98383ab3b33dfea9494dc | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/892 | |||||
87647409 | MDExOlB1bGxSZXF1ZXN0ODc2NDc0MDk= | 1027 | closed | 0 | Groupby bins empty groups | rabernat 1197350 | This PR fixes a bug in `groupby_bins` in which empty bins were dropped from the grouped results. Now `groupby_bins` restores any empty bins automatically. To recover the old behavior, one could apply `dropna` after a groupby operation. Fixes #1019 | 2016-10-02T21:31:32Z | 2016-10-03T15:28:01Z | 2016-10-03T15:22:15Z | 2016-10-03T15:22:15Z | 0e044ce807fa0ee15703c8b4088bf41ae8e99116 | 0 | 06517c3eb6cf4c4967c05e009803ad63a7103392 | 525e086097171aa8a904d54ea2ee8dc76f2a69ef | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/1027 | ||||
93171281 | MDExOlB1bGxSZXF1ZXN0OTMxNzEyODE= | 1104 | closed | 0 | add optimization tips | rabernat 1197350 | This adds some dask optimization tips from the mailing list (closes #1103). | 2016-11-10T15:26:25Z | 2016-11-10T16:49:13Z | 2016-11-10T16:49:06Z | 2016-11-10T16:49:06Z | 2cfd3882374831e5edfe5d040f7775c5bb5ecc7a | 0 | e489be806cfb7a8d2fc03493c410a1db2b80fe24 | 92095f759a4b61691bf494d46d8d3008d812c6f8 | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/1104 | ||||
113554698 | MDExOlB1bGxSZXF1ZXN0MTEzNTU0Njk4 | 1345 | closed | 0 | new dask prefix | rabernat 1197350 | - [x] closes #1343 - [ ] tests added / passed - [ ] passes ``git diff upstream/master | flake8 --diff`` - [ ] whatsnew entry | 2017-03-31T00:56:24Z | 2017-05-21T09:45:39Z | 2017-05-16T19:11:13Z | f16e61b31e8d30048d220c8247dc188c079080ce | 0 | 9529f54a01f4c06e642c54cb2088bcc9867ffe38 | f2a50158beba112e521548bea63fdf758b327235 | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/1345 | |||||
118408475 | MDExOlB1bGxSZXF1ZXN0MTE4NDA4NDc1 | 1390 | closed | 0 | Fix groupby bins tests | rabernat 1197350 | - [x] closes #1386 - [x] tests added / passed - [x] passes ``git diff upstream/master | flake8 --diff`` - [x] whatsnew entry | 2017-05-01T17:46:41Z | 2017-05-01T21:52:14Z | 2017-05-01T21:52:14Z | 2017-05-01T21:52:14Z | a9a12b0aca862d5ab19180594f616b8efab13308 | 0 | a91626819ea0cf333b5aad768863a487fd3a3de7 | 8f6a68e3f821689203bce2bce52b412e9fe70b5c | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/1390 | ||||
120903269 | MDExOlB1bGxSZXF1ZXN0MTIwOTAzMjY5 | 1411 | closed | 0 | fixed dask prefix naming | rabernat 1197350 | - [x] Closes #1343 - [x] Tests added / passed - [x] Passes ``git diff upstream/master | flake8 --diff`` - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API I am starting a new PR for this since the original one (#1345) was not branched of my own fork. As the discussion there stood, @shoyer suggested that `dataset.chunk` should also be updated to match the latest conventions in dask naming. The relevant code is here ```python def maybe_chunk(name, var, chunks): chunks = selkeys(chunks, var.dims) if not chunks: chunks = None if var.ndim > 0: token2 = tokenize(name, token if token else var._data) name2 = '%s%s-%s' % (name_prefix, name, token2) return var.chunk(chunks, name=name2, lock=lock) else: return var variables = OrderedDict([(k, maybe_chunk(k, v, chunks)) for k, v in self.variables.items()]) ``` Currently, `chunk` has an optional keyword argument `name_prefix='xarray-'`. Do we want to keep this optional? IMO, the current naming logic in `chunk` is not a problem for dask and will not cause problems for the distributed bokeh dashboard (as `open_dataset` did). | 2017-05-16T19:10:30Z | 2017-05-22T20:39:01Z | 2017-05-22T20:38:56Z | 2017-05-22T20:38:56Z | d80248476ebe4a3845211c5d58e0af1effc73ea3 | 0 | 630726ad12b9e83094ddd14bd02e6d4d2a18d706 | 028454d9d8c6d7d2f8afd7d0133941f961dbe231 | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/1411 | ||||
121142890 | MDExOlB1bGxSZXF1ZXN0MTIxMTQyODkw | 1413 | closed | 0 | concat prealigned objects | rabernat 1197350 | - [x] Closes #1385 - [ ] Tests added / passed - [ ] Passes ``git diff upstream/master | flake8 --diff`` - [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API This is an initial PR to bypass index alignment and coordinate checking when concatenating datasets. | 2017-05-17T20:16:00Z | 2017-07-17T21:53:53Z | 2017-07-17T21:53:40Z | cfaf0dd40692dcfddf9acb1dc9af1a292f965ece | 0 | a0314bfac8d3308c3c14c674386b9da4cb7b3c8d | d5c7e0612e8243c0a716460da0b74315f719f2df | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/1413 | |||||
137819104 | MDExOlB1bGxSZXF1ZXN0MTM3ODE5MTA0 | 1528 | closed | 0 | WIP: Zarr backend | rabernat 1197350 | - [x] Closes #1223 - [x] Tests added / passed - [x] Passes ``git diff upstream/master | flake8 --diff`` - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API I think that a zarr backend could be the ideal storage format for xarray datasets, overcoming many of the frustrations associated with netcdf and enabling optimal performance on cloud platforms. This is a very basic start to implementing a zarr backend (as proposed in #1223); however, I am taking a somewhat different approach. I store the whole dataset in a single zarr group. I encode the extra metadata needed by xarray (so far just dimension information) as attributes within the zarr group and child arrays. I hide these special attributes from the user by wrapping the attribute dictionaries in a "`HiddenKeyDict`", so that they can't be viewed or modified. I have no tests yet (:flushed:), but the following code works. ```python from xarray.backends.zarr import ZarrStore import xarray as xr import numpy as np ds = xr.Dataset( {'foo': (('y', 'x'), np.ones((100, 200)), {'myattr1': 1, 'myattr2': 2}), 'bar': (('x',), np.zeros(200))}, {'y': (('y',), np.arange(100)), 'x': (('x',), np.arange(200))}, {'some_attr': 'copana'} ).chunk({'y': 50, 'x': 40}) zs = ZarrStore(store='zarr_test') ds.dump_to_store(zs) ds2 = xr.Dataset.load_store(zs) assert ds2.equals(ds) ``` There is a very long way to go here, but I thought I would just get a PR started. Some questions that would help me move forward. 1. What is "encoding" at the variable level? (I have never understood this part of xarray.) How should encoding be handled with zarr? 1. Should we encode / decode CF for zarr stores? 1. Do we want to always automatically align dask chunks with the underlying zarr chunks? 1. What sort of public API should the zarr backend have? Should you be able to load zarr stores via `open_dataset`? Or do we need a new method? I think `.to_zarr()` would be quite useful. 1. zarr arrays are … | 2017-08-27T02:38:01Z | 2018-02-13T21:35:03Z | 2017-12-14T02:11:36Z | 2017-12-14T02:11:36Z | 8fe7eb0fbcb7aaa90d894bcf32dc1408735e5d9d | 0 | f5633cabd19189675b607379badc2c19b86c0b8e | 89a1a9883c0c8409dad8dbcccf1ab73a3ea2cafc | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/1528 | ||||
162224186 | MDExOlB1bGxSZXF1ZXN0MTYyMjI0MTg2 | 1817 | closed | 0 | fix rasterio chunking with s3 datasets | rabernat 1197350 | - [x] Closes #1816 (remove if there is no corresponding issue, which should only be the case for minor changes) - [x] Tests added (for all bug fixes or enhancements) - [x] Tests passed (for all non-documentation changes) - [x] Passes ``git diff upstream/master **/*py | flake8 --diff`` (remove if you did not edit any Python files) - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later) This is a simple fix for token generation of non-filename targets for rasterio. The problem is that I have no idea how to test it without actually hitting s3 (which requires boto and aws credentials). | 2018-01-10T20:37:45Z | 2018-01-24T09:33:07Z | 2018-01-23T16:33:28Z | 2018-01-23T16:33:28Z | 3cd2337d8035a324cb38d6793eaf33818066f25c | 0 | 350b929dcb4e87dc365b14f215925b944e91922a | e31cf43e8d183c63474b2898a0776fda72abc82c | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/1817 | ||||
180650279 | MDExOlB1bGxSZXF1ZXN0MTgwNjUwMjc5 | 2047 | closed | 0 | Fix decode cf with dask | rabernat 1197350 | - [x] Closes #1372 - [x] Tests added - [x] Tests passed - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API This was a very simple fix for an issue that has vexed me for quite a while. Am I missing something obvious here? | 2018-04-10T15:56:20Z | 2018-04-12T23:38:02Z | 2018-04-12T23:38:02Z | 2018-04-12T23:38:02Z | a9d1f3a36229636f0d519eb36a8d4a7c91f6e1cd | 0 | c8843003b98a3a26636ec9f88393590c633eb382 | 6402391cf206fd04c12d44773fecd9b42ea0c246 | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/2047 | ||||
213736501 | MDExOlB1bGxSZXF1ZXN0MjEzNzM2NTAx | 2405 | closed | 0 | WIP: don't create indexes on multidimensional dimensions | rabernat 1197350 | - [x] Closes #2368, Closes #2233 - [ ] Tests added (for all bug fixes or enhancements) - [ ] Tests passed (for all non-documentation changes) - [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later) This is just a start to the solution proposed in #2368. A surprisingly small number of tests broke in my local environment. | 2018-09-06T20:13:11Z | 2023-07-19T18:33:17Z | 2023-07-19T18:33:17Z | 6129432d60690690916cbac40a9e91099ad1f114 | 0 | 40c8d36844fdee6f8c06ec5babfacb25f177e954 | d1e4164f3961d7bbb3eb79037e96cae14f7182f8 | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/2405 | |||||
217463158 | MDExOlB1bGxSZXF1ZXN0MjE3NDYzMTU4 | 2430 | closed | 0 | WIP: revise top-level package description | rabernat 1197350 | I have often complained that xarray's top-level package description assumes that the user knows all about pandas. I think this alienates many new users. This is a first draft at revising that top-level description. Feedback from the community very needed here. | 2018-09-22T15:35:47Z | 2019-01-07T01:04:19Z | 2019-01-06T00:31:57Z | 2019-01-06T00:31:57Z | a0bbea89d5ce1399a24ca6c27b446283588ca2b4 | 0 | 085a5ddce0022639db9a2d23b5a486bb2cff38b3 | bb87a9441d22b390e069d0fde58f297a054fd98a | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/2430 | ||||
232190289 | MDExOlB1bGxSZXF1ZXN0MjMyMTkwMjg5 | 2559 | closed | 0 | Zarr consolidated | rabernat 1197350 | This PR adds support for reading and writing of [consolidated metadata](https://zarr.readthedocs.io/en/latest/tutorial.html#consolidating-metadata) in zarr stores. - [x] Closes #2558 (remove if there is no corresponding issue, which should only be the case for minor changes) - [x] Tests added (for all bug fixes or enhancements) - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later) | 2018-11-20T04:39:41Z | 2018-12-05T14:58:58Z | 2018-12-04T23:51:00Z | 2018-12-04T23:51:00Z | 3ae93ac31ce122fc10b089f3b92b8c20e8b218c9 | 0 | fe4af34732f104f1e5f2b18e25dec1c3b92d6809 | 483b8a0a89ea4be862488e51af8a1b3bc7f40356 | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/2559 | ||||
242668810 | MDExOlB1bGxSZXF1ZXN0MjQyNjY4ODEw | 2659 | closed | 0 | to_dict without data | rabernat 1197350 | This PR provides the ability to export Datasets and DataArrays to dictionary _without_ the actual data. This could be useful for generating indices of dataset contents to expose to search indices or other automated data discovery tools In the process of doing this, I refactored the core dictionary export function to live in the Variable class, since the same code was duplicated in several places. - [x] Closes #2656 - [x] Tests added - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API | 2019-01-07T14:09:25Z | 2019-02-12T21:21:13Z | 2019-01-21T23:25:56Z | 2019-01-21T23:25:56Z | a7d55b9bcd0cc19330b5784842d51af5309d07ee | 0 | 4cf7bc8efe9fe6aae4c2487685c883b70aefa9dd | ede3e0101bae2f45c3f4634a1e1ecb8e2ccd0258 | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/2659 | ||||
261202056 | MDExOlB1bGxSZXF1ZXN0MjYxMjAyMDU2 | 2813 | open | 0 | [WIP] added protect_dataset_variables_inplace to open_zarr | rabernat 1197350 | This adds the same call to `_protect_dataset_variables_inplace` to `open_zarr` which we find in `open_dataset`. It wraps the arrays with `indexing.MemoryCachedArray`. As far as I can tell, it *does not work*, in the sense that nothing is cached. - [ ] One possible way to close #2812 - [ ] Tests added - [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API | 2019-03-14T14:50:15Z | 2024-03-25T14:05:24Z | a8195022a5cc6c4c573bffa7df1588e6aa0a12b2 | 0 | 5ab07f8fa8f2a1e656b276e64f698f91aa07330d | d1e4164f3961d7bbb3eb79037e96cae14f7182f8 | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/2813 | ||||||
261207163 | MDExOlB1bGxSZXF1ZXN0MjYxMjA3MTYz | 2814 | open | 0 | [WIP] Use zarr internal LRU caching | rabernat 1197350 | Alternative way to close #2812. This uses zarr's own caching. In contrast to #2813, this *does work*. - [ ] Closes #2812 - [ ] Tests added - [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API | 2019-03-14T15:01:06Z | 2024-03-25T14:00:50Z | 5ac4430d03ce962eabf140771da75c0667e24e65 | 0 | e92f9c1b55fb685c2fc80f13dd16de852b0550b6 | d1e4164f3961d7bbb3eb79037e96cae14f7182f8 | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/2814 | ||||||
268927254 | MDExOlB1bGxSZXF1ZXN0MjY4OTI3MjU0 | 2881 | closed | 0 | decreased pytest verbosity | rabernat 1197350 | This removes the `--verbose` flag from py.test in .travis.yml. - [x] Closes #2880 | 2019-04-09T21:12:50Z | 2019-04-09T23:36:01Z | 2019-04-09T23:34:22Z | 2019-04-09T23:34:22Z | 2c10d1443bea09e5ef53e5a7e35195a195e193a7 | 0 | b085e311a822e5e77a9a2b6f3e132281bbd285ea | 3435b03de218f54a55eb72dff597bb47b0f407cb | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/2881 | ||||
297270660 | MDExOlB1bGxSZXF1ZXN0Mjk3MjcwNjYw | 3105 | closed | 0 | Switch doc examples to use nbsphinx | rabernat 1197350 | This is the beginning of the docs refactor we have in mind for the sprint tomorrow. We will merge things first to the scipy19-docs branch so we can make sure things build on RTD. http://xarray.pydata.org/en/scipy19-docs | 2019-07-13T02:28:34Z | 2019-07-13T04:53:09Z | 2019-07-13T04:52:52Z | 2019-07-13T04:52:52Z | 903495e5f3b7439e9ba9d63178129f43cab3082a | 0 | 163a0a694e187ec3b66c757572e4fc50be7aa8e3 | 6586c26af6e55279efe646188b39ee1caf86db23 | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/3105 | ||||
297282705 | MDExOlB1bGxSZXF1ZXN0Mjk3MjgyNzA1 | 3106 | closed | 0 | Replace sphinx_gallery with notebook | rabernat 1197350 | Today @jhamman and I discussed how to refactor our somewhat fragmented "examples". We decided to basically copy the approach of the [dask-examples](https://github.com/dask/dask-examples) repo, but have it live here in the main xarray repo. Basically this approach is: - all examples are notebooks - examples are rendered during doc build by nbsphinx - we will eventually have a binder that works with all of the same examples This PR removes the dependency on sphinx_gallery and replaces the existing gallery with a standalone notebook called `visualization_gallery.ipynb`. However, not all of the links that worked in the gallery work here, since we are now using nbsphinx to render the notebooks (see https://github.com/spatialaudio/nbsphinx/issues/308). Really important to get @dcherian's feedback on this, as he was the one who originally introduced the gallery. My view is that having everything as notebooks makes examples easier to maintain. But I'm curious to hear other views. | 2019-07-13T05:35:34Z | 2019-07-13T14:03:20Z | 2019-07-13T14:03:19Z | 2019-07-13T14:03:19Z | 92cd3c4ef9fc61c62fe8c8d257dd223c73a80ac2 | 0 | cd769c44954e2d03ea1aae607e56ad7e142542ea | 903495e5f3b7439e9ba9d63178129f43cab3082a | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/3106 | ||||
297354511 | MDExOlB1bGxSZXF1ZXN0Mjk3MzU0NTEx | 3121 | closed | 0 | Allow other tutorial filename extensions | rabernat 1197350 | <!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #3118 - [ ] Tests added - [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API Together with https://github.com/pydata/xarray-data/pull/15, this allows us to generalize out tutorial datasets to non netCDF files. But it is backwards compatible--if there is no file suffix, it will append `.nc`. | 2019-07-13T23:27:44Z | 2019-07-14T01:07:55Z | 2019-07-14T01:07:51Z | 2019-07-14T01:07:51Z | 5df8a428d36a3909395777bc9bc36e2d56b7422c | 0 | 49c11ade9d36e9c6295993e52deb660d01cfb846 | 92cd3c4ef9fc61c62fe8c8d257dd223c73a80ac2 | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/3121 | ||||
297445477 | MDExOlB1bGxSZXF1ZXN0Mjk3NDQ1NDc3 | 3131 | open | 0 | WIP: tutorial on merging datasets | rabernat 1197350 | <!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #1391 - [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API This is a start on a tutorial about merging / combining datasets. | 2019-07-15T01:28:25Z | 2022-06-09T14:50:17Z | 0d29796ce96aaa6e8f2fe7b81548b8fbe659c666 | TomNicholas 35968931 | 0 | 211a2b30ba063dcb3cf5eb0965376ac5a348e7ac | d1e4164f3961d7bbb3eb79037e96cae14f7182f8 | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/3131 | |||||
415292337 | MDExOlB1bGxSZXF1ZXN0NDE1MjkyMzM3 | 4047 | closed | 0 | Document Xarray zarr encoding conventions | rabernat 1197350 | When we implemented the Zarr backend, we made some _ad hoc_ choices about how to encode NetCDF data in Zarr. At this stage, it would be useful to explicitly document this encoding. I decided to put it on the "Xarray Internals" page, but I'm open to moving if folks feel it fits better elsewhere. cc @jeffdlb, @WardF, @DennisHeimbigner | 2020-05-08T15:29:14Z | 2020-05-22T21:59:09Z | 2020-05-20T17:04:02Z | 2020-05-20T17:04:02Z | 261df2e56b2d554927887b8943f84514fc60369b | 0 | eb700172e72c2177feb8c837c24bc62110451227 | 69548df9826cde9df6cbdae9c033c9fb1e62d493 | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/4047 | ||||
597608584 | MDExOlB1bGxSZXF1ZXN0NTk3NjA4NTg0 | 5065 | closed | 0 | Zarr chunking fixes | rabernat 1197350 | <!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #2300, closes #5056 - [x] Tests added - [x] Passes `pre-commit run --all-files` - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` This PR contains two small, related updates to how Zarr chunks are handled. 1. We now delete the `encoding` attribute at the Variable level whenever `chunk` is called. The persistence of `chunk` encoding has been the source of lots of confusion (see #2300, #4046, #4380, https://github.com/dcs4cop/xcube/issues/347) 2. Added a new option called `safe_chunks` in `to_zarr` which allows for bypassing the requirement of the many-to-one relationship between Zarr chunks and Dask chunks (see #5056). Both these touch the internal logic for how chunks are handled, so I thought it was easiest to tackle them with a single PR. | 2021-03-22T01:35:22Z | 2021-04-26T16:37:43Z | 2021-04-26T16:37:43Z | 2021-04-26T16:37:42Z | dd7f742fd79126d8665740c5a461c265fdfdc0da | 0 | 023920b8d11d1a77bedfa880fcaf68f95167729c | 69950a46f9402a7c5ae6d86d766d7933738dc62b | MEMBER | xarray 13221727 | https://github.com/pydata/xarray/pull/5065 | ||||
1592876533 | PR_kwDOAMm_X85e8V31 | 8428 | closed | 0 | Add mode='a-': Do not overwrite coordinates when appending to Zarr with `append_dim` | rabernat 1197350 | This implements the 1b option described in #8427. - [x] Closes #8427 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` | 2023-11-08T15:41:58Z | 2023-12-01T04:21:57Z | 2023-12-01T03:58:54Z | 2023-12-01T03:58:54Z | c93b31a9175eed9e506eb1950bf843d7de715bb9 | 0 | 9dbb50acec94790dd24c1d84a5bb5ad4325b5ac9 | 1715ed3422c04853fda1827de7e3580c07de85cf | MEMBER | { "enabled_by": { "login": "andersy005", "id": 13301940, "node_id": "MDQ6VXNlcjEzMzAxOTQw", "avatar_url": "https://avatars.githubusercontent.com/u/13301940?v=4", "gravatar_id": "", "url": "https://api.github.com/users/andersy005", "html_url": "https://github.com/andersy005", "followers_url": "https://api.github.com/users/andersy005/followers", "following_url": "https://api.github.com/users/andersy005/following{/other_user}", "gists_url": "https://api.github.com/users/andersy005/gists{/gist_id}", "starred_url": "https://api.github.com/users/andersy005/starred{/owner}{/repo}", "subscriptions_url": "https://api.github.com/users/andersy005/subscriptions", "organizations_url": "https://api.github.com/users/andersy005/orgs", "repos_url": "https://api.github.com/users/andersy005/repos", "events_url": "https://api.github.com/users/andersy005/events{/privacy}", "received_events_url": "https://api.github.com/users/andersy005/received_events", "type": "User", "site_admin": false }, "merge_method": "squash", "commit_title": "Add mode='a-': Do not overwrite coordinates when appending to Zarr with `append_dim` (#8428)", "commit_message": "Co-authored-by: Deepak Cherian <deepak@cherian.net>\r\nCo-authored-by: Anderson Banihirwe <13301940+andersy005@users.noreply.github.com>\r\n" } |
xarray 13221727 | https://github.com/pydata/xarray/pull/8428 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [pull_requests] ( [id] INTEGER PRIMARY KEY, [node_id] TEXT, [number] INTEGER, [state] TEXT, [locked] INTEGER, [title] TEXT, [user] INTEGER REFERENCES [users]([id]), [body] TEXT, [created_at] TEXT, [updated_at] TEXT, [closed_at] TEXT, [merged_at] TEXT, [merge_commit_sha] TEXT, [assignee] INTEGER REFERENCES [users]([id]), [milestone] INTEGER REFERENCES [milestones]([id]), [draft] INTEGER, [head] TEXT, [base] TEXT, [author_association] TEXT, [auto_merge] TEXT, [repo] INTEGER REFERENCES [repos]([id]), [url] TEXT, [merged_by] INTEGER REFERENCES [users]([id]) ); CREATE INDEX [idx_pull_requests_merged_by] ON [pull_requests] ([merged_by]); CREATE INDEX [idx_pull_requests_repo] ON [pull_requests] ([repo]); CREATE INDEX [idx_pull_requests_milestone] ON [pull_requests] ([milestone]); CREATE INDEX [idx_pull_requests_assignee] ON [pull_requests] ([assignee]); CREATE INDEX [idx_pull_requests_user] ON [pull_requests] ([user]);