home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

5 rows where repo = 13221727, state = "closed" and user = 1328158 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, created_at (date), updated_at (date), closed_at (date)

type 1

  • issue 5

state 1

  • closed · 5 ✖

repo 1

  • xarray · 5 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
951644054 MDU6SXNzdWU5NTE2NDQwNTQ= 5631 NameError: name '_DType_co' is not defined monocongo 1328158 closed 0     7 2021-07-23T14:44:19Z 2021-07-23T21:39:54Z 2021-07-23T18:50:39Z NONE      

What happened: I installed a package that has xarray as a dependency. I then ran the package's console script, which resulted in the NameError shown below.

What you expected to happen: Successful import of the xarray package.

Minimal Complete Verifiable Example: ```python $ conda create -n tstenv python=3.7 $ conda activate tstenv (tstenv) $ pip install climate-indices (tstenv) $ python

import xarray Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/james/miniconda3/envs/tstenv/lib/python3.7/site-packages/xarray/init.py", line 3, in <module> from . import testing, tutorial, ufuncs File "/home/james/miniconda3/envs/tstenv/lib/python3.7/site-packages/xarray/testing.py", line 8, in <module> from xarray.core import duck_array_ops, formatting, utils File "/home/james/miniconda3/envs/tstenv/lib/python3.7/site-packages/xarray/core/duck_array_ops.py", line 16, in <module> from . import dask_array_compat, dask_array_ops, dtypes, npcompat, nputils File "/home/james/miniconda3/envs/tstenv/lib/python3.7/site-packages/xarray/core/npcompat.py", line 81, in <module> from numpy.typing import ArrayLike, DTypeLike File "/home/james/miniconda3/envs/tstenv/lib/python3.7/site-packages/numpy/typing/init.py", line 316, in <module> from ._dtype_like import ( File "/home/james/miniconda3/envs/tstenv/lib/python3.7/site-packages/numpy/typing/_dtype_like.py", line 95, in <module> class _SupportsDType(Generic[_DType_co]): ```

Anything else we need to know?: This does not happen if I just install xarray, so there seems to be a conflict with another dependency package at play here. Are there methods for finding namespace conflicts like this without resorting to a "manual" method of trying all the various combinations? It seems that I've included a version of another package in my requirements that has a namespace conflict with xarray -- how can I work out which this is? Or maybe a simpler solution is to not use specific versions in the requirements.txt of the package and instead let conda work out the correct/latest versions?

Thanks in advance for any ideas on where to look to resolve this issue (and in general for all the work that goes into xarray's care and feeding).

Environment: Linux (Ubuntu 20.04) Anaconda, Python 3.7 ``` $ conda list xarray

packages in environment at /home/james/miniconda3/envs/tstenv:

Name Version Build Channel

xarray 0.18.2 pypi_0 pypi ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5631/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
530777706 MDU6SXNzdWU1MzA3Nzc3MDY= 3586 Update documentation to reflect removal of inplace option monocongo 1328158 closed 0     1 2019-12-01T18:58:53Z 2020-12-12T23:10:11Z 2020-12-12T23:10:11Z NONE      

MCVE Code Sample

```python

Your code here

ds_gamma.reset_coords('month', drop=True, inplace=True)

~/miniconda3/envs/spi_multi/lib/python3.8/site-packages/xarray/core/utils.py in _check_inplace(inplace) 38 def _check_inplace(inplace: Optional[bool]) -> None: 39 if inplace is not None: ---> 40 raise TypeError( 41 "The inplace argument has been removed from xarray. " 42 "You can achieve an identical effect with python's standard assignment."

TypeError: The inplace argument has been removed from xarray. You can achieve an identical effect with python's standard assignment. ```

Expected Output

Coordinates reset in-place, or documentation that does not list inplace as a valid argument.

Problem Description

The documentation for Dataset.reset_coords and Dataset.reset_coords lists an inplace argument that is no longer supported. My assumption is that this should be removed from the documentation but I'm not sure how this documentation is generated -- it may come from the docstring and/or function signature and the argument is still present in the function signature so it shows up in the documentation as a result?

This appears to have been addressed before in issue #858 but maybe the inplace argument slipped back into the documentation somehow? I see that there is a function to check for an inplace argument and raise an error if present -- why is this being used rather than removing the inplace argument altogether? Is this mechanism in place to facilitate backward compatibility, etc.?

Output of xr.show_versions()

# Paste the output here xr.show_versions() here INSTALLED VERSIONS ------------------ commit: None python: 3.8.0 | packaged by conda-forge | (default, Nov 22 2019, 19:11:38) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 4.15.0-70-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.1 xarray: 0.14.1 pandas: 0.25.3 numpy: 1.17.3 scipy: 1.3.2 netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.4.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.8.1 distributed: 2.8.1 matplotlib: 3.1.2 cartopy: None seaborn: None numbagg: None setuptools: 42.0.1.post20191125 pip: 19.3.1 conda: None pytest: None IPython: 7.10.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3586/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
372244156 MDU6SXNzdWUzNzIyNDQxNTY= 2499 Tremendous slowdown when using dask integration monocongo 1328158 closed 0     5 2018-10-20T19:19:08Z 2019-01-13T01:53:09Z 2019-01-13T01:53:09Z NONE      

Code Sample, a copy-pastable example if possible

```python def spi_gamma(data_array, scale, start_year, calibration_year_initial, calibration_year_final, periodicity):

original_shape = data_array.shape
spi = indices.spi(data_array.values.squeeze(),
                  scale,
                  indices.Distribution.gamma,
                  start_year,
                  calibration_year_initial,
                  calibration_year_final,
                  periodicity)
data_array.values = np.reshape(spi, newshape=original_shape)

return data_array

open the precipitation NetCDF as an xarray DataSet object

dataset = xr.open_dataset(netcdf_precip, chunks={'lat': 1})

trim out all data variables from the dataset except the precipitation

for var in dataset.data_vars: if var not in arguments.var_name_precip: dataset = dataset.drop(var)

get the precipitation variable as an xarray DataArray object

da_precip = dataset[var_name_precip]

get the initial year of the data

data_start_year = int(str(da_precip['time'].values[0])[0:4])

stack the lat and lon dimensions into a new dimension named point, so at each

lat/lon we'll have a time series for the geospatial point

da_precip = da_precip.stack(point=('lat', 'lon'))

timestep_scale = 3

group the data by lat/lon point and apply the SPI/Gamma function to each time series

da_spi = da_precip.groupby('point').apply(spi_gamma, scale=timestep_scale, start_year=data_start_year, calibration_year_initial=1951, calibration_year_final=2010, periodicity=compute.Periodicity.monthly)

unstack the array back into original dimensions

da_spi = da_spi.unstack('point')

copy the original dataset since we'll be able to

reuse most of the coordinates, attributes, etc.

index_dataset = dataset.copy()

remove all data variables from copied dataset

for var_name in index_dataset.data_vars: index_dataset = index_dataset.drop(var_name)

TODO set global attributes accordingly for this new dataset

create a new variable to contain the SPI for the scale, assign into the dataset

long_name = "Standardized Precipitation Index (Gamma distribution), "\ "{scale}-{increment}".format(scale=timestep_scale, increment=scale_increment) spi_var = xr.Variable(dims=da_spi.dims, data=da_spi, attrs={'long_name': long_name, 'valid_min': -3.09, 'valid_max': 3.09}) var_name = "spi_gamma_" + str(timestep_scale).zfill(2) index_dataset[var_name] = spi_var

write the dataset as NetCDF

index_dataset.to_netcdf(output_file_base + var_name + ".nc") ```

Problem description

When I use GroupBy for split-apply-combine it works well if I don't specify a chunks argument, i.e. without dask parallelization. However, when I use a chunks argument it runs very slowly. I assume that this is because I don't yet understand how to optimally set the chunking parameters rather than this being something under the covers goobering up the processing with dask arrays (i.e. I doubt that this is a bug with xarray/dask). I have tried modifying my code by replacing all numpy arrays with dask arrays, but this has been problematic since there are not dask equivalents for some of the numpy functions used therein. Before I go too much further down that path I wanted to post here to see if there is something else that I may be overlooking which would make that effort unnecessary. My apologies if this is better posted to StackOverflow rather than as an issue, and if so I can post there instead.

My attempt at making this work is in a feature branch in my project's Git repository, I mention this because the above code is not a minimally working example but is included nevertheless to give a summary of what's happening at the top layer where I'm using xarray explicitly. If more code is required after a cursory look at this then I will provide, but hopefully I'm making a rookie mistake that once rectified will fix this.

In case it matters I have been launching my code from within PyCharm (both run and debug, with the same results), but my assumption has been that this is irrelevant and it should work the same at command line.

Thanks in advance for any suggestions or insight.

Expected Output

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None xarray: 0.10.9 pandas: 0.23.4 numpy: 1.15.2 scipy: 1.1.0 netCDF4: 1.4.1 h5netcdf: None h5py: 2.8.0 Nio: None zarr: None cftime: 1.0.0b1 PseudonetCDF: None rasterio: None iris: None bottleneck: None cyordereddict: None dask: 0.19.3 distributed: 1.23.3 matplotlib: None cartopy: None seaborn: None setuptools: 39.2.0 pip: 10.0.1 conda: 4.5.11 pytest: None IPython: 7.0.1 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2499/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
373646673 MDU6SXNzdWUzNzM2NDY2NzM= 2507 Error when applying a function with apply_ufunc() when using a function that returns multiple arrays monocongo 1328158 closed 0     5 2018-10-24T19:42:49Z 2018-10-29T05:07:06Z 2018-10-29T05:07:06Z NONE      

Code Sample, a copy-pastable example if possible

You can reproduce this error I think by using the code found here: https://github.com/monocongo/climate_indices/tree/issue_191_groupby

To exercise the code where the error occurs run the following command:

$ python scripts/process_grid_ufunc.py --index palmers --periodicity monthly --netcdf_precip example_inputs/nclimgrid_lowres_prcp.nc --var_name_precip prcp --netcdf_pet example_inputs/nclimgrid_lowres_pet.nc --var_name_pet pet --netcdf_awc example_inputs/nclimgrid_lowres_soil.nc --var_name_awc awc --output_file_base /data/test/nclimgrid_lowres_ufunc --calibration_start_year 1951 --calibration_end_year 2010 Within scripts/process_grid_ufunc.py, in the compute_write_palmers() function:

```python # stack the lat and lon dimensions into a new dimension named point, so at each lat/lon # we'll have a time series for the geospatial point, and group by these points da_precip_groupby = da_precip.stack(point=('lat', 'lon')).groupby('point') da_pet_groupby = da_pet.stack(point=('lat', 'lon')).groupby('point') da_awc_groupby = da_awc.stack(point=('lat', 'lon')).groupby('point')

# keyword arguments used for the function we'll apply to the data array
args_dict = {'data_start_year': data_start_year,
             'calibration_start_year': kwrgs['calibration_start_year'],
             'calibration_end_year': kwrgs['calibration_end_year']}

# apply the self-calibrated Palmers function to the data arrays
da_scpdsi, da_pdsi, da_phdi, da_pmdi, da_zindex = xr.apply_ufunc(indices.scpdsi,
                                                                 da_precip_groupby,
                                                                 da_pet_groupby,
                                                                 da_awc_groupby,
                                                                 output_core_dims=[[], [], [], [], []],
                                                                 kwargs=args_dict)

```

Problem description

I have a function that I apply to xarray.DataArray objects using xarray.apply_ufunc().

The function I'm using returns five arrays. The signature is this:

def scpdsi(precip_inches: np.ndarray,
           pet_inches: np.ndarray,
           awc: np.ndarray,
           data_start_year,
           calibration_start_year,
           calibration_end_year):
...

   return scpdsi, pdsi, phdi, pmdi, zindex

I'm applying the function over three xarray.DataArray objects like so:

# keyword arguments used for the function we'll apply to the data arrays
args_dict = {'data_start_year': data_start_year,
             'calibration_start_year': kwrgs['calibration_start_year'],
             'calibration_end_year': kwrgs['calibration_end_year']}

# apply the self-calibrated Palmers function to the data arrays
da_scpdsi, da_pdsi, da_phdi, da_pmdi, da_zindex = xr.apply_ufunc(scpdsi,
                                                                 da_precip_groupby,
                                                                 da_pet_groupby,
                                                                 da_awc_groupby,
                                                                 output_core_dims=[[], [], [], [], []],
                                                                 kwargs=args_dict)

When I run the code I get the following error:

ValueError: applied function does not have the number of outputs specified in the ufunc signature. Result is not a tuple of 5 elements: [array([nan, nan, nan, ..., nan, nan, nan], dtype=float32), array([nan, nan, nan, ..., nan, nan, nan], dtype=float32), array([nan, nan, nan, ..., nan, nan, nan], dtype=float32), array([nan, nan, nan, ..., nan, nan, nan], dtype=float32), array([nan, nan, nan, ..., nan, nan, nan], dtype=float32)]

When I step through the xarray computation.py code I see that the result data is returned as a list of arrays rather than as a tuple, and this is what raises the error (line 565 in computation.py). I've tried modifying the function to return the arrays as a tuple, along with marking the function's return type as a tuple in the signature but none of this has helped, i.e. the result data always comes through to xarray as a list of arrays rather than as a tuple.

I am using xarray version 0.10.9 with Python 3.6.

Expected Output

I was expecting to get the five arrays returned after applying the function. I may be doing something else wrong, as this is my first time using appy_ufunc() in this way, and if so please advise.

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None xarray: 0.10.9 pandas: 0.23.4 numpy: 1.15.2 scipy: 1.1.0 netCDF4: 1.4.1 h5netcdf: None h5py: 2.8.0 Nio: None zarr: None cftime: 1.0.0b1 PseudonetCDF: None rasterio: None iris: None bottleneck: None cyordereddict: None dask: 0.19.4 distributed: 1.23.3 matplotlib: None cartopy: None seaborn: None setuptools: 39.2.0 pip: 10.0.1 conda: 4.5.11 pytest: None IPython: 7.0.1 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2507/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
158958801 MDU6SXNzdWUxNTg5NTg4MDE= 873 Broadcast error when dataset is recombined after a stack/groupby/apply/unstack sequence monocongo 1328158 closed 0     11 2016-06-07T15:50:43Z 2016-09-20T19:55:55Z 2016-09-20T19:55:55Z NONE      

I have code which performs the split-apply-combine pattern on a dataset, and it appears to work as expected until it reaches a point where the dataset is being recombined. At this point it seems that there's a dimensional mismatch between arrays which is causing numpy to raise a broadcasting error (below).

The code which can cause this error is in a gist here

When I run the code I see the following errors/traceback:

``` Traceback (most recent call last): File "H:\git\climate_indices\src\scripts\xarray_groupby_example.py", line 34, in <module> dataset = dataset.groupby('grid_cells').apply(double_data) File "C:\Anaconda\lib\site-packages\xarray\core\groupby.py", line 469, in apply combined = self._concat(applied) File "C:\Anaconda\lib\site-packages\xarray\core\groupby.py", line 476, in _concat combined = concat(applied, concat_dim, positions=positions) File "C:\Anaconda\lib\site-packages\xarray\core\combine.py", line 114, in concat return f(objs, dim, data_vars, coords, compat, positions) File "C:\Anaconda\lib\site-packages\xarray\core\combine.py", line 268, in _dataset_concat combined = Variable.concat(vars, dim, positions) File "C:\Anaconda\lib\site-packages\xarray\core\variable.py", line 919, in concat variables = list(variables) File "C:\Anaconda\lib\site-packages\xarray\core\combine.py", line 262, in ensure_common_dims var = var.expand_dims(common_dims, common_shape) File "C:\Anaconda\lib\site-packages\xarray\core\variable.py", line 717, in expand_dims expanded_data = ops.broadcast_to(self.data, tmp_shape) File "C:\Anaconda\lib\site-packages\xarray\core\ops.py", line 67, in f return getattr(module, name)(args, *kwargs) File "C:\Anaconda\lib\site-packages\numpy\lib\stride_tricks.py", line 115, in broadcast_to return _broadcast_to(array, shape, subok=subok, readonly=True) File "C:\Anaconda\lib\site-packages\numpy\lib\stride_tricks.py", line 70, in _broadcast_to op_flags=[op_flag], itershape=shape, order='C').itviews[0] ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (2,) and requested shape (1,)

```

I get the above error when I use NetCDF input files which contain three dimensions (time, lon, lat), a simple example of which is described below:

``` Dataset type: Hierarchical Data Format, version 5

netcdf file:/C:/home/tmp/toy.nc {

dimensions: lat = 2; lon = 2; time = 3;

variables: int prcp(time=3, lon=2, lat=2); double lat(lat=2); double lon(lon=2); long time(time=3); :calendar = "proleptic_gregorian"; :units = "days since 2014-06-09 00:00:00"; } ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/873/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 921.959ms · About: xarray-datasette