issue_comments
11 rows where user = 127195910 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: issue_url, created_at (date), updated_at (date)
user 1
- Karimat22 · 11 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
1460873349 | https://github.com/pydata/xarray/issues/7456#issuecomment-1460873349 | https://api.github.com/repos/pydata/xarray/issues/7456 | IC_kwDOAMm_X85XEyiF | Karimat22 127195910 | 2023-03-08T21:04:05Z | 2023-06-01T15:42:44Z | NONE | The xr.Dataset.expand_dims() method can be used to add new dimensions to a dataset. The axis parameter is used to specify where to insert the new dimension in the dataset. However, it's worth noting that the axis parameter only works when expanding along a 1D coordinate, not when expanding along a multi-dimensional array. Here's an example to illustrate how to use the axis parameter to expand a dataset along a 1D coordinate: import xarray as xr create a sample datasetdata = xr.DataArray([[1, 2], [3, 4]], dims=('x', 'y')) ds = xr.Dataset({'foo': data}) add a new dimension along the 'x' coordinate using the 'axis' parameterds_expanded = ds.expand_dims({'z': [1]}, axis='x') In this example, we create a 2D array with dimensions x and y, and then add a new dimension along the x coordinate using the axis='x' parameter. However, if you try to use the axis parameter to expand a dataset along a multi-dimensional array, you may encounter an error. This is because expanding along a multi-dimensional array would result in a dataset with non-unique dimension names, which is not allowed in xarray. Here's an example to illustrate this issue: import xarray as xr create a sample dataset with a 2D arraydata = xr.DataArray([[1, 2], [3, 4]], dims=('x', 'y')) ds = xr.Dataset({'foo': data}) add a new dimension along the 'x' and 'y' coordinates using the 'axis' parameterds_expanded = ds.expand_dims({'z': [1]}, axis=('x', 'y')) In this example, we try to use the axis=('x', 'y') parameter to add a new dimension along both the x and y coordinates. However, this results in a ValueError because the resulting dataset would have non-unique dimension names. To add a new dimension along a multi-dimensional array, you can instead use the xr.concat() function to concatenate the dataset with a new data array along the desired dimension: import xarray as xr create a sample dataset with a 2D arraydata = xr.DataArray([[1, 2], [3, 4]], dims=('x', 'y')) ds = xr.Dataset({'foo': data}) add a new dimension along the 'x' and 'y' coordinates using xr.concatds_expanded = xr.concat([ds, xr.DataArray([1], dims=('z'))], dim='z') In this example, we use the xr.concat() function to concatenate the original dataset with a new data array that has a single value along the new dimension z. The dim='z' parameter is used to specify that the new dimension should be named z. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xr.DataSet.expand_dims axis option doesn't work 1548355645 | |
1460894580 | https://github.com/pydata/xarray/issues/7593#issuecomment-1460894580 | https://api.github.com/repos/pydata/xarray/issues/7593 | IC_kwDOAMm_X85XE3t0 | Karimat22 127195910 | 2023-03-08T21:23:08Z | 2023-05-06T03:24:36Z | NONE | If you are encountering an error message that says "Plotting with time-zone-aware pd.Timestamp axis not possible", it means that you are trying to plot a Pandas DataFrame or Series that has a time-zone-aware pd.Timestamp axis using a plotting library that does not support time zones. To fix this error, you can convert the time-zone-aware pd.Timestamp axis to a time-zone-naive datetime object. This can be done using the tz_localize() method to set the time zone, followed by the tz_convert() method to convert to a new time zone or remove the time zone information altogether. Here is an example: import pandas as pd import matplotlib.pyplot as plt Create a time-series DataFrame with a time-zone-aware pd.Timestamp axisdata = pd.DataFrame({'value': [1, 2, 3, 4]}, index=pd.date_range('2022-03-01 00:00:00', periods=4, freq='H', tz='US/Eastern')) Convert the time-zone-aware pd.Timestamp axis to a time-zone-naive datetime objectdata.index = data.index.tz_localize(None) Plot the DataFrame using Matplotlibdata.plot() plt.show() In this example, we create a time-series DataFrame with a time-zone-aware pd.Timestamp axis using the pd.date_range() function with the tz parameter set to 'US/Eastern'. We then use the tz_localize() method to set the time zone to None to convert the axis to a time-zone-naive datetime object. Finally, we plot the DataFrame using Matplotlib and the plot() method. Note that converting the time-zone-aware pd.Timestamp axis to a time-zone-naive datetime object means that the time zone information is lost, so make sure that this is acceptable for your use case before making this conversion. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Plotting with time-zone-aware pd.Timestamp axis not possible 1613054013 | |
1460859657 | https://github.com/pydata/xarray/issues/7584#issuecomment-1460859657 | https://api.github.com/repos/pydata/xarray/issues/7584 | IC_kwDOAMm_X85XEvMJ | Karimat22 127195910 | 2023-03-08T20:51:15Z | 2023-04-29T03:41:57Z | NONE | When using NumPy arrays, the np.multiply() function and the * operator behave the same way and perform element-wise multiplication on the arrays. Similarly, the np.add() function and the + operator perform element-wise addition. However, when using Dask arrays, there is a difference between using the * and + operators and using the dask.array.multiply() and dask.array.add() functions. This is because Dask arrays are lazy and do not compute the result of an operation until it is explicitly requested. When you use the * or + operators, Dask constructs a task graph that describes the computation, but does not actually execute it until you explicitly call a computation method like dask.compute() or dask.persist(). On the other hand, when you use the dask.array.multiply() or dask.array.add() functions, Dask immediately constructs a task graph and adds it to the computation graph, triggering the computation to begin. Here's an example to illustrate the difference: import dask.array as da x = da.ones((1000, 1000), chunks=(100, 100)) y = da.ones((1000, 1000), chunks=(100, 100)) using the * operatorz = x * y no computation is triggered yetusing dask.array.multiply()z = da.multiply(x, y) computation is immediately triggeredIn this example, the * operator creates a task graph for the multiplication but does not execute it, whereas the dask.array.multiply() function immediately adds the task graph to the computation graph and triggers the computation to begin. It's worth noting that using the * and + operators can be more convenient and can lead to cleaner code, especially for simple operations. However, if you need more control over when computations are executed or want to avoid unnecessary computations, you should use the dask.array.multiply() and dask.array.add() functions. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
`np.multiply` and `dask.array.multiply` trigger graph computation vs using `+` and `*` operators. 1609090149 | |
1460907454 | https://github.com/pydata/xarray/issues/7574#issuecomment-1460907454 | https://api.github.com/repos/pydata/xarray/issues/7574 | IC_kwDOAMm_X85XE62- | Karimat22 127195910 | 2023-03-08T21:34:49Z | 2023-03-15T16:54:13Z | NONE | @jonas-constellr It's possible that the failure you're experiencing is due to an issue with how the h5netcdf library is interacting with Dask. One potential solution to this issue is to try using the netCDF4 library instead of h5netcdf. netCDF4 is another popular library for reading and writing netCDF files, and it has built-in support for parallel I/O through Dask. To use netCDF4 with xarray, you can simply pass the 'netcdf4' engine to the xr.open_mfdataset function: python import xarray as xr Open multiple netCDF files with netCDF4 engine and parallel I/Ods = xr.open_mfdataset('path/to/files/*.nc', engine='netcdf4', parallel=True) If you need to use h5netcdf for some reason, another potential solution is to use the dask.array.from_delayed function to manually create a Dask array from the h5netcdf data. This can be done by first reading in the data using h5netcdf, and then using dask.delayed to parallelize the data loading across multiple chunks. Here's an example: python import h5netcdf import dask.array as da from dask import delayed Define function to read in a single chunk of data from the netCDF file@delayed def read_chunk(filename, varname, start, count): with h5netcdf.File(filename, 'r') as f: var = f[varname][start[0]:start[0]+count[0], start[1]:start[1]+count[1]] return var Define function to read in the entire dataset using dask.array.from_delayeddef read_data(files, varname): chunks = (1000, 1000) # Define chunk size data = [read_chunk(f, varname, start, chunks) for f in files] data = [da.from_delayed(d, shape=chunks, dtype='float64') for d in data] data = da.concatenate(data, axis=0) return data Open multiple netCDF files with h5netcdf engine and parallel I/Ofiles = ['path/to/files/file1.nc', 'path/to/files/file2.nc', ...] varname = 'my_variable' data = read_data(files, varname) This code reads in the data from each file in chunks, and returns a Dask array that is a concatenation of all the chunks. The read_chunk function uses h5netcdf.File to read in a single chunk of data from a file, and returns a delayed object that represents the loading of that chunk. The read_data function uses dask.delayed to parallelize the loading of the chunks across all the files, and then uses dask.array.from_delayed to create a Dask array from the delayed objects. Finally, the function returns the concatenated Dask array. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xr.open_mfdataset doesn't work with fsspec and dask 1605108888 | |
1460903268 | https://github.com/pydata/xarray/issues/7574#issuecomment-1460903268 | https://api.github.com/repos/pydata/xarray/issues/7574 | IC_kwDOAMm_X85XE51k | Karimat22 127195910 | 2023-03-08T21:30:53Z | 2023-03-15T16:53:58Z | NONE |
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xr.open_mfdataset doesn't work with fsspec and dask 1605108888 | |
1460890139 | https://github.com/pydata/xarray/issues/7596#issuecomment-1460890139 | https://api.github.com/repos/pydata/xarray/issues/7596 | IC_kwDOAMm_X85XE2ob | Karimat22 127195910 | 2023-03-08T21:18:48Z | 2023-03-15T04:59:54Z | NONE | Time offset arithmetic involves adding or subtracting a duration from a specific time to obtain a new time. This is commonly used when dealing with time zones or calculating time differences between two events. For example, suppose the current time is 3:30 PM in New York City, which is in the Eastern Time Zone (ET). We want to calculate what time it is in Los Angeles, which is in the Pacific Time Zone (PT), considering the 3-hour time difference between the two zones. To do this, we can use time offset arithmetic by subtracting 3 hours from the current time in ET: 3:30 PM ET - 3 hours = 12:30 PM PT Therefore, the current time in Los Angeles is 12:30 PM. Another example of time offset arithmetic is when calculating the duration between two events. Suppose an event starts at 9:00 AM and ends at 10:30 AM. We can calculate the duration of the event by subtracting the start time from the end time: 10:30 AM - 9:00 AM = 1 hour 30 minutes Therefore, the event lasted for 1 hour and 30 minutes. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
illustrate time offset arithmetic 1615596004 | |
1460850301 | https://github.com/pydata/xarray/issues/7588#issuecomment-1460850301 | https://api.github.com/repos/pydata/xarray/issues/7588 | IC_kwDOAMm_X85XEs59 | Karimat22 127195910 | 2023-03-08T20:42:10Z | 2023-03-14T19:43:51Z | NONE | When using xr.merge with compat='minimal', the resulting Dataset may have unexpected behavior, including causing len to return wrong and possibly negative values. This is because compat='minimal' mode allows for the merging of datasets with non-matching dimensions and coordinates. When this happens, the merged dataset may contain "empty" dimensions or coordinates that have lost their original values and attributes. To avoid this issue, you can either use compat='override' mode, which will overwrite conflicting dimensions and coordinates, or manually align and concatenate the datasets before merging them. Here's an example of how to manually align and concatenate two datasets before merging them: import xarray as xr create two sample datasets with different dimensionsds1 = xr.Dataset({'foo': (['x', 'y'], [[1, 2], [3, 4]])}, coords={'x': [0, 1], 'y': [0, 1]}) ds2 = xr.Dataset({'bar': (['x', 'z'], [[5, 6], [7, 8]])}, coords={'x': [0, 1], 'z': [0, 1]}) align and concatenate the datasets along the 'x' dimensionds1_aligned = ds1.sel(x=slice(None, 1)) ds2_aligned = ds2.sel(x=slice(1, None)) merged = xr.merge([ds1_aligned, ds2_aligned]) In this example, we first slice each dataset to remove any non-matching dimensions and coordinates, and then concatenate them along the 'x' dimension. Finally, we merge the aligned datasets to create a single merged dataset without any corrupted dimensions or coordinates. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xr.merge with compat="minimal" returns corrupted Dataset and causes __len__ to return wrong and possibly negative values. 1611701140 | |
1460844042 | https://github.com/pydata/xarray/issues/7597#issuecomment-1460844042 | https://api.github.com/repos/pydata/xarray/issues/7597 | IC_kwDOAMm_X85XErYK | Karimat22 127195910 | 2023-03-08T20:36:16Z | 2023-03-14T19:41:29Z | NONE | The interpolate_na function is typically used to fill missing values (NAs) in a data frame or array by interpolating between existing values. It has an optional argument called max_gap which specifies the maximum number of consecutive NAs that can be filled in a single interpolation step. However, the max_gap argument may not work as expected at the boundaries of an array, as there may not be enough data points available to fill the maximum gap. For example, if the max_gap is set to 3 and there are only two consecutive NAs at the boundary of an array, the function will not be able to fill those NAs. One way to handle this issue is to reduce the max_gap value near the boundaries of the array. For example, you could set the max_gap to 1 for the first and last few rows or columns of the array, depending on the structure of your data. Alternatively, you could use a different interpolation method (e.g., linear interpolation) that does not require a fixed max_gap value. It's also worth noting that the interpolate_na function may not always be the best approach for filling missing values, as it assumes that the data has a smooth, continuous structure. If your data has a more complex structure (e.g., sharp discontinuities), other methods such as regression or machine learning models may be more appropriate. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Interpolate_na: max_map argument not working at array boundaries 1615599224 | |
1460877702 | https://github.com/pydata/xarray/issues/7597#issuecomment-1460877702 | https://api.github.com/repos/pydata/xarray/issues/7597 | IC_kwDOAMm_X85XEzmG | Karimat22 127195910 | 2023-03-08T21:08:17Z | 2023-03-14T19:40:34Z | NONE | The interpolate_na method in xarray can be used to interpolate missing values in a dataset or data array. The max_gap argument is used to specify the maximum number of consecutive NaN values that can be interpolated. The max_map argument is used to specify the maximum number of interpolated values that can be used for each NaN value. It's worth noting that the max_map argument only limits the number of interpolated values that can be used for each NaN value, but it does not limit the total number of interpolated values that can be used in the dataset. This means that if there are multiple consecutive NaN values, the max_map argument may not work as expected at the boundaries of the array. Here's an example to illustrate this issue: import xarray as xr import numpy as np create a sample data array with a missing value at the beginning and enddata = np.array([np.nan, 1, 2, 3, 4, np.nan]) create a dataset with the sample data arrayds = xr.Dataset({'foo': (['x'], data)}, coords={'x': np.arange(6)}) interpolate missing values with a max_map of 2ds_interp = ds.interpolate_na(max_gap=1, max_map=2) In this example, we have a data array with missing values at the beginning and end, and we interpolate the missing values using a max_map of 2. However, the resulting dataset still has 4 interpolated values, which is more than the max_map of 2. This is because the max_map argument is only limiting the number of interpolated values that can be used for each NaN value, but it is not limiting the total number of interpolated values that can be used in the dataset. To limit the total number of interpolated values in the dataset, you can use the limit argument, which specifies the maximum number of interpolated values that can be used in the entire dataset. Here's an example: interpolate missing values with a max_map of 2 and a limit of 2ds_interp = ds.interpolate_na(max_gap=1, max_map=2, limit=2) In this example, we add a limit argument of 2, which limits the total number of interpolated values in the dataset to 2. This results in only 2 interpolated values in the resulting dataset, which is consistent with the limit argument. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Interpolate_na: max_map argument not working at array boundaries 1615599224 | |
1461632125 | https://github.com/pydata/xarray/issues/7597#issuecomment-1461632125 | https://api.github.com/repos/pydata/xarray/issues/7597 | IC_kwDOAMm_X85XHrx9 | Karimat22 127195910 | 2023-03-09T09:18:11Z | 2023-03-09T09:18:11Z | NONE | @Ockenfuss i said you should try this three point I listed below and see if that could resolve the problem you raised.
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Interpolate_na: max_map argument not working at array boundaries 1615599224 | |
1460461197 | https://github.com/pydata/xarray/issues/7593#issuecomment-1460461197 | https://api.github.com/repos/pydata/xarray/issues/7593 | IC_kwDOAMm_X85XDN6N | Karimat22 127195910 | 2023-03-08T16:30:06Z | 2023-03-08T21:24:09Z | NONE | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Plotting with time-zone-aware pd.Timestamp axis not possible 1613054013 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
issue 7