html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/5878#issuecomment-968176008,https://api.github.com/repos/pydata/xarray/issues/5878,968176008,IC_kwDOAMm_X845tTGI,7237617,2021-11-13T23:43:17Z,2021-11-13T23:44:27Z,NONE,"Update: my local notebook accessing the public bucket **does** see the appended zarr store exactly as expected, while the 2i2c-hosted notebook still is not (been well over 3600s).
Also, I do as @jkingslake does above and set the `cache_timeout=0`. From [GCSFs docs](https://gcsfs.readthedocs.io/en/latest/api.html#gcsfs.core.GCSFileSystem) `Set cache_timeout <= 0 for no caching,` seems like the functionality we desire, yet I continue to only see the un-appended zarr ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1030811490
https://github.com/pydata/xarray/issues/5878#issuecomment-967408017,https://api.github.com/repos/pydata/xarray/issues/5878,967408017,IC_kwDOAMm_X845qXmR,7237617,2021-11-12T19:40:46Z,2021-11-13T23:25:53Z,NONE,"> Right now, it shows the shape is `[6]`, as expected after the appending. However, if you read the file immediately after appending (within the 3600s `max-age`), you will get the cached copy. The cached copy will still be of shape `[3]`--it won't know about the append.
Ignorant question: is this cache relevant to client (Jupyter) side or server (GCS) side? It has been well over 3600s and I'm still not seeing the _appended_ zarr when reading it in using Xarray.
> To test this hypothesis, you would need to [disable caching](https://cloud.google.com/storage/docs/metadata) on the bucket. Do you have privileges to do that?
I tried to do this last night but did not have permission myself. Perhaps @jkingslake does?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1030811490
https://github.com/pydata/xarray/issues/5878#issuecomment-967340995,https://api.github.com/repos/pydata/xarray/issues/5878,967340995,IC_kwDOAMm_X845qHPD,7237617,2021-11-12T18:52:01Z,2021-11-12T18:58:52Z,NONE,"Thanks for pointing out this cache feature @rabernat. I had no idea - makes sense in general but slows down testing if no known about! Anyway for my case, when appending the second Zarr store to the first, the Zarr's size (using `gsutil du`) does indeed double. I'm new to cloud storage, but my hunch is that this suggests it was appended?
> Can you post the full stack trace of the error you get when you try to append?
In my instance, there is no error, only this returned: ``
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1030811490
https://github.com/pydata/xarray/issues/5023#issuecomment-828683287,https://api.github.com/repos/pydata/xarray/issues/5023,828683287,MDEyOklzc3VlQ29tbWVudDgyODY4MzI4Nw==,7237617,2021-04-28T18:30:46Z,2021-04-28T18:30:46Z,NONE,"Thanks @dcherian
```
>> ds = xr.open_mfdataset(NCs_urls, engine='netcdf4',
parallel=True,
concat_dim='XTIME',
)
ValueError: Could not find any dimension coordinates to use to order the datasets for concatenation
```
So it doesn't work, but perhaps that's not surprising give that 'XTIME' is a coordinate, but 'Time' is the dimension (one of WRF's quirks related to staggered grids and moving nests).
```
>> print(ds.coords)
Coordinates:
XLAT (Time, south_north, west_east) float32 dask.array
XLONG (Time, south_north, west_east) float32 dask.array
XTIME (Time) datetime64[ns] dask.array
XLAT_U (Time, south_north, west_east_stag) float32 dask.array
XLONG_U (Time, south_north, west_east_stag) float32 dask.array
XLAT_V (Time, south_north_stag, west_east) float32 dask.array
XLONG_V (Time, south_north_stag, west_east) float32 dask.array
```
As such, I'm following the documentation to add a preprocessor `ds.swap_dims({'Time':'XTIME'})`, which works as expected.
Thanks for everyone's help! Shall I close this? (as it was never actually an _issue_?
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,829426650
https://github.com/pydata/xarray/issues/5023#issuecomment-812278389,https://api.github.com/repos/pydata/xarray/issues/5023,812278389,MDEyOklzc3VlQ29tbWVudDgxMjI3ODM4OQ==,7237617,2021-04-02T02:14:19Z,2021-04-02T02:14:19Z,NONE,"Thanks for the great suggestion @shoyer - your suggestion to loop through the netCDF files is working well in Dask using the following code:
```
import xarray as xr
import gcsfs
from tqdm.autonotebook import tqdm
xr.set_options(display_style=""html"");
fs = gcsfs.GCSFileSystem(project='ldeo-glaciology', mode='r',cache_timeout = 0)
NCs = fs.glob('gs://ldeo-glaciology/AMPS/WRF_24/domain_02/*.nc')
url = 'gs://' + NCs[0]
openfile = fs.open(url, mode='rb')
ds = xr.open_dataset(openfile, engine='h5netcdf',chunks={'Time': -1})
for i in tqdm(range(1, 8)):
url = 'gs://' + NCs[i]
openfile = fs.open(url, mode='rb')
temp = xr.open_dataset(openfile, engine='h5netcdf',chunks={'Time': -1})
ds = xr.concat([ds,temp],'Time')
```
However, I am still confused why `open_mfdataset` was not parsing the `Time` dimension - the concatenated DataSet using the looping method above appears to have a time dimension compatible with datetime64[ns].
```
>> ds.coords['XTIME'].compute()
xarray.DataArray'XTIME'Time: 8
array(['2019-01-01T03:00:00.000000000', '2019-01-01T06:00:00.000000000',
'2019-01-01T09:00:00.000000000', '2019-01-01T12:00:00.000000000',
'2019-01-01T15:00:00.000000000', '2019-01-01T18:00:00.000000000',
'2019-01-01T21:00:00.000000000', '2019-01-02T00:00:00.000000000'],
dtype='datetime64[ns]')
```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,829426650