SILO data in R

Accessing SILO climatic data for Australia in R

This short post will describe how to access SILO climatic data for Australia. The data is available as both csv and json, but here we work with the two-dimensional csv format to make use of R’s powerful data table functionality.

At the time of writing there were 18937 stations. We can access the metadata on each station (including location and years of available data), which will later become useful for selecting an appropriate weather station.

library(tidyverse)
library(sf)
library(tmap)
res <- httr::GET("https://siloapi.longpaddock.qld.gov.au/stations?format=csv")
recs <- read_csv(httr::content(res, as="text"))
stationsmeta = recs %>% 
  st_as_sf(coords = c('longitude', 'latitude'), crs = 4326)

We can plot the location of weather stations using the tmap package.

tm_shape(spData::world %>% filter(name_long == "Australia")) +
  tm_borders() +
  tm_shape(stationsmeta) +
  tm_dots(col = NA, alpha = 0.3) +
  tm_style('gray', title = "SILO weather \nstations", 
            outer.bg.color = rgb(.2,.21,.27))

Most stations have between 5 and 10 years of data.

ggplot(stationsmeta) + 
  geom_histogram(aes(x = ifelse(is.na(end_year), 2018, end_year) - start_year), 
           breaks = seq(0, 200, by = 5)) +
  xlab('years of data') +
  mydarktheme

Useful metadata can be accessed, such as the names of climatic variables and their units.

# variable metadata
silometa = read_csv('https://siloapi.longpaddock.qld.gov.au/variables?format=csv', col_types = 'ccc')
knitr::kable(silometa)
name code units
Daily rainfall daily_rain mm
Monthly rainfall monthly_rain mm
Maximum temperature max_temp Celsius
Minimum temperature min_temp Celsius
Vapour pressure vp hPa
Vapour pressure deficit vp_deficit hPa
Evaporation - Class A pan evap_pan mm
Evaporation - synthetic estimate evap_syn mm
Evaporation - combination (synthetic estimate pre-1970, class A pan 1970 onwards) evap_comb mm
Evaporation - Morton’s shallow lake evaporation evap_morton_lake mm
Solar radiation - total incoming downward shortwave radiation on a horizontal surface radiation MJm-2
Relative humidity at the time of maximum temperature rh_tmax %
Relative humidity at the time of minimum temperature rh_tmin %
Evapotranspiration - FAO56 short crop et_short_crop mm
Evapotranspiration - ASCE tall crop et_tall_crop mm
Evapotranspiration - Morton’s areal actual evapotranspiration et_morton_actual mm
Evapotranspiration - Morton’s potential evapotranspiration et_morton_potential mm
Evapotranspiration - Morton’s wet-environment areal evapotranspiration over land et_morton_wet mm
Mean sea level pressure mslp hPa

Next we specify some options to help us select an appropriate weather station. The string loc is used to extract a coordinate using a geocoder. Alternatively you can manually input a longitude and latitude later. The dates startdate and enddate hold the dates across which we require climatic data. The string apikey holds the API key which is required to query the database and freely available upon registration at the SILO landing page.

loc = 'Melbourne, Australia'
startdate = as.Date('2018-01-01')
enddate = as.Date('2018-12-30')
apikey = 'PASTE YOUR API KEY HERE'

Filter the station metadata by station with data for our dates. As an aside, if the station is still in operation the end_year will be NA.

stations = stationsmeta %>% 
  filter(start_year <= format(startdate, '%Y')) %>%
  filter(is.na(end_year) | (end_year >= format(enddate, '%Y')))

Convert the location string to a coordinate and find the closest weather station to the location.

xy = tmaptools::geocode_OSM(loc,as.sf = TRUE)
near_stat = stations[which.min(st_distance(xy, stations)), ]
knitr::kable(near_stat)
name number start_year end_year supplier elevation geometry
MELBOURNE (OLYMPIC PARK) 86338 2013 NA Bureau of Meteorology 7.5 c(144.9816, -37.8255)

We can also explicitly find the distance to the nearest weather station.

st_distance(xy, near_stat)
## Units: [m]
##          [,1]
## [1,] 2050.359

Now we have the weather station id and a date range for which we require data we can query the SILO database.

params = list(
  apikey = apikey,
  format = 'csv',
  station = near_stat$number,
  start = format(startdate, '%Y%m%d'),
  finish = format(enddate, '%Y%m%d'),
  variables = 'max_temp,min_temp,radiation,daily_rain'
)
res <- httr::GET("https://siloapi.longpaddock.qld.gov.au/pointdata", query=params)
silodata <- read_csv(httr::content(res, as="text")) 
head(silodata)
## # A tibble: 6 x 10
##   station date       daily_rain daily_rain_sour~ max_temp max_temp_source
##     <int> <date>          <dbl>            <int>    <dbl>           <int>
## 1   86338 2018-01-01        0                  0     25.3               0
## 2   86338 2018-01-02        0                  0     21.8               0
## 3   86338 2018-01-03        0                  0     21                 0
## 4   86338 2018-01-04        0.2                0     23.6               0
## 5   86338 2018-01-05        0                  0     30.7               0
## 6   86338 2018-01-06        0                  0     41.7               0
## # ... with 4 more variables: min_temp <dbl>, min_temp_source <int>,
## #   radiation <dbl>, radiation_source <int>

And finally, plot the data.

silodata %>% 
  select(date, `Rain, mm` = daily_rain, `Min temp, C` =min_temp, `Max temp, C` = max_temp) %>%
  gather(variable, value, -date) %>%
  ggplot() + 
    geom_line(aes(date, value, colour = variable)) + 
    xlab('') + ylab('') +
    ggtitle(loc) + 
    mydarktheme

Related

comments powered by Disqus