Where do we grow things? This is an important question that can be answered in a few ways, but a more direct way is to just ask growers. This is what they did at a national scale in the 2017 United States’ Agriculture Census and here I am going to use R to load it, filter it, and plot the result.
To start we need to download the data and the shape file of the USA counties (or states).
I never liked public speaking. When I was in primary school, I remember getting my younger sister to buy things for me from shops so I could avoid talking to adults. Although less severe in my adult life, I still can’t say I particularly enjoy giving presentations for their own sake. But what I have realised is that communication is a hugely important skill in just about professional role, which motivated me to push passed the discomfort.
I don’t know why, but it took me a little while to properly make sense of these diagnostics, so I wanted to develop a very simple illustration of the logic behind these concepts. ROC stands for Receiver Operating Characteristics, while AUC is the area under this curve, which is used as a metric for model performance in a classification problem. Perfomance is measured as the ability to maximise true positives, while minimising false positives.
Soil structure and health is critical to water availability, nutrient cycling, and plant productivity. By extension, soils will have strong associations with invertebrate community assemblage, diversity, and abundance and is thus a invertebrate science.
Modern spatial datasets on soil are available to help us intergrate variation in soil characteristics in the prediction of invertebrate processes. In Austalia, one of the newer soil data sets is Soil and Landscape grid of Australia.
Proportions are a funny thing in statistics. Some people just seem to love percentages. But there is a dark side to modelling a response variable as a percentage.
For example, I might be tempted to fit a linear model to mortality data on some insects exposed to heat stress for some time. To prove the point I will simulate some data.
library(tidyverse) time = rep(0:9, 10) n = 100 a = -1 b = 0.
As pest scientists, it is important to understand how crop seasonality overlaps with pest seasonality.
But vegetable and fruit seasonality in Australia is important for a few other reasons. In season produce is cheaper, fresher, with a lower carbon footprint, compared with imported produce, due to increased local availability. In addition, different areas in Australia have different fruit production outputs e.g. high melon production in New South Wales but not in Victoria, so it is important to now what is grown near you.
Aggregation measures for pest abundance have been widely used as summary statistics for aggregation levels as well as in designing surveillance protocols. Taylor’s power law and Iwao’s patchiness are two methods that are used most commonly. To be frank, I find the measures a little strange, particularly when they appear in papers as “cookbook statistics” (sometimes incorrectly presented) with little reference to any underpinning theory. But I managed to find some useful sources which helped to clarrify things for me.
A practical question in species surveillance is “How much search effort is required for detection?”. This can be quantified under controlled conditions where the number and location of target species are known and participants are recruited to see how success rate varies.
Let’s use an example of an easter egg hunt where the adult (the researcher) wants to quantify how much effort it takes a child (the participant) to find an easter egg.
## Registered S3 methods overwritten by 'ggplot2': ## method from ## [.quosures rlang ## c.quosures rlang ## print.quosures rlang This short post will describe how to access SILO climatic data for Australia..
Since the first time I wrote this post, there has been a significant overhaul of the API so write a simplified (but functioning!) version of the previous post here.
library(tidyverse) library(sf) library(tmap) res <- httr::GET("https://www.longpaddock.qld.gov.au/cgi-bin/silo/PatchedPointDataset.php?format=name&nameFrag=_") recs <- read_delim(httr::content(res, as="text"), delim = "|", col_names = c('number','name','latitude','longitude','state', 'elevation', 'extra')) %>% mutate(latitude = as.
How many probability distributions can we generate by imagining simple natural processes? In this post I use a simple binomial random number generator to produce different random variables with a variety of distributions. Using built in probability densities functions in R, I show how the simulated data (plot bars) approach the exact probability density (plot lines) and provide an intuitive interpretation of model parameters of commonly encountered distributions.
A biological example “Nothing in Biology Makes Sense Except in the Light of Evolution” - Theodosius Dobzhansky, 1973