This function robustly computes a statistic over a vector of values. The user can specify what to return if the vector is empty, values to include or exclude, and how to treat missing values. Useful in combination of functions like aggregate or dplyr's summarise.

statistic(
  x,
  f = length,
  include = NULL,
  exclude = NULL,
  na.rm = T,
  default = NA,
  ...
)

Arguments

x

A vector.

f

A function that takes the vector x as its first argument.

include

A logical vector matching length to the vector x.

exclude

A vector of unique cases to match and exclude.

na.rm

Logical; if TRUE removes NA values from x.

default

The default value to return if x is empty.

...

Additional arguments to pass to the function f.

Value

A computed statistic, output from the user-supplied function.

Examples

# Examples using the 'iris' data set
data("iris")

# Default
statistic(iris$Sepal.Length)
#> [1] 150
# User-specified statistic
statistic(iris$Sepal.Length, f = mean)
#> [1] 5.843333
# Include subset of cases
statistic(iris$Sepal.Length,
  f = mean,
  include = iris$Species == "setosa"
)
#> [1] 5.006
# Custom function
statistic(iris$Species, f = function(x) mean(x %in% "setosa"))
#> [1] 0.3333333
# Exclude unique cases
statistic(iris$Species,
  f = function(x) mean(x %in% "setosa"),
  exclude = "virginica"
)
#> [1] 0.5
# If empty vector supplied, user-specified default value returned
statistic(iris$wrong_name, default = 0)
#> [1] 0