Solves the empirical Bayes normal means (EBNM) problem for observations belonging to distinct groups.
ebnm_group(
x,
s = 1,
group,
prior_family = "point_normal",
mode = 0,
scale = "estimate",
g_init = NULL,
fix_g = FALSE,
output = ebnm_output_default(),
...
)A vector of observations. Missing observations (NAs) are
not allowed.
A vector of standard errors (or a scalar if all are equal).
Standard errors may not be exactly zero, and
missing standard errors are not allowed. Two prior families have
additional restrictions: when horseshoe priors are used, errors
must be homoskedastic; and since function
deconv in package deconvolveR takes
\(z\)-scores, the "deconvolver" family requires that all standard errors
be equal to 1.
A vector of character strings that gives the group to which each
observation belongs. It must have the same length as argument x. For
an example of usage, see Examples below.
A named vector that specifies the prior family \(G\) for each group. If the same prior family is to be used for all groups, then a character string may be used instead.
A named list that specifies, for each group, the mode of the
respective prior \(g\), or "estimate" if the mode is to be
estimated from the data. If the mode is the same across groups, then a
scalar may be used instead. If all modes are to be estimated, then
mode = "estimate" may be used.
A named list that specifies, for each group, the scale
parameter(s) of the respective prior, or "estimate" if the scale
parameters are to be estimated from the data. If the scale parameter is the
same across groups, then a scalar may be used instead. If all scales are to
be estimated, then scale = "estimate" may be used.
The prior distributions \(g\). Usually this is left
unspecified (NULL) and estimated from the data. However, it can be
used in conjuction with fix_g = TRUE to fix the prior (useful, for
example, to do computations with the "true" \(g\) in simulations). If
g_init is specified but fix_g = FALSE, g_init
specifies the initial value of \(g\) used during optimization. If
g_init is supplied, it should be a named list that specifies, for
each group, a prior of the appropriate class (normalmix
for normal, point-normal,
scale mixture of normals, and deconvolveR prior families, as well as
for the NPMLE; class laplacemix for
point-Laplace families; class gammamix for point-exponential
families; class horseshoe for horseshoe families; and class
unimix for unimodal_ families).
If TRUE, fix the prior \(g\) at g_init instead
of estimating it.
A character vector indicating which values are to be returned.
Function ebnm_output_default() provides the default return values, while
ebnm_output_all() lists all possible return values. See Value
below.
Additional parameters. When a unimodal_ prior family is used,
these parameters are passed to function ash in package
ashr. Although it
does not call into ashr, the scale mixture of normals family accepts
parameter gridmult for purposes of comparison. When gridmult
is set, an ashr-style grid will be used instead of the default
ebnm grid. When the "deconvolver" family is used, additional
parameters are passed to function deconv in
package deconvolveR. Families of generalized binary priors take several
additional parameters; see ebnm_generalized_binary. In all
other cases, additional parameters are ignored.
An ebnm object. Depending on the argument to output, the
object is a list containing elements:
dataA data frame containing the observations x
and standard errors s.
posteriorA data frame of summary results (posterior means, standard deviations, second moments, and local false sign rates).
fitted_gThe fitted prior \(\hat{g}\) (an object of
class normalmix, laplacemix,
gammamix, unimix,
tnormalmix, or horseshoe).
log_likelihoodThe optimal log likelihood attained, \(L(\hat{g})\).
posterior_samplerA function that can be used to
produce samples from the posterior. For all prior families other
than the horseshoe, the sampler takes a single parameter
nsamp, the number of posterior samples to return per
observation. Since ebnm_horseshoe returns an MCMC sampler,
it additionally takes parameter burn, the number of burn-in
samples to discard.
S3 methods coef, confint, fitted, logLik,
nobs, plot, predict, print, quantile,
residuals, simulate, summary, and vcov
have been implemented for ebnm objects. For details, see the
respective help pages, linked below under See Also.
The EBNM model for grouped data, with observations \(x_j\) belonging to groups \(k = 1, ..., K\), is $$x_j | \theta_j, s_j \sim N(\theta_j, s_j^2)$$ $$\theta_j \sim g_{k(j)} \in G_{k(j)}.$$
Solving the EBNM problem for grouped data is equivalent to solving a separate EBNM problem for each group \(k = 1, ..., K\), with the optimal log likelihood equal to the sum of the optimal log likelihoods for each separate problem.
group <- c(rep("small_sd", 100), rep("large_sd", 100))
theta <- c(rnorm(100, sd = 1), rnorm(100, sd = 10))
s <- 1
x <- theta + rnorm(200, 0, s)
ebnm.group.res <- ebnm_group(x, s, group)
# Use different prior families for each group:
ebnm.group.res <- ebnm_group(
x, s, group,
prior_family = list(small_sd = "normal", large_sd = "normal_scale_mixture")
)
# Different modes and scales can be set similarly:
ebnm.group.res <- ebnm_group(
x, s, group,
mode = list(small_sd = 0, large_sd = "estimate"),
scale = list(small_sd = 1, large_sd = "estimate")
)