botorch.acquisition
Acquisition Function APIs
Abstract Acquisition Function APIs
Abstract base module for all botorch acquisition functions.
- class botorch.acquisition.acquisition.AcquisitionFunction(model)[source]
Bases:
Module,ABCAbstract base class for acquisition functions.
Please note that if your acquisition requires a backwards call, you will need to wrap the backwards call inside of an enable_grad context to be able to optimize the acquisition. See #1164.
Constructor for the AcquisitionFunction base class.
- Parameters:
model (Model) – A fitted model.
- class botorch.acquisition.acquisition.OneShotAcquisitionFunction(model)[source]
Bases:
AcquisitionFunction,ABCAbstract base class for acquisition functions using one-shot optimization
Constructor for the AcquisitionFunction base class.
- Parameters:
model (Model) – A fitted model.
- abstractmethod get_augmented_q_batch_size(q)[source]
Get augmented q batch size for one-shot optimization.
- Parameters:
q (int) – The number of candidates to consider jointly.
- Returns:
The augmented size for one-shot optimization (including variables parameterizing the fantasy solutions).
- Return type:
int
- abstractmethod extract_candidates(X_full)[source]
Extract the candidates from a full “one-shot” parameterization.
- Parameters:
X_full (Tensor) – A
b x q_aug x d-dim Tensor withbt-batches ofq_augdesign points each.- Returns:
A
b x q x d-dim Tensor withbt-batches ofqdesign points each.- Return type:
Tensor
- class botorch.acquisition.acquisition.MCSamplerMixin(sampler=None)[source]
Bases:
ABCA mix-in for adding sampler functionality into an acquisition function class.
- _default_sample_shape
The
sample_shapefor the default sampler.
Register the sampler on the acquisition function.
- Parameters:
sampler (MCSampler | None) – The sampler used to draw base samples for MC-based acquisition functions. If
None, a sampler is generated on the fly within theget_posterior_samplesmethod usingget_sampler.
- get_posterior_samples(posterior)[source]
Sample from the posterior using the sampler.
- Parameters:
posterior (Posterior) – The posterior to sample from.
- Return type:
Tensor
- property sample_shape: Size
- class botorch.acquisition.acquisition.MultiModelAcquisitionFunction(model_dict)[source]
Bases:
AcquisitionFunction,ABCAbstract base class for acquisition functions that require multiple types of models.
The intended use case for these acquisition functions are those where we have multiple models, each serving a distinct purpose. As an example, we can have a “regression” model that predicts one or more outcomes, and a “classification” model that predicts the probabilty that a given parameterization is feasible. The multi-model acquisition function can then weight the acquisition value computed with the “regression” model with the feasibility value predicted by the “classification” model to produce the composite acquisition value.
This is currently only a placeholder to help with some development in Ax. We plan to add some acquisition functions utilizing multiple models in the future.
Constructor for the MultiModelAcquisitionFunction base class.
- Parameters:
model_dict (ModelDict) – A ModelDict mapping labels to models.
Analytic Acquisition Function API
- class botorch.acquisition.analytic.AnalyticAcquisitionFunction(model, posterior_transform=None, allow_multi_output=False)[source]
Bases:
AcquisitionFunction,ABCBase class for analytic acquisition functions.
Base constructor for analytic acquisition functions.
- Parameters:
model (Model) – A fitted single-outcome model.
posterior_transform (PosteriorTransform | None) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.
allow_multi_output (bool) – If False, requires a posterior_transform if a multi-output model is passed.
Cached Cholesky Acquisition Function API
Abstract class for acquisition functions leveraging a cached Cholesky decomposition of the posterior covariance over f(X_baseline).
- botorch.acquisition.cached_cholesky.supports_cache_root(model)[source]
Checks if a model supports the cache_root functionality. The two criteria are that the model is not multi-task and the model produces a GPyTorchPosterior.
- Parameters:
model (Model)
- Return type:
bool
- class botorch.acquisition.cached_cholesky.CachedCholeskyMCSamplerMixin(model, cache_root=None, sampler=None)[source]
Bases:
MCSamplerMixinAbstract Mixin class for acquisition functions using a cached Cholesky.
Specifically, this is for acquisition functions that require sampling from the posterior P(f(X_baseline, X) | D). The Cholesky of the posterior covariance over f(X_baseline) is cached.
Set class attributes and perform compatibility checks.
Decoupled Acquisition Function API
Abstract base module for decoupled acquisition functions.
- class botorch.acquisition.decoupled.DecoupledAcquisitionFunction(model, X_evaluation_mask=None, **kwargs)[source]
Bases:
AcquisitionFunction,ABCAbstract base class for decoupled acquisition functions. A decoupled acquisition function where one may intend to evaluate a design on only a subset of the outcomes. Typically this would be handled by fantasizing, where one would fantasize as to what the partial observation would be if one were to evaluate a design on the subset of outcomes (e.g. you only fantasize at those outcomes). The
X_evaluation_maskspecifies which outcomes should be evaluated for each design.X_evaluation_maskisq x m, where there are q design points in the batch and m outcomes. In the asynchronous case, where there are n’ pending points, we need to track which outcomes each pending point should be evaluated on. In this case, we concatenateX_pending_evaluation_maskwithX_evaluation_maskto obtain the full evaluation_mask.This abstract class handles generating and updating an evaluation mask, which is a boolean tensor indicating which outcomes a given design is being evaluated on. The evaluation mask has shape
(n' + q) x m, where n’ is the number of pending points and the q represents the new candidates to be generated.If
X(_pending)_evaluation_maskis None, it is assumed thatX(_pending)will be evaluated on all outcomes.Initialize.
- Parameters:
model (ModelList) – A model
X_evaluation_mask (Tensor | None) – A
q x m-dim boolean tensor indicating which outcomes the decoupled acquisition function should generate new candidates for.
- property X_evaluation_mask: Tensor | None
Get the evaluation indices for the new candidate.
- set_X_pending(X_pending=None, X_pending_evaluation_mask=None)[source]
Informs the AF about pending design points for different outcomes.
- Parameters:
X_pending (Tensor | None) – A
n' x dTensor withn'd-dim design points that have been submitted for evaluation but have not yet been evaluated.X_pending_evaluation_mask (Tensor | None) – A
n' x m-dim tensor of booleans indicating for which outputs the pending point is being evaluated on. IfX_pending_evaluation_maskisNone, it is assumed thatX_pendingwill be evaluated on all outcomes.
- Return type:
None
Monte-Carlo Acquisition Function API
- class botorch.acquisition.monte_carlo.MCAcquisitionFunction(model, sampler=None, objective=None, posterior_transform=None, X_pending=None)[source]
Bases:
AcquisitionFunction,MCSamplerMixin,ABCAbstract base class for Monte-Carlo based batch acquisition functions.
- Parameters:
model (Model) – A fitted model.
sampler (MCSampler | None) – The sampler used to draw base samples. If not given, a sampler is generated on the fly within the
get_posterior_samplesmethod usingbotorch.sampling.get_sampler. NOTE: For posteriors that do not support base samples, a sampler compatible with intended use case must be provided. SeeForkedRNGSamplerandStochasticSampleras examples.objective (MCAcquisitionObjective | None) – The MCAcquisitionObjective under which the samples are evaluated. Defaults to
IdentityMCObjective().posterior_transform (PosteriorTransform | None) – A PosteriorTransform (optional).
X_pending (Tensor | None) – A
batch_shape, m x d-dim Tensor ofmdesign points that have points that have been submitted for function evaluation but have not yet been evaluated.
- abstractmethod forward(X)[source]
Takes in a
batch_shape x q x dX Tensor of t-batches withqd-dim design points each, and returns a Tensor with shapebatch_shape', wherebatch_shape'is the broadcasted batch shape of model and inputX. Should utilize the result ofset_X_pendingas needed to account for pending function evaluations.- Parameters:
X (Tensor)
- Return type:
Tensor
Multi-Output Acquisition Function API
Abstract base module for multi-output acquisition functions.
- class botorch.acquisition.multioutput_acquisition.MultiOutputAcquisitionFunction(model)[source]
Bases:
AcquisitionFunction,ABCAbstract base class for multi-output acquisition functions.
These are intended to be optimized with a multi-objective optimizer (e.g. NSGA-II).
Constructor for the AcquisitionFunction base class.
- Parameters:
model (Model) – A fitted model.
- abstractmethod forward(X)[source]
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Tensor) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.- Returns:
A
(b) x m-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Tensor
- class botorch.acquisition.multioutput_acquisition.MultiOutputPosteriorMean(model, weights=None)[source]
Bases:
MultiOutputAcquisitionFunctionConstructor for the MultiOutputPosteriorMean.
Maximization of all outputs is assumed by default. Minimizing outputs can be achieved by setting the corresponding weights to negative.
- Parameters:
model (Model) – A fitted model.
weights (Tensor | None) – A one-dimensional tensor with
melements representing the weights on the outputs.
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b) x m-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
- class botorch.acquisition.multioutput_acquisition.MultiOutputAcquisitionFunctionWrapper(acqfs)[source]
Bases:
MultiOutputAcquisitionFunctionMulti-output wrapper around single-output acquisition functions.
Constructor for the MultiOutputAcquisitionFunctionWrapper.
- Parameters:
acqfs (list[AcquisitionFunction]) – A list of
macquisition functions.
Base Classes for Multi-Objective Acquisition Function API
Base classes for multi-objective acquisition functions.
- class botorch.acquisition.multi_objective.base.MultiObjectiveAnalyticAcquisitionFunction(model, posterior_transform=None)[source]
Bases:
AcquisitionFunctionAbstract base class for Multi-Objective batch acquisition functions.
Constructor for the MultiObjectiveAnalyticAcquisitionFunction base class.
- Parameters:
model (Model) – A fitted model.
posterior_transform (PosteriorTransform | None) – A PosteriorTransform (optional).
- class botorch.acquisition.multi_objective.base.MultiObjectiveMCAcquisitionFunction(model, sampler=None, objective=None, constraints=None, eta=0.001, X_pending=None)[source]
Bases:
AcquisitionFunction,MCSamplerMixin,ABCAbstract base class for Multi-Objective batch acquisition functions.
NOTE: This does not inherit from
MCAcquisitionFunctionto avoid circular imports.- Parameters:
_default_sample_shape – The
sample_shapefor the default sampler.model (Model)
sampler (MCSampler | None)
objective (MCMultiOutputObjective | None)
constraints (list[Callable[[Tensor], Tensor]] | None)
eta (Tensor | float)
X_pending (Tensor | None)
Constructor for the
MultiObjectiveMCAcquisitionFunctionbase class.- Parameters:
model (Model) – A fitted model.
sampler (MCSampler | None) – The sampler used to draw base samples. If not given, a sampler is generated using
get_sampler. NOTE: For posteriors that do not support base samples, a sampler compatible with intended use case must be provided. SeeForkedRNGSamplerandStochasticSampleras examples.objective (MCMultiOutputObjective | None) – The MCMultiOutputObjective under which the samples are evaluated. Defaults to
IdentityMCMultiOutputObjective().constraints (list[Callable[[Tensor], Tensor]] | None) – A list of callables, each mapping a Tensor of dimension
sample_shape x batch-shape x q x mto a Tensor of dimensionsample_shape x batch-shape x q, where negative values imply feasibility.eta (Tensor | float) – The temperature parameter for the sigmoid function used for the differentiable approximation of the constraints. In case of a float the same eta is used for every constraint in constraints. In case of a tensor the length of the tensor must match the number of provided constraints. The i-th constraint is then estimated with the i-th eta value.
X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have points that have been submitted for function evaluation but have not yet been evaluated.
- abstractmethod forward(X)[source]
Takes in a
batch_shape x q x dX Tensor of t-batches withqd-dim design points each, and returns a Tensor with shapebatch_shape', wherebatch_shape'is the broadcasted batch shape of model and inputX. Should utilize the result ofset_X_pendingas needed to account for pending function evaluations.- Parameters:
X (Tensor)
- Return type:
Tensor
Acquisition Functions
Analytic Acquisition Functions
Analytic Acquisition Functions that evaluate the posterior without performing Monte-Carlo sampling.
- class botorch.acquisition.analytic.LogProbabilityOfImprovement(model, best_f, posterior_transform=None, maximize=True)[source]
Bases:
AnalyticAcquisitionFunctionSingle-outcome Log Probability of Improvement.
Logarithm of the probability of improvement over the current best observed value, computed using the analytic formula under a Normal posterior distribution. Only supports the case of q=1. Requires the posterior to be Gaussian. The model must be single-outcome.
The logarithm of the probability of improvement is numerically better behaved than the original function, which can lead to significantly improved optimization of the acquisition function. This is analogous to the common practice of optimizing the log likelihood of a probabilistic model - rather than the likelihood - for the sake of maximum likelihood estimation.
logPI(x) = log(P(y >= best_f)), y ~ f(x)Example
>>> model = SingleTaskGP(train_X, train_Y) >>> LogPI = LogProbabilityOfImprovement(model, best_f=0.2) >>> log_pi = LogPI(test_X)
Single-outcome Log Probability of Improvement.
- Parameters:
model (Model) – A fitted single-outcome model.
best_f (float | Tensor) – Either a scalar or a
b-dim Tensor (batch mode) representing the best function value observed so far (assumed noiseless).posterior_transform (PosteriorTransform | None) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.
maximize (bool) – If True, consider the problem a maximization problem.
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
- class botorch.acquisition.analytic.ProbabilityOfImprovement(model, best_f, posterior_transform=None, maximize=True)[source]
Bases:
AnalyticAcquisitionFunctionSingle-outcome Probability of Improvement.
Probability of improvement over the current best observed value, computed using the analytic formula under a Normal posterior distribution. Only supports the case of q=1. Requires the posterior to be Gaussian. The model must be single-outcome.
PI(x) = P(y >= best_f), y ~ f(x)Example
>>> model = SingleTaskGP(train_X, train_Y) >>> PI = ProbabilityOfImprovement(model, best_f=0.2) >>> pi = PI(test_X)
Single-outcome Probability of Improvement.
- Parameters:
model (Model) – A fitted single-outcome model.
best_f (float | Tensor) – Either a scalar or a
b-dim Tensor (batch mode) representing the best function value observed so far (assumed noiseless).posterior_transform (PosteriorTransform | None) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.
maximize (bool) – If True, consider the problem a maximization problem.
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
- class botorch.acquisition.analytic.qAnalyticProbabilityOfImprovement(model, best_f, posterior_transform=None, maximize=True)[source]
Bases:
AnalyticAcquisitionFunctionApproximate, single-outcome batch Probability of Improvement using MVNXPB.
This implementation uses MVNXPB, a bivariate conditioning algorithm for approximating P(a <= Y <= b) for multivariate normal Y. See [Trinh2015bivariate]. This (analytic) approximate q-PI is given by
approx-qPI(X) = P(max Y >= best_f) = 1 - P(Y < best_f), Y ~ f(X), X = (x_1,...,x_q), whereP(Y < best_f)is estimated using MVNXPB.qPI using an analytic approximation.
- Parameters:
model (Model) – A fitted single-outcome model.
best_f (float | Tensor) – Either a scalar or a
b-dim Tensor (batch mode) representing the best function value observed so far (assumed noiseless).posterior_transform (PosteriorTransform | None) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.
maximize (bool) – If True, consider the problem a maximization problem.
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
- class botorch.acquisition.analytic.ExpectedImprovement(model, best_f, posterior_transform=None, maximize=True)[source]
Bases:
AnalyticAcquisitionFunctionSingle-outcome Expected Improvement (analytic).
Computes classic Expected Improvement over the current best observed value, using the analytic formula for a Normal posterior distribution. Unlike the MC-based acquisition functions, this relies on the posterior at single test point being Gaussian (and require the posterior to implement
meanandvarianceproperties). Only supports the case ofq=1. The model must be single-outcome.EI(x) = E(max(f(x) - best_f, 0)),where the expectation is taken over the value of stochastic function
fatx.Example
>>> model = SingleTaskGP(train_X, train_Y) >>> EI = ExpectedImprovement(model, best_f=0.2) >>> ei = EI(test_X)
NOTE: It is strongly recommended to use
LogExpectedImprovementinstead of regularEI, as it can lead to substantially improved BO performance through improved numerics. See https://arxiv.org/abs/2310.20708 for details.Single-outcome Expected Improvement (analytic).
- Parameters:
model (Model) – A fitted single-outcome model.
best_f (float | Tensor) – Either a scalar or a
b-dim Tensor (batch mode) representing the best function value observed so far (assumed noiseless).posterior_transform (PosteriorTransform | None) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.
maximize (bool) – If True, consider the problem a maximization problem.
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
- class botorch.acquisition.analytic.LogExpectedImprovement(model, best_f, posterior_transform=None, maximize=True)[source]
Bases:
AnalyticAcquisitionFunctionSingle-outcome Log Expected Improvement (analytic).
Computes the logarithm of the classic Expected Improvement acquisition function, in a numerically robust manner. In particular, the implementation takes special care to avoid numerical issues in the computation of the acquisition value and its gradient in regions where improvement is predicted to be virtually impossible.
See [Ament2023logei] for details. Formally,
LogEI(x) = log(E(max(f(x) - best_f, 0))),where the expectation is taken over the value of stochastic function
fatx.Example
>>> model = SingleTaskGP(train_X, train_Y) >>> LogEI = LogExpectedImprovement(model, best_f=0.2) >>> ei = LogEI(test_X)
Logarithm of single-outcome Expected Improvement (analytic).
- Parameters:
model (Model) – A fitted single-outcome model.
best_f (float | Tensor) – Either a scalar or a
b-dim Tensor (batch mode) representing the best function value observed so far (assumed noiseless).posterior_transform (PosteriorTransform | None) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.
maximize (bool) – If True, consider the problem a maximization problem.
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
- class botorch.acquisition.analytic.ConstrainedAnalyticAcquisitionFunctionMixin(constraints)[source]
Bases:
ABCBase class for constrained analytic acquisition functions.
Constrained analytic acquisition function mixin.
- Parameters:
constraints (dict[int, tuple[float | None, float | None]]) – A dictionary of the form
{i: [lower, upper]}, whereiis the output index, andlowerandupperare lower and upper bounds on that output (resp. interpreted as -Inf / Inf if None)
- abstractmethod register_buffer(name, tensor, persistent=True)[source]
Register a buffer on the module.
This is an abstract placeholder whose signature matches
torch.nn.Module.register_buffer. It exists because this mixin callsself.register_bufferin_preprocess_constraint_boundsbut does not itself inherit fromnn.Module. All concrete subclasses obtain the real implementation fromnn.ModuleviaAnalyticAcquisitionFunction; this stub simply makes the interface dependency explicit and keeps Pyre happy.- Parameters:
name (str)
tensor (Tensor | None)
persistent (bool)
- Return type:
None
- class botorch.acquisition.analytic.LogConstrainedExpectedImprovement(model, best_f, objective_index, constraints, maximize=True)[source]
Bases:
AnalyticAcquisitionFunction,ConstrainedAnalyticAcquisitionFunctionMixinLog Constrained Expected Improvement (feasibility-weighted).
Computes the logarithm of the analytic expected improvement for a Normal posterior distribution weighted by a probability of feasibility. The objective and constraints are assumed to be independent and have Gaussian posterior distributions. Only supports non-batch mode (i.e.
q=1). The model should be multi-outcome, with the index of the objective and constraints passed to the constructor.See [Ament2023logei] for details. Formally,
LogConstrainedEI(x) = log(EI(x)) + Sum_i log(P(y_i \in [lower_i, upper_i])),where
y_i ~ constraint_i(x)andlower_i,upper_iare the lower and upper bounds for the i-th constraint, respectively.Example
# example where the 0th output has a non-negativity constraint and # the 1st output is the objective >>> model = SingleTaskGP(train_X, train_Y) >>> constraints = {0: (0.0, None)} >>> LogCEI = LogConstrainedExpectedImprovement(model, 0.2, 1, constraints) >>> cei = LogCEI(test_X)
Analytic Log Constrained Expected Improvement.
- Parameters:
model (Model) – A fitted single- or multi-output model.
best_f (float | Tensor) – Either a scalar or a
b-dim Tensor (batch mode) representing the best feasible function value observed so far (assumed noiseless).objective_index (int) – The index of the objective.
constraints (dict[int, tuple[float | None, float | None]]) – A dictionary of the form
{i: [lower, upper]}, whereiis the output index, andlowerandupperare lower and upper bounds on that output (resp. interpreted as -Inf / Inf if None)maximize (bool) – If True, consider the problem a maximization problem.
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
- class botorch.acquisition.analytic.LogProbabilityOfFeasibility(model, constraints)[source]
Bases:
AnalyticAcquisitionFunction,ConstrainedAnalyticAcquisitionFunctionMixinLog Probability of Feasibility.
Computes the logarithm of the analytic probability of feasibility for a Normal posterior distribution. The constraints are assumed to be independent and have Gaussian posterior distributions. Only supports non-batch mode (i.e.
q=1). The model should be multi-outcome, with the index of the constraints passed to the constructor.See [Ament2023logei] for details. Formally,
LogPF(x) = Sum_i log(P(y_i \in [lower_i, upper_i])),where
y_i ~ constraint_i(x)andlower_i,upper_iare the lower and upper bounds for the i-th constraint, respectively.Example
# example where the 0th output has a non-negativity constraint >>> model = SingleTaskGP(train_X, train_Y) >>> constraints = {0: (0.0, None)} >>> LogPOF = LogProbabilityOfFeasibility(model, constraints) >>> log_pof = LogPOF(test_X)
Analytic Log Probability of Feasibility.
- Parameters:
model (Model) – A fitted single- or multi-output model.
constraints (dict[int, tuple[float | None, float | None]]) – A dictionary of the form
{i: [lower, upper]}, whereiis the output index, andlowerandupperare lower and upper bounds on that output (resp. interpreted as -Inf / Inf if None)
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
- class botorch.acquisition.analytic.ConstrainedExpectedImprovement(model, best_f, objective_index, constraints, maximize=True)[source]
Bases:
AnalyticAcquisitionFunction,ConstrainedAnalyticAcquisitionFunctionMixinConstrained Expected Improvement (feasibility-weighted).
Computes the analytic expected improvement for a Normal posterior distribution, weighted by a probability of feasibility. The objective and constraints are assumed to be independent and have Gaussian posterior distributions. Only supports non-batch mode (i.e.
q=1). The model should be multi-outcome, with the index of the objective and constraints passed to the constructor.Constrained_EI(x) = EI(x) * Product_i P(y_i \in [lower_i, upper_i]), wherey_i ~ constraint_i(x)andlower_i,upper_iare the lower and upper bounds for the i-th constraint, respectively.Example
# example where the 0th output has a non-negativity constraint and # 1st output is the objective >>> model = SingleTaskGP(train_X, train_Y) >>> constraints = {0: (0.0, None)} >>> cEI = ConstrainedExpectedImprovement(model, 0.2, 1, constraints) >>> cei = cEI(test_X)
NOTE: It is strongly recommended to use
LogConstrainedExpectedImprovementinstead of regularCEI, as it can lead to substantially improved BO performance through improved numerics. See https://arxiv.org/abs/2310.20708 for details.Analytic Constrained Expected Improvement.
- Parameters:
model (Model) – A fitted single- or multi-output model.
best_f (float | Tensor) – Either a scalar or a
b-dim Tensor (batch mode) representing the best feasible function value observed so far (assumed noiseless).objective_index (int) – The index of the objective.
constraints (dict[int, tuple[float | None, float | None]]) – A dictionary of the form
{i: [lower, upper]}, whereiis the output index, andlowerandupperare lower and upper bounds on that output (resp. interpreted as -Inf / Inf if None)maximize (bool) – If True, consider the problem a maximization problem.
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
- class botorch.acquisition.analytic.LogNoisyExpectedImprovement(model, X_observed, num_fantasies=20, maximize=True, posterior_transform=None)[source]
Bases:
AnalyticAcquisitionFunctionSingle-outcome Log Noisy Expected Improvement (via fantasies).
This computes Log Noisy Expected Improvement by averaging over the Expected Improvement values of a number of fantasy models. Only supports the case
q=1. Assumes that the posterior distribution of the model is Gaussian. The model must be single-outcome.See [Ament2023logei] for details. Formally,
LogNEI(x) = log(E(max(y - max Y_base), 0))), (y, Y_base) ~ f((x, X_base)),where
X_baseare previously observed points.Note: This acquisition function currently relies on using a SingleTaskGP with known observation noise. In other words,
train_Yvarmust be passed to the model. (required for noiseless fantasies).Example
>>> model = SingleTaskGP(train_X, train_Y, train_Yvar=train_Yvar) >>> LogNEI = LogNoisyExpectedImprovement(model, train_X) >>> nei = LogNEI(test_X)
Single-outcome Noisy Log Expected Improvement (via fantasies).
- Parameters:
model (GPyTorchModel) – A fitted single-outcome model. Only
SingleTaskGPmodels with known observation noise are currently supported.X_observed (Tensor) – A
n x dTensor of observed points that are likely to be the best observed points so far.num_fantasies (int) – The number of fantasies to generate. The higher this number the more accurate the model (at the expense of model complexity and performance).
maximize (bool) – If True, consider the problem a maximization problem.
posterior_transform (PosteriorTransform | None)
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
- class botorch.acquisition.analytic.NoisyExpectedImprovement(model, X_observed, num_fantasies=20, maximize=True)[source]
Bases:
ExpectedImprovementSingle-outcome Noisy Expected Improvement (via fantasies).
This computes Noisy Expected Improvement by averaging over the Expected Improvement values of a number of fantasy models. Only supports the case
q=1. Assumes that the posterior distribution of the model is Gaussian. The model must be single-outcome.NEI(x) = E(max(y - max Y_baseline), 0)), (y, Y_baseline) ~ f((x, X_baseline)), whereX_baselineare previously observed points.Note: This acquisition function currently relies on using a SingleTaskGP with known observation noise. In other words,
train_Yvarmust be passed to the model. (required for noiseless fantasies).Example
>>> model = SingleTaskGP(train_X, train_Y, train_Yvar=train_Yvar) >>> NEI = NoisyExpectedImprovement(model, train_X) >>> nei = NEI(test_X)
NOTE: It is strongly recommended to use
LogNoisyExpectedImprovementinstead of regularNEI, as it can lead to substantially improved BO performance through improved numerics. See https://arxiv.org/abs/2310.20708 for details.Single-outcome Noisy Expected Improvement (via fantasies).
- Parameters:
model (GPyTorchModel) – A fitted single-outcome model. Only
SingleTaskGPmodels with known observation noise are currently supported.X_observed (Tensor) – A
n x dTensor of observed points that are likely to be the best observed points so far.num_fantasies (int) – The number of fantasies to generate. The higher this number the more accurate the model (at the expense of model complexity and performance).
maximize (bool) – If True, consider the problem a maximization problem.
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
- class botorch.acquisition.analytic.UpperConfidenceBound(model, beta, posterior_transform=None, maximize=True)[source]
Bases:
AnalyticAcquisitionFunctionSingle-outcome Upper Confidence Bound (UCB).
Analytic upper confidence bound that comprises of the posterior mean plus an additional term: the posterior standard deviation weighted by a trade-off parameter,
beta. Only supports the case ofq=1(i.e. greedy, non-batch selection of design points). The model must be single-outcome.UCB(x) = mu(x) + sqrt(beta) * sigma(x), wheremuandsigmaare the posterior mean and standard deviation, respectively.Example
>>> model = SingleTaskGP(train_X, train_Y) >>> UCB = UpperConfidenceBound(model, beta=0.2) >>> ucb = UCB(test_X)
Single-outcome Upper Confidence Bound.
- Parameters:
model (Model) – A fitted single-outcome GP model (must be in batch mode if candidate sets X will be)
beta (float | Tensor) – Either a scalar or a one-dim tensor with
belements (batch mode) representing the trade-off parameter between mean and covarianceposterior_transform (PosteriorTransform | None) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.
maximize (bool) – If True, consider the problem a maximization problem.
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
- class botorch.acquisition.analytic.PosteriorMean(model, posterior_transform=None, maximize=True)[source]
Bases:
AnalyticAcquisitionFunctionSingle-outcome Posterior Mean.
Only supports the case of q=1. Requires the model’s posterior to have a
meanproperty. The model must be single-outcome.Example
>>> model = SingleTaskGP(train_X, train_Y) >>> PM = PosteriorMean(model) >>> pm = PM(test_X)
Single-outcome Posterior Mean.
- Parameters:
model (Model) – A fitted single-outcome GP model (must be in batch mode if candidate sets X will be)
posterior_transform (PosteriorTransform | None) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.
maximize (bool) – If True, consider the problem a maximization problem. Note that if
maximize=False, the posterior mean is negated. As a consequenceoptimize_acqf(PosteriorMean(gp, maximize=False))actually returns -1 * minimum of the posterior mean.
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
- class botorch.acquisition.analytic.ScalarizedPosteriorMean(model, weights, posterior_transform=None)[source]
Bases:
AnalyticAcquisitionFunctionScalarized Posterior Mean.
This acquisition function returns a scalarized (across the q-batch) posterior mean given a vector of weights.
Scalarized Posterior Mean.
- Parameters:
model (Model) – A fitted single-outcome model.
weights (Tensor) – A tensor of shape
qfor scalarization. In order to minimize the scalarized posterior mean, pass -weights.posterior_transform (PosteriorTransform | None) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
- class botorch.acquisition.analytic.PosteriorStandardDeviation(model, posterior_transform=None, maximize=True)[source]
Bases:
AnalyticAcquisitionFunctionSingle-outcome Posterior Standard Deviation.
An acquisition function for pure exploration. Only supports the case of q=1. Requires the model’s posterior to have
meanandvarianceproperties. The model must be either single-outcome or combined with aposterior_transformto produce a single-output posterior.Example
>>> import torch >>> from botorch.models.gp_regression import SingleTaskGP >>> from botorch.models.transforms.input import Normalize >>> from botorch.models.transforms.outcome import Standardize >>> >>> # Set up a model >>> train_X = torch.rand(20, 2, dtype=torch.float64) >>> train_Y = torch.sin(train_X).sum(dim=1, keepdim=True) >>> model = SingleTaskGP( ... train_X, train_Y, outcome_transform=Standardize(m=1), ... input_transform=Normalize(d=2), ... ) >>> # Now set up the acquisition function >>> PSTD = PosteriorStandardDeviation(model) >>> test_X = torch.zeros((1, 2), dtype=torch.float64) >>> std = PSTD(test_X) >>> std.item() 0.16341639895667773
Single-outcome Posterior Standard Deviation.
- Parameters:
model (Model) – A fitted single-outcome GP model (must be in batch mode if candidate sets X will be)
posterior_transform (PosteriorTransform | None) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.
maximize (bool) – If True, consider the problem a maximization problem. Note that if
maximize=False, the posterior standard deviation is negated. As a consequence,optimize_acqf(PosteriorStandardDeviation(gp, maximize=False))actually returns -1 * minimum of the posterior standard deviation.
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
Monte-Carlo Acquisition Functions
Batch acquisition functions using the reparameterization trick in combination with (quasi) Monte-Carlo sampling. See [Rezende2014reparam], [Wilson2017reparam] and [Balandat2020botorch].
References
D. J. Rezende, S. Mohamed, and D. Wierstra. Stochastic backpropagation and approximate inference in deep generative models. ICML 2014.
J. T. Wilson, R. Moriconi, F. Hutter, and M. P. Deisenroth. The reparameterization trick for acquisition functions. ArXiv 2017.
- class botorch.acquisition.monte_carlo.SampleReductionProtocol(*args, **kwargs)[source]
Bases:
ProtocolFor static type check of SampleReducingMCAcquisitionFunction’s mc_reduction.
- class botorch.acquisition.monte_carlo.SampleReducingMCAcquisitionFunction(model, sampler=None, objective=None, posterior_transform=None, X_pending=None, sample_reduction=<built-in method mean of type object>, q_reduction=<built-in method amax of type object>, constraints=None, eta=0.001, fat=False)[source]
Bases:
MCAcquisitionFunctionMC-based batch acquisition function that reduces across samples and implements a general treatment of outcome constraints.
This class’s
forwardcomputes the - possibly constrained - acquisition value by (1) computing the unconstrained utility for each MC sample using_sample_forward, (2) weighing the utility values by the constraint indicator per MC sample, and (3) reducing (e.g. averaging) the weighted utility values over the MC dimension.NOTE: Do NOT override the
forwardmethod, unless you have thought about it well.forwardis implemented generically to incorporate constraints in a principled way, and takes care of reducing over the Monte Carlo and batch dimensions via thesample_reductionandq_reductionarguments, which default totorch.meanandtorch.max, respectively.In order to implement a custom SampleReducingMCAcquisitionFunction, we only need to implement the
_sample_forward(obj: Tensor) -> Tensormethod, which maps objective samples to acquisition utility values without reducing the Monte Carlo and batch (i.e. q) dimensions (see details in the docstring of_sample_forward).A note on design choices:
The primary purpose of
SampleReducingMCAcquisitionFunction``is to support outcome constraints. On the surface, designing a wrapper ``ConstrainedMCAcquisitionFunctioncould be an elegant solution to this end, but it would still require the acquisition functions to implement a_sample_forwardmethod to weigh acquisition utilities at the sample level. Further,qNoisyExpectedImprovementis a special case that is hard to encompass in this pattern, since it requires the computation of the best feasible objective, which requires access to the constraint functions. However, if the constraints are stored in a wrapper class, they will be inaccessible to the forward pass. These problems are circumvented by the design of this class.Constructor of SampleReducingMCAcquisitionFunction.
- Parameters:
model (Model) – A fitted model.
sampler (MCSampler | None) – The sampler used to draw base samples. If not given, a sampler is generated on the fly within the
get_posterior_samplesmethod usingbotorch.sampling.get_sampler. NOTE: For posteriors that do not support base samples, a sampler compatible with intended use case must be provided. SeeForkedRNGSamplerandStochasticSampleras examples.objective (MCAcquisitionObjective | None) – The MCAcquisitionObjective under which the samples are evaluated. Defaults to
IdentityMCObjective(). NOTE:ConstrainedMCObjectivefor outcome constraints is deprecated in favor of passing theconstraintsdirectly to this constructor.posterior_transform (PosteriorTransform | None) – A
PosteriorTransform(optional).X_pending (Tensor | None) – A
batch_shape, m x d-dim Tensor ofmdesign points that have points that have been submitted for function evaluation but have not yet been evaluated.sample_reduction (SampleReductionProtocol) – A callable that takes in a
sample_shape x batch_shapeTensor of acquisition utility values, a keyword-argumentdimthat specifies the sample dimensions to reduce over, and returns abatch_shape-dim Tensor of acquisition values.q_reduction (SampleReductionProtocol) – A callable that takes in a
sample_shape x batch_shape x qTensor of acquisition utility values, a keyword-argumentdimthat specifies the q dimension to reduce over (i.e. -1), and returns asample_shape x batch_shape-dim Tensor of acquisition values.constraints (list[Callable[[Tensor], Tensor]] | None) – A list of constraint callables which map a Tensor of posterior samples of dimension
sample_shape x batch-shape x q x m-dim to asample_shape x batch-shape x q-dim Tensor. The associated constraints are considered satisfied if the output is less than zero. NOTE: Constraint-weighting is only compatible with non-negative acquistion utilities, e.g. all improvement-based acquisition functions.eta (Tensor | float) – Temperature parameter(s) governing the smoothness of the sigmoid approximation to the constraint indicators. For more details, on this parameter, see the docs of
compute_smoothed_feasibility_indicator.fat (list[bool | None] | bool) – Whether to apply a fat-tailed smooth approximation to the feasibility indicator or the canonical sigmoid approximation. For more details, on this parameter, see the docs of
compute_smoothed_feasibility_indicator.
- forward(X)[source]
Computes the acquisition value associated with the input
X. Weighs the acquisition utility values by smoothed constraint indicators ifconstraintswas passed to the constructor of the class. Appliesself.sample_reductionandself.q_reductionto reduce over the Monte Carlo and batch (q) dimensions.NOTE: Do NOT override the
forwardmethod for a custom acquisition function. Instead, implement the_sample_forwardmethod. See the docstring of this class for details.- Parameters:
X (Tensor) – A
batch_shape x q x dTensor of t-batches withqd-dim design points each.- Returns:
A Tensor with shape
batch_shape', wherebatch_shape'is the broadcasted batch shape of model and inputX.- Return type:
Tensor
- class botorch.acquisition.monte_carlo.qExpectedImprovement(model, best_f, sampler=None, objective=None, posterior_transform=None, X_pending=None, constraints=None, eta=0.001)[source]
Bases:
SampleReducingMCAcquisitionFunctionMC-based batch Expected Improvement.
This computes qEI by (1) sampling the joint posterior over q points (2) evaluating the improvement over the current best for each sample (3) maximizing over q (4) averaging over the samples
qEI(X) = E(max(max Y - best_f, 0)), Y ~ f(X), where X = (x_1,...,x_q)Example
>>> model = SingleTaskGP(train_X, train_Y) >>> best_f = train_Y.max()[0] >>> sampler = SobolQMCNormalSampler(1024) >>> qEI = qExpectedImprovement(model, best_f, sampler) >>> qei = qEI(test_X)
NOTE: It is strongly recommended to use
qLogExpectedImprovementinstead of regularqEI, as it can lead to substantially improved BO performance through improved numerics. See https://arxiv.org/abs/2310.20708 for details.q-Expected Improvement.
- Parameters:
model (Model) – A fitted model.
best_f (float | Tensor) – The best objective value observed so far (assumed noiseless). Can be a scalar, or a
batch_shape-dim tensor. In case of a batched model, the tensor can specify different values for each element of the batch.sampler (MCSampler | None) – The sampler used to draw base samples. See
MCAcquisitionFunctionmore details.objective (MCAcquisitionObjective | None) – The MCAcquisitionObjective under which the samples are evaluated. Defaults to
IdentityMCObjective(). NOTE:ConstrainedMCObjectivefor outcome constraints is deprecated in favor of passing theconstraintsdirectly to this constructor.posterior_transform (PosteriorTransform | None) – A PosteriorTransform (optional).
X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call. Copied and set to have no gradient.constraints (list[Callable[[Tensor], Tensor]] | None) – A list of constraint callables which map a Tensor of posterior samples of dimension
sample_shape x batch-shape x q x m-dim to asample_shape x batch-shape x q-dim Tensor. The associated constraints are considered satisfied if the output is less than zero.eta (Tensor | float) – Temperature parameter(s) governing the smoothness of the sigmoid approximation to the constraint indicators. For more details, on this parameter, see the docs of
compute_smoothed_feasibility_indicator.
- class botorch.acquisition.monte_carlo.qNoisyExpectedImprovement(model, X_baseline, sampler=None, objective=None, posterior_transform=None, X_pending=None, prune_baseline=True, cache_root=None, constraints=None, eta=0.001, marginalize_dim=None)[source]
Bases:
SampleReducingMCAcquisitionFunction,CachedCholeskyMCSamplerMixinMC-based batch Noisy Expected Improvement.
This function does not assume a
best_fis known (which would require noiseless observations). Instead, it uses samples from the joint posterior over theqtest points and previously observed points. The improvement over previously observed points is computed for each sample and averaged.qNEI(X) = E(max(max Y - max Y_baseline, 0)), where(Y, Y_baseline) ~ f((X, X_baseline)), X = (x_1,...,x_q)Example
>>> model = SingleTaskGP(train_X, train_Y) >>> sampler = SobolQMCNormalSampler(1024) >>> qNEI = qNoisyExpectedImprovement(model, train_X, sampler) >>> qnei = qNEI(test_X)
NOTE: It is strongly recommended to use
qLogNoisyExpectedImprovementinstead of regularqNEI, as it can lead to substantially improved BO performance through improved numerics. See https://arxiv.org/abs/2310.20708 for details.q-Noisy Expected Improvement.
- Parameters:
model (Model) – A fitted model.
X_baseline (Tensor) – A
batch_shape x r x d-dim Tensor ofrdesign points that have already been observed. These points are considered as the potential best design point.sampler (MCSampler | None) – The sampler used to draw base samples. See
MCAcquisitionFunctionmore details.objective (MCAcquisitionObjective | None) – The MCAcquisitionObjective under which the samples are evaluated. Defaults to
IdentityMCObjective(). NOTE:ConstrainedMCObjectivefor outcome constraints is deprecated in favor of passing theconstraintsdirectly to this constructor.posterior_transform (PosteriorTransform | None) – A PosteriorTransform (optional).
X_pending (Tensor | None) – A
batch_shape x m x d-dim Tensor ofmdesign points that have points that have been submitted for function evaluation but have not yet been evaluated. Concatenated intoXupon forward call. Copied and set to have no gradient.prune_baseline (bool) – If True, remove points in
X_baselinethat are highly unlikely to be the best point. This can significantly improve performance and is generally recommended. In order to customize pruning parameters, instead manually callbotorch.acquisition.utils.prune_inferior_pointsonX_baselinebefore instantiating the acquisition function.cache_root (bool | None) – A boolean indicating whether to cache the root decomposition over
X_baselineand use low-rank updates.constraints (list[Callable[[Tensor], Tensor]] | None) – A list of constraint callables which map a Tensor of posterior samples of dimension
sample_shape x batch-shape x q x m-dim to asample_shape x batch-shape x q-dim Tensor. The associated constraints are considered satisfied if the output is less than zero.eta (Tensor | float) – Temperature parameter(s) governing the smoothness of the sigmoid approximation to the constraint indicators. For more details, on this parameter, see the docs of
compute_smoothed_feasibility_indicator.marginalize_dim (int | None) – The dimension to marginalize over.
TODO: similar to qNEHVI, when we are using sequential greedy candidate selection, we could incorporate pending points X_baseline and compute the incremental qNEI from the new point. This would greatly increase efficiency for large batches.
- class botorch.acquisition.monte_carlo.qProbabilityOfImprovement(model, best_f, sampler=None, objective=None, posterior_transform=None, X_pending=None, tau=0.001, constraints=None, eta=0.001)[source]
Bases:
SampleReducingMCAcquisitionFunctionMC-based batch Probability of Improvement.
Estimates the probability of improvement over the current best observed value by sampling from the joint posterior distribution of the q-batch. MC-based estimates of a probability involves taking expectation of an indicator function; to support auto-differentiation, the indicator is replaced with a sigmoid function with temperature parameter
tau.qPI(X) = P(max Y >= best_f), Y ~ f(X), X = (x_1,...,x_q)Example
>>> model = SingleTaskGP(train_X, train_Y) >>> best_f = train_Y.max()[0] >>> sampler = SobolQMCNormalSampler(1024) >>> qPI = qProbabilityOfImprovement(model, best_f, sampler) >>> qpi = qPI(test_X)
q-Probability of Improvement.
- Parameters:
model (Model) – A fitted model.
best_f (float | Tensor) – The best objective value observed so far (assumed noiseless). Can be a
batch_shape-shaped tensor, which in case of a batched model specifies potentially different values for each element of the batch.sampler (MCSampler | None) – The sampler used to draw base samples. See
MCAcquisitionFunctionmore details.objective (MCAcquisitionObjective | None) – The MCAcquisitionObjective under which the samples are evaluated. Defaults to
IdentityMCObjective(). NOTE:ConstrainedMCObjectivefor outcome constraints is deprecated in favor of passing theconstraintsdirectly to this constructor.posterior_transform (PosteriorTransform | None) – A PosteriorTransform (optional).
X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call. Copied and set to have no gradient.tau (float) – The temperature parameter used in the sigmoid approximation of the step function. Smaller values yield more accurate approximations of the function, but result in gradients estimates with higher variance.
constraints (list[Callable[[Tensor], Tensor]] | None) – A list of constraint callables which map posterior samples to a scalar. The associated constraint is considered satisfied if this scalar is less than zero.
eta (Tensor | float) – Temperature parameter(s) governing the smoothness of the sigmoid approximation to the constraint indicators. For more details, on this parameter, see the docs of
compute_smoothed_feasibility_indicator.
- class botorch.acquisition.monte_carlo.qSimpleRegret(model, sampler=None, objective=None, posterior_transform=None, X_pending=None)[source]
Bases:
SampleReducingMCAcquisitionFunctionMC-based batch Simple Regret.
Samples from the joint posterior over the q-batch and computes the simple regret.
qSR(X) = E(max Y), Y ~ f(X), X = (x_1,...,x_q)Constraints should be provided as a
ConstrainedMCObjective. Passingconstraintsas an argument is not supported. This is becauseSampleReducingMCAcquisitionFunctioncomputes the acquisition values on the sample level and then weights the sample-level acquisition values by a soft feasibility indicator. Hence, it expects non-log acquisition function values to be non-negative.qSimpleRegretacquisition values can be negative, so we instead use aConstrainedMCObjectivewhich applies constraints to the objectives (e.g. before computing the acquisition function) and shifts negative objective values using an infeasible cost to ensure non-negativity (before applying constraints and shifting them back).Example
>>> model = SingleTaskGP(train_X, train_Y) >>> sampler = SobolQMCNormalSampler(1024) >>> qSR = qSimpleRegret(model, sampler) >>> qsr = qSR(test_X)
q-Simple Regret.
- Parameters:
model (Model) – A fitted model.
sampler (MCSampler | None) – The sampler used to draw base samples. See
MCAcquisitionFunctionmore details.objective (MCAcquisitionObjective | None) – The MCAcquisitionObjective under which the samples are evaluated. Defaults to
IdentityMCObjective().posterior_transform (PosteriorTransform | None) – A PosteriorTransform (optional).
X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call. Copied and set to have no gradient.
- class botorch.acquisition.monte_carlo.qUpperConfidenceBound(model, beta, sampler=None, objective=None, posterior_transform=None, X_pending=None)[source]
Bases:
SampleReducingMCAcquisitionFunctionMC-based batch Upper Confidence Bound.
Uses a reparameterization to extend UCB to qUCB for q > 1 (See Appendix A of [Wilson2017reparam].)
qUCB = E(max(mu + |Y_tilde - mu|)), whereY_tilde ~ N(mu, beta pi/2 Sigma)andf(X)has distributionN(mu, Sigma).Constraints should be provided as a
ConstrainedMCObjective. Passingconstraintsas an argument is not supported. This is becauseSampleReducingMCAcquisitionFunctioncomputes the acquisition values on the sample level and then weights the sample-level acquisition values by a soft feasibility indicator. Hence, it expects non-log acquisition function values to be non-negative.qUpperConfidenceBoundacquisition values can be negative, so we instead use aConstrainedMCObjectivewhich applies constraints to the objectives (e.g. before computing the acquisition function) and shifts negative objective values using an infeasible cost to ensure non-negativity (before applying constraints and shifting them back).Example
>>> model = SingleTaskGP(train_X, train_Y) >>> sampler = SobolQMCNormalSampler(1024) >>> qUCB = qUpperConfidenceBound(model, 0.1, sampler) >>> qucb = qUCB(test_X)
q-Upper Confidence Bound.
- Parameters:
model (Model) – A fitted model.
beta (float) – Controls tradeoff between mean and standard deviation in UCB.
sampler (MCSampler | None) – The sampler used to draw base samples. See
MCAcquisitionFunctionmore details.objective (MCAcquisitionObjective | None) – The MCAcquisitionObjective under which the samples are evaluated. Defaults to
IdentityMCObjective().posterior_transform (PosteriorTransform | None) – A PosteriorTransform (optional).
X_pending (Tensor | None) – A
batch_shape x m x d-dim Tensor ofmdesign points that have points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call. Copied and set to have no gradient.
- class botorch.acquisition.monte_carlo.qLowerConfidenceBound(model, beta, sampler=None, objective=None, posterior_transform=None, X_pending=None)[source]
Bases:
qUpperConfidenceBoundMC-based batched lower confidence bound.
This acquisition function is useful for confident/risk-averse decision making. This acquisition function is intended to be maximized as with qUpperConfidenceBound, but the qLowerConfidenceBound will be pessimistic in the face of uncertainty and lead to conservative candidates.
q-Upper Confidence Bound.
- Parameters:
model (Model) – A fitted model.
beta (float) – Controls tradeoff between mean and standard deviation in UCB.
sampler (MCSampler | None) – The sampler used to draw base samples. See
MCAcquisitionFunctionmore details.objective (MCAcquisitionObjective | None) – The MCAcquisitionObjective under which the samples are evaluated. Defaults to
IdentityMCObjective().posterior_transform (PosteriorTransform | None) – A PosteriorTransform (optional).
X_pending (Tensor | None) – A
batch_shape x m x d-dim Tensor ofmdesign points that have points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call. Copied and set to have no gradient.
- class botorch.acquisition.monte_carlo.qPosteriorStandardDeviation(model, sampler=None, objective=None, posterior_transform=None, X_pending=None, constraints=None, eta=0.001)[source]
Bases:
SampleReducingMCAcquisitionFunctionMC-based batch Posterior Standard Deviation.
An acquisition function for pure exploration.
Example
>>> model = SingleTaskGP(train_X, train_Y) >>> sampler = SobolQMCNormalSampler(1024) >>> qPSTD = qPosteriorStandardDeviation(model, sampler) >>> std = qPSTD(test_X)
q-Posterior Standard Deviation.
- Parameters:
model (Model) – A fitted model.
sampler (MCSampler | None) – The sampler used to draw base samples. See
MCAcquisitionFunctionmore details.objective (MCAcquisitionObjective | None) – The MCAcquisitionObjective under which the samples are evaluated. Defaults to
IdentityMCObjective().posterior_transform (PosteriorTransform | None) – A PosteriorTransform (optional).
X_pending (Tensor | None) – A
batch_shape x m x d-dim Tensor ofmdesign points that have points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call. Copied and set to have no gradient.constraints (list[Callable[[Tensor], Tensor]] | None) – A list of constraint callables which map a Tensor of posterior samples of dimension
sample_shape x batch-shape x q x m-dim to asample_shape x batch-shape x q-dim Tensor. The associated constraints are considered satisfied if the output is less than zero.eta (Tensor | float) – Temperature parameter(s) governing the smoothness of the sigmoid approximation to the constraint indicators. For more details, on this parameter, see the docs of
compute_smoothed_feasibility_indicator.
Monte-Carlo variants of the LogEI family of improvements-based acquisition functions, see [Ament2023logei] for details.
References
- class botorch.acquisition.logei.LogImprovementMCAcquisitionFunction(model, sampler=None, objective=None, posterior_transform=None, X_pending=None, constraints=None, eta=0.001, fat=True, tau_max=0.01)[source]
Bases:
SampleReducingMCAcquisitionFunctionAbstract base class for Monte-Carlo-based batch LogEI acquisition functions.
Constructor of the base class for LogEI acquisition functions.
- Parameters:
model (Model) – A fitted model.
sampler (MCSampler | None) – The sampler used to draw base samples. If not given, a sampler is generated using
get_sampler. NOTE: For posteriors that do not support base samples, a sampler compatible with intended use case must be provided. SeeForkedRNGSamplerandStochasticSampleras examples.objective (MCAcquisitionObjective | None) – The MCAcquisitionObjective under which the samples are evaluated. Defaults to
IdentityMCObjective().posterior_transform (PosteriorTransform | None) – A PosteriorTransform (optional).
X_pending (Tensor | None) – A
batch_shape, m x d-dim Tensor ofmdesign points that have points that have been submitted for function evaluation but have not yet been evaluated.constraints (list[Callable[[Tensor], Tensor]] | None) – A list of constraint callables which map a Tensor of posterior samples of dimension
sample_shape x batch-shape x q x m-dim to asample_shape x batch-shape x q-dim Tensor. The associated constraints are satisfied ifconstraint(samples) < 0.eta (Tensor | float) – Temperature parameter(s) governing the smoothness of the sigmoid approximation to the constraint indicators. See the docs of
compute_(log_)constraint_indicatorfor more details on this parameter.fat (bool) – Toggles the logarithmic / linear asymptotic behavior of the smooth approximation to the ReLU.
tau_max (float) – Temperature parameter controlling the sharpness of the approximation to the
maxoperator over theqcandidate points.
- class botorch.acquisition.logei.qLogExpectedImprovement(model, best_f, sampler=None, objective=None, posterior_transform=None, X_pending=None, constraints=None, eta=0.001, fat=True, tau_max=0.01, tau_relu=1e-06)[source]
Bases:
LogImprovementMCAcquisitionFunctionMC-based batch Log Expected Improvement.
This computes qLogEI by (1) sampling the joint posterior over q points, (2) evaluating the smoothed log improvement over the current best for each sample, (3) smoothly maximizing over q, and (4) averaging over the samples in log space.
See [Ament2023logei] for details. Formally,
qLogEI(X) ~ log(qEI(X)) = log(E(max(max Y - best_f, 0))).where
Y ~ f(X), andX = (x_1,...,x_q).Example
>>> model = SingleTaskGP(train_X, train_Y) >>> best_f = train_Y.max()[0] >>> sampler = SobolQMCNormalSampler(1024) >>> qLogEI = qLogExpectedImprovement(model, best_f, sampler) >>> qei = qLogEI(test_X)
q-Log Expected Improvement.
- Parameters:
model (Model) – A fitted model.
best_f (float | Tensor) – The best objective value observed so far (assumed noiseless). Can be a scalar, or a
batch_shape-dim tensor. In case of a batched model, the tensor can specify different values for each element of the batch.sampler (MCSampler | None) – The sampler used to draw base samples. See
MCAcquisitionFunctionmore details.objective (MCAcquisitionObjective | None) – The MCAcquisitionObjective under which the samples are evaluated. Defaults to
IdentityMCObjective().posterior_transform (PosteriorTransform | None) – A PosteriorTransform (optional).
X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have been submitted for function evaluation but have not yet been evaluated. Concatenated intoXupon forward call. Copied and set to have no gradient.constraints (list[Callable[[Tensor], Tensor]] | None) – A list of constraint callables which map a Tensor of posterior samples of dimension
sample_shape x batch-shape x q x m-dim to asample_shape x batch-shape x q-dim Tensor. The associated constraints are satisfied ifconstraint(samples) < 0.eta (Tensor | float) – Temperature parameter(s) governing the smoothness of the sigmoid approximation to the constraint indicators. See the docs of
compute_(log_)smoothed_constraint_indicatorfor details.fat (bool) – Toggles the logarithmic / linear asymptotic behavior of the smooth approximation to the ReLU.
tau_max (float) – Temperature parameter controlling the sharpness of the smooth approximations to max.
tau_relu (float) – Temperature parameter controlling the sharpness of the smooth approximations to ReLU.
- class botorch.acquisition.logei.qLogNoisyExpectedImprovement(model, X_baseline, sampler=None, objective=None, posterior_transform=None, X_pending=None, constraints=None, eta=0.001, fat=True, prune_baseline=True, cache_root=None, tau_max=0.01, tau_relu=1e-06, marginalize_dim=None, incremental=True)[source]
Bases:
LogImprovementMCAcquisitionFunction,CachedCholeskyMCSamplerMixinMC-based batch Log Noisy Expected Improvement.
This function does not assume a
best_fis known (which would require noiseless observations). Instead, it uses samples from the joint posterior over theqtest points and previously observed points. A smooth approximation to the canonical improvement over previously observed points is computed for each sample and the logarithm of the average is returned.See [Ament2023logei] for details. Formally,
qLogNEI(X) ~ log(qNEI(X)) = Log E(max(max Y - max Y_baseline, 0)),where
(Y, Y_baseline) ~ f((X, X_baseline)), X = (x_1,...,x_q).For optimizing a batch of
q > 1points using sequential greedy optimization, the incremental improvement from the latest point is computed and returned by default. I.e., the pending points are treated asX_baseline. Often, the incremental EI is easier to optimize.Example
>>> model = SingleTaskGP(train_X, train_Y) >>> sampler = SobolQMCNormalSampler(1024) >>> qLogNEI = qLogNoisyExpectedImprovement(model, train_X, sampler) >>> acqval = qLogNEI(test_X)
q-Noisy Expected Improvement.
- Parameters:
model (Model) – A fitted model.
X_baseline (Tensor) – A
batch_shape x r x d-dim Tensor ofrdesign points that have already been observed. These points are considered as the potential best design point.sampler (MCSampler | None) – The sampler used to draw base samples. See
MCAcquisitionFunctionmore details.objective (MCAcquisitionObjective | None) – The MCAcquisitionObjective under which the samples are evaluated. Defaults to
IdentityMCObjective().posterior_transform (PosteriorTransform | None) – A PosteriorTransform (optional).
X_pending (Tensor | None) – A
batch_shape x m x d-dim Tensor ofmdesign points that have points that have been submitted for function evaluation but have not yet been evaluated. Concatenated intoXupon forward call. Copied and set to have no gradient.constraints (list[Callable[[Tensor], Tensor]] | None) – A list of constraint callables which map a Tensor of posterior samples of dimension
sample_shape x batch-shape x q x m-dim to asample_shape x batch-shape x q-dim Tensor. The associated constraints are satisfied ifconstraint(samples) < 0.eta (Tensor | float) – Temperature parameter(s) governing the smoothness of the sigmoid approximation to the constraint indicators. See the docs of
compute_(log_)smoothed_constraint_indicatorfor details.fat (bool) – Toggles the logarithmic / linear asymptotic behavior of the smooth approximation to the ReLU.
prune_baseline (bool) – If True, remove points in
X_baselinethat are highly unlikely to be the best point. This can significantly improve performance and is generally recommended. In order to customize pruning parameters, instead manually callbotorch.acquisition.utils.prune_inferior_pointsonX_baselinebefore instantiating the acquisition function.cache_root (bool | None) – A boolean indicating whether to cache the root decomposition over
X_baselineand use low-rank updates.tau_max (float) – Temperature parameter controlling the sharpness of the smooth approximations to max.
tau_relu (float) – Temperature parameter controlling the sharpness of the smooth approximations to ReLU.
marginalize_dim (int | None) – The dimension to marginalize over.
incremental (bool) – Whether to compute incremental EI over the pending points or compute EI of the joint batch improvement (including pending points).
TODO: similar to qNEHVI, when we are using sequential greedy candidate selection, we could incorporate pending points X_baseline and compute the incremental q(Log)NEI from the new point. This would greatly increase efficiency for large batches.
- property X_baseline: Tensor
Returns the set of points that should be considered as the incumbent.
For incremental EI, this contains the previously evaluated points (X_baseline) and pending points (X_pending). For non-incremental EI, this contains the previously evaluated points (X_baseline).
- set_X_pending(X_pending=None)[source]
Informs the acquisition function about pending design points.
Here pending points are concatenated with X_baseline and incremental NEI is computed.
- Parameters:
X_pending (Tensor | None) –
n x dTensor withnd-dim design points that have been submitted for evaluation but have not yet been evaluated.- Return type:
None
- class botorch.acquisition.logei.qLogProbabilityOfFeasibility(model, constraints, sampler=None, objective=None, posterior_transform=None, X_pending=None, eta=0.001, fat=True, tau_max=0.01)[source]
Bases:
LogImprovementMCAcquisitionFunctionMC-based batch LogProbabilityOfFeasibility.
This computes the log probability of feasibility by (1) sampling the joint posterior over q points (2) evaluating the feasibility of each sample (3) averaging over the sample and batch dimensions.
log_prob_feas(X) = log(P(f(X) <= 0)), where f(X) ~ GP.Constructor of the batch log probability of feasibility acquisition function.
- Parameters:
model (Model) – A fitted model.
constraints (list[Callable[[Tensor], Tensor]]) – A list of constraint callables which map a Tensor of posterior samples of dimension
sample_shape x batch-shape x q x m-dim to asample_shape x batch-shape x q-dim Tensor. The associated constraints are satisfied ifconstraint(samples) < 0.sampler (MCSampler | None) – The sampler used to draw base samples. If not given, a sampler is generated using
get_sampler. NOTE: For posteriors that do not support base samples, a sampler compatible with intended use case must be provided. SeeForkedRNGSamplerandStochasticSampleras examples.objective (MCAcquisitionObjective | None) – Not used, kept for compatibility with interface.
posterior_transform (PosteriorTransform | None) – A PosteriorTransform (optional).
X_pending (Tensor | None) – A
batch_shape, m x d-dim Tensor ofmdesign points that have points that have been submitted for function evaluation but have not yet been evaluated.eta (Tensor | float) – Temperature parameter(s) governing the smoothness of the sigmoid approximation to the constraint indicators. See the docs of
compute_(log_)constraint_indicatorfor more details on this parameter.fat (bool) – Toggles the logarithmic / linear asymptotic behavior of the smooth approximation to the ReLU.
tau_max (float) – Temperature parameter controlling the sharpness of the approximation to the
maxoperator over theqcandidate points.
Multi-Objective Analytic Acquisition Functions
Analytic Acquisition Functions for Multi-objective Bayesian optimization.
References
- class botorch.acquisition.multi_objective.analytic.ExpectedHypervolumeImprovement(model, ref_point, partitioning, posterior_transform=None)[source]
Bases:
MultiObjectiveAnalyticAcquisitionFunctionExpected Hypervolume Improvement supporting m>=2 outcomes.
This computes EHVI using the algorithm from [Yang2019], but additionally computes gradients via auto-differentiation as proposed by [Daulton2020qehvi].
Note: this is currently inefficient in two ways due to the binary partitioning algorithm that we use for the box decomposition:
We have more boxes in our decomposition
- If we used a box decomposition that used
infas the upper bound for the last dimension in all hypercells, then we could reduce the number of terms we need to compute from 2^m to 2^(m-1). [Yang2019] do this by using DKLV17 and LKF17 for the box decomposition.
- If we used a box decomposition that used
TODO: Use DKLV17 and LKF17 for the box decomposition as in [Yang2019] for greater efficiency.
TODO: Add support for outcome constraints.
Example
>>> model = SingleTaskGP(train_X, train_Y) >>> ref_point = [0.0, 0.0] >>> EHVI = ExpectedHypervolumeImprovement(model, ref_point, partitioning) >>> ehvi = EHVI(test_X)
- Parameters:
model (Model) – A fitted model.
ref_point (list[float]) – A list with
melements representing the reference point (in the outcome space) w.r.t. to which compute the hypervolume. This is a reference point for the outcome values (i.e., after applyingposterior_transformif provided).partitioning (NondominatedPartitioning) – A
NondominatedPartitioningmodule that provides the non- dominated front and a partitioning of the non-dominated space in hyper- rectangles.posterior_transform (PosteriorTransform | None) – A
PosteriorTransform.
- psi(lower, upper, mu, sigma)[source]
Compute Psi function.
For each cell i and outcome k:
Psi(lower_{i,k}, upper_{i,k}, mu_k, sigma_k) = ( sigma_k * PDF((upper_{i,k} - mu_k) / sigma_k) + ( mu_k - lower_{i,k} ) * (1 - CDF(upper_{i,k} - mu_k) / sigma_k) )
See Equation 19 in [Yang2019] for more details.
- Parameters:
lower (Tensor) – A
num_cells x m-dim tensor of lower cell boundsupper (Tensor) – A
num_cells x m-dim tensor of upper cell boundsmu (Tensor) – A
batch_shape x 1 x m-dim tensor of meanssigma (Tensor) – A
batch_shape x 1 x m-dim tensor of standard deviations (clamped).
- Returns:
A
batch_shape x num_cells x m-dim tensor of values.- Return type:
Tensor
- nu(lower, upper, mu, sigma)[source]
Compute Nu function.
For each cell i and outcome k:
nu(lower_{i,k}, upper_{i,k}, mu_k, sigma_k) = ( upper_{i,k} - lower_{i,k} ) * (1 - CDF((upper_{i,k} - mu_k) / sigma_k))
See Equation 25 in [Yang2019] for more details.
- Parameters:
lower (Tensor) – A
num_cells x m-dim tensor of lower cell boundsupper (Tensor) – A
num_cells x m-dim tensor of upper cell boundsmu (Tensor) – A
batch_shape x 1 x m-dim tensor of meanssigma (Tensor) – A
batch_shape x 1 x m-dim tensor of standard deviations (clamped).
- Returns:
A
batch_shape x num_cells x m-dim tensor of values.- Return type:
Tensor
- forward(X, *args, **kwargs)
Takes in a
batch_shape x 1 x dX Tensor of t-batches with1d-dim design point each, and returns a Tensor with shapebatch_shape', wherebatch_shape'is the broadcasted batch shape of model and inputX.- Parameters:
acqf (AcquisitionFunction)
X (Any)
args (Any)
kwargs (Any)
- Return type:
Any
Multi-Objective Hypervolume Knowledge Gradient Acquisition Functions
The hypervolume knowledge gradient acquisition function (HVKG).
References:
- class botorch.acquisition.multi_objective.hypervolume_knowledge_gradient.qHypervolumeKnowledgeGradient(model, ref_point, num_fantasies=8, num_pareto=10, sampler=None, objective=None, inner_sampler=None, X_evaluation_mask=None, X_pending=None, X_pending_evaluation_mask=None, current_value=None, use_posterior_mean=True, cost_aware_utility=None, log=False)[source]
Bases:
DecoupledAcquisitionFunction,MultiObjectiveMCAcquisitionFunction,OneShotAcquisitionFunctionBatch Hypervolume Knowledge Gradient using one-shot optimization.
The hypervolume knowledge gradient seeks to maximize the difference in hypervolume of the hypervolume-maximizing set of a fixed size after conditioning the unknown observation(s) that would be recevied if X where evalauted. See [Daulton2023hvkg] for details.
This computes the batch Hypervolume Knowledge Gradient using fantasies for the outer expectation and MC-sampling for the inner expectation.
In addition to the design variables, the input
Xalso includes variables for the optimal designs for each of the fantasy models (Note this isN x N_paretooptimal designs). For a fixed number of fantasies, all points inXcan be optimized in a “one-shot” fashion.q-Hypervolume Knowledge Gradient.
- Parameters:
model (Model) – A fitted model. Must support fantasizing.
ref_point (Tensor) – A
m-dim tensor containing the reference point.num_fantasies (int) – The number of fantasy points to use. More fantasy points result in a better approximation, at the expense of memory and wall time. Unused if
sampleris specified.num_pareto (int) – The number of pareto optimal designs to consider.
sampler (ListSampler | None) – The sampler used to sample fantasy observations. Optional if
num_fantasiesis specified. The optimization performance does not seem particularly sensitive to the number of fantasies. As the number of fantasies increases, the estimation of the expectation over fantasies becomes more accurate, but the one- shot optimization problem gets harder as there are more “fantasy” designs that need to be optimized.objective (MCMultiOutputObjective | None) – The objective under which the samples are evaluated. If
None, then the analytic posterior mean is used. Otherwise, the objective is MC-evaluated (using inner_sampler).inner_sampler (MCSampler | None) – The sampler used for inner sampling. Ignored if the objective is
None.X_evaluation_mask (list[Tensor] | None) – A
q x m-dim tensor of booleans indicating which objective(s) each of theqpoints should be evaluated on.X_pending (Tensor | None) – A
n' x d-dim Tensor ofmdesign points that have points that have been submitted for function evaluation but have not yet been evaluated.X_pending_evaluation_mask (Tensor | None) – A
n' x m-dim tensor of booleans indicating which objective(s) each of then'pending points are being evaluated on.current_value (Tensor | None) – The current value, i.e. the expected best objective given the observed points
D. If omitted, forward will not return the actual KG value, but the expected best objective given the data setD u X. If pending points are used, this should be the current value under the fantasy model conditioned on the pending points so that the incremental KG value from the new candidates (not pending points) is used.use_posterior_mean (bool) – If true, optimize the hypervolume of the posterior mean, otherwise optimize the expected hypervolume. See [Daulton2023hvkg] for details.
cost_aware_utility (CostAwareUtility | None) – A CostAwareUtility specifying the cost function for evaluating the
Xon the objectives indicated byevaluation_mask.log (bool) – If True, then returns the log of the HVKG value. If True, then it expects current_value to be in log-space and cost_aware_utility to output log utilities.
- property cost_sampler
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
- get_augmented_q_batch_size(q)[source]
Get augmented q batch size for one-shot optimization.
- Parameters:
q (int) – The number of candidates to consider jointly.
- Returns:
The augmented size for one-shot optimization (including variables parameterizing the fantasy solutions).
- Return type:
int
- extract_candidates(X_full)[source]
We only return X as the set of candidates post-optimization.
- Parameters:
X_full (Tensor) – A
b x (q + num_fantasies) x d-dim Tensor withbt-batches ofq + num_fantasiesdesign points each.- Returns:
A
b x q x d-dim Tensor withbt-batches ofqdesign points each.- Return type:
Tensor
- class botorch.acquisition.multi_objective.hypervolume_knowledge_gradient.qMultiFidelityHypervolumeKnowledgeGradient(model, ref_point, target_fidelities, num_fantasies=8, num_pareto=10, sampler=None, objective=None, inner_sampler=None, X_pending=None, X_evaluation_mask=None, X_pending_evaluation_mask=None, current_value=None, cost_aware_utility=None, project=<function qMultiFidelityHypervolumeKnowledgeGradient.<lambda>>, valfunc_cls=None, valfunc_argfac=None, use_posterior_mean=True, log=False, **kwargs)[source]
Bases:
qHypervolumeKnowledgeGradientBatch Hypervolume Knowledge Gradient for multi-fidelity optimization.
See [Daulton2023hvkg] for details.
A version of
qHypervolumeKnowledgeGradientthat supports multi-fidelity optimization via aCostAwareUtilityand theprojectandexpandoperators. If none of these are set, this acquisition function reduces toqHypervolumeKnowledgeGradient. Throughvalfunc_clsandvalfunc_argfac, this can be changed into a custom multi-fidelity acquisition function.Multi-Fidelity q-Knowledge Gradient (one-shot optimization).
- Parameters:
model (Model) – A fitted model. Must support fantasizing.
ref_point (Tensor) – A
m-dim tensor containing the reference point.num_fantasies (int) – The number of fantasy points to use. More fantasy points result in a better approximation, at the expense of memory and wall time. Unused if
sampleris specified.num_pareto (int) – The number of pareto optimal designs to consider.
sampler (MCSampler | None) – The sampler used to sample fantasy observations. Optional if
num_fantasiesis specified.objective (MCMultiOutputObjective | None) – The objective under which the samples are evaluated. If
None, then the analytic posterior mean is used. Otherwise, the objective is MC-evaluated (using inner_sampler).inner_sampler (MCSampler | None) – The sampler used for inner sampling. Ignored if the objective is
None.X_evaluation_mask (Tensor | None) – A
q x m-dim tensor of booleans indicating which objective(s) each of theqpoints should be evaluated on.X_pending (Tensor | None) – A
n' x d-dim Tensor ofmdesign points that have points that have been submitted for function evaluation but have not yet been evaluated.X_pending_evaluation_mask (Tensor | None) – A
n' x m-dim tensor of booleans indicating which objective(s) each of then'pending points are being evaluated on.current_value (Tensor | None) – The current value, i.e. the expected best objective given the observed points
D. If omitted, forward will not return the actual KG value, but the expected best objective given the data setD u X. If pending points are used, this should be the current value under the fantasy model conditioned on the pending points so that the incremental KG value from the new candidates (not pending points) is used.use_posterior_mean (bool) – A boolean indicating whether to use the to optimize the hypervolume of the posterior mean or whether to optimize the expected hypervolume. See [Daulton2023hvkg] for details.
cost_aware_utility (CostAwareUtility | None) – A CostAwareUtility specifying the cost function for evaluating the
Xon the objectives indicated byevaluation_mask.project (Callable[[Tensor], Tensor]) – A callable mapping a
batch_shape x q x dtensor of design points to a tensor with shapebatch_shape x q_term x dprojected to the desired target set (e.g. the target fidelities in case of multi-fidelity optimization). For the basic case,q_term = q.valfunc_cls (type[AcquisitionFunction] | None) – An acquisition function class to be used as the terminal value function.
valfunc_argfac (Callable[[Model], dict[str, Any]] | None) – An argument factory, i.e. callable that maps a
Modelto a dictionary of kwargs for the terminal value function (e.g.best_fforExpectedImprovement).log (bool) – If True, then returns the log of the HVKG value. If True, then it expects current_value to be in log-space and cost_aware_utility to output log utilities.
target_fidelities (dict[int, float])
kwargs (Any)
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
Multi-Objective Joint Entropy Search Acquisition Functions
Acquisition functions for joint entropy search for Bayesian optimization (JES).
References:
- class botorch.acquisition.multi_objective.joint_entropy_search.LowerBoundMultiObjectiveEntropySearch(model, pareto_sets, pareto_fronts, hypercell_bounds, X_pending=None, estimation_type='LB', num_samples=64)[source]
Bases:
AcquisitionFunction,MCSamplerMixinAbstract base class for the lower bound multi-objective entropy search acquisition functions.
Lower bound multi-objective entropy search acquisition function.
- Parameters:
model (Model) – A fitted batch model with ‘M’ number of outputs.
pareto_sets (Tensor) – A
num_pareto_samples x num_pareto_points x d-dim Tensor containing the sampled Pareto optimal sets of inputs.pareto_fronts (Tensor) – A
num_pareto_samples x num_pareto_points x M-dim Tensor containing the sampled Pareto optimal sets of outputs.hypercell_bounds (Tensor) – A
num_pareto_samples x 2 x J x M-dim Tensor containing the hyper-rectangle bounds for integration, whereJis the number of hyper-rectangles. In the unconstrained case, this gives the partition of the dominated space. In the constrained case, this gives the partition of the feasible dominated space union the infeasible space.X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have been submitted for function evaluation, but have not yet been evaluated.estimation_type (str) – A string to determine which entropy estimate is computed: “0”, “LB”, “LB2”, or “MC”.
num_samples (int) – The number of Monte Carlo samples for the Monte Carlo estimate.
- abstractmethod forward(X)[source]
Compute lower bound multi-objective entropy search at the design points
X.- Parameters:
X (Tensor) – A
batch_shape x q x d-dim Tensor ofbatch_shapet-batches withqd-dim design points each.- Returns:
A
batch_shape-dim Tensor of acquisition values at the given design pointsX.- Return type:
Tensor
- class botorch.acquisition.multi_objective.joint_entropy_search.qLowerBoundMultiObjectiveJointEntropySearch(model, pareto_sets, pareto_fronts, hypercell_bounds, X_pending=None, estimation_type='LB', num_samples=64)[source]
Bases:
LowerBoundMultiObjectiveEntropySearchThe acquisition function for the multi-objective joint entropy search, where the batches
q > 1are supported through the lower bound formulation.This acquisition function computes the mutual information between the observation at a candidate point
Xand the Pareto optimal input-output pairs.See [Tu2022] for a discussion on the estimation procedure.
NOTES: (i) The estimated acquisition value could be negative.
(ii) The lower bound batch acquisition function might not be monotone in the sense that adding more elements to the batch does not necessarily increase the acquisition value. Specifically, the acquisition value can become smaller when more inputs are added.
Lower bound multi-objective joint entropy search acquisition function.
- Parameters:
model (Model) – A fitted batch model with ‘M’ number of outputs.
pareto_sets (Tensor) – A
num_pareto_samples x num_pareto_points x d-dim Tensor containing the sampled Pareto optimal sets of inputs.pareto_fronts (Tensor) – A
num_pareto_samples x num_pareto_points x M-dim Tensor containing the sampled Pareto optimal sets of outputs.hypercell_bounds (Tensor) – A
num_pareto_samples x 2 x J x M-dim Tensor containing the hyper-rectangle bounds for integration. In the unconstrained case, this gives the partition of the dominated space. In the constrained case, this gives the partition of the feasible dominated space union the infeasible space.X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have been submitted for function evaluation, but have not yet been evaluated.estimation_type (str) – A string to determine which entropy estimate is computed: “0”, “LB”, “LB2”, or “MC”.
num_samples (int) – The number of Monte Carlo samples used for the Monte Carlo estimate.
- forward(X, *args, **kwargs)
Compute lower bound multi-objective entropy search at the design points
X.- Parameters:
X (Any) – A
batch_shape x q x d-dim Tensor ofbatch_shapet-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
batch_shape-dim Tensor of acquisition values at the given design pointsX.- Return type:
Any
Multi-Objective Max-value Entropy Search Acquisition Functions
Acquisition functions for max-value entropy search for multi-objective Bayesian optimization (MESMO).
- class botorch.acquisition.multi_objective.max_value_entropy_search.qLowerBoundMultiObjectiveMaxValueEntropySearch(model, hypercell_bounds, X_pending=None, estimation_type='LB', num_samples=64)[source]
Bases:
LowerBoundMultiObjectiveEntropySearchThe acquisition function for the multi-objective Max-value Entropy Search, where the batches
q > 1are supported through the lower bound formulation.This acquisition function computes the mutual information between the observation at a candidate point
Xand the Pareto optimal outputs.See [Tu2022] for a discussion on the estimation procedure.
NOTES: (i) The estimated acquisition value could be negative.
(ii) The lower bound batch acquisition function might not be monotone in the sense that adding more elements to the batch does not necessarily increase the acquisition value. Specifically, the acquisition value can become smaller when more inputs are added.
Lower bound multi-objective max-value entropy search acquisition function.
- Parameters:
model (Model) – A fitted batch model with ‘M’ number of outputs.
hypercell_bounds (Tensor) – A
num_pareto_samples x 2 x J x M-dim Tensor containing the hyper-rectangle bounds for integration, whereJis the number of hyper-rectangles. In the unconstrained case, this gives the partition of the dominated space. In the constrained case, this gives the partition of the feasible dominated space union the infeasible space.X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have been submitted for function evaluation, but have not yet been evaluated.estimation_type (str) – A string to determine which entropy estimate is computed: “0”, “LB”, “LB2”, or “MC”.
num_samples (int) – The number of Monte Carlo samples for the Monte Carlo estimate.
- forward(X, *args, **kwargs)
Compute lower bound multi-objective entropy search at the design points
X.- Parameters:
X (Any) – A
batch_shape x q x d-dim Tensor ofbatch_shapet-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
batch_shape-dim Tensor of acquisition values at the given design pointsX.- Return type:
Any
Multi-Objective Monte-Carlo Acquisition Functions
Monte-Carlo Acquisition Functions for Multi-objective Bayesian optimization. In particular, this module contains implementations of 1) qEHVI [Daulton2020qehvi], and 2) qNEHVI [Daulton2021nehvi].
References
- class botorch.acquisition.multi_objective.monte_carlo.qExpectedHypervolumeImprovement(model, ref_point, partitioning, sampler=None, objective=None, constraints=None, X_pending=None, eta=0.001, fat=False)[source]
Bases:
MultiObjectiveMCAcquisitionFunction,SubsetIndexCachingMixinq-Expected Hypervolume Improvement supporting m>=2 outcomes.
See [Daulton2020qehvi] for details.
Example
>>> model = SingleTaskGP(train_X, train_Y) >>> ref_point = [0.0, 0.0] >>> qEHVI = qExpectedHypervolumeImprovement(model, ref_point, partitioning) >>> qehvi = qEHVI(test_X)
- Parameters:
model (Model) – A fitted model.
ref_point (list[float] | Tensor) – A list or tensor with
melements representing the reference point (in the outcome space) w.r.t. to which compute the hypervolume. This is a reference point for the objective values (i.e. after applyingobjectiveto the samples).partitioning (NondominatedPartitioning) – A
NondominatedPartitioningmodule that provides the non- dominated front and a partitioning of the non-dominated space in hyper- rectangles. If constraints are present, this partitioning must only include feasible points.sampler (MCSampler | None) – The sampler used to draw base samples. If not given, a sampler is generated using
get_sampler.objective (MCMultiOutputObjective | None) – The MCMultiOutputObjective under which the samples are evaluated. Defaults to
IdentityMCMultiOutputObjective().constraints (list[Callable[[Tensor], Tensor]] | None) – A list of callables, each mapping a Tensor of dimension
sample_shape x batch-shape x q x mto a Tensor of dimensionsample_shape x batch-shape x q, where negative values imply feasibility. The acquisition function will compute expected feasible hypervolume.X_pending (Tensor | None) – A
batch_shape x m x d-dim Tensor ofmdesign points that have points that have been submitted for function evaluation but have not yet been evaluated. Concatenated intoXupon forward call. Copied and set to have no gradient.eta (Tensor | float) – The temperature parameter for the sigmoid function used for the differentiable approximation of the constraints. In case of a float the same eta is used for every constraint in constraints. In case of a tensor the length of the tensor must match the number of provided constraints. The i-th constraint is then estimated with the i-th eta value.
fat (bool) – A Boolean flag indicating whether to use the heavy-tailed approximation of the constraint indicator.
- forward(X, *args, **kwargs)
Takes in a
batch_shape x q x dX Tensor of t-batches withqd-dim design points each, and returns a Tensor with shapebatch_shape', wherebatch_shape'is the broadcasted batch shape of model and inputX. Should utilize the result ofset_X_pendingas needed to account for pending function evaluations.- Parameters:
acqf (AcquisitionFunction)
X (Any)
args (Any)
kwargs (Any)
- Return type:
Any
- class botorch.acquisition.multi_objective.monte_carlo.qNoisyExpectedHypervolumeImprovement(model, ref_point, X_baseline, sampler=None, objective=None, constraints=None, X_pending=None, eta=0.001, fat=False, prune_baseline=False, alpha=0.0, cache_pending=True, max_iep=0, incremental_nehvi=True, cache_root=None, marginalize_dim=None)[source]
Bases:
NoisyExpectedHypervolumeMixin,qExpectedHypervolumeImprovementq-Noisy Expected Hypervolume Improvement supporting m>=2 outcomes.
See [Daulton2021nehvi] for details.
Example
>>> model = SingleTaskGP(train_X, train_Y) >>> ref_point = [0.0, 0.0] >>> qNEHVI = qNoisyExpectedHypervolumeImprovement(model, ref_point, train_X) >>> qnehvi = qNEHVI(test_X)
- Parameters:
model (Model) – A fitted model.
ref_point (list[float] | Tensor) – A list or tensor with
melements representing the reference point (in the outcome space) w.r.t. to which compute the hypervolume. This is a reference point for the objective values (i.e. after applyingobjectiveto the samples).X_baseline (Tensor) – A
r x d-dim Tensor ofrdesign points that have already been observed. These points are considered as potential approximate pareto-optimal design points.sampler (MCSampler | None) – The sampler used to draw base samples. If not given, a sampler is generated using
get_sampler. Note: a pareto front is created for each mc sample, which can be computationally intensive form> 2.objective (MCMultiOutputObjective | None) – The MCMultiOutputObjective under which the samples are evaluated. Defaults to
IdentityMCMultiOutputObjective().constraints (list[Callable[[Tensor], Tensor]] | None) – A list of callables, each mapping a Tensor of dimension
sample_shape x batch-shape x q x mto a Tensor of dimensionsample_shape x batch-shape x q, where negative values imply feasibility. The acquisition function will compute expected feasible hypervolume.X_pending (Tensor | None) – A
batch_shape x m x d-dim Tensor ofmdesign points that have points that have been submitted for function evaluation, but have not yet been evaluated.eta (Tensor | float) – The temperature parameter for the sigmoid function used for the differentiable approximation of the constraints. In case of a float the same
etais used for every constraint in constraints. In case of a tensor the length of the tensor must match the number of provided constraints. The i-th constraint is then estimated with the i-thetavalue. For more details, on this parameter, see the docs ofcompute_smoothed_feasibility_indicator.fat (bool) – A Boolean flag indicating whether to use the heavy-tailed approximation of the constraint indicator.
prune_baseline (bool) – If True, remove points in
X_baselinethat are highly unlikely to be the pareto optimal and better than the reference point. This can significantly improve computation time and is generally recommended. In order to customize pruning parameters, instead manually callprune_inferior_points_multi_objectiveonX_baselinebefore instantiating the acquisition function.alpha (float) – The hyperparameter controlling the approximate non-dominated partitioning. The default value of 0.0 means an exact partitioning is used. As the number of objectives
mincreases, consider increasing this parameter in order to limit computational complexity.cache_pending (bool) – A boolean indicating whether to use cached box decompositions (CBD) for handling pending points. This is generally recommended.
max_iep (int) – The maximum number of pending points before the box decompositions will be recomputed.
incremental_nehvi (bool) – A boolean indicating whether to compute the incremental NEHVI from the
i``th point where ``i=1, ..., qunder sequential greedy optimization, or the full qNEHVI overqpoints.cache_root (bool | None) – A boolean indicating whether to cache the root decomposition over
X_baselineand use low-rank updates.marginalize_dim (int | None) – A batch dimension that should be marginalized. For example, this is useful when using a batched fully Bayesian model.
- forward(X, *args, **kwargs)
Takes in a
batch_shape x q x dX Tensor of t-batches withqd-dim design points each, and returns a Tensor with shapebatch_shape', wherebatch_shape'is the broadcasted batch shape of model and inputX. Should utilize the result ofset_X_pendingas needed to account for pending function evaluations.- Parameters:
acqf (AcquisitionFunction)
X (Any)
args (Any)
kwargs (Any)
- Return type:
Any
Multi-objective variants of the LogEI family of acquisition functions, see [Ament2023logei] for details.
A fused C++ kernel is available that accelerates the inner loop of
_compute_log_qehvi by ~1.5-3x on CPU. It is loaded lazily on first
acquisition function construction — either from a pre-compiled extension
(Buck builds) or via JIT compilation (pip installs). If loading fails the
pure-Python implementation is used transparently.
For pip installs, JIT compilation requires a C++ compiler (gcc or
clang) to be available on the system. The ninja build system is
included as a dependency. The kernel is compiled once on first use (~7 s)
and cached by PyTorch for subsequent imports.
- class botorch.acquisition.multi_objective.logei.qLogExpectedHypervolumeImprovement(model, ref_point, partitioning, sampler=None, objective=None, constraints=None, X_pending=None, eta=0.01, fat=True, tau_relu=1e-06, tau_max=0.01)[source]
Bases:
MultiObjectiveMCAcquisitionFunction,SubsetIndexCachingMixinParallel Log Expected Hypervolume Improvement supporting m>=2 outcomes.
See [Ament2023logei] for details and the methodology behind the LogEI family of acquisition function. Line-by-line differences to the original differentiable expected hypervolume formulation of [Daulton2020qehvi] are described via inline comments in
forward.Example
>>> model = SingleTaskGP(train_X, train_Y) >>> ref_point = [0.0, 0.0] >>> acq = qLogExpectedHypervolumeImprovement(model, ref_point, partitioning) >>> value = acq(test_X)
- Parameters:
model (Model) – A fitted model.
ref_point (list[float] | Tensor) – A list or tensor with
melements representing the reference point (in the outcome space) w.r.t. to which compute the hypervolume. This is a reference point for the objective values (i.e. after applyingobjectiveto the samples).partitioning (NondominatedPartitioning) – A
NondominatedPartitioningmodule that provides the non- dominated front and a partitioning of the non-dominated space in hyper- rectangles. If constraints are present, this partitioning must only include feasible points.sampler (MCSampler | None) – The sampler used to draw base samples. If not given, a sampler is generated using
get_sampler.objective (MCMultiOutputObjective | None) – The MCMultiOutputObjective under which the samples are evaluated. Defaults to
IdentityMultiOutputObjective().constraints (list[Callable[[Tensor], Tensor]] | None) – A list of callables, each mapping a Tensor of dimension
sample_shape x batch-shape x q x mto a Tensor of dimensionsample_shape x batch-shape x q, where negative values imply feasibility. The acquisition function will compute expected feasible hypervolume.X_pending (Tensor | None) – A
batch_shape x m x d-dim Tensor ofmdesign points that have points that have been submitted for function evaluation but have not yet been evaluated. Concatenated intoXupon forward call. Copied and set to have no gradient.eta (Tensor | float) – The temperature parameter for the sigmoid function used for the differentiable approximation of the constraints. In case of a float the same eta is used for every constraint in constraints. In case of a tensor the length of the tensor must match the number of provided constraints. The i-th constraint is then estimated with the i-th eta value.
fat (bool) – Toggles the logarithmic / linear asymptotic behavior of the smooth approximation to the ReLU and the maximum.
tau_relu (float) – Temperature parameter controlling the sharpness of the approximation to the ReLU over the
qcandidate points. For further details, see the comments above the definition ofTAU_RELU.tau_max (float) – Temperature parameter controlling the sharpness of the approximation to the
maxoperator over theqcandidate points. For further details, see the comments above the definition ofTAU_MAX.
- forward(X, *args, **kwargs)
Takes in a
batch_shape x q x dX Tensor of t-batches withqd-dim design points each, and returns a Tensor with shapebatch_shape', wherebatch_shape'is the broadcasted batch shape of model and inputX. Should utilize the result ofset_X_pendingas needed to account for pending function evaluations.- Parameters:
acqf (AcquisitionFunction)
X (Any)
args (Any)
kwargs (Any)
- Return type:
Any
- class botorch.acquisition.multi_objective.logei.qLogNoisyExpectedHypervolumeImprovement(model, ref_point, X_baseline, sampler=None, objective=None, constraints=None, X_pending=None, eta=0.001, prune_baseline=False, alpha=0.0, cache_pending=True, max_iep=0, incremental_nehvi=True, cache_root=None, tau_relu=1e-06, tau_max=0.001, fat=True, marginalize_dim=None)[source]
Bases:
NoisyExpectedHypervolumeMixin,qLogExpectedHypervolumeImprovementq-Log Noisy Expected Hypervolume Improvement supporting m>=2 outcomes.
Based on the differentiable hypervolume formulation of [Daulton2021nehvi].
Example
>>> model = SingleTaskGP(train_X, train_Y) >>> ref_point = [0.0, 0.0] >>> qNEHVI = qLogNoisyExpectedHypervolumeImprovement( ... model, ref_point, train_X ... ) >>> qnehvi = qNEHVI(test_X)
- Parameters:
model (Model) – A fitted model.
ref_point (list[float] | Tensor) – A list or tensor with
melements representing the reference point (in the outcome space) w.r.t. to which compute the hypervolume. This is a reference point for the objective values (i.e. after applyingobjectiveto the samples).X_baseline (Tensor) – A
r x d-dim Tensor ofrdesign points that have already been observed. These points are considered as potential approximate pareto-optimal design points.sampler (MCSampler | None) – The sampler used to draw base samples. If not given, a sampler is generated using
get_sampler. Note: a pareto front is created for each mc sample, which can be computationally intensive form> 2.objective (MCMultiOutputObjective | None) – The MCMultiOutputObjective under which the samples are evaluated. Defaults to
IdentityMultiOutputObjective().constraints (list[Callable[[Tensor], Tensor]] | None) – A list of callables, each mapping a Tensor of dimension
sample_shape x batch-shape x q x mto a Tensor of dimensionsample_shape x batch-shape x q, where negative values imply feasibility. The acquisition function will compute expected feasible hypervolume.X_pending (Tensor | None) – A
batch_shape x m x d-dim Tensor ofmdesign points that have points that have been submitted for function evaluation, but have not yet been evaluated.eta (Tensor | float) – The temperature parameter for the sigmoid function used for the differentiable approximation of the constraints. In case of a float the same
etais used for every constraint in constraints. In case of a tensor the length of the tensor must match the number of provided constraints. The i-th constraint is then estimated with the i-thetavalue.prune_baseline (bool) – If True, remove points in
X_baselinethat are highly unlikely to be the pareto optimal and better than the reference point. This can significantly improve computation time and is generally recommended. In order to customize pruning parameters, instead manually callprune_inferior_points_multi_objectiveonX_baselinebefore instantiating the acquisition function.alpha (float) – The hyperparameter controlling the approximate non-dominated partitioning. The default value of 0.0 means an exact partitioning is used. As the number of objectives
mincreases, consider increasing this parameter in order to limit computational complexity.cache_pending (bool) – A boolean indicating whether to use cached box decompositions (CBD) for handling pending points. This is generally recommended.
max_iep (int) – The maximum number of pending points before the box decompositions will be recomputed.
incremental_nehvi (bool) – A boolean indicating whether to compute the incremental NEHVI from the
i``th point where ``i=1, ..., qunder sequential greedy optimization, or the full qNEHVI overqpoints.cache_root (bool | None) – A boolean indicating whether to cache the root decomposition over
X_baselineand use low-rank updates.marginalize_dim (int | None) – A batch dimension that should be marginalized.
tau_relu (float)
tau_max (float)
fat (bool)
- forward(X, *args, **kwargs)
Takes in a
batch_shape x q x dX Tensor of t-batches withqd-dim design points each, and returns a Tensor with shapebatch_shape', wherebatch_shape'is the broadcasted batch shape of model and inputX. Should utilize the result ofset_X_pendingas needed to account for pending function evaluations.- Parameters:
acqf (AcquisitionFunction)
X (Any)
args (Any)
kwargs (Any)
- Return type:
Any
Multi-Objective Multi-Fidelity Acquisition Functions
Multi-Fidelity Acquisition Functions for Multi-objective Bayesian optimization.
References
F. Irshad, S. Karsch, and A. Döpp. Expected hypervolume improvement for simultaneous multi-objective and multi-fidelity optimization. arXiv preprint arXiv:2112.13901, 2021.
- class botorch.acquisition.multi_objective.multi_fidelity.MOMF(model, ref_point, partitioning, sampler=None, objective=None, constraints=None, eta=0.001, X_pending=None, cost_call=None)[source]
Bases:
qExpectedHypervolumeImprovementMOMF acquisition function supporting m>=2 outcomes. The model needs to have train_obj that has a fidelity objective appended to its end. In the following example we consider a 2-D output space but the ref_point is 3D because of fidelity objective.
See [Irshad2021MOMF] for details.
Example
>>> model = SingleTaskGP(train_X, train_Y) >>> ref_point = [0.0, 0.0, 0.0] >>> cost_func = lambda X: 5 + X[..., -1] >>> momf = MOMF(model, ref_point, partitioning, cost_func) >>> momf_val = momf(test_X)
- Parameters:
model (Model) – A fitted model. There are two default assumptions in the training data.
train_Xshould have fidelity parametersas the last dimension of the input andtrain_Ycontains a trust objective as its last dimension.ref_point (list[float] | Tensor) – A list or tensor with
m+1elements representing the reference point (in the outcome space) w.r.t. to which compute the hypervolume. The ‘+1’ takes care of the trust objective appended totrain_Y. This is a reference point for the objective values (i.e. after applying``objective`` to the samples).partitioning (NondominatedPartitioning) – A
NondominatedPartitioningmodule that provides the non- dominated front and a partitioning of the non-dominated space in hyper- rectangles. If constraints are present, this partitioning must only include feasible points.sampler (MCSampler | None) – The sampler used to draw base samples. If not given, a sampler is generated using
get_sampler.objective (MCMultiOutputObjective | None) – The MCMultiOutputObjective under which the samples are evaluated. Defaults to
IdentityMCMultiOutputObjective().constraints (list[Callable[[Tensor], Tensor]] | None) – A list of callables, each mapping a Tensor of dimension
sample_shape x batch-shape x q x mto a Tensor of dimensionsample_shape x batch-shape x q, where negative values imply feasibility. The acquisition function will compute expected feasible hypervolume.X_pending (Tensor | None) – A
batch_shape x m x d-dim Tensor ofmdesign points that have points that have been submitted for function evaluation but have not yet been evaluated. Concatenated intoXupon forward call. Copied and set to have no gradient.cost_call (Callable[[Tensor], Tensor] | None) – A callable cost function mapping a Tensor of dimension
batch_shape x q x dto a cost Tensor of dimensionbatch_shape x q x m. Defaults to an AffineCostModel withC(s) = 1 + s.eta (Tensor | float) – The temperature parameter for the sigmoid function used for the differentiable approximation of the constraints. In case of a float the same eta is used for every constraint in constraints. In case of a tensor the length of the tensor must match the number of provided constraints. The i-th constraint is then estimated with the i-th eta value.
- forward(X, *args, **kwargs)
Takes in a
batch_shape x q x dX Tensor of t-batches withqd-dim design points each, and returns a Tensor with shapebatch_shape', wherebatch_shape'is the broadcasted batch shape of model and inputX. Should utilize the result ofset_X_pendingas needed to account for pending function evaluations.- Parameters:
acqf (AcquisitionFunction)
X (Any)
args (Any)
kwargs (Any)
- Return type:
Any
Multi-Objective Predictive Entropy Search Acquisition Functions
Acquisition function for predictive entropy search for multi-objective Bayesian optimization (PES). The code does not support constraint handling.
NOTE: The PES acquisition might not be differentiable. As a result, we recommend optimizing the acquisition function using finite differences.
References:
E. Garrido-Merchan and D. Hernandez-Lobato. Predictive Entropy Search for Multi-objective Bayesian Optimization with Constraints. Neurocomputing. 2019. The computation follows the procedure described in the supplementary material: https://www.sciencedirect.com/science/article/abs/pii/S0925231219308525
- class botorch.acquisition.multi_objective.predictive_entropy_search.qMultiObjectivePredictiveEntropySearch(model, pareto_sets, maximize=True, X_pending=None, max_ep_iterations=250, ep_jitter=0.0001, test_jitter=0.0001, threshold=0.01)[source]
Bases:
AcquisitionFunctionThe acquisition function for Predictive Entropy Search. The code supports both single and multiple objectives as well as batching.
This acquisition function approximates the mutual information between the observation at a candidate point
Xand the Pareto optimal input using the moment-matching procedure known as expectation propagation (EP).See the Appendix of [Garrido-Merchan2019] for the description of the EP procedure.
IMPORTANT NOTES: (i) The PES acquisition function estimated using EP is sometimes not differentiable, and therefore we advise using a finite-difference estimate of the gradient as opposed to the gradients identified using automatic differentiation, which occasionally outputs
nanvalues.The source of this differentiability is in the
_update_dampingfunction, which finds the damping factorathat is used to update the EP parametersa * param_new + (1 - a) * param_old. The damping factor has to ensure that the updated covariance matrices,a * cov_f_new + (1 - a) cov_f_old, is positive semi-definiteness. We follow the original paper, which identifiesavia a successive halving scheme i.e. we checka=1thena=0.5etc. This procedure meansais a function of the test inputX. This function is not differentiable inX.EP could potentially fail for a number of reasons:
(a) When the sampled Pareto optimal points
x_pis poor compared to the training or testing datax_n.(b) When the training or testing data
x_nis close the Pareto optimal pointsx_p.When the convergence threshold is set too small.
Problem (a) occurs because we have to compute the variable:
alpha = (mean(x_n) - mean(x_p)) / std(x_n - x_p), which becomes very large whenx_nis better thanx_pwith high-probability. This leads to a log(0) error when we computelog(1 - cdf(alpha)). We have preemptively clamped some values depending onalphain order to mitigate this.Problem (b) occurs because we have to compute matrix inverses for the two-dimensional marginals (x_n, x_p). To address this we manually add jitter to the diagonal of the covariance matrix i.e.
ep_jitterwhen training andtest_jitterwhen testing. The default choice is not always appropriate because the same jitter is used for the inversion of the covariance and precision matrix, which are on different scales.TODO: come up with strategy to adaptively update the jitter.
Problem (c) occurs because a smaller threshold usually means that more EP iterations are required. Running too many EP iterations could lead to invertibility problems such as in problem (b). Setting a larger threshold or reducing the number of EP iterations could alleviate this.
The estimated acquisition value could be negative.
Multi-objective predictive entropy search acquisition function.
- Parameters:
model (Model) – A fitted batched model with
Mnumber of outputs.pareto_sets (Tensor) – A
num_pareto_samples x P x d-dim tensor containing the Pareto optimal set of inputs, wherePis the number of pareto optimal points. The points in each sample have to be discrete otherwise expectation propagation will fail.maximize (bool) – If true, we consider a maximization problem.
X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have been submitted for function evaluation, but have not yet been evaluated.max_ep_iterations (int) – The maximum number of expectation propagation iterations. (The minimum number of iterations is set at 3.)
ep_jitter (float) – The amount of jitter added for the matrix inversion that occurs during the expectation propagation update during the training phase.
test_jitter (float) – The amount of jitter added for the matrix inversion that occurs during the expectation propagation update in the testing phase.
threshold (float) – The convergence threshold for expectation propagation. This assesses the relative change in the mean and covariance. We default to one percent change i.e.
threshold = 1e-2.
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
- botorch.acquisition.multi_objective.predictive_entropy_search.log_cdf_robust(x)[source]
Computes the logarithm of the normal cumulative density robustly. This uses the approximation log(1-z) ~ -z when z is small:
- if x > 5:
log(cdf(x)) = log(1-cdf(-x)) approx -cdf(-x)
- else:
log(cdf(x)).
- Parameters:
x (Tensor) – a
x_shape-dim Tensor.- Return type:
Tensor
- Returns
A
x_shape-dim Tensor.
ParEGO: Multi-Objective Acquisition Function with Chebyshev Scalarization
- class botorch.acquisition.multi_objective.parego.qLogNParEGO(model, X_baseline, scalarization_weights=None, sampler=None, objective=None, constraints=None, X_pending=None, eta=0.001, fat=True, prune_baseline=False, cache_root=None, tau_relu=1e-06, tau_max=0.01, incremental=True)[source]
Bases:
qLogNoisyExpectedImprovement,MultiObjectiveMCAcquisitionFunctionq-LogNParEGO supporting m >= 2 outcomes. This acquisition function utilizes qLogNEI to compute the expected improvement over Chebyshev scalarization of the objectives.
This is adapted from qNParEGO proposed in [Daulton2020qehvi] to utilize log-improvement acquisition functions of [Ament2023logei]. See [Knowles2005] for the original ParEGO algorithm.
This implementation assumes maximization of all objectives. If any of the model outputs are to be minimized, either an
objectiveshould be used to negate the model outputs or thescalarization_weightsshould be provided with negative weights for the outputs to be minimized.- Args:
- model: A fitted multi-output model, producing outputs for
mobjectives and any number of outcome constraints. NOTE: The model posterior must have a
meanattribute.- X_baseline: A
batch_shape x r x d-dim Tensor ofrdesign points that have already been observed. These points are considered as the potential best design point.
- scalarization_weights: A
m-dim Tensor of weights to be used in the Chebyshev scalarization. If omitted, samples from the unit simplex.
- sampler: The sampler used to draw base samples. See
MCAcquisitionFunctionmore details.- objective: The MultiOutputMCAcquisitionObjective under which the samples are
evaluated before applying Chebyshev scalarization. Defaults to
IdentityMultiOutputObjective().- constraints: A list of constraint callables which map a Tensor of posterior
samples of dimension
sample_shape x batch-shape x q x m'-dim to asample_shape x batch-shape x q-dim Tensor. The associated constraints are satisfied ifconstraint(samples) < 0.- X_pending: A
batch_shape x q' x d-dim Tensor ofq'design points that have points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into
Xupon forward call. Copied and set to have no gradient.- eta: Temperature parameter(s) governing the smoothness of the sigmoid
approximation to the constraint indicators. See the docs of
compute_(log_)smoothed_constraint_indicatorfor details.- fat: Toggles the logarithmic / linear asymptotic behavior of the smooth
approximation to the ReLU.
- prune_baseline: If True, remove points in
X_baselinethat are highly unlikely to be the best point. This can significantly improve performance and is generally recommended. In order to customize pruning parameters, instead manually call
botorch.acquisition.utils.prune_inferior_pointsonX_baselinebefore instantiating the acquisition function.- cache_root: A boolean indicating whether to cache the root
decomposition over
X_baselineand use low-rank updates.- tau_max: Temperature parameter controlling the sharpness of the smooth
approximations to max.
- tau_relu: Temperature parameter controlling the sharpness of the smooth
approximations to ReLU.
- incremental: Whether to compute incremental EI over the pending points
or compute EI of the joint batch improvement (including pending points).
- model: A fitted multi-output model, producing outputs for
- Parameters:
model (Model)
X_baseline (Tensor)
scalarization_weights (Tensor | None)
sampler (MCSampler | None)
objective (MCMultiOutputObjective | None)
constraints (list[Callable[[Tensor], Tensor]] | None)
X_pending (Tensor | None)
eta (Tensor | float)
fat (bool)
prune_baseline (bool)
cache_root (bool | None)
tau_relu (float)
tau_max (float)
incremental (bool)
The One-Shot Knowledge Gradient
Batch Knowledge Gradient (KG) via one-shot optimization as introduced in [Balandat2020botorch]. For broader discussion of KG see also [Frazier2008knowledge] and [Wu2016parallelkg].
M. Balandat, B. Karrer, D. R. Jiang, S. Daulton, B. Letham, A. G. Wilson, and E. Bakshy. BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization. Advances in Neural Information Processing Systems 33, 2020.
P. Frazier, W. Powell, and S. Dayanik. A Knowledge-Gradient policy for sequential information collection. SIAM Journal on Control and Optimization, 2008.
J. Wu and P. Frazier. The parallel knowledge gradient method for batch bayesian optimization. NIPS 2016.
- class botorch.acquisition.knowledge_gradient.qKnowledgeGradient(model, num_fantasies=64, sampler=None, objective=None, posterior_transform=None, inner_sampler=None, X_pending=None, current_value=None)[source]
Bases:
MCAcquisitionFunction,OneShotAcquisitionFunctionBatch Knowledge Gradient using one-shot optimization.
This computes the batch Knowledge Gradient using fantasies for the outer expectation and either the model posterior mean or MC-sampling for the inner expectation.
In addition to the design variables, the input
Xalso includes variables for the optimal designs for each of the fantasy models. For a fixed number of fantasies, all parts ofXcan be optimized in a “one-shot” fashion.q-Knowledge Gradient (one-shot optimization).
- Parameters:
model (Model) – A fitted model. Must support fantasizing.
num_fantasies (int | None) – The number of fantasy points to use. More fantasy points result in a better approximation, at the expense of memory and wall time. Unused if
sampleris specified.sampler (MCSampler | None) – The sampler used to sample fantasy observations. Optional if
num_fantasiesis specified.objective (MCAcquisitionObjective | None) – The objective under which the samples are evaluated. If
None, then the analytic posterior mean is used. Otherwise, the objective is MC-evaluated (using inner_sampler).posterior_transform (PosteriorTransform | None) – An optional PosteriorTransform. If given, this transforms the posterior before evaluation. If
objective is None, then the analytic posterior mean of the transformed posterior is used. Ifobjectiveis given, theinner_sampleris used to draw samples from the transformed posterior, which are then evaluated under theobjective.inner_sampler (MCSampler | None) – The sampler used for inner sampling. Ignored if the objective is
None.X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have points that have been submitted for function evaluation but have not yet been evaluated.current_value (Tensor | None) – The current value, i.e. the expected best objective given the observed points
D. If omitted, forward will not return the actual KG value, but the expected best objective given the data setD u X.
- forward(X, *args, **kwargs)
Takes in a
batch_shape x q x dX Tensor of t-batches withqd-dim design points each, and returns a Tensor with shapebatch_shape', wherebatch_shape'is the broadcasted batch shape of model and inputX. Should utilize the result ofset_X_pendingas needed to account for pending function evaluations.- Parameters:
acqf (AcquisitionFunction)
X (Any)
args (Any)
kwargs (Any)
- Return type:
Any
- evaluate(X, *args, **kwargs)
- Parameters:
acqf (AcquisitionFunction)
X (Any)
args (Any)
kwargs (Any)
- Return type:
Any
- get_augmented_q_batch_size(q)[source]
Get augmented q batch size for one-shot optimization.
- Parameters:
q (int) – The number of candidates to consider jointly.
- Returns:
The augmented size for one-shot optimization (including variables parameterizing the fantasy solutions).
- Return type:
int
- extract_candidates(X_full)[source]
We only return X as the set of candidates post-optimization.
- Parameters:
X_full (Tensor) – A
b x (q + num_fantasies) x d-dim Tensor withbt-batches ofq + num_fantasiesdesign points each.- Returns:
A
b x q x d-dim Tensor withbt-batches ofqdesign points each.- Return type:
Tensor
- class botorch.acquisition.knowledge_gradient.qMultiFidelityKnowledgeGradient(model, num_fantasies=64, sampler=None, objective=None, posterior_transform=None, inner_sampler=None, X_pending=None, current_value=None, cost_aware_utility=None, project=<function qMultiFidelityKnowledgeGradient.<lambda>>, expand=<function qMultiFidelityKnowledgeGradient.<lambda>>, valfunc_cls=None, valfunc_argfac=None)[source]
Bases:
qKnowledgeGradientBatch Knowledge Gradient for multi-fidelity optimization.
A version of
qKnowledgeGradientthat supports multi-fidelity optimization via aCostAwareUtilityand theprojectandexpandoperators. If none of these are set, this acquisition function reduces toqKnowledgeGradient. Throughvalfunc_clsandvalfunc_argfac, this can be changed into a custom multi-fidelity acquisition function (it is only KG if the terminal value is computed using a posterior mean).Multi-Fidelity q-Knowledge Gradient (one-shot optimization).
- Parameters:
model (Model) – A fitted model. Must support fantasizing.
num_fantasies (int | None) – The number of fantasy points to use. More fantasy points result in a better approximation, at the expense of memory and wall time. Unused if
sampleris specified.sampler (MCSampler | None) – The sampler used to sample fantasy observations. Optional if
num_fantasiesis specified.objective (MCAcquisitionObjective | None) – The objective under which the samples are evaluated. If
None, then the analytic posterior mean is used. Otherwise, the objective is MC-evaluated (using inner_sampler).posterior_transform (PosteriorTransform | None) – An optional PosteriorTransform. If given, this transforms the posterior before evaluation. If
objective is None, then the analytic posterior mean of the transformed posterior is used. Ifobjectiveis given, theinner_sampleris used to draw samples from the transformed posterior, which are then evaluated under theobjective.inner_sampler (MCSampler | None) – The sampler used for inner sampling. Ignored if the objective is
None.X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have points that have been submitted for function evaluation but have not yet been evaluated.current_value (Tensor | None) – The current value, i.e. the expected best objective given the observed points
D. If omitted, forward will not return the actual KG value, but the expected best objective given the data setD u X.cost_aware_utility (CostAwareUtility | None) – A CostAwareUtility computing the cost-transformed utility from a candidate set and samples of increases in utility.
project (Callable[[Tensor], Tensor]) – A callable mapping a
batch_shape x q x dtensor of design points to a tensor with shapebatch_shape x q_term x dprojected to the desired target set (e.g. the target fidelities in case of multi-fidelity optimization). For the basic case,q_term = q.expand (Callable[[Tensor], Tensor]) – A callable mapping a
batch_shape x q x dinput tensor to abatch_shape x (q + q_e)' x d-dim output tensor, where theq_eadditional points in each q-batch correspond to additional (“trace”) observations.valfunc_cls (type[AcquisitionFunction] | None) – An acquisition function class to be used as the terminal value function.
valfunc_argfac (Callable[[Model], dict[str, Any]] | None) – An argument factory, i.e. callable that maps a
Modelto a dictionary of kwargs for the terminal value function (e.g.best_fforExpectedImprovement).
- property cost_sampler
- forward(X, *args, **kwargs)
Takes in a
batch_shape x q x dX Tensor of t-batches withqd-dim design points each, and returns a Tensor with shapebatch_shape', wherebatch_shape'is the broadcasted batch shape of model and inputX. Should utilize the result ofset_X_pendingas needed to account for pending function evaluations.- Parameters:
acqf (AcquisitionFunction)
X (Any)
args (Any)
kwargs (Any)
- Return type:
Any
- class botorch.acquisition.knowledge_gradient.ProjectedAcquisitionFunction(base_value_function, project)[source]
Bases:
AcquisitionFunctionDefines a wrapper around an
AcquisitionFunctionthat incorporates the project operator. Typically used to handle value functions in look-ahead methods.- Parameters:
base_value_function (AcquisitionFunction) – The wrapped
AcquisitionFunction.project (Callable[[Tensor], Tensor]) – A callable mapping a
batch_shape x q x dtensor of design points to a tensor with shapebatch_shape x q_term x dprojected to the desired target set (e.g. the target fidelities in case of multi-fidelity optimization). For the basic case,q_term = q.
Multi-Step Lookahead Acquisition Functions
A general implementation of multi-step look-ahead acquisition function with configurable value functions. See [Jiang2020multistep].
S. Jiang, D. R. Jiang, M. Balandat, B. Karrer, J. Gardner, and R. Garnett. Efficient Nonmyopic Bayesian Optimization via One-Shot Multi-Step Trees. In Advances in Neural Information Processing Systems 33, 2020.
- class botorch.acquisition.multi_step_lookahead.qMultiStepLookahead(model, batch_sizes, num_fantasies=None, samplers=None, valfunc_cls=None, valfunc_argfacs=None, objective=None, posterior_transform=None, inner_mc_samples=None, X_pending=None, collapse_fantasy_base_samples=True)[source]
Bases:
MCAcquisitionFunction,OneShotAcquisitionFunctionMC-based batch Multi-Step Look-Ahead (one-shot optimization).
q-Multi-Step Look-Ahead (one-shot optimization).
Performs a
k-step lookahead by means of repeated fantasizing.Allows to specify the stage value functions by passing the respective class objects via the
valfunc_clslist. Optionally,valfunc_argfacstakes a list of callables that generate additional kwargs for these constructors. By default,valfunc_clswill be chosen as[None, ..., None, PosteriorMean], which corresponds to the (parallel) multi-step KnowledgeGradient. If, in addition,k=1andq_1 = 1, this reduces to the classic Knowledge Gradient.WARNING: The complexity of evaluating this function is exponential in the number of lookahead steps!
- Parameters:
model (Model) – A fitted model.
batch_sizes (list[int]) – A list
[q_1, ..., q_k]containing the batch sizes for theklook-ahead steps.num_fantasies (list[int] | None) – A list
[f_1, ..., f_k]containing the number of fantasy points to use for theklook-ahead steps.samplers (list[MCSampler] | None) – A list of MCSampler objects to be used for sampling fantasies in each stage.
valfunc_cls (list[type[AcquisitionFunction] | None] | None) – A list of
k + 1acquisition function classes to be used as the (stage + terminal) value functions. Each element (except for the last one) can beNone, in which case a zero stage value is assumed for the respective stage. IfNone, this defaults to[None, ..., None, PosteriorMean]valfunc_argfacs (list[TAcqfArgConstructor | None] | None) – A list of
k + 1“argument factories”, i.e. callables that map aModeland input tensorXto a dictionary of kwargs for the respective stage value function constructor (e.g.best_fforExpectedImprovement). If None, only the standard (model,samplerandobjective) kwargs will be used.objective (MCAcquisitionObjective | None) – The objective under which the output is evaluated. If
None, use the model output (requires a single-output model or a posterior transform). Otherwise the objective is MC-evaluated (usinginner_sampler).posterior_transform (PosteriorTransform | None) – An optional PosteriorTransform. If given, this transforms the posterior before evaluation. If
objective is None, then the output of the transformed posterior is used. Ifobjectiveis given, theinner_sampleris used to draw samples from the transformed posterior, which are then evaluated under theobjective.inner_mc_samples (list[int] | None) – A list
[n_0, ..., n_k]containing the number of MC samples to be used for evaluating the stage value function. Ignored if the objective isNone.X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have points that have been submitted for function evaluation but have not yet been evaluated. Concatenated intoXupon forward call. Copied and set to have no gradient.collapse_fantasy_base_samples (bool) – If True, collapse_batch_dims of the Samplers will be applied on fantasy batch dimensions as well, meaning that base samples are the same in all subtrees starting from the same level.
- forward(X, *args, **kwargs)
Takes in a
batch_shape x q x dX Tensor of t-batches withqd-dim design points each, and returns a Tensor with shapebatch_shape', wherebatch_shape'is the broadcasted batch shape of model and inputX. Should utilize the result ofset_X_pendingas needed to account for pending function evaluations.- Parameters:
acqf (AcquisitionFunction)
X (Any)
args (Any)
kwargs (Any)
- Return type:
Any
- get_augmented_q_batch_size(q)[source]
Get augmented q batch size for one-shot optimization.
- Parameters:
q (int) – The number of candidates to consider jointly.
- Returns:
The augmented size for one-shot optimization (including variables parameterizing the fantasy solutions):
q_0 + f_1 q_1 + f_2 f_1 q_2 + ...- Return type:
int
- get_split_shapes(X)[source]
Get the split shapes from X.
- Parameters:
X (Tensor) – A
batch_shape x q_aug x d-dim tensor including fantasy points.- Returns:
A 3-tuple
(batch_shape, shapes, sizes), whereshape[i] = f_i x .... x f_1 x batch_shape x q_i x dandsize[i] = f_i * ... f_1 * q_i.- Return type:
tuple[Size, list[Size], list[int]]
- get_multi_step_tree_input_representation(X)[source]
Get the multi-step tree representation of X.
- Parameters:
X (Tensor) – A
batch_shape x q' x d-dim Tensor withq'design points for each batch, whereq' = q_0 + f_1 q_1 + f_2 f_1 q_2 + .... Hereq_iis the number of candidates jointly considered in look-ahead stepi, andf_iis respective number of fantasies.- Returns:
A list
[X_j, ..., X_k]of tensors, whereX_ihas shapef_i x .... x f_1 x batch_shape x q_i x d.- Return type:
list[Tensor]
- extract_candidates(X_full)[source]
We only return X as the set of candidates post-optimization.
- Parameters:
X_full (Tensor) – A
batch_shape x q' x d-dim Tensor withq'design points for each batch, whereq' = q + f_1 q_1 + f_2 f_1 q_2 + ....- Returns:
A
batch_shape x q x d-dim Tensor withqdesign points for each batch.- Return type:
Tensor
- get_induced_fantasy_model(X)[source]
Fantasy model induced by X.
- Parameters:
X (Tensor) – A
batch_shape x q' x d-dim Tensor withq'design points for each batch, whereq' = q_0 + f_1 q_1 + f_2 f_1 q_2 + .... Hereq_iis the number of candidates jointly considered in look-ahead stepi, andf_iis respective number of fantasies.- Returns:
The fantasy model induced by X.
- Return type:
- botorch.acquisition.multi_step_lookahead.warmstart_multistep(acq_function, bounds, num_restarts, raw_samples, full_optimizer)[source]
Warm-start initialization for multi-step look-ahead acquisition functions.
For now uses the same q’ as in
full_optimizer. TODO: allow differentq.- Parameters:
acq_function (qMultiStepLookahead) – A qMultiStepLookahead acquisition function.
bounds (Tensor) – A
2 x dtensor of lower and upper bounds for each column of features.num_restarts (int) – The number of starting points for multistart acquisition function optimization.
raw_samples (int) – The number of raw samples to consider in the initialization heuristic.
full_optimizer (Tensor) – The full tree of optimizers of the previous iteration of shape
batch_shape x q' x d. Typically obtained by passingreturn_best_only=Falseandreturn_full_tree=Trueintooptimize_acqf.
- Returns:
A
num_restarts x q' x dtensor for initial points for optimization.- Return type:
Tensor
This is a very simple initialization heuristic. TODO: Use the observed values to identify the fantasy sub-tree that is closest to the observed value.
Max-value Entropy Search Acquisition Functions
Acquisition functions for Max-value Entropy Search (MES), General Information-Based Bayesian Optimization (GIBBON), and multi-fidelity MES with noisy observations and trace observations.
References
Moss, H. B., et al., GIBBON: General-purpose Information-Based Bayesian OptimisatioN. Journal of Machine Learning Research, 2021.
- class botorch.acquisition.max_value_entropy_search.MaxValueBase(model, candidate_set, num_mv_samples=10, posterior_transform=None, use_gumbel=True, maximize=True, X_pending=None, train_inputs=None)[source]
Bases:
AcquisitionFunction,ABCAbstract base class for acquisition functions based on Max-value Entropy Search, using discrete max posterior sampling.
This class provides the basic building blocks for constructing max-value entropy-based acquisition functions along the lines of [Wang2017mves]. It provides basic functionality for sampling posterior maximum values from a surrogate Gaussian process model using a discrete set of candidates. It supports either exact (w.r.t. the candidate set) sampling, or using a Gumbel approximation.
Subclasses must implement
_compute_information_gain.Single-outcome max-value entropy search-based acquisition functions based on discrete MV sampling.
- Parameters:
model (Model) – A fitted single-outcome model.
candidate_set (Tensor) – A
n x dTensor includingncandidate points to discretize the design space. Max values are sampled from the (joint) model posterior over these points.num_mv_samples (int) – Number of max value samples.
posterior_transform (PosteriorTransform | None) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.
use_gumbel (bool) – If True, use Gumbel approximation to sample the max values.
maximize (bool) – If True, consider the problem a maximization problem.
X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have been submitted for function evaluation but have not yet been evaluated.train_inputs (Tensor | None) – A
n_train x dTensor that the model has been fitted on. Not required if the model is an instance of a GPyTorch ExactGP model.
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
- class botorch.acquisition.max_value_entropy_search.qMaxValueEntropy(model, candidate_set, num_fantasies=16, num_mv_samples=10, num_y_samples=128, posterior_transform=None, use_gumbel=True, maximize=True, X_pending=None, train_inputs=None)[source]
Bases:
MaxValueBase,MCSamplerMixinThe acquisition function for Max-value Entropy Search.
This acquisition function computes the mutual information of max values and a candidate point X. See [Wang2017mves] for a detailed discussion.
The model must be single-outcome. The batch case
q > 1is supported through cyclic optimization and fantasies.Example
>>> model = SingleTaskGP(train_X, train_Y) >>> candidate_set = torch.rand(1000, bounds.size(1)) >>> candidate_set = bounds[0] + (bounds[1] - bounds[0]) * candidate_set >>> MES = qMaxValueEntropy(model, candidate_set) >>> mes = MES(test_X)
Single-outcome max-value entropy search acquisition function.
- Parameters:
model (Model) – A fitted single-outcome model.
candidate_set (Tensor) – A
n x dTensor includingncandidate points to discretize the design space. Max values are sampled from the (joint) model posterior over these points.num_fantasies (int) – Number of fantasies to generate. The higher this number the more accurate the model (at the expense of model complexity, wall time and memory). Ignored if
X_pendingisNone.num_mv_samples (int) – Number of max value samples.
num_y_samples (int) – Number of posterior samples at specific design point
X.posterior_transform (PosteriorTransform | None) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.
use_gumbel (bool) – If True, use Gumbel approximation to sample the max values.
maximize (bool) – If True, consider the problem a maximization problem.
X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have been submitted for function evaluation but have not yet been evaluated.train_inputs (Tensor | None) – A
n_train x dTensor that the model has been fitted on. Not required if the model is an instance of a GPyTorch ExactGP model.
- set_X_pending(X_pending=None)[source]
Set pending points.
Informs the acquisition function about pending design points, fantasizes the model on the pending points and draws max-value samples from the fantasized model posterior.
- Parameters:
X_pending (Tensor | None) –
m x dTensor withmd-dim design points that have been submitted for evaluation but have not yet been evaluated.- Return type:
None
- class botorch.acquisition.max_value_entropy_search.qLowerBoundMaxValueEntropy(model, candidate_set, num_mv_samples=10, posterior_transform=None, use_gumbel=True, maximize=True, X_pending=None, train_inputs=None)[source]
Bases:
MaxValueBaseThe acquisition function for General-purpose Information-Based Bayesian Optimisation (GIBBON).
This acquisition function provides a computationally cheap approximation of the mutual information between max values and a batch of candidate points
X. See [Moss2021gibbon] for a detailed discussion.The model must be single-outcome, unless using a PosteriorTransform. q > 1 is supported through greedy batch filling.
Example
>>> model = SingleTaskGP(train_X, train_Y) >>> candidate_set = torch.rand(1000, bounds.size(1)) >>> candidate_set = bounds[0] + (bounds[1] - bounds[0]) * candidate_set >>> qGIBBON = qLowerBoundMaxValueEntropy(model, candidate_set) >>> candidates, _ = optimize_acqf(qGIBBON, bounds, q=5)
Lower bound max-value entropy search acquisition function (GIBBON).
- Parameters:
model (Model) – A fitted single-outcome model.
candidate_set (Tensor) – A
n x dTensor includingncandidate points to discretize the design space. Max values are sampled from the (joint) model posterior over these points.num_mv_samples (int) – Number of max value samples.
posterior_transform (PosteriorTransform | None) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.
use_gumbel (bool) – If True, use Gumbel approximation to sample the max values.
maximize (bool) – If True, consider the problem a maximization problem.
X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have been submitted for function evaluation but have not yet been evaluated.train_inputs (Tensor | None) – A
n_train x dTensor that the model has been fitted on. Not required if the model is an instance of a GPyTorch ExactGP model.
- class botorch.acquisition.max_value_entropy_search.qMultiFidelityMaxValueEntropy(model, candidate_set, num_fantasies=16, num_mv_samples=10, num_y_samples=128, posterior_transform=None, use_gumbel=True, maximize=True, X_pending=None, cost_aware_utility=None, project=<function qMultiFidelityMaxValueEntropy.<lambda>>, expand=<function qMultiFidelityMaxValueEntropy.<lambda>>)[source]
Bases:
qMaxValueEntropyMulti-fidelity max-value entropy.
The acquisition function for multi-fidelity max-value entropy search with support for trace observations. See [Takeno2020mfmves] for a detailed discussion of the basic ideas on multi-fidelity MES (note that this implementation is somewhat different).
The model must be single-outcome. The batch case
q > 1is supported through cyclic optimization and fantasies.Example
>>> model = SingleTaskGP(train_X, train_Y) >>> candidate_set = torch.rand(1000, bounds.size(1)) >>> candidate_set = bounds[0] + (bounds[1] - bounds[0]) * candidate_set >>> MF_MES = qMultiFidelityMaxValueEntropy(model, candidate_set) >>> mf_mes = MF_MES(test_X)
Single-outcome max-value entropy search acquisition function.
- Parameters:
model (Model) – A fitted single-outcome model.
candidate_set (Tensor) – A
n x dorn x (d + s)Tensor includingncandidate points to discretize the design space, which will be used to sample the max values from their posteriors.sis the number of fidelity dimensions. Theprojectcallable is applied to the candidate set before use, so if it handles inserting fidelity dimensions (e.g.,project_to_target_fidelitywithdspecified),candidate_setcan omit them.cost_aware_utility (CostAwareUtility | None) – A CostAwareUtility computing the cost-transformed utility from a candidate set and samples of increases in utility.
num_fantasies (int) – Number of fantasies to generate. The higher this number the more accurate the model (at the expense of model complexity and performance) and it’s only used when
X_pendingis notNone.num_mv_samples (int) – Number of max value samples.
num_y_samples (int) – Number of posterior samples at specific design point
X.posterior_transform (PosteriorTransform | None) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.
use_gumbel (bool) – If True, use Gumbel approximation to sample the max values.
maximize (bool) – If True, consider the problem a maximization problem.
X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have been submitted for function evaluation but have not yet been evaluated.cost_aware_utility – A CostAwareUtility computing the cost-transformed utility from a candidate set and samples of increases in utility.
project (Callable[[Tensor], Tensor]) – A callable mapping a
batch_shape x q x dtensor of design points to a tensor of the same shape projected to the desired target set (e.g. the target fidelities in case of multi-fidelity optimization). This is also applied to the candidate set during initialization.expand (Callable[[Tensor], Tensor]) – A callable mapping a
batch_shape x q x dinput tensor to abatch_shape x (q + q_e)' x d-dim output tensor, where theq_eadditional points in each q-batch correspond to additional (“trace”) observations.
- property cost_sampler
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
- class botorch.acquisition.max_value_entropy_search.qMultiFidelityLowerBoundMaxValueEntropy(model, candidate_set, num_fantasies=16, num_mv_samples=10, num_y_samples=128, X_pending=None, posterior_transform=None, use_gumbel=True, maximize=True, cost_aware_utility=None, project=<function qMultiFidelityLowerBoundMaxValueEntropy.<lambda>>, expand=None)[source]
Bases:
qMultiFidelityMaxValueEntropyMulti-fidelity acquisition function for General-purpose Information-Based Bayesian optimization (GIBBON).
The acquisition function for multi-fidelity max-value entropy search with support for trace observations. See [Takeno2020mfmves] for a detailed discussion of the basic ideas on multi-fidelity MES (note that this implementation is somewhat different). This acquisition function is similar to
qMultiFidelityMaxValueEntropybut computes the information gain from the lower bound described in [Moss2021gibbon].The model must be single-outcome, unless using a PosteriorTransform. The batch case
q > 1is supported through cyclic optimization and fantasies.Example
>>> model = SingleTaskGP(train_X, train_Y) >>> candidate_set = torch.rand(1000, bounds.size(1)) >>> candidate_set = bounds[0] + (bounds[1] - bounds[0]) * candidate_set >>> MF_qGIBBON = qMultiFidelityLowerBoundMaxValueEntropy(model, candidate_set) >>> mf_gibbon = MF_qGIBBON(test_X)
Single-outcome max-value entropy search acquisition function.
- Parameters:
model (Model) – A fitted single-outcome model.
candidate_set (Tensor) – A
n x dorn x (d + s)Tensor includingncandidate points to discretize the design space, which will be used to sample the max values from their posteriors.sis the number of fidelity dimensions. Theprojectcallable is applied to the candidate set before use, so if it handles inserting fidelity dimensions (e.g.,project_to_target_fidelitywithdspecified),candidate_setcan omit them.cost_aware_utility (CostAwareUtility | None) – A CostAwareUtility computing the cost-transformed utility from a candidate set and samples of increases in utility.
num_fantasies (int) – Number of fantasies to generate. The higher this number the more accurate the model (at the expense of model complexity and performance) and it’s only used when
X_pendingis notNone.num_mv_samples (int) – Number of max value samples.
num_y_samples (int) – Number of posterior samples at specific design point
X.posterior_transform (PosteriorTransform | None) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.
X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have been submitted for function evaluation but have not yet been evaluated.use_gumbel (bool) – If True, use Gumbel approximation to sample the max values.
maximize (bool) – If True, consider the problem a maximization problem.
cost_aware_utility – A CostAwareUtility computing the cost-transformed utility from a candidate set and samples of increases in utility.
project (Callable[[Tensor], Tensor]) – A callable mapping a
batch_shape x q x dtensor of design points to a tensor of the same shape projected to the desired target set (e.g. the target fidelities in case of multi-fidelity optimization).expand (Callable[[Tensor], Tensor] | None) – A callable mapping a
batch_shape x q x dinput tensor to abatch_shape x (q + q_e)' x d-dim output tensor, where theq_eadditional points in each q-batch correspond to additional (“trace”) observations. NOTE: This is currently not supported. It leads to wrong outputs.
- set_X_pending(X_pending=None)[source]
For the non-lower bound methods, X_pending creates a new fantasy model with a batch shape of (num_fantasies, batch_shape, m). Lower bound methods don’t operate with the same logic, so we don’t need to create a new fantasy model. Moreover, this causes shape issues in the forward pass due to tensor broadcasting inconsistencies.
- Parameters:
X_pending (Tensor | None)
- Return type:
None
Joint Entropy Search Acquisition Functions
Acquisition function for joint entropy search (JES).
C. Hvarfner, F. Hutter, L. Nardi, Joint Entropy Search for Maximally-informed Bayesian Optimization. In Proceedings of the Annual Conference on Neural Information Processing Systems (NeurIPS), 2022.
- class botorch.acquisition.joint_entropy_search.qJointEntropySearch(model, optimal_inputs, optimal_outputs, condition_noiseless=True, posterior_transform=None, X_pending=None, estimation_type='LB', num_samples=64)[source]
Bases:
AcquisitionFunction,MCSamplerMixinThe acquisition function for the Joint Entropy Search, where the batches
q > 1are supported through the lower bound formulation.This acquisition function computes the mutual information between the observation at a candidate point
Xand the optimal input-output pair.See [Tu2022joint] for a discussion on the estimation procedure.
Joint entropy search acquisition function.
- Parameters:
model (Model) – A fitted single-outcome model.
optimal_inputs (Tensor) – A
num_samples x d-dim tensor containing the sampled optimal inputs of dimensiond. We assume for simplicity that each sample only contains one optimal set of inputs.optimal_outputs (Tensor) – A
num_samples x 1-dim Tensor containing the optimal set of objectives of dimension1.condition_noiseless (bool) – Whether to condition on noiseless optimal observations
f*[Hvarfner2022joint] or noisy optimal observationsy*[Tu2022joint]. These are sampled identically, so this only controls the fashion in which the GP is reshaped as a result of conditioning on the optimum.posterior_transform (PosteriorTransform | None) – PosteriorTransform to negate or scalarize the output.
estimation_type (str) – estimation_type: A string to determine which entropy estimate is computed: Lower bound” (“LB”) or “Monte Carlo” (“MC”). Lower Bound is recommended due to the relatively high variance of the MC estimator.
X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have been submitted for function evaluation, but have not yet been evaluated.num_samples (int) – The number of Monte Carlo samples used for the Monte Carlo estimate.
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
Predictive Entropy Search Acquisition Functions
Acquisition function for predictive entropy search (PES). The code utilizes the implementation designed for the multi-objective batch setting.
NOTE: The PES acquisition might not be differentiable. As a result, we recommend optimizing the acquisition function using finite differences.
- class botorch.acquisition.predictive_entropy_search.qPredictiveEntropySearch(model, optimal_inputs, maximize=True, X_pending=None, max_ep_iterations=250, ep_jitter=0.0001, test_jitter=0.0001, threshold=0.01)[source]
Bases:
qMultiObjectivePredictiveEntropySearchThe acquisition function for Predictive Entropy Search.
This acquisition function approximates the mutual information between the observation at a candidate point
Xand the optimal set of inputs using expectation propagation (EP).NOTES: (i) The expectation propagation procedure can potentially fail due to the unstable EP updates. This is however unlikely to happen in the single-objective setting because we have much fewer EP factors. The jitter added in the training phase (
ep_jitter) and testing phase (test_jitter) can be increased to prevent these failures from happening. More details in the description ofqMultiObjectivePredictiveEntropySearch.The estimated acquisition value could be negative.
Predictive entropy search acquisition function.
- Parameters:
model (Model) – A fitted single-outcome model.
optimal_inputs (Tensor) – A
num_samples x d-dim tensor containing the sampled optimal inputs of dimensiond. We assume for simplicity that each sample only contains one optimal set of inputs.maximize (bool) – If true, we consider a maximization problem.
X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have been submitted for function evaluation, but have not yet been evaluated.max_ep_iterations (int) – The maximum number of expectation propagation iterations. (The minimum number of iterations is set at 3.)
ep_jitter (float) – The amount of jitter added for the matrix inversion that occurs during the expectation propagation update during the training phase.
test_jitter (float) – The amount of jitter added for the matrix inversion that occurs during the expectation propagation update in the testing phase.
threshold (float) – The convergence threshold for expectation propagation. This assesses the relative change in the mean and covariance. We default to one percent change i.e.
threshold = 1e-2.
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
Active Learning Acquisition Functions
Active learning acquisition functions.
S. Seo, M. Wallat, T. Graepel, and K. Obermayer. Gaussian process regression: Active data selection and test point rejection. IJCNN 2000.
X. Chen and Q. Zhou. Sequential experimental designs for stochastic kriging. Winter Simulation Conference 2014.
M. Binois, J. Huang, R. B. Gramacy, and M. Ludkovski. Replication or exploration? Sequential design for stochastic simulation experiments. ArXiv 2017.
- class botorch.acquisition.active_learning.qNegIntegratedPosteriorVariance(model, mc_points, sampler=None, posterior_transform=None, X_pending=None)[source]
Bases:
AcquisitionFunctionBatch Integrated Negative Posterior Variance for Active Learning.
This acquisition function quantifies the (negative) integrated posterior variance (excluding observation noise, computed using MC integration) of the model. In that, it is a proxy for global model uncertainty, and thus purely focused on “exploration”, rather the “exploitation” of many of the classic Bayesian Optimization acquisition functions.
See [Seo2014activedata], [Chen2014seqexpdesign], and [Binois2017repexp].
q-Integrated Negative Posterior Variance.
- Parameters:
model (Model) – A fitted model.
mc_points (Tensor) – A
batch_shape x N x dtensor of points to use for MC-integrating the posterior variance. Usually, these are qMC samples on the whole design space, but biased sampling directly allows weighted integration of the posterior variance.sampler (MCSampler | None) – The sampler used for drawing fantasy samples. In the basic setting of a standard GP (default) this is a dummy, since the variance of the model after conditioning does not actually depend on the sampled values.
posterior_transform (PosteriorTransform | None) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.
X_pending (Tensor | None) – A
n' x d-dim Tensor ofn'design points that have been submitted for function evaluation but have not yet been evaluated.
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
- class botorch.acquisition.active_learning.PairwiseMCPosteriorVariance(model, objective, sampler=None)[source]
Bases:
MCAcquisitionFunctionVariance of difference for Active Learning
Given a model and an objective, calculate the posterior sample variance of the objective on the difference of pairs of points. See more implementation details in
forward. This acquisition function is typically used with a pairwise model (e.g., PairwiseGP) and a likelihood/link function on the pair difference (e.g., logistic or probit) for pure explorationPairwise Monte Carlo Posterior Variance
- Parameters:
model (Model) – A fitted model.
objective (MCAcquisitionObjective) – An MCAcquisitionObjective representing the link function (e.g., logistic or probit.) applied on the difference of (usually 1-d) two samples. Can be implemented via GenericMCObjective.
sampler (MCSampler | None) – The sampler used for drawing MC samples.
- forward(X, *args, **kwargs)
Takes in a
batch_shape x q x dX Tensor of t-batches withqd-dim design points each, and returns a Tensor with shapebatch_shape', wherebatch_shape'is the broadcasted batch shape of model and inputX. Should utilize the result ofset_X_pendingas needed to account for pending function evaluations.- Parameters:
acqf (AcquisitionFunction)
X (Any)
args (Any)
kwargs (Any)
- Return type:
Any
Bayesian Active Learning Acquisition Functions
Acquisition functions for Bayesian active learning. This includes: BALD [Houlsby2011bald] and its batch version [kirsch2019batchbald].
References
- botorch.acquisition.bayesian_active_learning.check_negative_info_gain(info_gain)[source]
Check if the (expected) information gain is negative, raise a warning if so.
- Parameters:
info_gain (Tensor)
- Return type:
None
- class botorch.acquisition.bayesian_active_learning.FullyBayesianAcquisitionFunction(model)[source]
Bases:
AcquisitionFunctionBase class for acquisition functions which require a Fully Bayesian model treatment.
- Parameters:
model (Model) – A fully bayesian single-outcome model.
- class botorch.acquisition.bayesian_active_learning.qBayesianActiveLearningByDisagreement(model, sampler=None, posterior_transform=None, X_pending=None)[source]
Bases:
FullyBayesianAcquisitionFunction,MCSamplerMixinBatch implementation [kirsch2019batchbald] of BALD [Houlsby2011bald], which maximizes the mutual information between the next observation and the hyperparameters of the model. Computed by Monte Carlo integration.
- Parameters:
model (ModelListGP | SaasFullyBayesianSingleTaskGP) – A fully bayesian model (SaasFullyBayesianSingleTaskGP).
sampler (MCSampler | None) – The sampler used for drawing samples to approximate the entropy of the Gaussian Mixture posterior.
posterior_transform (PosteriorTransform | None) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.
X_pending (Tensor | None) – A
batch_shape x m x d-dim Tensor ofmdesign points
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
Preference Acquisition Functions
Preference acquisition functions. This includes: Analytical EUBO acquisition function as introduced in [Lin2022preference] and its MC-based generalization qEUBO as proposed in [Astudillo2023qeubo].
Astudillo, R., Lin, Z.J., Bakshy, E. and Frazier, P.I. qEUBO: A Decision-Theoretic Acquisition Function for Preferential Bayesian Optimization. International Conference on Artificial Intelligence and Statistics (AISTATS), 2023.
- class botorch.acquisition.preference.AnalyticExpectedUtilityOfBestOption(pref_model, outcome_model=None, previous_winner=None)[source]
Bases:
AnalyticAcquisitionFunctionAnalytic Preferential Expected Utility of Best Options, i.e., Analytical EUBO
Analytic implementation of Expected Utility of the Best Option under the Laplace model (assumes a PairwiseGP is used as the preference model) as proposed in [Lin2022preference].
- Parameters:
pref_model (Model) – The preference model that maps the outcomes (i.e., Y) to scalar-valued utility.
outcome_model (DeterministicModel | None) – A deterministic model that maps parameters (i.e., X) to outcomes (i.e., Y). The outcome model f defines the search space of Y = f(X). If model is None, we are directly calculating EUBO on the parameter space. When used with
OneSamplePosteriorDrawModel, we are obtaining EUBO-zeta as described in [Lin2022preference].previous_winner (Tensor | None) – Tensor representing the previous winner in the Y space.
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
- class botorch.acquisition.preference.qExpectedUtilityOfBestOption(pref_model, outcome_model=None, sampler=None, objective=None, posterior_transform=None, X_pending=None)[source]
Bases:
MCAcquisitionFunctionMC-based Expected Utility of Best Option (qEUBO)
This computes qEUBO by (1) sampling the joint posterior over q points (2) evaluating the maximum objective value across the q points (3) averaging over the samples
qEUBO(X) = E[max Y], Y ~ f(X), where X = (x_1,...,x_q)MC-based Expected Utility of Best Option (qEUBO) as proposed in [Astudillo2023qeubo].
- Parameters:
pref_model (Model) – The preference model that maps the outcomes (i.e., Y) to scalar-valued utility.
outcome_model (DeterministicModel | None) –
- A deterministic model that maps parameters (i.e., X) to
outcomes (i.e., Y). The outcome model f defines the search space of Y = f(X). If model is None, we are directly calculating qEUBO on the parameter space.
- sampler: The sampler used to draw base samples. See
MCAcquisitionFunctionmore details.
objective (MCAcquisitionObjective | None) – The MCAcquisitionObjective under which the samples are evaluated. Defaults to
IdentityMCObjective().posterior_transform (PosteriorTransform | None) – A PosteriorTransform (optional).
X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call. Copied and set to have no gradient.sampler (MCSampler | None)
- forward(X, *args, **kwargs)
Takes in a
batch_shape x q x dX Tensor of t-batches withqd-dim design points each, and returns a Tensor with shapebatch_shape', wherebatch_shape'is the broadcasted batch shape of model and inputX. Should utilize the result ofset_X_pendingas needed to account for pending function evaluations.- Parameters:
acqf (AcquisitionFunction)
X (Any)
args (Any)
kwargs (Any)
- Return type:
Any
- class botorch.acquisition.preference.PairwiseBayesianActiveLearningByDisagreement(pref_model, outcome_model=None, num_samples=1024, std_noise=0.0)[source]
Bases:
MCAcquisitionFunctionMC Bayesian Active Learning by Disagreement
Monte Carlo implementation of Bayesian Active Learning by Disagreement (BALD) proposed in [Houlsby2011bald].
- Parameters:
pref_model (Model) – The preference model that maps the outcomes (i.e., Y) to scalar-valued utility.
outcome_model (DeterministicModel | None) – A deterministic model that maps parameters (i.e., X) to outcomes (i.e., Y). The outcome model f defines the search space of Y = f(X). If model is None, we are directly calculating BALD on the parameter space.
num_samples (int | None) – number of samples to approximate the conditional_entropy.
std_noise (float | None) – Additional observational noise to include. Defaults to 0.
- forward(X, *args, **kwargs)
Takes in a
batch_shape x q x dX Tensor of t-batches withqd-dim design points each, and returns a Tensor with shapebatch_shape', wherebatch_shape'is the broadcasted batch shape of model and inputX. Should utilize the result ofset_X_pendingas needed to account for pending function evaluations.- Parameters:
acqf (AcquisitionFunction)
X (Any)
args (Any)
kwargs (Any)
- Return type:
Any
Objectives and Cost-Aware Utilities
Objectives
Objective Modules to be used with acquisition functions.
- class botorch.acquisition.objective.PosteriorTransform(*args, **kwargs)[source]
Bases:
Module,ABCAbstract base class for objectives that transform the posterior.
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- Parameters:
args (Any)
kwargs (Any)
- abstractmethod evaluate(Y, X=None)[source]
Evaluate the transform on a set of outcomes.
- Parameters:
Y (Tensor) – A
batch_shape x q x m-dim tensor of outcomes.X (Tensor | None) – A
batch_shape x q x d-dim tensor of inputs. Relevant only if the transform depends on the inputs explicitly.
- Returns:
A
batch_shape x q' [x m']-dim tensor of transformed outcomes.- Return type:
Tensor
- class botorch.acquisition.objective.ScalarizedPosteriorTransform(weights, offset=0.0)[source]
Bases:
PosteriorTransformAn affine posterior transform for scalarizing multi-output posteriors.
For a Gaussian posterior at a single point (
q=1) with meanmuand covariance matrixSigma, this yields a single-output posterior with meanweights^T * muand varianceweights^T Sigma w.Example
Example for a model with two outcomes:
>>> weights = torch.tensor([0.5, 0.25]) >>> posterior_transform = ScalarizedPosteriorTransform(weights) >>> EI = ExpectedImprovement( ... model, best_f=0.1, posterior_transform=posterior_transform ... )
- Parameters:
weights (Tensor) – A one-dimensional tensor with
melements representing the linear weights on the outputs.offset (float) – An offset to be added to posterior mean.
- scalarize: bool = True
- evaluate(Y, X=None)[source]
Evaluate the transform on a set of outcomes.
- Parameters:
Y (Tensor) – A
batch_shape x q x m-dim tensor of outcomes.X (Tensor | None) – A
batch_shape x q x d-dim tensor of inputs. Relevant only if the transform depends on the inputs explicitly. Ignored here.
- Returns:
A
batch_shape x q-dim tensor of transformed outcomes.- Return type:
Tensor
- forward(posterior, X=None)[source]
Compute the posterior of the affine transformation.
- Parameters:
posterior (GPyTorchPosterior | PosteriorList) – A posterior with the same number of outputs as the elements in
self.weights.X (Tensor | None) – A
batch_shape x q x d-dim tensor of inputs. Relevant only if the transform depends on the inputs explicitly. Ignored here.
- Returns:
A single-output posterior.
- Return type:
- class botorch.acquisition.objective.ExpectationPosteriorTransform(n_w, weights=None)[source]
Bases:
PosteriorTransformTransform the
batch x (q * n_w) x mposterior into abatch x q x mposterior of the expectation. The expectation is calculated over each consecutiven_wblock of points in the posterior.This is intended for use with
InputPerturbationorAppendFeaturesfor optimizing the expectation overn_wpoints. This should not be used when there are constraints present, since this does not take into account the feasibility of the objectives.Note: This is different than
ScalarizedPosteriorTransformin that this operates over the q-batch dimension.A posterior transform calculating the expectation over the q-batch dimension.
- Parameters:
n_w (int) – The number of points in the q-batch of the posterior to compute the expectation over. This corresponds to the size of the
feature_setofAppendFeaturesor the size of theperturbation_setofInputPerturbation.weights (Tensor | None) – An optional
n_w x m-dim tensor of weights. Can be used to compute a weighted expectation. Weights are normalized before use.
- evaluate(Y, X=None)[source]
Evaluate the expectation of a set of outcomes.
- Parameters:
Y (Tensor) – A
batch_shape x (q * n_w) x m-dim tensor of outcomes.X (Tensor | None) – A
batch_shape x q x d-dim tensor of inputs. Relevant only if the transform depends on the inputs explicitly. Ignored here.
- Returns:
A
batch_shape x q x m-dim tensor of expectation outcomes.- Return type:
Tensor
- forward(posterior, X=None)[source]
Compute the posterior of the expectation.
- Parameters:
posterior (GPyTorchPosterior) – An
m-outcome joint posterior overq * n_wpoints.X (Tensor | None) – A
batch_shape x q x d-dim tensor of inputs. Relevant only if the transform depends on the inputs explicitly. Ignored here.
- Returns:
An
m-outcome joint posterior overqexpectations.- Return type:
- class botorch.acquisition.objective.MCAcquisitionObjective(*args, **kwargs)[source]
Bases:
Module,ABCAbstract base class for MC-based objectives.
- Parameters:
_verify_output_shape – If True and
Xis given, check that the q-batch shape of the objectives agrees with that of X._is_mo – A boolean denoting whether the objectives are multi-output.
args (Any)
kwargs (Any)
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- abstractmethod forward(samples, X=None)[source]
Evaluate the objective on the samples.
- Parameters:
samples (Tensor) – A
sample_shape x batch_shape x q x m-dim Tensors of samples from a model posterior.X (Tensor | None) – A
batch_shape x q x d-dim tensor of inputs. Relevant only if the objective depends on the inputs explicitly.
- Returns:
A
sample_shape x batch_shape x q-dim Tensor of objective values (assuming maximization).- Return type:
Tensor
This method is usually not called directly, but via the objectives.
Example
>>> # ``__call__`` method: >>> samples = sampler(posterior) >>> outcome = mc_obj(samples)
- class botorch.acquisition.objective.IdentityMCObjective(*args, **kwargs)[source]
Bases:
MCAcquisitionObjectiveTrivial objective extracting the last dimension.
Example
>>> identity_objective = IdentityMCObjective() >>> samples = sampler(posterior) >>> objective = identity_objective(samples)
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- Parameters:
args (Any)
kwargs (Any)
- forward(samples, X=None)[source]
Evaluate the objective on the samples.
- Parameters:
samples (Tensor) – A
sample_shape x batch_shape x q x m-dim Tensors of samples from a model posterior.X (Tensor | None) – A
batch_shape x q x d-dim tensor of inputs. Relevant only if the objective depends on the inputs explicitly.
- Returns:
A
sample_shape x batch_shape x q-dim Tensor of objective values (assuming maximization).- Return type:
Tensor
This method is usually not called directly, but via the objectives.
Example
>>> # ``__call__`` method: >>> samples = sampler(posterior) >>> outcome = mc_obj(samples)
- class botorch.acquisition.objective.LinearMCObjective(weights)[source]
Bases:
MCAcquisitionObjectiveLinear objective constructed from a weight tensor.
For input
samplesandmc_obj = LinearMCObjective(weights), this producesmc_obj(samples) = sum_{i} weights[i] * samples[..., i]Example
Example for a model with two outcomes:
>>> weights = torch.tensor([0.75, 0.25]) >>> linear_objective = LinearMCObjective(weights) >>> samples = sampler(posterior) >>> objective = linear_objective(samples)
- Parameters:
weights (Tensor) – A one-dimensional tensor with
melements representing the linear weights on the outputs.
- forward(samples, X=None)[source]
Evaluate the linear objective on the samples.
- Parameters:
samples (Tensor) – A
sample_shape x batch_shape x q x m-dim tensors of samples from a model posterior.X (Tensor | None) – A
batch_shape x q x d-dim tensor of inputs. Relevant only if the objective depends on the inputs explicitly.
- Returns:
A
sample_shape x batch_shape x q-dim tensor of objective values.- Return type:
Tensor
- class botorch.acquisition.objective.GenericMCObjective(objective)[source]
Bases:
MCAcquisitionObjectiveObjective generated from a generic callable.
Allows to construct arbitrary MC-objective functions from a generic callable. In order to be able to use gradient-based acquisition function optimization it should be possible to backpropagate through the callable.
Example
>>> generic_objective = GenericMCObjective( lambda Y, X: torch.sqrt(Y).sum(dim=-1), ) >>> samples = sampler(posterior) >>> objective = generic_objective(samples)
- Parameters:
objective (Callable[[Tensor, Tensor | None], Tensor]) – A callable
f(samples, X)mapping asample_shape x batch-shape x q x m-dim Tensorsamplesand an optionalbatch-shape x q x d-dim TensorXto asample_shape x batch-shape x q-dim Tensor of objective values.
- forward(samples, X=None)[source]
Evaluate the objective on the samples.
- Parameters:
samples (Tensor) – A
sample_shape x batch_shape x q x m-dim Tensors of samples from a model posterior.X (Tensor | None) – A
batch_shape x q x d-dim tensor of inputs. Relevant only if the objective depends on the inputs explicitly.
- Returns:
A
sample_shape x batch_shape x q-dim Tensor of objective values.- Return type:
Tensor
- class botorch.acquisition.objective.ConstrainedMCObjective(objective, constraints, infeasible_cost=0.0, eta=0.001)[source]
Bases:
GenericMCObjectiveFeasibility-weighted objective.
An Objective allowing to maximize some scalable objective on the model outputs subject to a number of constraints. Constraint feasibilty is approximated by a sigmoid function.
mc_acq(X) = ( (objective(X) + infeasible_cost) * prod_i (1 - sigmoid(constraint_i(X))) ) - infeasible_cost
See
botorch.utils.objective.apply_constraintsfor details on the constraint handling.Example
>>> bound = 0.0 >>> objective = lambda Y: Y[..., 0] >>> # apply non-negativity constraint on f(x)[1] >>> constraint = lambda Y: bound - Y[..., 1] >>> constrained_objective = ConstrainedMCObjective(objective, [constraint]) >>> samples = sampler(posterior) >>> objective = constrained_objective(samples)
TODO: Deprecate this as default way to handle constraints with MC acquisition functions once we have data on how well SampleReducingMCAcquisitionFunction works.
- Parameters:
objective (Callable[[Tensor, Tensor | None], Tensor]) – A callable
f(samples, X)mapping asample_shape x batch-shape x q x m-dim Tensorsamplesand an optionalbatch-shape x q x d-dim TensorXto asample_shape x batch-shape x q-dim Tensor of objective values.constraints (list[Callable[[Tensor], Tensor]]) – A list of callables, each mapping a Tensor of dimension
sample_shape x batch-shape x q x mto a Tensor of dimensionsample_shape x batch-shape x q, where negative values imply feasibility.infeasible_cost (Tensor | float) – The cost of a design if all associated samples are infeasible.
eta (Tensor | float) – The temperature parameter of the sigmoid function approximating the constraint. Can be either a float or a 1-dim tensor. In case of a float the same eta is used for every constraint in constraints. In case of a tensor the length of the tensor must match the number of provided constraints. The i-th constraint is then estimated with the i-th eta value.
- forward(samples, X=None)[source]
Evaluate the feasibility-weighted objective on the samples.
- Parameters:
samples (Tensor) – A
sample_shape x batch_shape x q x m-dim Tensors of samples from a model posterior.X (Tensor | None) – A
batch_shape x q x d-dim tensor of inputs. Relevant only if the objective depends on the inputs explicitly.
- Returns:
A
sample_shape x batch_shape x q-dim Tensor of objective values weighted by feasibility (assuming maximization).- Return type:
Tensor
- class botorch.acquisition.objective.LearnedObjective(pref_model, sample_shape=None, seed=None)[source]
Bases:
MCAcquisitionObjectiveLearned preference objective constructed from a preference model.
For input
samples, it samples each individual sample again from the latent preference posterior distribution usingpref_modeland return the posterior mean.Example
>>> train_X = torch.rand(2, 2) >>> train_comps = torch.LongTensor([[0, 1]]) >>> pref_model = PairwiseGP(train_X, train_comps) >>> learned_pref_obj = LearnedObjective(pref_model) >>> samples = sampler(posterior) >>> objective = learned_pref_obj(samples)
- Parameters:
pref_model (Model) – A BoTorch model, which models the latent preference/utility function. Given an input tensor of size
sample_size x batch_shape x q x d, itsposteriormethod should return aPosteriorobject with single outcome representing the utility values of the input.sample_shape (torch.Size | None) – Determines the number of preference-model samples drawn per outcome-model sample when the
LearnedObjectiveis called. Note that this is an additional layer of sampling relative to what is needed when evaluating most MC acquisition functions in order to account for uncertainty in the preference model. IfNone, it will default totorch.Size([16]), so that 16 samples will be drawn from the preference model at each outcome sample. This number is relatively high because sampling from the preference model is general cheap relative to generating the outcome model posterior.seed (int | None)
- forward(samples, X=None)[source]
Sample each element of samples.
- Parameters:
samples (Tensor) – A
sample_size x batch_shape x q x d-dim Tensors of samples from a model posterior.X (Tensor | None)
- Returns:
A
(sample_size * num_samples) x batch_shape x q-dim Tensor of objective values sampled from utility posterior usingpref_model.- Return type:
Tensor
Multi-Objective Objectives
- class botorch.acquisition.multi_objective.objective.MCMultiOutputObjective(*args, **kwargs)[source]
Bases:
MCAcquisitionObjectiveAbstract base class for MC multi-output objectives.
- Parameters:
_is_mo – A boolean denoting whether the objectives are multi-output.
args (Any)
kwargs (Any)
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- abstractmethod forward(samples, X=None)[source]
Evaluate the multi-output objective on the samples.
- Parameters:
samples (Tensor) – A
sample_shape x batch_shape x q x m-dim Tensors of samples from a model posterior.X (Tensor | None) – A
batch_shape x q x d-dim Tensors of inputs.
- Returns:
A
sample_shape x batch_shape x q x m'-dim Tensor of objective values withm'the output dimension. This assumes maximization in each output dimension).- Return type:
Tensor
This method is usually not called directly, but via the objectives.
Example
>>> # ``__call__`` method: >>> samples = sampler(posterior) >>> outcomes = multi_obj(samples)
- class botorch.acquisition.multi_objective.objective.GenericMCMultiOutputObjective(objective)[source]
Bases:
GenericMCObjective,MCMultiOutputObjectiveMulti-output objective generated from a generic callable.
Allows to construct arbitrary MC-objective functions from a generic callable. In order to be able to use gradient-based acquisition function optimization it should be possible to backpropagate through the callable.
- Parameters:
objective (Callable[[Tensor, Tensor | None], Tensor]) – A callable
f(samples, X)mapping asample_shape x batch-shape x q x m-dim Tensorsamplesand an optionalbatch-shape x q x d-dim TensorXto asample_shape x batch-shape x q-dim Tensor of objective values.
- class botorch.acquisition.multi_objective.objective.IdentityMCMultiOutputObjective(outcomes=None, num_outcomes=None)[source]
Bases:
MCMultiOutputObjectiveTrivial objective that returns the unaltered samples.
Example
>>> identity_objective = IdentityMCMultiOutputObjective() >>> samples = sampler(posterior) >>> objective = identity_objective(samples)
Initialize Objective.
- Parameters:
outcomes (list[int] | None) – A list of the
m'indices that the weights should be applied to.num_outcomes (int | None) – The total number of outcomes
m
- forward(samples, X=None)[source]
Evaluate the multi-output objective on the samples.
- Parameters:
samples (Tensor) – A
sample_shape x batch_shape x q x m-dim Tensors of samples from a model posterior.X (Tensor | None) – A
batch_shape x q x d-dim Tensors of inputs.
- Returns:
A
sample_shape x batch_shape x q x m'-dim Tensor of objective values withm'the output dimension. This assumes maximization in each output dimension).- Return type:
Tensor
This method is usually not called directly, but via the objectives.
Example
>>> # ``__call__`` method: >>> samples = sampler(posterior) >>> outcomes = multi_obj(samples)
- class botorch.acquisition.multi_objective.objective.WeightedMCMultiOutputObjective(weights, outcomes=None, num_outcomes=None)[source]
Bases:
IdentityMCMultiOutputObjectiveObjective that reweights samples by given weights vector.
Example
>>> weights = torch.tensor([1.0, -1.0]) >>> weighted_objective = WeightedMCMultiOutputObjective(weights) >>> samples = sampler(posterior) >>> objective = weighted_objective(samples)
Initialize Objective.
- Parameters:
weights (Tensor) –
m'-dim tensor of outcome weights.outcomes (list[int] | None) – A list of the
m'indices that the weights should be applied to.num_outcomes (int | None) – the total number of outcomes
m
- forward(samples, X=None)[source]
Evaluate the multi-output objective on the samples.
- Parameters:
samples (Tensor) – A
sample_shape x batch_shape x q x m-dim Tensors of samples from a model posterior.X (Tensor | None) – A
batch_shape x q x d-dim Tensors of inputs.
- Returns:
A
sample_shape x batch_shape x q x m'-dim Tensor of objective values withm'the output dimension. This assumes maximization in each output dimension).- Return type:
Tensor
This method is usually not called directly, but via the objectives.
Example
>>> # ``__call__`` method: >>> samples = sampler(posterior) >>> outcomes = multi_obj(samples)
- class botorch.acquisition.multi_objective.objective.FeasibilityWeightedMCMultiOutputObjective(model, X_baseline, constraint_idcs, objective=None)[source]
Bases:
MCMultiOutputObjectiveConstruct a feasibility-weighted objective.
This applies feasibility weighting before calculating the objective value. Defaults to identity if no constraints or objective is present.
NOTE: By passing in a single-output
MCAcquisitionObjectiveas theobjective, this can be used as a single-outputMCAcquisitionObjectiveas well.- Parameters:
model (Model) – A fitted Model.
X_baseline (Tensor) – An
n x d-dim tensor of points already observed.constraint_idcs (list[int]) – The outcome indices of the constraints. Constraints are handled by weighting the samples according to a sigmoid approximation of feasibility. A positive constraint outcome implies feasibility.
objective (MCMultiOutputObjective | None) – An optional objective to apply after feasibility-weighting the samples.
- forward(samples, X=None)[source]
Evaluate the multi-output objective on the samples.
- Parameters:
samples (Tensor) – A
sample_shape x batch_shape x q x m-dim Tensors of samples from a model posterior.X (Tensor | None) – A
batch_shape x q x d-dim Tensors of inputs.
- Returns:
A
sample_shape x batch_shape x q x m'-dim Tensor of objective values withm'the output dimension. This assumes maximization in each output dimension).- Return type:
Tensor
This method is usually not called directly, but via the objectives.
Example
>>> # ``__call__`` method: >>> samples = sampler(posterior) >>> outcomes = multi_obj(samples)
Cost-Aware Utility
Cost functions for cost-aware acquisition functions, e.g. multi-fidelity KG. To be used in a context where there is an objective/cost tradeoff.
- class botorch.acquisition.cost_aware.CostAwareUtility(*args, **kwargs)[source]
Bases:
Module,ABCAbstract base class for cost-aware utilities.
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- Parameters:
args (Any)
kwargs (Any)
- abstractmethod forward(X, deltas, sampler=None)[source]
Evaluate the cost-aware utility on the candidates and improvements.
- Parameters:
X (Tensor) – A
batch_shape x q x d-dim Tensor of withqd-dim design points each for each t-batch.deltas (Tensor) – A
num_fantasies x batch_shape-dim Tensor ofnum_fantasysamples from the marginal improvement in utility over the current state atXfor each t-batch.sampler (MCSampler | None) – A sampler used for sampling from the posterior of the cost model. Some subclasses ignore this argument.
- Returns:
A
num_fantasies x batch_shape-dim Tensor of cost-transformed utilities.- Return type:
Tensor
- class botorch.acquisition.cost_aware.GenericCostAwareUtility(cost)[source]
Bases:
CostAwareUtilityGeneric cost-aware utility wrapping a callable.
Generic cost-aware utility wrapping a callable.
- Parameters:
cost (Callable[[Tensor, Tensor], Tensor]) – A callable mapping a
batch_shape x q x d'-dim candidate set to abatch_shape-dim tensor of costs
- forward(X, deltas, sampler=None)[source]
Evaluate the cost function on the candidates and improvements.
- Parameters:
X (Tensor) – A
batch_shape x q x d'-dim Tensor of withqd-dim design points for each t-batch.deltas (Tensor) – A
num_fantasies x batch_shape-dim Tensor ofnum_fantasysamples from the marginal improvement in utility over the current state atXfor each t-batch.sampler (MCSampler | None) – Ignored.
- Returns:
A
num_fantasies x batch_shape-dim Tensor of cost-weighted utilities.- Return type:
Tensor
- class botorch.acquisition.cost_aware.InverseCostWeightedUtility(cost_model, use_mean=True, cost_objective=None, log=False)[source]
Bases:
CostAwareUtilityA cost-aware utility using inverse cost weighting based on a model.
Computes the cost-aware utility by inverse-weighting samples
U = (u_1, ..., u_N)of the increase in utility. Ifuse_mean=True, this uses the posterior meanmean_costof the cost model, i.e.weighted utility = mean(U) / mean_cost. Ifuse_mean=False, it uses samplesC = (c_1, ..., c_N)from the posterior of the cost model and performs the inverse weighting on the sample level:weighted utility = mean(u_1 / c_1, ..., u_N / c_N).Where values in (u_1, …, u_N) are negative, or for mean(U) < 0, the weighted utility is instead calculated via scaling by the cost, i.e. if
use_mean=True:weighted_utility = mean(U) * mean_costand ifuse_mean=False:weighted utility = mean(u_1 * c_1, u_2 / c_2, u_3 * c_3, ..., u_N / c_N), depending on whether (u_*>= 0), as withu_2andu_Nin this case, or (u_*< 0) as withu_1andu_3.The cost is additive across multiple elements of a q-batch.
Cost-aware utility that weights increase in utility by inverse cost. For negative increases in utility, the utility is instead scaled by the cost. See the class description for more information.
- Parameters:
cost_model (DeterministicModel | GPyTorchModel) – A model of the cost of evaluating a candidate set
X, whereXare the same features as in the model for the acquisition function this is to be used with. If no cost_objective is specified, the outputs are required to be non-negative.use_mean (bool) – If True, use the posterior mean, otherwise use posterior samples from the cost model.
cost_objective (MCAcquisitionObjective | None) – If specified, transform the posterior mean / the posterior samples from the cost model. This can be used e.g. to un-transform predictions/samples of a cost model fit on the log-transformed cost (often done to ensure non-negativity). If the cost model is multi-output, then by default this will sum the cost across outputs. NOTE:
cost_objectivemust output strictly positive values; forward will raise aValueErrorotherwise.min_cost – A value used to clamp the cost samples so that they are not too close to zero, which may cause numerical issues.
log (bool)
- Returns:
The inverse-cost-weighted utility.
- forward(X, deltas, sampler=None, X_evaluation_mask=None)[source]
Evaluate the cost function on the candidates and improvements. Note that negative values of
deltasare instead scaled by the cost, and not inverse-weighted. See the class description for more information.- Parameters:
X (Tensor) – A
batch_shape x q x d-dim Tensor of withqd-dim design points each for each t-batch.deltas (Tensor) – A
num_fantasies x batch_shape-dim Tensor ofnum_fantasysamples from the marginal improvement in utility over the current state atXfor each t-batch.sampler (MCSampler | None) – A sampler used for sampling from the posterior of the cost model (required if
use_mean=False, ignored ifuse_mean=True).X_evaluation_mask (Tensor | None) – A
q x m-dim boolean tensor indicating which outcomes should be evaluated for each design in the batch.
- Returns:
A
num_fantasies x batch_shape-dim Tensor of cost-weighted utilities.- Return type:
Tensor
Risk Measures
Risk Measures implemented as Monte-Carlo objectives, based on Bayesian optimization of risk measures as introduced in [Cakmak2020risk]. For a broader discussion of Monte-Carlo methods for VaR and CVaR risk measures, see also [Hong2014review].
S. Cakmak, R. Astudillo, P. Frazier, and E. Zhou. Bayesian Optimization of Risk Measures. Advances in Neural Information Processing Systems 33, 2020.
L. J. Hong, Z. Hu, and G. Liu. Monte carlo methods for value-at-risk and conditional value-at-risk: a review. ACM Transactions on Modeling and Computer Simulation, 2014.
- class botorch.acquisition.risk_measures.RiskMeasureMCObjective(n_w, preprocessing_function=None)[source]
Bases:
MCAcquisitionObjective,ABCObjective transforming the posterior samples to samples of a risk measure.
The risk measure is calculated over joint q-batch samples from the posterior. If the q-batch includes samples corresponding to multiple inputs, it is assumed that first
n_wsamples correspond to first input, secondn_wsamples correspond to second input etc.The risk measures are commonly defined for minimization by considering the upper tail of the distribution, i.e., treating larger values as being undesirable. BoTorch by default assumes a maximization objective, so the default behavior here is to calculate the risk measures w.r.t. the lower tail of the distribution. This can be changed by passing a preprocessing function with
weights=torch.tensor([-1.0]).Transform the posterior samples to samples of a risk measure.
- Parameters:
n_w (int) – The size of the
w_setto calculate the risk measure over.preprocessing_function (Callable[[Tensor], Tensor] | None) – A preprocessing function to apply to the samples before computing the risk measure. This can be used to scalarize multi-output samples before calculating the risk measure. For constrained optimization, this should also apply feasibility-weighting to samples. Given a
batch x m-dim tensor of samples, this should return abatch-dim tensor.
- abstractmethod forward(samples, X=None)[source]
Calculate the risk measure corresponding to the given samples.
- Parameters:
samples (Tensor) – A
sample_shape x batch_shape x (q * n_w) x m-dim tensor of posterior samples. The q-batches should be ordered so that eachn_wblock of samples correspond to the same input.X (Tensor | None) – A
batch_shape x q x d-dim tensor of inputs. Ignored.
- Returns:
A
sample_shape x batch_shape x q-dim tensor of risk measure samples.- Return type:
Tensor
- class botorch.acquisition.risk_measures.CVaR(alpha, n_w, preprocessing_function=None)[source]
Bases:
RiskMeasureMCObjectiveThe Conditional Value-at-Risk risk measure.
The Conditional Value-at-Risk measures the expectation of the worst outcomes (small rewards or large losses) with a total probability of
1 - alpha. It is commonly defined as the conditional expectation of the reward function, with the condition that the reward is smaller than the corresponding Value-at-Risk (also defined below).- Note: Due to the use of a discrete
w_setof samples, the VaR and CVaR calculated here are (possibly biased) Monte-Carlo approximations of the true risk measures.
Transform the posterior samples to samples of a risk measure.
- Parameters:
alpha (float) – The risk level, float in
(0.0, 1.0].n_w (int) – The size of the
w_setto calculate the risk measure over.preprocessing_function (Callable[[Tensor], Tensor] | None) – A preprocessing function to apply to the samples before computing the risk measure. This can be used to scalarize multi-output samples before calculating the risk measure. For constrained optimization, this should also apply feasibility-weighting to samples. Given a
batch x m-dim tensor of samples, this should return abatch-dim tensor.
- forward(samples, X=None)[source]
Calculate the CVaR corresponding to the given samples.
- Parameters:
samples (Tensor) – A
sample_shape x batch_shape x (q * n_w) x m-dim tensor of posterior samples. The q-batches should be ordered so that eachn_wblock of samples correspond to the same input.X (Tensor | None) – A
batch_shape x q x d-dim tensor of inputs. Ignored.
- Returns:
A
sample_shape x batch_shape x q-dim tensor of CVaR samples.- Return type:
Tensor
- Note: Due to the use of a discrete
- class botorch.acquisition.risk_measures.VaR(alpha, n_w, preprocessing_function=None)[source]
Bases:
CVaRThe Value-at-Risk risk measure.
Value-at-Risk measures the smallest possible reward (or largest possible loss) after excluding the worst outcomes with a total probability of
1 - alpha. It is commonly used in financial risk management, and it corresponds to the1 - alphaquantile of a given random variable.Transform the posterior samples to samples of a risk measure.
- Parameters:
alpha (float) – The risk level, float in
(0.0, 1.0].n_w (int) – The size of the
w_setto calculate the risk measure over.preprocessing_function (Callable[[Tensor], Tensor] | None) – A preprocessing function to apply to the samples before computing the risk measure. This can be used to scalarize multi-output samples before calculating the risk measure. For constrained optimization, this should also apply feasibility-weighting to samples. Given a
batch x m-dim tensor of samples, this should return abatch-dim tensor.
- forward(samples, X=None)[source]
Calculate the VaR corresponding to the given samples.
- Parameters:
samples (Tensor) – A
sample_shape x batch_shape x (q * n_w) x m-dim tensor of posterior samples. The q-batches should be ordered so that eachn_wblock of samples correspond to the same input.X (Tensor | None) – A
batch_shape x q x d-dim tensor of inputs. Ignored.
- Returns:
A
sample_shape x batch_shape x q-dim tensor of VaR samples.- Return type:
Tensor
- class botorch.acquisition.risk_measures.WorstCase(n_w, preprocessing_function=None)[source]
Bases:
RiskMeasureMCObjectiveThe worst-case risk measure.
Transform the posterior samples to samples of a risk measure.
- Parameters:
n_w (int) – The size of the
w_setto calculate the risk measure over.preprocessing_function (Callable[[Tensor], Tensor] | None) – A preprocessing function to apply to the samples before computing the risk measure. This can be used to scalarize multi-output samples before calculating the risk measure. For constrained optimization, this should also apply feasibility-weighting to samples. Given a
batch x m-dim tensor of samples, this should return abatch-dim tensor.
- forward(samples, X=None)[source]
Calculate the worst-case measure corresponding to the given samples.
- Parameters:
samples (Tensor) – A
sample_shape x batch_shape x (q * n_w) x m-dim tensor of posterior samples. The q-batches should be ordered so that eachn_wblock of samples correspond to the same input.X (Tensor | None) – A
batch_shape x q x d-dim tensor of inputs. Ignored.
- Returns:
A
sample_shape x batch_shape x q-dim tensor of worst-case samples.- Return type:
Tensor
- class botorch.acquisition.risk_measures.Expectation(n_w, preprocessing_function=None)[source]
Bases:
RiskMeasureMCObjectiveThe expectation risk measure.
For unconstrained problems, we recommend using the
ExpectationPosteriorTransforminstead.ExpectationPosteriorTransformdirectly transforms the posterior distribution overq * n_wto a posterior ofqexpectations, significantly reducing the cost of posterior sampling as a result.Transform the posterior samples to samples of a risk measure.
- Parameters:
n_w (int) – The size of the
w_setto calculate the risk measure over.preprocessing_function (Callable[[Tensor], Tensor] | None) – A preprocessing function to apply to the samples before computing the risk measure. This can be used to scalarize multi-output samples before calculating the risk measure. For constrained optimization, this should also apply feasibility-weighting to samples. Given a
batch x m-dim tensor of samples, this should return abatch-dim tensor.
- forward(samples, X=None)[source]
Calculate the expectation corresponding to the given samples. This calculates the expectation / mean / average of each
n_wsamples across the q-batch dimension. Ifself.weightsis given, the samples are scalarized across the output dimension before taking the expectation.- Parameters:
samples (Tensor) – A
sample_shape x batch_shape x (q * n_w) x m-dim tensor of posterior samples. The q-batches should be ordered so that eachn_wblock of samples correspond to the same input.X (Tensor | None) – A
batch_shape x q x d-dim tensor of inputs. Ignored.
- Returns:
A
sample_shape x batch_shape x q-dim tensor of expectation samples.- Return type:
Tensor
Thompson Sampling
- class botorch.acquisition.thompson_sampling.PathwiseThompsonSampling(model, objective=None, posterior_transform=None)[source]
Bases:
AcquisitionFunctionSingle-outcome Thompson Sampling packaged as an (analytic) acquisition function. Querying the acquisition function gives the summed values of one or more draws from a pathwise drawn posterior sample, and thus it maximization yields one (or multiple) Thompson sample(s).
Example
>>> model = SingleTaskGP(train_X, train_Y) >>> TS = PathwiseThompsonSampling(model)
Single-outcome TS.
If using a multi-output
model, the acquisition function requires either anobjectiveor aposterior_transformthat transforms the multi-output posterior samples to single-output posterior samples.- Parameters:
model (Model) – A fitted GP model.
objective (MCAcquisitionObjective | None) – The MCAcquisitionObjective under which the samples are evaluated. Defaults to
IdentityMCObjective().posterior_transform (PosteriorTransform | None) – An optional PosteriorTransform.
- forward(X)[source]
Evaluate the pathwise posterior sample draws on the candidate set X.
- Parameters:
X (Tensor) – A
batch_shape x q x d-dim batched tensor ofd-dim design points.- Returns:
A
batch_shape-dim tensor of evaluations on the posterior sample draws, where the samples are summed over the q-batch dimension.- Return type:
Tensor
- select_from_ensemble_models(values)[source]
Subselecting a value associated with a single sample in the ensemble for each element of samples that is not associated with an ensemble dimension.
NOTE: 1) uses
self.modelandis_ensembleto determine whether or not an ensembling dimension is present. 2) usesself.ensemble_indicesto select the value associated with a single sample in the ensemble.ensemble_indicescontains uniformly randomly sample indices for each element of the ensemble, but is cached to make the evaluation of the acquisition function deterministic.- Parameters:
values (Tensor) – A
batch_shape x num_draws x q [x num_ensemble] x m-dim Tensor.- Returns:
A``batch_shape x num_draws x q x m``-dim where each element is contains a single sample from the ensemble, selected with
self.ensemble_indices.
Multi-Output Risk Measures
Multi-output extensions of the risk measures, implemented as Monte-Carlo objectives. Except for MVaR, the risk measures are computed over each output dimension independently. In contrast, MVaR is computed using the joint distribution of the outputs, and provides more accurate risk estimates.
References
A. Prekopa. Multivariate value at risk and related topics. Annals of Operations Research, 2012.
- class botorch.acquisition.multi_objective.multi_output_risk_measures.MultiOutputRiskMeasureMCObjective(n_w, preprocessing_function=None)[source]
Bases:
RiskMeasureMCObjective,MCMultiOutputObjective,ABCObjective transforming the multi-output posterior samples to samples of a multi-output risk measure.
The risk measure is calculated over joint q-batch samples from the posterior. If the q-batch includes samples corresponding to multiple inputs, it is assumed that first
n_wsamples correspond to first input, secondn_wsamples correspond to second input, etc.Transform the posterior samples to samples of a risk measure.
- Parameters:
n_w (int) – The size of the
w_setto calculate the risk measure over.preprocessing_function (Callable[[Tensor], Tensor] | None) – A preprocessing function to apply to the samples before computing the risk measure. This can be used to remove non-objective outcomes or to align all outcomes for maximization. For constrained optimization, this should also apply feasibility-weighting to samples. Given a
batch x m-dim tensor of samples, this should return abatch x m'-dim tensor.
- abstractmethod forward(samples, X=None)[source]
Calculate the risk measure corresponding to the given samples.
- Parameters:
samples (Tensor) – A
sample_shape x batch_shape x (q * n_w) x m-dim tensor of posterior samples. The q-batches should be ordered so that eachn_wblock of samples correspond to the same input.X (Tensor | None) – A
batch_shape x q x d-dim tensor of inputs. Ignored.
- Returns:
A
sample_shape x batch_shape x q x m'-dim tensor of risk measure samples.- Return type:
Tensor
- class botorch.acquisition.multi_objective.multi_output_risk_measures.MultiOutputExpectation(n_w, preprocessing_function=None)[source]
Bases:
MultiOutputRiskMeasureMCObjectiveA multi-output MC expectation risk measure.
For unconstrained problems, we recommend using the
ExpectationPosteriorTransforminstead.ExpectationPosteriorTransformdirectly transforms the posterior distribution overq * n_wto a posterior ofqexpectations, significantly reducing the cost of posterior sampling as a result.Transform the posterior samples to samples of a risk measure.
- Parameters:
n_w (int) – The size of the
w_setto calculate the risk measure over.preprocessing_function (Callable[[Tensor], Tensor] | None) – A preprocessing function to apply to the samples before computing the risk measure. This can be used to remove non-objective outcomes or to align all outcomes for maximization. For constrained optimization, this should also apply feasibility-weighting to samples. Given a
batch x m-dim tensor of samples, this should return abatch x m'-dim tensor.
- forward(samples, X=None)[source]
Calculate the expectation of the given samples. Expectation is calculated over each
n_wsamples in the q-batch dimension.- Parameters:
samples (Tensor) – A
sample_shape x batch_shape x (q * n_w) x m-dim tensor of posterior samples. The q-batches should be ordered so that eachn_wblock of samples correspond to the same input.X (Tensor | None) – A
batch_shape x q x d-dim tensor of inputs. Ignored.
- Returns:
A
sample_shape x batch_shape x q x m'-dim tensor of expectation samples.- Return type:
Tensor
- class botorch.acquisition.multi_objective.multi_output_risk_measures.IndependentCVaR(alpha, n_w, preprocessing_function=None)[source]
Bases:
CVaR,MultiOutputRiskMeasureMCObjectiveThe multi-output Conditional Value-at-Risk risk measure that operates on each output dimension independently. Since this does not consider the joint distribution of the outputs (i.e., that the outputs were evaluated on same perturbed input and are not independent), the risk estimates provided by
IndependentCVaRin general are more optimistic than the definition of CVaR would suggest.The Conditional Value-at-Risk measures the expectation of the worst outcomes (small rewards or large losses) with a total probability of
1 - alpha. It is commonly defined as the conditional expectation of the reward function, with the condition that the reward is smaller than the corresponding Value-at-Risk (also defined below).NOTE: Due to the use of a discrete
w_setof samples, the VaR and CVaR calculated here are (possibly biased) Monte-Carlo approximations of the true risk measures.Transform the posterior samples to samples of a risk measure.
- Parameters:
alpha (float) – The risk level, float in
(0.0, 1.0].n_w (int) – The size of the
w_setto calculate the risk measure over.preprocessing_function (Callable[[Tensor], Tensor] | None) – A preprocessing function to apply to the samples before computing the risk measure. This can be used to scalarize multi-output samples before calculating the risk measure. For constrained optimization, this should also apply feasibility-weighting to samples. Given a
batch x m-dim tensor of samples, this should return abatch-dim tensor.
- forward(samples, X=None)[source]
Calculate the CVaR corresponding to the given samples.
- Parameters:
samples (Tensor) – A
sample_shape x batch_shape x (q * n_w) x m-dim tensor of posterior samples. The q-batches should be ordered so that eachn_wblock of samples correspond to the same input.X (Tensor | None) – A
batch_shape x q x d-dim tensor of inputs. Ignored.
- Returns:
A
sample_shape x batch_shape x q x m'-dim tensor of CVaR samples.- Return type:
Tensor
- class botorch.acquisition.multi_objective.multi_output_risk_measures.IndependentVaR(alpha, n_w, preprocessing_function=None)[source]
Bases:
IndependentCVaRThe multi-output Value-at-Risk risk measure that operates on each output dimension independently. For the same reasons as
IndependentCVaR, the risk estimates provided by this are in general more optimistic than the definition of VaR would suggest.Value-at-Risk measures the smallest possible reward (or largest possible loss) after excluding the worst outcomes with a total probability of
1 - alpha. It is commonly used in financial risk management, and it corresponds to the1 - alphaquantile of a given random variable.Transform the posterior samples to samples of a risk measure.
- Parameters:
alpha (float) – The risk level, float in
(0.0, 1.0].n_w (int) – The size of the
w_setto calculate the risk measure over.preprocessing_function (Callable[[Tensor], Tensor] | None) – A preprocessing function to apply to the samples before computing the risk measure. This can be used to scalarize multi-output samples before calculating the risk measure. For constrained optimization, this should also apply feasibility-weighting to samples. Given a
batch x m-dim tensor of samples, this should return abatch-dim tensor.
- forward(samples, X=None)[source]
Calculate the VaR corresponding to the given samples.
- Parameters:
samples (Tensor) – A
sample_shape x batch_shape x (q * n_w) x m-dim tensor of posterior samples. The q-batches should be ordered so that eachn_wblock of samples correspond to the same input.X (Tensor | None) – A
batch_shape x q x d-dim tensor of inputs. Ignored.
- Returns:
A
sample_shape x batch_shape x q x m'-dim tensor of VaR samples.- Return type:
Tensor
- class botorch.acquisition.multi_objective.multi_output_risk_measures.MultiOutputWorstCase(n_w, preprocessing_function=None)[source]
Bases:
MultiOutputRiskMeasureMCObjectiveThe multi-output worst-case risk measure.
Transform the posterior samples to samples of a risk measure.
- Parameters:
n_w (int) – The size of the
w_setto calculate the risk measure over.preprocessing_function (Callable[[Tensor], Tensor] | None) – A preprocessing function to apply to the samples before computing the risk measure. This can be used to remove non-objective outcomes or to align all outcomes for maximization. For constrained optimization, this should also apply feasibility-weighting to samples. Given a
batch x m-dim tensor of samples, this should return abatch x m'-dim tensor.
- forward(samples, X=None)[source]
Calculate the worst-case measure corresponding to the given samples.
- Parameters:
samples (Tensor) – A
sample_shape x batch_shape x (q * n_w) x m-dim tensor of posterior samples. The q-batches should be ordered so that eachn_wblock of samples correspond to the same input.X (Tensor | None) – A
batch_shape x q x d-dim tensor of inputs. Ignored.
- Returns:
A
sample_shape x batch_shape x q x m'-dim tensor of worst-case samples.- Return type:
Tensor
- class botorch.acquisition.multi_objective.multi_output_risk_measures.MVaR(n_w, alpha, expectation=False, preprocessing_function=None, *, pad_to_n_w=False, filter_dominated=True, use_counting=False)[source]
Bases:
MultiOutputRiskMeasureMCObjectiveThe multivariate Value-at-Risk as introduced in [Prekopa2012MVaR].
MVaR is defined as the non-dominated set of points in the extended domain of the random variable that have multivariate CDF greater than or equal to
alpha. Note that MVaR is set valued and the size of the set depends on the particular realizations of the random variable. [Cousin2013MVaR] instead propose to use the expectation of the set-valued MVaR as the multivariate VaR. We support this alternative with anexpectationflag.This supports approximate gradients as discussed in [Daulton2022MARS].
The multivariate Value-at-Risk.
- Parameters:
n_w (int) – The size of the
w_setto calculate the risk measure over.alpha (float) – The risk level of MVaR, float in
(0.0, 1.0]. Each MVaR value dominatesalphafraction of all observations.expectation (bool) – If True, returns the expectation of the MVaR set as is done in [Cousin2013MVaR]. Otherwise, it returns the union of all values in the MVaR set. Default: False.
preprocessing_function (Callable[[Tensor], Tensor] | None) – A preprocessing function to apply to the samples before computing the risk measure. This can be used to remove non-objective outcomes or to align all outcomes for maximization. For constrained optimization, this should also apply feasibility-weighting to samples. Given a
batch x m-dim tensor of samples, this should return abatch x m'-dim tensor.pad_to_n_w (bool) – If True, instead of padding up to
k', which is the size of the largest MVaR set across all batches, we pad the MVaR set up ton_w. This produces a return tensor of known size, however, it may in general be much larger than the alternative. Seeforwardfor more details on the return shape. NOTE: this is only relevant ifexpectation=False.filter_dominated (bool) – If True, returns the non-dominated subset of alpha level points (this is MVaR as defined by [Prekopa2012MVaR]). Disabling this will make it faster, and may be preferable if the dominated points will be filtered out later, e.g., while calculating the hypervolume. Disabling this is not recommended if
expectation=True.use_counting (bool) – If True, uses
get_mvar_set_via_countingfor finding the MVaR set. This is method is less memory intensive than the vectorized implementation, which is beneficial whenn_wis quite large.
- get_mvar_set_via_counting(Y)[source]
Find MVaR set based on the definition in [Prekopa2012MVaR].
This first calculates the CDF for each point on the extended domain of the random variable (the grid defined by the given samples), then takes the values with CDF equal to (rounded if necessary)
alpha. The non-dominated subset of these form the MVaR set.This implementation processes each batch of
Yin a for loop using a counting based implementation. It requires less memory than the vectorized implementation and should be used with large (>128)n_wvalues.- Parameters:
Y (Tensor) – A
batch x n_w x m-dim tensor of outcomes.- Returns:
A
batchlength list ofk x m-dim tensor of MVaR values, wherekdepends on the corresponding batch inputs. Note that MVaR values in general are not in-sample points.- Return type:
list[Tensor]
- get_mvar_set_vectorized(Y)[source]
Find MVaR set based on the definition in [Prekopa2012MVaR].
This first calculates the CDF for each point on the extended domain of the random variable (the grid defined by the given samples), then takes the values with CDF equal to (rounded if necessary)
alpha. The non-dominated subset of these form the MVaR set.This implementation computes the CDF of each point using highly vectorized operations. As such, it may use large amounts of memory, particularly when the batch size and/or
n_ware large. It is typically faster than the alternative implementation when computing MVaR of a large batch of points with small to moderate (<128 for m=2, <64 for m=3)n_w.- Parameters:
Y (Tensor) – A
batch x n_w x m-dim tensor of observations.- Returns:
A
batchlength list ofk x m-dim tensor of MVaR values, wherekdepends on the corresponding batch inputs. Note that MVaR values in general are not in-sample points.- Return type:
list[Tensor]
- make_differentiable(prepared_samples, mvars)[source]
An experimental approach for obtaining the gradient of the MVaR via component-wise mapping to original samples. See [Daulton2022MARS].
- Parameters:
prepared_samples (Tensor) – A
(sample_shape * batch_shape * q) x n_w x m- dim tensor of posterior samples. The q-batches should be ordered so that eachn_wblock of samples correspond to the same input.mvars (Tensor) – A
(sample_shape * batch_shape * q) x k x m-dim tensor of padded MVaR values.
- Returns:
The same
mvarswith entries mapped to inputs to produce gradients.- Return type:
Tensor
- forward(samples, X=None)[source]
Calculate the MVaR corresponding to the given samples.
- Parameters:
samples (Tensor) – A
sample_shape x batch_shape x (q * n_w) x m-dim tensor of posterior samples. The q-batches should be ordered so that eachn_wblock of samples correspond to the same input.X (Tensor | None) – A
batch_shape x q x d-dim tensor of inputs. Ignored.
- Returns:
A
sample_shape x batch_shape x q x m'-dim tensor of MVaR values, ifself.expectation=True. Otherwise, this returns asample_shape x batch_shape x (q * k') x m'-dim tensor, wherek'is the maximumkacross all batches that is returned byget_mvar_set_.... Each(q * k') x m'corresponds to thekMVaR values for eachqbatch ofn_winputs, padded up tok'by repeating the last element. Ifself.pad_to_n_w, we setk' = self.n_w, producing a deterministic return shape.- Return type:
Tensor
- class botorch.acquisition.multi_objective.multi_output_risk_measures.MARS(alpha, n_w, chebyshev_weights, baseline_Y=None, ref_point=None, preprocessing_function=None)[source]
Bases:
VaR,MultiOutputRiskMeasureMCObjectiveMVaR Approximation based on Random Scalarizations as introduced in [Daulton2022MARS].
This approximates MVaR via VaR of Chebyshev scalarizations, where each scalarization corresponds to a point in the MVaR set. As implemented, this uses one set of scalarization weights to approximate a single MVaR value. Note that due to the normalization within the Chebyshev scalarization, the output of this risk measure may not be on the same scale as its inputs.
Transform the posterior samples to samples of a risk measure.
- Parameters:
alpha (float) – The risk level, float in
(0.0, 1.0].n_w (int) – The size of the perturbation set to calculate the risk measure over.
chebyshev_weights (Tensor | list[float]) – The weights to use in the Chebyshev scalarization. The Chebyshev scalarization is applied before computing VaR. The weights must be non-negative. See
preprocessing_functionto support minimization objectives.baseline_Y (Tensor | None) – An
n' x d-dim tensor of baseline outcomes to use in determining the normalization bounds for Chebyshev scalarization. It is recommended to set this viaset_baseline_Yhelper.ref_point (Tensor | list[float] | None) – An optional MVaR reference point to use in determining the normalization bounds for Chebyshev scalarization.
preprocessing_function (Callable[[Tensor], Tensor] | None) – A preprocessing function to apply to the samples before computing the risk measure. This can be used to remove non-objective outcomes or to align all outcomes for maximization. For constrained optimization, this should also apply feasibility-weighting to samples.
- set_baseline_Y(model, X_baseline, Y_samples=None)[source]
Set the
baseline_Ybased on the MVaR predictions of themodelforX_baseline.- Parameters:
model (Model | None) – The model being used for MARS optimization. Must have a compatible
InputPerturbationtransform attached. Ignored ifY_samplesis given.X_baseline (Tensor | None) – An
n x d-dim tensor of previously evaluated points. Ignored ifY_samplesis given.Y_samples (Tensor | None) – An optional
(n * n_w) x d-dim tensor of predictions. If given, instead of sampling from the model, these are used.
- Return type:
None
- property chebyshev_weights: Tensor
The weights used in Chebyshev scalarization.
- property baseline_Y: Tensor | None
Baseline outcomes used in determining the normalization bounds.
- property chebyshev_objective: Callable[[Tensor, Tensor | None], Tensor]
The objective for applying the Chebyshev scalarization.
Utilities
Fixed Feature Acquisition Function
A wrapper around AcquisitionFunctions to fix certain features for optimization. This is useful e.g. for performing contextual optimization.
- botorch.acquisition.fixed_feature.get_dtype_of_sequence(values)[source]
Return torch.float32 if everything is single-precision and torch.float64 otherwise.
Numbers (non-tensors) are double-precision.
- Parameters:
values (Sequence[Tensor | float])
- Return type:
dtype
- botorch.acquisition.fixed_feature.get_device_of_sequence(values)[source]
CPU if everything is on the CPU; Cuda otherwise.
Numbers (non-tensors) are considered to be on the CPU.
- Parameters:
values (Sequence[Tensor | float])
- Return type:
device
- class botorch.acquisition.fixed_feature.FixedFeatureAcquisitionFunction(acq_function, d, columns, values)[source]
Bases:
AcquisitionFunctionA wrapper around AcquisitionFunctions to fix a subset of features.
Example
>>> model = SingleTaskGP(train_X, train_Y) # d = 5 >>> qEI = qExpectedImprovement(model, best_f=0.0) >>> columns = [2, 4] >>> values = X[..., columns] >>> qEI_FF = FixedFeatureAcquisitionFunction(qEI, 5, columns, values) >>> qei = qEI_FF(test_X) # d' = 3
Derived Acquisition Function by fixing a subset of input features.
- Parameters:
acq_function (AcquisitionFunction) – The base acquisition function, operating on input tensors
X_fullof feature dimensiond.d (int) – The feature dimension expected by
acq_function.columns (list[int]) –
d_f < dindices of columns inX_fullthat are to be fixed to the provided values.values (Tensor | Sequence[Tensor | float]) – The values to which to fix the columns in
columns. Either a fullbatch_shape x q x d_ftensor of values (if values are different for each of theqinput points), or an array-like of values that is broadcastable to the input acrosst-batch andq-batch dimensions, e.g. a list of lengthd_fif values are the same across alltandq-batch dimensions, or a combination ofTensor``s and numbers which can be broadcasted to form a tensor with trailing dimension size of ``d_f.
- forward(X)[source]
Evaluate base acquisition function under the fixed features.
- Parameters:
X (Tensor) – Input tensor of feature dimension
d' < dsuch thatd' + d_f = d.- Returns:
Base acquisition function evaluated on tensor
X_fullconstructed by addingvaluesin the appropriate places (see_construct_X_full).
- property X_pending
Return the
X_pendingof the base acquisition function.
Constructors for Acquisition Function Input Arguments
A registry of helpers for generating inputs to acquisition function constructors programmatically from a consistent input format.
- botorch.acquisition.input_constructors.get_acqf_input_constructor(acqf_cls)[source]
Get acquisition function input constructor from registry.
- Parameters:
acqf_cls (type[AcquisitionFunction]) – The AcquisitionFunction class (not instance) for which to retrieve the input constructor.
- Returns:
The input constructor associated with
acqf_cls.- Return type:
Callable[[…], dict[str, Any]]
- botorch.acquisition.input_constructors.allow_only_specific_variable_kwargs(f)[source]
Decorator for allowing a function to accept keyword arguments that are not explicitly listed in the function signature, but only specific ones.
This decorator is applied in
acqf_input_constructorso that all constructors obtained withacqf_input_constructorallow keyword arguments such astraining_dataandobjective, even if they do not appear in the signature off. Any other keyword arguments will raise an error.- Parameters:
f (Callable[[...], T])
- Return type:
Callable[[…], T]
- botorch.acquisition.input_constructors.acqf_input_constructor(*acqf_cls)[source]
Decorator for registering acquisition function input constructors.
- Parameters:
acqf_cls (type[AcquisitionFunction]) – The AcquisitionFunction classes (not instances) for which to register the input constructor.
- Return type:
Callable[[…], AcquisitionFunction]
- botorch.acquisition.input_constructors.construct_inputs_posterior_mean(model, posterior_transform=None)[source]
Construct kwargs for PosteriorMean acquisition function.
- Parameters:
model (Model) – The model to be used in the acquisition function.
posterior_transform (PosteriorTransform | None) – The posterior transform to be used in the acquisition function.
- Returns:
A dict mapping kwarg names of the constructor to values.
- Return type:
dict[str, Model | PosteriorTransform | None]
- botorch.acquisition.input_constructors.construct_inputs_best_f(model, training_data, posterior_transform=None, best_f=None, maximize=True)[source]
Construct kwargs for the acquisition functions requiring
best_f.- Parameters:
model (Model) – The model to be used in the acquisition function.
training_data (SupervisedDataset | dict[Hashable, SupervisedDataset]) – Dataset(s) used to train the model. Used to determine default value for
best_f.best_f (float | Tensor | None) – Threshold above (or below) which improvement is defined.
posterior_transform (PosteriorTransform | None) – The posterior transform to be used in the acquisition function.
maximize (bool) – If True, consider the problem a maximization problem.
- Returns:
A dict mapping kwarg names of the constructor to values.
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_pof(model, constraints_tuple)[source]
Construct kwargs for the log probability of feasibility acquisition function.
- Parameters:
model (Model) – The model to be used in the acquisition function.
constraints_tuple (tuple[Tensor, Tensor]) – A tuple of
(A, b). Forkoutcome constraints andmoutputs atf(x)`,Aisk x mandbisk x 1such thatA f(x) <= b.
- Returns:
A dict mapping kwarg names of the constructor to values.
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_logcei(model, training_data, objective_index, constraints_tuple, best_f=None, maximize=True)[source]
Construct kwargs for the log constrained expected improvement acquisition function.
- Parameters:
model (Model) – The model to be used in the acquisition function.
training_data (SupervisedDataset | dict[Hashable, SupervisedDataset]) – Dataset(s) used to train the model. Used to determine default value for
best_f.objective_index (int) – The index of the objective.
constraints_tuple (tuple[Tensor, Tensor]) – A tuple of
(A, b). Forkoutcome constraints andmoutputs atf(x)`,Aisk x mandbisk x 1such thatA f(x) <= b.best_f (float | Tensor | None) – Either a scalar or a
b-dim Tensor (batch mode) representing the best feasible function value observed so far (assumed noiseless).maximize (bool) – If True, consider the problem a maximization problem.
- Returns:
A dict mapping kwarg names of the constructor to values.
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_ucb(model, posterior_transform=None, beta=0.2, maximize=True)[source]
Construct kwargs for
UpperConfidenceBound.- Parameters:
model (Model) – The model to be used in the acquisition function.
posterior_transform (PosteriorTransform | None) – The posterior transform to be used in the acquisition function.
beta (float | Tensor) – Either a scalar or a one-dim tensor with
belements (batch mode) representing the trade-off parameter between mean and covariancemaximize (bool) – If True, consider the problem a maximization problem.
- Returns:
A dict mapping kwarg names of the constructor to values.
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_noisy_ei(model, training_data, num_fantasies=20, maximize=True)[source]
Construct kwargs for
NoisyExpectedImprovement.- Parameters:
model (Model) – The model to be used in the acquisition function.
training_data (SupervisedDataset | dict[Hashable, SupervisedDataset]) – Dataset(s) used to train the model.
num_fantasies (int) – The number of fantasies to generate. The higher this number the more accurate the model (at the expense of model complexity and performance).
maximize (bool) – If True, consider the problem a maximization problem.
- Returns:
A dict mapping kwarg names of the constructor to values.
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_qSimpleRegret(model, objective=None, posterior_transform=None, X_pending=None, sampler=None, constraints=None, X_baseline=None)[source]
Construct kwargs for qSimpleRegret.
- Parameters:
model (Model) – The model to be used in the acquisition function.
objective (MCAcquisitionObjective | None) – The objective to be used in the acquisition function.
posterior_transform (PosteriorTransform | None) – The posterior transform to be used in the acquisition function.
X_pending (Tensor | None) – A
batch_shape, m x d-dim Tensor ofmdesign points that have points that have been submitted for function evaluation but have not yet been evaluated.sampler (MCSampler | None) – The sampler used to draw base samples. If omitted, uses the acquisition functions’s default sampler.
constraints (list[Callable[[Tensor], Tensor]] | None) – A list of constraint callables which map a Tensor of posterior samples of dimension
sample_shape x batch-shape x q x m-dim to asample_shape x batch-shape x q-dim Tensor. The associated constraints are considered satisfied if the output is less than zero.X_baseline (Tensor | None) – A
batch_shape x r x d-dim Tensor ofrdesign points that have already been observed. These points are considered as the potential best design point. If omitted, checks that all training_data have the same input features and take the firstX.
- Returns:
A dict mapping kwarg names of the constructor to values.
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_qEI(model, training_data, objective=None, posterior_transform=None, X_pending=None, sampler=None, best_f=None, constraints=None, eta=0.001)[source]
Construct kwargs for the
qExpectedImprovementconstructor.- Parameters:
model (Model) – The model to be used in the acquisition function.
training_data (SupervisedDataset | dict[Hashable, SupervisedDataset]) – Dataset(s) used to train the model.
objective (MCAcquisitionObjective | None) – The objective to be used in the acquisition function.
posterior_transform (PosteriorTransform | None) – The posterior transform to be used in the acquisition function.
X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call.sampler (MCSampler | None) – The sampler used to draw base samples. If omitted, uses the acquisition functions’s default sampler.
best_f (float | Tensor | None) – Threshold above (or below) which improvement is defined.
constraints (list[Callable[[Tensor], Tensor]] | None) – A list of constraint callables which map a Tensor of posterior samples of dimension
sample_shape x batch-shape x q x m-dim to asample_shape x batch-shape x q-dim Tensor. The associated constraints are considered satisfied if the output is less than zero.eta (Tensor | float) – Temperature parameter(s) governing the smoothness of the sigmoid approximation to the constraint indicators. For more details, on this parameter, see the docs of
compute_smoothed_feasibility_indicator.
- Returns:
A dict mapping kwarg names of the constructor to values.
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_qLogEI(model, training_data, objective=None, posterior_transform=None, X_pending=None, sampler=None, best_f=None, constraints=None, eta=0.001, fat=True, tau_max=0.01, tau_relu=1e-06)[source]
Construct kwargs for the
qLogExpectedImprovementconstructor.- Parameters:
model (Model) – The model to be used in the acquisition function.
training_data (SupervisedDataset | dict[Hashable, SupervisedDataset]) – Dataset(s) used to train the model.
objective (MCAcquisitionObjective | None) – The objective to be used in the acquisition function.
posterior_transform (PosteriorTransform | None) – The posterior transform to be used in the acquisition function.
X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call.sampler (MCSampler | None) – The sampler used to draw base samples. If omitted, uses the acquisition functions’s default sampler.
best_f (float | Tensor | None) – Threshold above (or below) which improvement is defined.
constraints (list[Callable[[Tensor], Tensor]] | None) – A list of constraint callables which map a Tensor of posterior samples of dimension
sample_shape x batch-shape x q x m-dim to asample_shape x batch-shape x q-dim Tensor. The associated constraints are considered satisfied if the output is less than zero.eta (Tensor | float) – Temperature parameter(s) governing the smoothness of the sigmoid approximation to the constraint indicators. For more details, on this parameter, see the docs of
compute_smoothed_feasibility_indicator.fat (bool) – Toggles the logarithmic / linear asymptotic behavior of the smooth approximation to the ReLU.
tau_max (float) – Temperature parameter controlling the sharpness of the smooth approximations to max.
tau_relu (float) – Temperature parameter controlling the sharpness of the smooth approximations to ReLU.
- Returns:
A dict mapping kwarg names of the constructor to values.
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_LogPF(model, constraints, posterior_transform=None, X_pending=None, sampler=None, eta=0.001, fat=True, tau_max=0.01)[source]
Construct kwargs for the
qLogProbabilityOfFeasibilityconstructor.- Parameters:
model (Model) – The model to be used in the acquisition function.
constraints (list[Callable[[Tensor], Tensor]]) – A list of constraint callables which map a Tensor of posterior samples of dimension
sample_shape x batch-shape x q x m-dim to asample_shape x batch-shape x q-dim Tensor. The associated constraints are considered satisfied if the output is less than zero.posterior_transform (PosteriorTransform | None) – The posterior transform to be used in the acquisition function.
X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call.sampler (MCSampler | None) – The sampler used to draw base samples. If omitted, uses the acquisition functions’s default sampler.
eta (Tensor | float) – Temperature parameter(s) governing the smoothness of the sigmoid approximation to the constraint indicators. For more details, on this parameter, see the docs of
compute_smoothed_feasibility_indicator.fat (bool) – Toggles the logarithmic / linear asymptotic behavior of the smooth approximation to the ReLU.
tau_max (float) – Temperature parameter controlling the sharpness of the smooth approximations to max.
- Returns:
A dictionary mapping kwarg names of the constructor to values.
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_qNEI(model, training_data, objective=None, posterior_transform=None, X_pending=None, sampler=None, X_baseline=None, prune_baseline=True, cache_root=None, constraints=None, eta=0.001)[source]
Construct kwargs for the
qNoisyExpectedImprovementconstructor.- Parameters:
model (Model) – The model to be used in the acquisition function.
training_data (SupervisedDataset | dict[Hashable, SupervisedDataset]) – Dataset(s) used to train the model.
objective (MCAcquisitionObjective | None) – The objective to be used in the acquisition function.
posterior_transform (PosteriorTransform | None) – The posterior transform to be used in the acquisition function.
X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call.sampler (MCSampler | None) – The sampler used to draw base samples. If omitted, uses the acquisition functions’s default sampler.
X_baseline (Tensor | None) – A
batch_shape x r x d-dim Tensor ofrdesign points that have already been observed. These points are considered as the potential best design point. If omitted, checks that all training_data have the same input features and take the firstX.prune_baseline (bool | None) – If True, remove points in
X_baselinethat are highly unlikely to be the best point. This can significantly improve performance and is generally recommended.cache_root (bool | None) – A boolean indicating whether to cache the root decomposition over
X_baselineand use low-rank updates. If None, will be set to True if the model supports it and False otherwise.constraints (list[Callable[[Tensor], Tensor]] | None) – A list of constraint callables which map a Tensor of posterior samples of dimension
sample_shape x batch-shape x q x m-dim to asample_shape x batch-shape x q-dim Tensor. The associated constraints are considered satisfied if the output is less than zero.eta (Tensor | float) – Temperature parameter(s) governing the smoothness of the sigmoid approximation to the constraint indicators. For more details, on this parameter, see the docs of
compute_smoothed_feasibility_indicator.
- Returns:
A dict mapping kwarg names of the constructor to values.
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_qLogNEI(model, training_data, objective=None, posterior_transform=None, X_pending=None, sampler=None, X_baseline=None, prune_baseline=True, cache_root=None, constraints=None, eta=0.001, fat=True, tau_max=0.01, tau_relu=1e-06, incremental=True)[source]
Construct kwargs for the
qLogNoisyExpectedImprovementconstructor.- Parameters:
model (Model) – The model to be used in the acquisition function.
training_data (SupervisedDataset | dict[Hashable, SupervisedDataset]) – Dataset(s) used to train the model.
objective (MCAcquisitionObjective | None) – The objective to be used in the acquisition function.
posterior_transform (PosteriorTransform | None) – The posterior transform to be used in the acquisition function.
X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call.sampler (MCSampler | None) – The sampler used to draw base samples. If omitted, uses the acquisition functions’s default sampler.
X_baseline (Tensor | None) – A
batch_shape x r x d-dim Tensor ofrdesign points that have already been observed. These points are considered as the potential best design point. If omitted, checks that all training_data have the same input features and take the firstX.prune_baseline (bool | None) – If True, remove points in
X_baselinethat are highly unlikely to be the best point. This can significantly improve performance and is generally recommended.constraints (list[Callable[[Tensor], Tensor]] | None) – A list of constraint callables which map a Tensor of posterior samples of dimension
sample_shape x batch-shape x q x m-dim to asample_shape x batch-shape x q-dim Tensor. The associated constraints are considered satisfied if the output is less than zero.eta (Tensor | float) – Temperature parameter(s) governing the smoothness of the sigmoid approximation to the constraint indicators. For more details, on this parameter, see the docs of
compute_smoothed_feasibility_indicator.fat (bool) – Toggles the use of the fat-tailed non-linearities to smoothly approximate the constraints indicator function.
tau_max (float) – Temperature parameter controlling the sharpness of the smooth approximations to max.
tau_relu (float) – Temperature parameter controlling the sharpness of the smooth approximations to ReLU.
incremental (bool) – Whether to compute incremental EI over the pending points or compute EI of the joint batch improvement (including pending points).
cache_root (bool | None)
- Returns:
A dict mapping kwarg names of the constructor to values.
- botorch.acquisition.input_constructors.construct_inputs_qPI(model, training_data, objective=None, posterior_transform=None, X_pending=None, sampler=None, tau=0.001, best_f=None, constraints=None, eta=0.001)[source]
Construct kwargs for the
qProbabilityOfImprovementconstructor.- Parameters:
model (Model) – The model to be used in the acquisition function.
training_data (SupervisedDataset | dict[Hashable, SupervisedDataset]) – Dataset(s) used to train the model.
objective (MCAcquisitionObjective | None) – The objective to be used in the acquisition function.
posterior_transform (PosteriorTransform | None) – The posterior transform to be used in the acquisition function.
X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call.sampler (MCSampler | None) – The sampler used to draw base samples. If omitted, uses the acquisition functions’s default sampler.
tau (float) – The temperature parameter used in the sigmoid approximation of the step function. Smaller values yield more accurate approximations of the function, but result in gradients estimates with higher variance.
best_f (float | Tensor | None) – The best objective value observed so far (assumed noiseless). Can be a
batch_shape-shaped tensor, which in case of a batched model specifies potentially different values for each element of the batch.constraints (list[Callable[[Tensor], Tensor]] | None) – A list of constraint callables which map a Tensor of posterior samples of dimension
sample_shape x batch-shape x q x m-dim to asample_shape x batch-shape x q-dim Tensor. The associated constraints are considered satisfied if the output is less than zero.eta (Tensor | float) – Temperature parameter(s) governing the smoothness of the sigmoid approximation to the constraint indicators. For more details, on this parameter, see the docs of
compute_smoothed_feasibility_indicator.
- Returns:
A dict mapping kwarg names of the constructor to values.
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_qUCB(model, objective=None, posterior_transform=None, X_pending=None, sampler=None, X_baseline=None, constraints=None, beta=0.2)[source]
Construct kwargs for the
qUpperConfidenceBoundconstructor.- Parameters:
model (Model) – The model to be used in the acquisition function.
objective (MCAcquisitionObjective | None) – The objective to be used in the acquisition function.
posterior_transform (PosteriorTransform | None) – The posterior transform to be used in the acquisition function.
X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call.sampler (MCSampler | None) – The sampler used to draw base samples. If omitted, uses the acquisition functions’s default sampler.
X_baseline (Tensor | None) – A
batch_shape x r x d-dim Tensor ofrdesign points that have already been observed. These points are used to compute with infeasible cost when there are constraints.constraints (list[Callable[[Tensor], Tensor]] | None) – A list of constraint callables which map a Tensor of posterior samples of dimension
sample_shape x batch-shape x q x m-dim to asample_shape x batch-shape x q-dim Tensor. The associated constraints are considered satisfied if the output is less than zero.beta (float) – Controls tradeoff between mean and standard deviation in UCB.
- Returns:
A dict mapping kwarg names of the constructor to values.
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_EHVI(model, training_data, objective_thresholds=None, posterior_transform=None, constraints=None, alpha=None, Y_pmean=None, ref_point=None)[source]
Construct kwargs for
ExpectedHypervolumeImprovementconstructor.- Parameters:
model (Model)
training_data (SupervisedDataset | dict[Hashable, SupervisedDataset])
objective_thresholds (Tensor | None)
posterior_transform (PosteriorTransform | None)
constraints (list[Callable[[Tensor], Tensor]] | None)
alpha (float | None)
Y_pmean (Tensor | None)
ref_point (Tensor | None)
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_qEHVI(model, training_data, objective_thresholds=None, objective=None, constraints=None, alpha=None, sampler=None, X_pending=None, eta=0.001, mc_samples=128, qmc=True, ref_point=None)[source]
Construct kwargs for
qExpectedHypervolumeImprovementandqLogExpectedHypervolumeImprovement.- Parameters:
model (Model)
training_data (SupervisedDataset | dict[Hashable, SupervisedDataset])
objective_thresholds (Tensor | None)
objective (MCMultiOutputObjective | None)
constraints (list[Callable[[Tensor], Tensor]] | None)
alpha (float | None)
sampler (MCSampler | None)
X_pending (Tensor | None)
eta (float)
mc_samples (int)
qmc (bool)
ref_point (Tensor | None)
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_qNEHVI(model, training_data, objective_thresholds=None, objective=None, X_baseline=None, constraints=None, alpha=None, sampler=None, X_pending=None, eta=0.001, fat=False, mc_samples=128, qmc=True, prune_baseline=True, cache_pending=True, max_iep=0, incremental_nehvi=True, cache_root=None, ref_point=None)[source]
Construct kwargs for
qNoisyExpectedHypervolumeImprovement’s constructor.- Parameters:
model (Model)
training_data (SupervisedDataset | dict[Hashable, SupervisedDataset])
objective_thresholds (Tensor | None)
objective (MCMultiOutputObjective | None)
X_baseline (Tensor | None)
constraints (list[Callable[[Tensor], Tensor]] | None)
alpha (float | None)
sampler (MCSampler | None)
X_pending (Tensor | None)
eta (float)
fat (bool)
mc_samples (int)
qmc (bool)
prune_baseline (bool)
cache_pending (bool)
max_iep (int)
incremental_nehvi (bool)
cache_root (bool | None)
ref_point (Tensor | None)
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_qLogNEHVI(model, training_data, objective_thresholds=None, objective=None, X_baseline=None, constraints=None, alpha=None, sampler=None, X_pending=None, eta=0.001, fat=True, mc_samples=128, qmc=True, prune_baseline=True, cache_pending=True, max_iep=0, incremental_nehvi=True, cache_root=None, tau_relu=1e-06, tau_max=0.01, ref_point=None)[source]
Construct kwargs for
qLogNoisyExpectedHypervolumeImprovement’s constructor.”- Parameters:
model (Model)
training_data (SupervisedDataset | dict[Hashable, SupervisedDataset])
objective_thresholds (Tensor | None)
objective (MCMultiOutputObjective | None)
X_baseline (Tensor | None)
constraints (list[Callable[[Tensor], Tensor]] | None)
alpha (float | None)
sampler (MCSampler | None)
X_pending (Tensor | None)
eta (float)
fat (bool)
mc_samples (int)
qmc (bool)
prune_baseline (bool)
cache_pending (bool)
max_iep (int)
incremental_nehvi (bool)
cache_root (bool | None)
tau_relu (float)
tau_max (float)
ref_point (Tensor | None)
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_qLogNParEGO(model, training_data, scalarization_weights=None, objective=None, X_pending=None, sampler=None, X_baseline=None, prune_baseline=True, cache_root=None, constraints=None, eta=0.001, fat=True, tau_max=0.01, tau_relu=1e-06)[source]
Construct kwargs for the
qLogNParEGOconstructor.- Parameters:
model (Model) – The model to be used in the acquisition function.
training_data (SupervisedDataset | dict[Hashable, SupervisedDataset]) – Dataset(s) used to train the model.
scalarization_weights (Tensor | None) – A
m-dim Tensor of weights to be used in the Chebyshev scalarization. If omitted, samples from the unit simplex.objective (MCMultiOutputObjective | None) – The MultiOutputMCAcquisitionObjective under which the samples are evaluated before applying Chebyshev scalarization. Defaults to
IdentityMultiOutputObjective().X_pending (Tensor | None) – A
m x d-dim Tensor ofmdesign points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call.sampler (MCSampler | None) – The sampler used to draw base samples. If omitted, uses the acquisition functions’s default sampler.
X_baseline (Tensor | None) – A
batch_shape x r x d-dim Tensor ofrdesign points that have already been observed. These points are considered as the potential best design point. If omitted, checks that all training_data have the same input features and take the firstX.prune_baseline (bool | None) – If True, remove points in
X_baselinethat are highly unlikely to be the best point. This can significantly improve performance and is generally recommended.constraints (list[Callable[[Tensor], Tensor]] | None) – A list of constraint callables which map a Tensor of posterior samples of dimension
sample_shape x batch-shape x q x m-dim to asample_shape x batch-shape x q-dim Tensor. The associated constraints are considered satisfied if the output is less than zero.eta (Tensor | float) – Temperature parameter(s) governing the smoothness of the sigmoid approximation to the constraint indicators. For more details, on this parameter, see the docs of
compute_smoothed_feasibility_indicator.fat (bool) – Toggles the use of the fat-tailed non-linearities to smoothly approximate the constraints indicator function.
tau_max (float) – Temperature parameter controlling the sharpness of the smooth approximations to max.
tau_relu (float) – Temperature parameter controlling the sharpness of the smooth approximations to ReLU.
cache_root (bool | None)
- Returns:
A dict mapping kwarg names of the constructor to values.
- botorch.acquisition.input_constructors.construct_inputs_qMES(model, training_data, bounds, posterior_transform=None, candidate_size=1000, maximize=True)[source]
Construct kwargs for
qMaxValueEntropyconstructor.- Parameters:
model (Model)
training_data (SupervisedDataset | dict[Hashable, SupervisedDataset])
bounds (list[tuple[float, float]])
posterior_transform (PosteriorTransform | None)
candidate_size (int)
maximize (bool)
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_mf_base(target_fidelities, fidelity_weights=None, cost_intercept=1.0, num_trace_observations=0)[source]
Construct kwargs for a multifidelity acquisition function’s constructor.
- Parameters:
target_fidelities (dict[int, int | float])
fidelity_weights (dict[int, float] | None)
cost_intercept (float)
num_trace_observations (int)
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_qKG(model, training_data, bounds, objective=None, posterior_transform=None, num_fantasies=64, with_current_value=False, **optimize_objective_kwargs)[source]
Construct kwargs for
qKnowledgeGradientconstructor.- Parameters:
model (Model)
training_data (SupervisedDataset | dict[Hashable, SupervisedDataset])
bounds (list[tuple[float, float]])
objective (MCAcquisitionObjective | None)
posterior_transform (PosteriorTransform | None)
num_fantasies (int)
with_current_value (bool)
optimize_objective_kwargs (None | MCAcquisitionObjective | PosteriorTransform | tuple[Tensor, Tensor] | dict[int, float] | bool | int | dict[str, Any] | Callable[[Tensor], Tensor] | Tensor)
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_qHVKG(model, training_data, bounds, objective_thresholds=None, objective=None, posterior_transform=None, num_fantasies=8, num_pareto=10, ref_point=None, **optimize_objective_kwargs)[source]
Construct kwargs for
qKnowledgeGradientconstructor.- Parameters:
model (Model)
training_data (SupervisedDataset | dict[Hashable, SupervisedDataset])
bounds (list[tuple[float, float]])
objective_thresholds (Tensor | None)
objective (MCMultiOutputObjective | None)
posterior_transform (PosteriorTransform | None)
num_fantasies (int)
num_pareto (int)
ref_point (Tensor | None)
optimize_objective_kwargs (None | MCAcquisitionObjective | PosteriorTransform | tuple[Tensor, Tensor] | dict[int, float] | bool | int | dict[str, Any] | Callable[[Tensor], Tensor] | Tensor)
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_qMFKG(model, training_data, bounds, target_fidelities, objective=None, posterior_transform=None, fidelity_weights=None, cost_intercept=1.0, num_trace_observations=0, num_fantasies=64, **optimize_objective_kwargs)[source]
Construct kwargs for
qMultiFidelityKnowledgeGradientconstructor.- Parameters:
model (Model)
training_data (SupervisedDataset | dict[Hashable, SupervisedDataset])
bounds (list[tuple[float, float]])
target_fidelities (dict[int, int | float])
objective (MCAcquisitionObjective | None)
posterior_transform (PosteriorTransform | None)
fidelity_weights (dict[int, float] | None)
cost_intercept (float)
num_trace_observations (int)
num_fantasies (int)
optimize_objective_kwargs (None | MCAcquisitionObjective | PosteriorTransform | tuple[Tensor, Tensor] | dict[int, float] | bool | int | dict[str, Any] | Callable[[Tensor], Tensor] | Tensor)
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_qMFHVKG(model, training_data, bounds, target_fidelities, objective_thresholds=None, objective=None, posterior_transform=None, fidelity_weights=None, cost_intercept=1.0, num_trace_observations=0, num_fantasies=8, num_pareto=10, ref_point=None, **optimize_objective_kwargs)[source]
Construct kwargs for
qMultiFidelityHypervolumeKnowledgeGradientconstructor.- Parameters:
model (Model)
training_data (SupervisedDataset | dict[Hashable, SupervisedDataset])
bounds (list[tuple[float, float]])
target_fidelities (dict[int, int | float])
objective_thresholds (Tensor | None)
objective (MCMultiOutputObjective | None)
posterior_transform (PosteriorTransform | None)
fidelity_weights (dict[int, float] | None)
cost_intercept (float)
num_trace_observations (int)
num_fantasies (int)
num_pareto (int)
ref_point (Tensor | None)
optimize_objective_kwargs (None | MCAcquisitionObjective | PosteriorTransform | tuple[Tensor, Tensor] | dict[int, float] | bool | int | dict[str, Any] | Callable[[Tensor], Tensor] | Tensor)
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_qMFMES(model, training_data, bounds, target_fidelities, num_fantasies=64, fidelity_weights=None, cost_intercept=1.0, num_trace_observations=0, candidate_size=1000, maximize=True)[source]
Construct kwargs for
qMultiFidelityMaxValueEntropyconstructor.- Parameters:
model (Model)
training_data (SupervisedDataset | dict[Hashable, SupervisedDataset])
bounds (list[tuple[float, float]])
target_fidelities (dict[int, int | float])
num_fantasies (int)
fidelity_weights (dict[int, float] | None)
cost_intercept (float)
num_trace_observations (int)
candidate_size (int)
maximize (bool)
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_analytic_eubo(model, pref_model=None, previous_winner=None, sample_multiplier=1.0, objective=None, posterior_transform=None)[source]
Construct kwargs for the
AnalyticExpectedUtilityOfBestOptionconstructor.modelis the primary model defined over the parameter space. It can be the outcome model in BOPE or the preference model in PBO.pref_modelis the model defined over the outcome/metric space, which is typically the preference model in BOPE.If both model and pref_model exist, we are performing Bayesian Optimization with Preference Exploration (BOPE). When only pref_model is None, we are performing preferential BO (PBO).
- Parameters:
model (Model) – The outcome model to be used in the acquisition function in BOPE when pref_model exists; otherwise, model is the preference model and we are doing Preferential BO
pref_model (Model | None) – The preference model to be used in preference exploration as in BOPE; if None, we are doing PBO and model is the preference model.
previous_winner (Tensor | None) – The previous winner of the best option.
sample_multiplier (float | None) – The scale factor for the single-sample model.
objective (LearnedObjective | None) – Ignored. This argument is allowed to be passed then ignored because of the way that EUBO is typically used in a BOPE loop.
posterior_transform (PosteriorTransform | None) – Ignored. This argument is allowed to be passed then ignored because of the way that EUBO is typically used in a BOPE loop.
- Returns:
A dict mapping kwarg names of the constructor to values.
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.construct_inputs_qeubo(model, pref_model=None, sample_multiplier=1.0, sampler=None, objective=None, posterior_transform=None, X_pending=None)[source]
Construct kwargs for the
qExpectedUtilityOfBestOption(qEUBO) constructor.modelis the primary model defined over the parameter space. It can be the outcomde model in BOPE or the preference model in PBO.pref_modelis the model defined over the outcome/metric space, which is typically the preference model in BOPE.If both model and pref_model exist, we are performing Bayesian Optimization with Preference Exploration (BOPE). When only pref_model is None, we are performing preferential BO (PBO).
- Parameters:
model (Model) – The outcome model to be used in the acquisition function in BOPE when pref_model exists; otherwise, model is the preference model and we are doing Preferential BO
pref_model (Model | None) – The preference model to be used in preference exploration as in BOPE; if None, we are doing PBO and model is the preference model.
sample_multiplier (float | None) – The scale factor for the single-sample model.
sampler (MCSampler | None)
objective (MCAcquisitionObjective | None)
posterior_transform (PosteriorTransform | None)
X_pending (Tensor | None)
- Returns:
A dict mapping kwarg names of the constructor to values.
- Return type:
dict[str, Any]
- botorch.acquisition.input_constructors.get_best_f_analytic(training_data, posterior_transform=None)[source]
- Parameters:
training_data (SupervisedDataset | dict[Hashable, SupervisedDataset])
posterior_transform (PosteriorTransform | None)
- Return type:
Tensor
- botorch.acquisition.input_constructors.get_best_f_mc(training_data, objective=None, posterior_transform=None, constraints=None, model=None)[source]
Computes the maximum value of the objective over the training data.
- Parameters:
training_data (SupervisedDataset | dict[Hashable, SupervisedDataset]) – Has fields Y, which is evaluated by
objective, and X, which is used asX_baseline.Yis of shapebatch_shape x q x m.objective (MCAcquisitionObjective | None) – The objective under which to evaluate the training data. If omitted, uses
IdentityMCObjective.posterior_transform (PosteriorTransform | None) – An optional PosteriorTransform to apply to
Ybefore computing the objective.constraints (list[Callable[[Tensor], Tensor]] | None) – For assessing feasibility.
model (Model | None) – Used by
compute_best_feasible_objectivewhen there are no feasible observations.
- Returns:
A Tensor of shape
batch_shape.- Return type:
Tensor
- botorch.acquisition.input_constructors.optimize_objective(model, bounds, q, acq_function=None, objective=None, posterior_transform=None, linear_constraints=None, fixed_features=None, qmc=True, mc_samples=512, seed_inner=None, optimizer_options=None, post_processing_func=None, batch_initial_conditions=None, sequential=False)[source]
Optimize an objective under the given model.
- Parameters:
model (Model) – The model to be used in the objective.
bounds (Tensor) – A
2 x dtensor of lower and upper bounds for each column ofX.q (int) – The cardinality of input sets on which the objective is to be evaluated.
objective (MCAcquisitionObjective | None) – The objective to optimize.
posterior_transform (PosteriorTransform | None) – The posterior transform to be used in the acquisition function.
linear_constraints (tuple[Tensor, Tensor] | None) – A tuple of (A, b). Given
klinear constraints on ad-dimensional space,Aisk x dandbisk x 1such thatA x <= b. (Not used by single task models).fixed_features (dict[int, float] | None) – A dictionary of feature assignments
{feature_index: value}to hold fixed during generation.qmc (bool) – Toggle for enabling (qmc=1) or disabling (qmc=0) use of Quasi Monte Carlo.
mc_samples (int) – Integer number of samples used to estimate Monte Carlo objectives.
seed_inner (int | None) – Integer seed used to initialize the sampler passed to MCObjective.
optimizer_options (dict[str, Any] | None) – Table used to lookup keyword arguments for the optimizer.
post_processing_func (Callable[[Tensor], Tensor] | None) – A function that post-processes an optimization result appropriately (i.e. according to
round-triptransformations).batch_initial_conditions (Tensor | None) – A Tensor of initial values for the optimizer.
sequential (bool) – If False, uses joint optimization, otherwise uses sequential optimization.
acq_function (AcquisitionFunction | None)
- Returns:
A tuple containing the best input locations and corresponding objective values.
- Return type:
tuple[Tensor, Tensor]
- botorch.acquisition.input_constructors.construct_inputs_qJES(model, bounds, num_optima=64, condition_noiseless=True, posterior_transform=None, X_pending=None, estimation_type='LB', num_samples=64)[source]
- Parameters:
model (Model)
bounds (list[tuple[float, float]])
num_optima (int)
condition_noiseless (bool)
posterior_transform (ScalarizedPosteriorTransform | None)
X_pending (Tensor | None)
estimation_type (str)
num_samples (int)
- botorch.acquisition.input_constructors.construct_inputs_BALD(model, X_pending=None, sampler=None, posterior_transform=None)[source]
- Parameters:
model (Model)
X_pending (Tensor | None)
sampler (MCSampler | None)
posterior_transform (PosteriorTransform | None)
- botorch.acquisition.input_constructors.construct_inputs_NIPV(model, bounds, num_mc_points=128, X_pending=None, posterior_transform=None)[source]
Construct inputs for qNegIntegratedPosteriorVariance.
- Parameters:
model (Model)
bounds (list[tuple[float, float]])
num_mc_points (int)
X_pending (Tensor | None)
posterior_transform (PosteriorTransform | None)
- Return type:
dict[str, Any]
Penalized Acquisition Function Wrapper
Modules to add regularization to acquisition functions.
- class botorch.acquisition.penalized.L2Penalty(init_point)[source]
Bases:
ModuleL2 penalty class to be added to any arbitrary acquisition function to construct a PenalizedAcquisitionFunction.
Initializing L2 regularization.
- Parameters:
init_point (Tensor) – The “1 x dim” reference point against which we want to regularize.
- class botorch.acquisition.penalized.L1Penalty(init_point)[source]
Bases:
ModuleL1 penalty class to be added to any arbitrary acquisition function to construct a PenalizedAcquisitionFunction.
Initializing L1 regularization.
- Parameters:
init_point (Tensor) – The “1 x dim” reference point against which we want to regularize.
- class botorch.acquisition.penalized.GaussianPenalty(init_point, sigma)[source]
Bases:
ModuleGaussian penalty class to be added to any arbitrary acquisition function to construct a PenalizedAcquisitionFunction.
Initializing Gaussian regularization.
- Parameters:
init_point (Tensor) – The “1 x dim” reference point against which we want to regularize.
sigma (float) – The parameter used in gaussian function.
- class botorch.acquisition.penalized.GroupLassoPenalty(init_point, groups)[source]
Bases:
ModuleGroup lasso penalty class to be added to any arbitrary acquisition function to construct a PenalizedAcquisitionFunction.
Initializing Group-Lasso regularization.
- Parameters:
init_point (Tensor) – The “1 x dim” reference point against which we want to regularize.
groups (list[list[int]]) – Groups of indices used in group lasso.
- botorch.acquisition.penalized.narrow_gaussian(X, a)[source]
- Parameters:
X (Tensor)
a (Tensor)
- Return type:
Tensor
- botorch.acquisition.penalized.nnz_approx(X, target_point, a)[source]
Differentiable relaxation of ||X - target_point||_0
- Parameters:
X (Tensor) – An
n x dtensor of inputs.target_point (Tensor) – A tensor of size
ncorresponding to the target point.a (Tensor) – A scalar tensor that controls the differentiable relaxation.
- Return type:
Tensor
- class botorch.acquisition.penalized.L0Approximation(target_point, a=1.0, **tkwargs)[source]
Bases:
ModuleDifferentiable relaxation of the L0 norm using a Gaussian basis function.
Initializing L0 penalty with differentiable relaxation.
- Parameters:
target_point (Tensor) – A tensor corresponding to the target point.
a (float) – A hyperparameter that controls the differentiable relaxation.
tkwargs (Any)
- forward(X)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.- Parameters:
X (Tensor)
- Return type:
Tensor
- class botorch.acquisition.penalized.L0PenaltyApprox(target_point, a=1.0, **tkwargs)[source]
Bases:
L0ApproximationDifferentiable relaxation of the L0 norm to be added to any arbitrary acquisition function to construct a PenalizedAcquisitionFunction.
Initializing L0 penalty with differentiable relaxation.
- Parameters:
target_point (Tensor) – A tensor corresponding to the target point.
a (float) – A hyperparameter that controls the differentiable relaxation.
tkwargs (Any)
- class botorch.acquisition.penalized.PenalizedAcquisitionFunction(raw_acqf, penalty_func, regularization_parameter)[source]
Bases:
AcquisitionFunctionSingle-outcome acquisition function regularized by the given penalty.
- The usage is similar to:
raw_acqf = NoisyExpectedImprovement(…) penalty = GroupLassoPenalty(…) acqf = PenalizedAcquisitionFunction(raw_acqf, penalty)
Initializing penalized acquisition function.
- Parameters:
raw_acqf (AcquisitionFunction) – The raw acquisition function that is going to be regularized.
penalty_func (torch.nn.Module) – The regularization function.
regularization_parameter (float) – Regularization parameter used in optimization.
- forward(X)[source]
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Tensor) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Tensor
- property X_pending: Tensor | None
- botorch.acquisition.penalized.group_lasso_regularizer(X, groups)[source]
Computes the group lasso regularization function for the given point.
- Parameters:
X (Tensor) – A bxd tensor representing the points to evaluate the regularization at.
groups (list[list[int]]) – List of indices of different groups.
- Returns:
Computed group lasso norm of at the given points.
- Return type:
Tensor
- class botorch.acquisition.penalized.L1PenaltyObjective(init_point)[source]
Bases:
ModuleL1 penalty objective class. An instance of this class can be added to any arbitrary objective to construct a PenalizedMCObjective.
Initializing L1 penalty objective.
- Parameters:
init_point (Tensor) – The “1 x dim” reference point against which we want to regularize.
- class botorch.acquisition.penalized.PenalizedMCObjective(objective, penalty_objective, regularization_parameter, expand_dim=None)[source]
Bases:
GenericMCObjectivePenalized MC objective.
Allows to construct a penalized MC-objective by adding a penalty term to the original objective.
mc_acq(X) = objective(X) + penalty_objective(X)
Note: PenalizedMCObjective allows adding penalty at the MCObjective level, different from the AcquisitionFunction level in PenalizedAcquisitionFunction.
Example
>>> regularization_parameter = 0.01 >>> init_point = torch.zeros(3) # assume data dim is 3 >>> objective = lambda Y, X: torch.sqrt(Y).sum(dim=-1) >>> l1_penalty_objective = L1PenaltyObjective(init_point=init_point) >>> l1_penalized_objective = PenalizedMCObjective( objective, l1_penalty_objective, regularization_parameter ) >>> samples = sampler(posterior)
Penalized MC objective.
- Parameters:
objective (Callable[[Tensor, Tensor | None], Tensor]) – A callable
f(samples, X)mapping asample_shape x batch-shape x q x m-dim Tensorsamplesand an optionalbatch-shape x q x d-dim TensorXto asample_shape x batch-shape x q-dim Tensor of objective values.penalty_objective (torch.nn.Module) – A torch.nn.Module
f(X)that takes in abatch-shape x q x d-dim TensorXand outputs a1 x batch-shape x q-dim Tensor of penalty objective values.regularization_parameter (float) – weight of the penalty (regularization) term
expand_dim (int | None) – dim to expand penalty_objective to match with objective when fully bayesian model is used. If None, no expansion is performed.
- forward(samples, X=None)[source]
Evaluate the penalized objective on the samples.
- Parameters:
samples (Tensor) – A
sample_shape x batch_shape x q x m-dim Tensors of samples from a model posterior.X (Tensor | None) – A
batch_shape x q x d-dim tensor of inputs. Relevant only if the objective depends on the inputs explicitly.
- Returns:
A
sample_shape x batch_shape x q-dim Tensor of objective values with penalty added for each point.- Return type:
Tensor
- class botorch.acquisition.penalized.L0PenaltyApproxObjective(target_point, a=1.0, **tkwargs)[source]
Bases:
L0ApproximationDifferentiable relaxation of the L0 norm penalty objective class. An instance of this class can be added to any arbitrary objective to construct a PenalizedMCObjective.
Initializing L0 penalty with differentiable relaxation.
- Parameters:
target_point (Tensor) – A tensor corresponding to the target point.
a (float) – A hyperparameter that controls the differentiable relaxation.
tkwargs (Any)
Prior-Guided Acquisition Function Wrapper
Prior-Guided Acquisition Functions
References
- class botorch.acquisition.prior_guided.PriorGuidedAcquisitionFunction(acq_function, prior_module, log=False, prior_exponent=1.0, X_pending=None)[source]
Bases:
AcquisitionFunctionClass for weighting acquisition functions by a prior distribution.
Supports MC and batch acquisition functions via SampleReducingAcquisitionFunction.
See [Hvarfner2022] for details.
Initialize the prior-guided acquisition function.
- Parameters:
acq_function (AcquisitionFunction) – The base acquisition function.
prior_module (Module) – A Module that computes the probability (or log probability) for the provided inputs.
prior_module.forwardshould take abatch_shape x q-dim tensor of inputs and return abatch_shape x q-dim tensor of probabilities.log (bool) – A boolean that should be true if the acquisition function emits a log-transformed value and the prior module emits a log probability.
prior_exponent (float) – The exponent applied to the prior. This can be used for example to decay the effect the prior over time as in [Hvarfner2022].
X_pending (Tensor | None) –
n x dTensor withnd-dim design points that have been submitted for evaluation but have not yet been evaluated. Note: X_pending should be provided as an argument to or set onPriorGuidedAcquisitionFunction, but not set on the underlying acquisition function.
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
Proximal Acquisition Function Wrapper
A wrapper around AcquisitionFunctions to add proximal weighting of the acquisition function.
- class botorch.acquisition.proximal.ProximalAcquisitionFunction(acq_function, proximal_weights, transformed_weighting=True, beta=None)[source]
Bases:
AcquisitionFunctionA wrapper around AcquisitionFunctions to add proximal weighting of the acquisition function. The acquisition function is weighted via a squared exponential centered at the last training point, with varying lengthscales corresponding to
proximal_weights. Can only be used with acquisition functions based on single batch models. Acquisition functions must be positive orbetamust be specified to apply a SoftPlus transform before proximal weighting.Small values of
proximal_weightscorresponds to strong biasing towards recently observed points, which smoothes optimization with a small potential decrease in convergence rate.Example
>>> model = SingleTaskGP(train_X, train_Y) >>> EI = ExpectedImprovement(model, best_f=0.0) >>> proximal_weights = torch.ones(d) >>> EI_proximal = ProximalAcquisitionFunction(EI, proximal_weights) >>> eip = EI_proximal(test_X)
Derived Acquisition Function weighted by proximity to recently observed point.
- Parameters:
acq_function (AcquisitionFunction) – The base acquisition function, operating on input tensors of feature dimension
d.proximal_weights (Tensor) – A
ddim tensor used to bias locality along each axis.transformed_weighting (bool | None) – If True, the proximal weights are applied in the transformed input space given by
acq_function.model.input_transform(if available), otherwise proximal weights are applied in real input space.beta (float | None) – If not None, apply a softplus transform to the base acquisition function, allows negative base acquisition function values.
- forward(X, *args, **kwargs)
Evaluate the acquisition function on the candidate set X.
- Parameters:
X (Any) – A
(b) x q x d-dim Tensor of(b)t-batches withqd-dim design points each.acqf (AcquisitionFunction)
args (Any)
kwargs (Any)
- Returns:
A
(b)-dim Tensor of acquisition function values at the given design pointsX.- Return type:
Any
Factory Functions for Acquisition Functions
Utilities for acquisition functions.
- botorch.acquisition.factory.get_acquisition_function(acquisition_function_name, model, objective, X_observed, posterior_transform=None, X_pending=None, constraints=None, eta=0.001, mc_samples=512, seed=None, *, tau=0.001, prune_baseline=True, marginalize_dim=None, cache_root=None, beta=None, ref_point=None, Y=None, alpha=0.0)[source]
Convenience function for initializing botorch acquisition functions.
- Parameters:
acquisition_function_name (str) – Name of the acquisition function.
model (Model) – A fitted model.
objective (MCAcquisitionObjective) – A MCAcquisitionObjective.
X_observed (Tensor) – A
m1 x d-dim Tensor ofm1design points that have already been observed.posterior_transform (PosteriorTransform | None) – A PosteriorTransform (optional).
X_pending (Tensor | None) – A
m2 x d-dim Tensor ofm2design points whose evaluation is pending.constraints (list[Callable[[Tensor], Tensor]] | None) – A list of callables, each mapping a Tensor of dimension
sample_shape x batch-shape x q x mto a Tensor of dimensionsample_shape x batch-shape x q, where negative values imply feasibility. Used for all acquisition functions except qSR and qUCB.eta (Tensor | float | None) – The temperature parameter for the sigmoid function used for the differentiable approximation of the constraints. In case of a float the same eta is used for every constraint in constraints. In case of a tensor the length of the tensor must match the number of provided constraints. The i-th constraint is then estimated with the i-th eta value. Used for all acquisition functions except qSR and qUCB.
mc_samples (int) – The number of samples to use for (q)MC evaluation of the acquisition function.
seed (int | None) – If provided, perform deterministic optimization (i.e. the function to optimize is fixed and not stochastic).
tau (float)
prune_baseline (bool)
marginalize_dim (int | None)
cache_root (bool | None)
beta (float | None)
ref_point (None | list[float] | Tensor)
Y (Tensor | None)
alpha (float)
- Returns:
The requested acquisition function.
- Return type:
Example
>>> model = SingleTaskGP(train_X, train_Y) >>> obj = LinearMCObjective(weights=torch.tensor([1.0, 2.0])) >>> acqf = get_acquisition_function("qEI", model, obj, train_X)
General Utilities for Acquisition Functions
Utilities for acquisition functions.
- botorch.acquisition.utils.repeat_to_match_aug_dim(target_tensor, reference_tensor)[source]
Repeat target_tensor until it has the same first dimension as reference_tensor This works regardless of the batch shapes and q. This is useful as we sometimes modify sample shapes such as in LearnedObjective.
- Parameters:
target_tensor (Tensor) – A
sample_size x batch_shape x q x m-dim Tensorreference_tensor (Tensor) – A
(augmented_sample * sample_size) x batch_shape x q-dim Tensor.augmented_samplecould be 1.
- Returns:
The content of
target_tensorpotentially repeated so that its first dimension matches that ofreference_tensor. The shape will be(augmented_sample * sample_size) x batch_shape x q x m.- Return type:
Tensor
Examples
>>> import torch >>> target_tensor = torch.arange(3).repeat(2, 1).T >>> target_tensor tensor([[0, 0], [1, 1], [2, 2]]) >>> repeat_to_match_aug_dim(target_tensor, torch.zeros(6)) tensor([[0, 0], [1, 1], [2, 2], [0, 0], [1, 1], [2, 2]])
- botorch.acquisition.utils.compute_best_feasible_objective(samples, obj, constraints, model=None, objective=None, posterior_transform=None, X_baseline=None, infeasible_obj=None)[source]
Computes the largest
objvalue that is feasible under theconstraints. Ifconstraintsis None, returns the best unconstrained objective value.When no feasible observations exist and
infeasible_objis notNone, returnsinfeasible_obj(potentially reshaped). When no feasible observations exist andinfeasible_objisNone, usesmodel,objective,posterior_transform, andX_baselineto infer and return aninfeasible_objMs.t.M < min_x f(x).- Parameters:
samples (Tensor) –
(sample_shape) x batch_shape x q x m-dim posterior samples.obj (Tensor) – A
(sample_shape) x batch_shape x q-dim Tensor of MC objective values.constraints (list[Callable[[Tensor], Tensor]] | None) – A list of constraint callables which map posterior samples to a scalar. The associated constraint is considered satisfied if this scalar is less than zero.
model (Model | None) – A Model, only required when there are no feasible observations.
objective (MCAcquisitionObjective | None) – An MCAcquisitionObjective, only optionally used when there are no feasible observations.
posterior_transform (PosteriorTransform | None) – A PosteriorTransform, only optionally used when there are no feasible observations.
X_baseline (Tensor | None) – A
batch_shape x d-dim Tensor of baseline points, only required when there are no feasible observations.infeasible_obj (Tensor | None) – A Tensor to be returned when no feasible points exist.
- Returns:
A
(sample_shape) x batch_shape-dim Tensor of best feasible objectives.- Return type:
Tensor
- botorch.acquisition.utils.get_infeasible_cost(X, model, objective=None, posterior_transform=None)[source]
Get infeasible cost for a model and objective.
For each outcome, computes an infeasible cost
Msuch that-M < min_x f(x)almost always, so that feasible points are preferred.- Parameters:
X (Tensor) – A
n x dTensor ofndesign points to use in evaluating the minimum. These points should cover the design space well. The more points the better the estimate, at the expense of added computation.model (Model) – A fitted botorch model with
moutcomes.objective (Callable[[Tensor, Tensor | None], Tensor] | None) – The objective with which to evaluate the model output.
posterior_transform (PosteriorTransform | None) – A PosteriorTransform (optional).
- Returns:
An
m-dim tensor of infeasible cost values.- Return type:
Tensor
Example
>>> model = SingleTaskGP(train_X, train_Y) >>> objective = lambda Y: Y[..., -1] ** 2 >>> M = get_infeasible_cost(train_X, model, obj)
- botorch.acquisition.utils.prune_inferior_points(model, X, objective=None, posterior_transform=None, constraints=None, num_samples=2048, max_frac=1.0, sampler=None, marginalize_dim=None)[source]
Prune points from an input tensor that are unlikely to be the best point.
Given a model, an objective, and an input tensor
X, this function returns the subset of points inXthat have some probability of being the best point under the objective. This function uses sampling to estimate the probabilities, the higher the number of pointsninXthe higher the number of samplesnum_samplesshould be to obtain accurate estimates.- Parameters:
model (Model) – A fitted model. Batched models are currently not supported.
X (Tensor) – An input tensor of shape
n x d. Batched inputs are currently not supported.objective (MCAcquisitionObjective | None) – The objective under which to evaluate the posterior.
posterior_transform (PosteriorTransform | None) – A PosteriorTransform (optional).
constraints (list[Callable[[Tensor], Tensor]] | None) – A list of constraint callables which map a Tensor of posterior samples of dimension
sample_shape x batch-shape x q x m-dim to asample_shape x batch-shape x q-dim Tensor. The associated constraints are satisfied ifconstraint(samples) < 0.num_samples (int) – The number of samples used to compute empirical probabilities of being the best point.
max_frac (float) – The maximum fraction of points to retain. Must satisfy
0 < max_frac <= 1. Ensures that the number of elements in the returned tensor does not exceedceil(max_frac * n).sampler (MCSampler | None) – If provided, will use this customized sampler instead of automatically constructing one with
num_samples.marginalize_dim (int | None) – A batch dimension that should be marginalized. For example, this is useful when using a batched fully Bayesian model.
- Returns:
A
n' x dwith subset of points inX, wheren’ = min(N_nz, ceil(max_frac * n))
with
N_nzthe number of points inXthat have non-zero (empirical, undernum_samplessamples) probability of being the best point.- Return type:
Tensor
- botorch.acquisition.utils.project_to_target_fidelity(X, target_fidelities, d)[source]
Project
Xonto the target set of fidelities.This function assumes that the set of feasible fidelities is a box, so projecting here just means setting each fidelity parameter to its target value. If X does not contain the fidelity dimensions, this will insert them and set them to their target values.
- Parameters:
X (Tensor) – A
batch_shape x q x (d or d-d_f)-dim Tensor of withqdord-d_f-dim design points for each t-batch, where d_f is the number of fidelity dimensions.Xmay have sized(fidelity dims included) ord - d_f(fidelity dims will be inserted at the appropriate positions).target_fidelities (dict[int, float]) – A dictionary mapping a subset of columns of
X(the fidelity parameters) to their respective target fidelity value. Supports both positive and negative indexing.d (int) – The total number of dimensions including fidelity dimensions.
- Returns:
- A
batch_shape x q x d-dim TensorX_projwith fidelity parameters projected to the provided fidelity values.
- A
- Return type:
Tensor
- botorch.acquisition.utils.expand_trace_observations(X, fidelity_dims=None, num_trace_obs=0)[source]
Expand
Xwith trace observations.Expand a tensor of inputs with “trace observations” that are obtained during the evaluation of the candidate set. This is used in multi-fidelity optimization. It can be though of as augmenting the
q-batch with additional points that are the expected trace observations.Let
f_ibe thei-th fidelity parameter. Then this functions assumes that for each element of the q-batch, besides the fidelityf_i, we will observe additonal fidelitiesf_i1, ..., f_iK, whereK = num_trace_obs, during evaluation of the candidate setX. Specifically, this function assumes thatf_ij = (K-j) / (num_trace_obs + 1) * f_ifor alli. That is, the expansion is performed in parallel for all fidelities (it does not expand out all possible combinations).- Parameters:
X (Tensor) – A
batch_shape x q x d-dim Tensor of withqd-dim design points (incl. the fidelity parameters) for each t-batch.fidelity_dims (list[int] | None) – The indices of the fidelity parameters. If omitted, assumes that the last column of X contains the fidelity parameters.
num_trace_obs (int) – The number of trace observations to use.
- Returns:
- A
batch_shape x (q + num_trace_obs x q) x dTensorX_expandedthat expands
Xwith trace observations.
- A
- Return type:
Tensor
- botorch.acquisition.utils.project_to_sample_points(X, sample_points)[source]
Augment
Xwith sample points at which to take weighted average.- Parameters:
X (Tensor) – A
batch_shape x 1 x d-dim Tensor of with one d`-dim design points for each t-batch.sample_points (Tensor) –
p x d'-dim Tensor (d' < d) ofd'-dim sample points at which to compute the expectation. Thed'-dims refer to the trailing columns of X.
- Returns:
A
batch_shape x p x dTensor where the q-batch includes thepsample points.- Return type:
Tensor
- botorch.acquisition.utils.get_optimal_samples(model, bounds, num_optima, raw_samples=1024, num_restarts=20, posterior_transform=None, objective=None, return_transformed=False)[source]
Draws sample paths from the posterior and maximizes the samples using GD.
- Parameters:
model (GP) – The model from which samples are drawn.
bounds (Tensor) – Bounds of the search space. If the model inputs are normalized, the bounds should be normalized as well.
num_optima (int) – The number of paths to be drawn and optimized.
raw_samples (int) – The number of candidates randomly sample. Defaults to 1024.
num_restarts (int) – The number of candidates to do gradient-based optimization on. Defaults to 20.
posterior_transform (ScalarizedPosteriorTransform | None) – A ScalarizedPosteriorTransform (may e.g. be used to scalarize multi-output models or negate the objective).
objective (MCAcquisitionObjective | None) – An MCAcquisitionObjective, used to negate the objective or otherwise transform sample outputs. Cannot be combined with
posterior_transform.return_transformed (bool) – If True, return the transformed samples.
- Returns:
The optimal input locations and corresponding outputs, x* and f*.
- Return type:
tuple[Tensor, Tensor]
Multi-Objective Utilities for Acquisition Functions
Utilities for multi-objective acquisition functions.
- botorch.acquisition.multi_objective.utils.get_default_partitioning_alpha(num_objectives)[source]
Determines an approximation level based on the number of objectives.
If
alphais 0, FastNondominatedPartitioning should be used. Otherwise, an approximate NondominatedPartitioning should be used with approximation levelalpha.- Parameters:
num_objectives (int) – the number of objectives.
- Returns:
The approximation level
alpha.- Return type:
float
- botorch.acquisition.multi_objective.utils.prune_inferior_points_multi_objective(model, X, ref_point, objective=None, constraints=None, num_samples=2048, max_frac=1.0, marginalize_dim=None)[source]
Prune points from an input tensor that are unlikely to be pareto optimal.
Given a model, an objective, and an input tensor
X, this function returns the subset of points inXthat have some probability of being pareto optimal, better than the reference point, and feasible. This function uses sampling to estimate the probabilities, the higher the number of pointsninXthe higher the number of samplesnum_samplesshould be to obtain accurate estimates.- Parameters:
model (Model) – A fitted model. Batched models are currently not supported.
X (Tensor) – An input tensor of shape
n x d. Batched inputs are currently not supported.ref_point (Tensor) – The reference point.
objective (MCMultiOutputObjective | None) – The objective under which to evaluate the posterior.
constraints (list[Callable[[Tensor], Tensor]] | None) – A list of callables, each mapping a Tensor of dimension
sample_shape x batch-shape x q x mto a Tensor of dimensionsample_shape x batch-shape x q, where negative values imply feasibility.num_samples (int) – The number of samples used to compute empirical probabilities of being the best point.
max_frac (float) – The maximum fraction of points to retain. Must satisfy
0 < max_frac <= 1. Ensures that the number of elements in the returned tensor does not exceedceil(max_frac * n).marginalize_dim (int | None) – A batch dimension that should be marginalized. For example, this is useful when using a batched fully Bayesian model.
- Returns:
A
n' x dwith subset of points inX, wheren’ = min(N_nz, ceil(max_frac * n))
with
N_nzthe number of points inXthat have non-zero (empirical, undernum_samplessamples) probability of being pareto optimal.- Return type:
Tensor
- botorch.acquisition.multi_objective.utils.compute_sample_box_decomposition(pareto_fronts, partitioning=<class 'botorch.utils.multi_objective.box_decompositions.dominated.DominatedPartitioning'>, maximize=True, num_constraints=0)[source]
Computes the box decomposition associated with some sampled optimal objectives. This also supports the single-objective and constrained optimization setting. An objective
yis feasible ify <= 0.To take advantage of batch computations, we pad the hypercell bounds with a
2 x (M + K)-dim Tensor of zeros[0, 0].- Parameters:
pareto_fronts (Tensor) – A
num_pareto_samples x num_pareto_points x Mdim Tensor containing the sampled optimal set of objectives.partitioning (type[BoxDecomposition]) – A
BoxDecompositionmodule that is used to obtain the hyper-rectangle bounds for integration. In the unconstrained case, this gives the partition of the dominated space. In the constrained case, this gives the partition of the feasible dominated space union the infeasible space.maximize (bool) – If true, the box-decomposition is computed assuming maximization.
num_constraints (int) – The number of constraints
K.
- Returns:
A
num_pareto_samples x 2 x J x (M + K)-dim Tensor containing the bounds for the hyper-rectangles. The numberJis the smallest number of boxes needed to partition all the Pareto samples.- Return type:
Tensor
- botorch.acquisition.multi_objective.utils.random_search_optimizer(model, bounds, num_points, maximize, pop_size=1024, max_tries=10)[source]
Optimize a function via random search.
- Parameters:
model (GenericDeterministicModel) – The model.
bounds (Tensor) – A
2 x d-dim Tensor containing the input bounds.num_points (int) – The number of optimal points to be outputted.
maximize (bool) – If true, we consider a maximization problem.
pop_size (int) – The number of function evaluations per try.
max_tries (int) – The maximum number of tries.
- Returns:
A two-element tuple containing
A
num_points x d-dim Tensor containing the collection of optimal inputs.- A
num_points x M-dim Tensor containing the collection of optimal objectives.
- A
- Return type:
tuple[Tensor, Tensor]
- botorch.acquisition.multi_objective.utils.sample_optimal_points(model, bounds, num_samples, num_points, optimizer=<function random_search_optimizer>, maximize=True, optimizer_kwargs=None)[source]
Compute a collection of optimal inputs and outputs from samples of a Gaussian Process (GP).
Steps: (1) The samples are generated using random Fourier features (RFFs). (2) The samples are optimized sequentially using an optimizer.
- TODO: We can generalize the GP sampling step to accommodate for other sampling
strategies rather than restricting to RFFs e.g. decoupled sampling.
- TODO: Currently this defaults to random search optimization, might want to
explore some other alternatives.
- Parameters:
model (Model) – The model. This does not support models which include fantasy observations.
bounds (Tensor) – A
2 x d-dim Tensor containing the input bounds.num_samples (int) – The number of GP samples.
num_points (int) – The number of optimal points to be outputted.
optimizer (Callable[[GenericDeterministicModel, Tensor, int, bool, Any], tuple[Tensor, Tensor]]) – A callable that solves the deterministic optimization problem.
maximize (bool) – If true, we consider a maximization problem.
optimizer_kwargs (dict[str, Any] | None) – The additional arguments for the optimizer.
- Returns:
A two-element tuple containing
- A
num_samples x num_points x d-dim Tensor containing the collection of optimal inputs.
- A
- A
num_samples x num_points x M-dim Tensor containing the collection of optimal objectives.
- A
- Return type:
tuple[Tensor, Tensor]