botorch.optim
Optimization
Core
Core abstractions and generic optimizers.
- class botorch.optim.core.OptimizationStatus(*values)[source]
Bases:
int,Enum- RUNNING = 1
- SUCCESS = 2
- FAILURE = 3
- STOPPED = 4
- class botorch.optim.core.OptimizationResult(step: 'int', fval: 'float | int', status: 'OptimizationStatus', runtime: 'float | None' = None, message: 'str | None' = None)[source]
Bases:
object- Parameters:
step (int)
fval (float | int)
status (OptimizationStatus)
runtime (float | None)
message (str | None)
- step: int
- fval: float | int
- status: OptimizationStatus
- runtime: float | None
- message: str | None
- botorch.optim.core.scipy_minimize(closure, parameters, bounds=None, callback=None, x0=None, method='L-BFGS-B', options=None, timeout_sec=None)[source]
Generic scipy.optimize.minimize-based optimization routine.
- Parameters:
closure (Callable[[], tuple[Tensor, Sequence[Tensor | None]]] | NdarrayOptimizationClosure) – Callable that returns a tensor and an iterable of gradient tensors or NdarrayOptimizationClosure instance.
parameters (dict[str, Tensor]) – A dictionary of tensors to be optimized.
bounds (dict[str, tuple[float | None, float | None]] | None) – A dictionary mapping parameter names to lower and upper bounds.
callback (Callable[[dict[str, Tensor], OptimizationResult], None] | None) – A callable taking
parametersand an OptimizationResult as arguments.x0 (ndarray[tuple[Any, ...], dtype[_ScalarT]] | None) – An optional initialization vector passed to scipy.optimize.minimize.
method (str) – Solver type, passed along to scipy.optimize.minimize.
options (dict[str, Any] | None) – Dictionary of solver options, passed along to scipy.optimize.minimize.
timeout_sec (float | None) – Timeout in seconds to wait before aborting the optimization loop if not converged (will return the best found solution thus far).
- Returns:
An OptimizationResult summarizing the final state of the run.
- Return type:
- botorch.optim.core.torch_minimize(closure, parameters, bounds=None, callback=None, optimizer=<class 'torch.optim.adam.Adam'>, scheduler=None, step_limit=None, timeout_sec=None, stopping_criterion=None)[source]
Generic torch.optim-based optimization routine.
- Parameters:
closure (Callable[[], tuple[Tensor, Sequence[Tensor | None]]]) – Callable that returns a tensor and an iterable of gradient tensors. Responsible for setting relevant parameters’
gradattributes.parameters (dict[str, Tensor]) – A dictionary of tensors to be optimized.
bounds (dict[str, tuple[float | None, float | None]] | None) – An optional dictionary of bounds for elements of
parameters.callback (Callable[[dict[str, Tensor], OptimizationResult], None] | None) – A callable taking
parametersand an OptimizationResult as arguments.optimizer (Optimizer | Callable[[list[Tensor]], Optimizer]) – A
torch.optim.Optimizerinstance or a factory that takes a list of parameters and returns anOptimizerinstance.scheduler (LRScheduler | Callable[[Optimizer], LRScheduler] | None) – A
torch.optim.lr_scheduler._LRSchedulerinstance or a factory that takes aOptimizerinstance and returns a_LRScheduleinstance.step_limit (int | None) – Integer specifying a maximum number of optimization steps. One of
step_limit,stopping_criterion, ortimeout_secmust be passed.timeout_sec (float | None) – Timeout in seconds before terminating the optimization loop. One of
step_limit,stopping_criterion, ortimeout_secmust be passed.stopping_criterion (StoppingCriterion | None) – A StoppingCriterion for the optimization loop.
- Returns:
An OptimizationResult summarizing the final state of the run.
- Return type:
Acquisition Function Optimization
Methods for optimizing acquisition functions.
- class botorch.optim.optimize.OptimizeAcqfInputs(*, acq_function, bounds, q, num_restarts, raw_samples, options, inequality_constraints, equality_constraints, nonlinear_inequality_constraints, fixed_features, post_processing_func, batch_initial_conditions, return_best_only, gen_candidates, sequential, ic_generator=None, timeout_sec=None, return_acq_values=True, return_full_tree=False, retry_on_optimization_warning=True, ic_gen_kwargs=<factory>, acq_function_sequence=None)[source]
Bases:
objectContainer for inputs to
optimize_acqf.See docstring for
optimize_acqffor explanation of parameters.- Parameters:
acq_function (AcquisitionFunction | None)
bounds (Tensor)
q (int)
num_restarts (int)
raw_samples (int | None)
options (dict[str, bool | float | int | str] | None)
inequality_constraints (list[tuple[Tensor, Tensor, float]] | None)
equality_constraints (list[tuple[Tensor, Tensor, float]] | None)
nonlinear_inequality_constraints (list[tuple[Callable, bool]] | None)
fixed_features (Mapping[int, float | Tensor] | None)
post_processing_func (Callable[[Tensor], Tensor] | None)
batch_initial_conditions (Tensor | None)
return_best_only (bool)
gen_candidates (Callable[[Tensor, AcquisitionFunction, Any], tuple[Tensor, Tensor]])
sequential (bool)
ic_generator (Callable[[qKnowledgeGradient, Tensor, int, int, int, dict[int, float] | None, dict[str, bool | float | int] | None, list[tuple[Tensor, Tensor, float]] | None, list[tuple[Tensor, Tensor, float]] | None], Tensor | None] | None)
timeout_sec (float | None)
return_acq_values (bool)
return_full_tree (bool)
retry_on_optimization_warning (bool)
ic_gen_kwargs (dict)
acq_function_sequence (list[AcquisitionFunction] | None)
- acq_function: AcquisitionFunction | None
- bounds: Tensor
- q: int
- num_restarts: int
- raw_samples: int | None
- options: dict[str, bool | float | int | str] | None
- inequality_constraints: list[tuple[Tensor, Tensor, float]] | None
- equality_constraints: list[tuple[Tensor, Tensor, float]] | None
- nonlinear_inequality_constraints: list[tuple[Callable, bool]] | None
- fixed_features: Mapping[int, float | Tensor] | None
- post_processing_func: Callable[[Tensor], Tensor] | None
- batch_initial_conditions: Tensor | None
- return_best_only: bool
- gen_candidates: Callable[[Tensor, AcquisitionFunction, Any], tuple[Tensor, Tensor]]
- sequential: bool
- ic_generator: Callable[[qKnowledgeGradient, Tensor, int, int, int, dict[int, float] | None, dict[str, bool | float | int] | None, list[tuple[Tensor, Tensor, float]] | None, list[tuple[Tensor, Tensor, float]] | None], Tensor | None] | None
- timeout_sec: float | None
- return_acq_values: bool
- return_full_tree: bool
- retry_on_optimization_warning: bool
- ic_gen_kwargs: dict
- acq_function_sequence: list[AcquisitionFunction] | None
- property full_tree: bool
- get_ic_generator()[source]
- Return type:
Callable[[qKnowledgeGradient, Tensor, int, int, int, dict[int, float] | None, dict[str, bool | float | int] | None, list[tuple[Tensor, Tensor, float]] | None, list[tuple[Tensor, Tensor, float]] | None], Tensor | None]
- botorch.optim.optimize.optimize_acqf(acq_function, bounds, q, num_restarts, raw_samples=None, options=None, inequality_constraints=None, equality_constraints=None, nonlinear_inequality_constraints=None, fixed_features=None, post_processing_func=None, batch_initial_conditions=None, return_best_only=True, gen_candidates=None, sequential=False, acq_function_sequence=None, *, ic_generator=None, timeout_sec=None, return_acq_values=True, return_full_tree=False, retry_on_optimization_warning=True, **ic_gen_kwargs)[source]
Optimize the acquisition function for a single or multiple joint candidates.
A high-level description (missing exceptions for special setups):
This function optimizes the acquisition function
acq_functionin two steps:i) It will sample
raw_samplesrandom points using Sobol sampling in the boundsboundsand pass on the “best”num_restartsmany. The default way to find these “best” is viagen_batch_initial_conditions(deviating for some acq functions, seeget_ic_generator), which by default performs Boltzmann sampling on the acquisition function value (The behavior of step (i) can be further controlled by specifyingic_generatororbatch_initial_conditions.)ii) A batch of the
num_restartspoints (or joint sets of points) with the highest acquisition values in the previous step are then further optimized. This is by default done by LBFGS-B optimization, if no constraints are present, and SLSQP, if constraints are present (can be changed to other optimizers viagen_candidates).While the optimization procedure runs on CPU by default for this function, the acq_function can be implemented on GPU and simply move the inputs to GPU internally.
- Parameters:
acq_function (AcquisitionFunction | None) – An AcquisitionFunction.
bounds (Tensor) – A
2 x dtensor of lower and upper bounds for each column ofX(if inequality_constraints is provided, these bounds can be -inf and +inf, respectively).q (int) – The number of candidates.
num_restarts (int) – The number of starting points for multistart acquisition function optimization. Even though the name suggests this happens sequentially, it is done in parallel (using batched evaluations) for up to
options.batch_limitcandidates (by default completely parallel).raw_samples (int | None) – The number of samples for initialization. This is required if
batch_initial_conditionsis not specified.options (dict[str, bool | float | int | str] | None) – Options for both optimization, passed to
gen_candidates, and initialization, passed to theic_generatorvia theoptionskwarg.inequality_constraints (list[tuple[Tensor, Tensor, float]] | None) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) >= rhs.indicesandcoefficientsshould be torch tensors. See the docstring ofmake_scipy_linear_constraintsfor an example. When q=1, or when applying the same constraint to each candidate in the batch (intra-point constraint),indicesshould be a 1-d tensor. For inter-point constraints, in which the constraint is applied to the whole batch of candidates,indicesmust be a 2-d tensor, where in each rowindices[i] =(k_i, l_i)the first indexk_icorresponds to thek_i-th element of theq-batch and the second indexl_icorresponds to thel_i-th feature of that element.equality_constraints (list[tuple[Tensor, Tensor, float]] | None) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an equality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) = rhs. See the docstring ofmake_scipy_linear_constraintsfor an example.nonlinear_inequality_constraints (list[tuple[Callable, bool]] | None) – A list of tuples representing the nonlinear inequality constraints. The first element in the tuple is a callable representing a constraint of the form
callable(x) >= 0. In case of an intra-point constraint,callable()``takes in an one-dimensional tensor of shape ``dand returns a scalar. In case of an inter-point constraint,callable()takes a two dimensional tensor of shapeq x dand again returns a scalar. The second element is a boolean, indicating if it is an intra-point or inter-point constraint (Truefor intra-point.Falsefor inter-point). For more information on intra-point vs inter-point constraints, see the docstring of theinequality_constraintsargument tooptimize_acqf(). The constraints will later be passed to the scipy solver. You need to pass inbatch_initial_conditionsin this case. Using non-linear inequality constraints also requires thatbatch_limitis set to 1, which will be done automatically if not specified inoptions.fixed_features (Mapping[int, float | Tensor] | None) – A map
{feature_index: value}for features that should be fixed to a particular value during generation. The value can be a float, in which case the feature is fixed across the entire batch, or a Tensor, in which case the feature can be fixed to different values for each batch element (used for batched optimization with different fixed features per restart). When passing tensors as values, they should have shapeborb x q.post_processing_func (Callable[[Tensor], Tensor] | None) – A function that post-processes an optimization result appropriately (i.e., according to
round-triptransformations).batch_initial_conditions (Tensor | None) – A tensor to specify the initial conditions. Set this if you do not want to use default initialization strategy.
return_best_only (bool) – If False, outputs the solutions corresponding to all random restart initializations of the optimization.
gen_candidates (Callable[[Tensor, AcquisitionFunction, Any], tuple[Tensor, Tensor]] | None) – A callable for generating candidates (and their associated acquisition values) given a tensor of initial conditions and an acquisition function. Other common inputs include lower and upper bounds and a dictionary of options, but refer to the documentation of specific generation functions (e.g., botorch.optim.optimize.gen_candidates_scipy and botorch.generation.gen.gen_candidates_torch) for method-specific inputs. Default:
gen_candidates_scipysequential (bool) – If False, uses joint optimization, otherwise uses sequential optimization for optimizing multiple joint candidates (q > 1).
acq_function_sequence (list[AcquisitionFunction] | None) – A list of acquisition functions to be optimized sequentially. Must be of length q>1, and requires sequential=True. Used for ensembling candidates from different acquisition functions. If omitted, use
acq_functionto generate allqcandidates.ic_generator (Callable[[qKnowledgeGradient, Tensor, int, int, int, dict[int, float] | None, dict[str, bool | float | int] | None, list[tuple[Tensor, Tensor, float]] | None, list[tuple[Tensor, Tensor, float]] | None], Tensor | None] | None) – Function for generating initial conditions. Not needed when
batch_initial_conditionsare provided. Defaults togen_one_shot_kg_initial_conditionsforqKnowledgeGradientacquisition functions andgen_batch_initial_conditionsotherwise. Must be specified for nonlinear inequality constraints.timeout_sec (float | None) – Max amount of time optimization can run for.
return_acq_values (bool) – Return acquisition values.
return_full_tree (bool) – Return the full tree of optimizers of the previous iteration.
retry_on_optimization_warning (bool) – Whether to retry candidate generation with a new set of initial conditions when it fails with an
OptimizationWarning.ic_gen_kwargs (Any) – Additional keyword arguments passed to function specified by
ic_generator
- Returns:
A two-element tuple containing
- A tensor of generated candidates. The shape is
–
q x difreturn_best_onlyis True (default) –num_restarts x q x difreturn_best_onlyis False
- a tensor of associated acquisition values
if
return_acq_values=TrueelseNone. Ifsequential=False, this is a(num_restarts)-dim tensor of joint acquisition values (with explicit restart dimension ifreturn_best_only=False). Ifsequential=True, this is aq-dim tensor of expected acquisition values conditional on having observed candidates0,1,...,i-1.
- Return type:
tuple[Tensor, Tensor | None]
Example
>>> # generate ``q=2`` candidates jointly using 20 random restarts >>> # and 512 raw samples >>> candidates, acq_value = optimize_acqf(qEI, bounds, 2, 20, 512)
>>> generate ``q=3`` candidates sequentially using 15 random restarts >>> # and 256 raw samples >>> qEI = qExpectedImprovement(model, best_f=0.2) >>> bounds = torch.tensor([[0.], [1.]]) >>> candidates, acq_value_list = optimize_acqf( >>> qEI, bounds, 3, 15, 256, sequential=True >>> )
- botorch.optim.optimize.optimize_acqf_cyclic(acq_function, bounds, q, num_restarts, raw_samples=None, options=None, inequality_constraints=None, equality_constraints=None, fixed_features=None, post_processing_func=None, batch_initial_conditions=None, cyclic_options=None, *, ic_generator=None, timeout_sec=None, return_acq_values=True, return_full_tree=False, retry_on_optimization_warning=True, **ic_gen_kwargs)[source]
Generate a set of
qcandidates via cyclic optimization.- Parameters:
acq_function (AcquisitionFunction) – An AcquisitionFunction
bounds (Tensor) – A
2 x dtensor of lower and upper bounds for each column ofX(if inequality_constraints is provided, these bounds can be -inf and +inf, respectively).q (int) – The number of candidates.
num_restarts (int) – Number of starting points for multistart acquisition function optimization.
raw_samples (int | None) – Number of samples for initialization. This is required if
batch_initial_conditionsis not specified.options (dict[str, bool | float | int | str] | None) – Options for candidate generation.
constraints (equality) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) >= rhsconstraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) = rhsfixed_features (Mapping[int, float | Tensor] | None) – A map
{feature_index: value}for features that should be fixed to a particular value during generation. The value can be a float, in which case the feature is fixed across the entire batch, or a Tensor, in which case the feature can be fixed to different values for each batch element.post_processing_func (Callable[[Tensor], Tensor] | None) – A function that post-processes an optimization result appropriately (i.e., according to
round-triptransformations).batch_initial_conditions (Tensor | None) – A tensor to specify the initial conditions. If no initial conditions are provided, the default initialization will be used.
cyclic_options (dict[str, bool | float | int | str] | None) – Options for stopping criterion for outer cyclic optimization.
ic_generator (Callable[[qKnowledgeGradient, Tensor, int, int, int, dict[int, float] | None, dict[str, bool | float | int] | None, list[tuple[Tensor, Tensor, float]] | None, list[tuple[Tensor, Tensor, float]] | None], Tensor | None] | None) – Function for generating initial conditions. Not needed when
batch_initial_conditionsare provided. Defaults togen_one_shot_kg_initial_conditionsforqKnowledgeGradientacquisition functions andgen_batch_initial_conditionsotherwise. Must be specified for nonlinear inequality constraints.timeout_sec (float | None) – Max amount of time optimization can run for.
return_acq_values (bool) – Return acquisition values.
return_full_tree (bool) – Return the full tree of optimizers of the previous iteration.
retry_on_optimization_warning (bool) – Whether to retry candidate generation with a new set of initial conditions when it fails with an
OptimizationWarning.ic_gen_kwargs (Any) – Additional keyword arguments passed to function specified by
ic_generatorinequality_constraints (list[tuple[Tensor, Tensor, float]] | None)
equality_constraints (list[tuple[Tensor, Tensor, float]] | None)
- Returns:
A two-element tuple containing
a
q x d-dim tensor of generated candidates.- a
q-dim tensor of expected acquisition values, where the value at index
iis the acquisition value conditional on having observed all candidates except candidatei. ReturnsNoneifreturn_acq_values=False
- a
- Return type:
tuple[Tensor, Tensor | None]
Example
>>> # generate ``q=3`` candidates cyclically using 15 random restarts >>> # 256 raw samples, and 4 cycles >>> >>> qEI = qExpectedImprovement(model, best_f=0.2) >>> bounds = torch.tensor([[0.], [1.]]) >>> candidates, acq_value_list = optimize_acqf_cyclic( >>> qEI, bounds, 3, 15, 256, cyclic_options={"maxiter": 4} >>> )
- botorch.optim.optimize.optimize_acqf_list(acq_function_list, bounds, num_restarts, raw_samples=None, options=None, inequality_constraints=None, equality_constraints=None, nonlinear_inequality_constraints=None, fixed_features=None, fixed_features_list=None, post_processing_func=None, ic_generator=None, ic_gen_kwargs=None, return_acq_values=True)[source]
Generate a list of candidates from a list of acquisition functions.
The acquisition functions are optimized in sequence, with previous candidates set as
X_pending. This is also known as sequential greedy optimization.- Parameters:
acq_function_list (list[AcquisitionFunction]) – A list of acquisition functions.
bounds (Tensor) – A
2 x dtensor of lower and upper bounds for each column ofX(if inequality_constraints is provided, these bounds can be -inf and +inf, respectively).num_restarts (int) – Number of starting points for multistart acquisition function optimization.
raw_samples (int | None) – Number of samples for initialization. This is required if
batch_initial_conditionsis not specified.options (dict[str, bool | float | int | str] | None) – Options for candidate generation.
constraints (equality) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) >= rhsconstraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) = rhsnonlinear_inequality_constraints (list[tuple[Callable, bool]] | None) – A list of tuples representing the nonlinear inequality constraints. The first element in the tuple is a callable representing a constraint of the form
callable(x) >= 0. In case of an intra-point constraint,callable()``takes in an one-dimensional tensor of shape ``dand returns a scalar. In case of an inter-point constraint,callable()takes a two dimensional tensor of shapeq x dand again returns a scalar. The second element is a boolean, indicating if it is an intra-point or inter-point constraint (Truefor intra-point.Falsefor inter-point). For more information on intra-point vs inter-point constraints, see the docstring of theinequality_constraintsargument tooptimize_acqf(). The constraints will later be passed to the scipy solver. You need to pass inbatch_initial_conditionsin this case. Using non-linear inequality constraints also requires thatbatch_limitis set to 1, which will be done automatically if not specified inoptions.fixed_features (dict[int, float] | None) – A map
{feature_index: value}for features that should be fixed to a particular value during generation.fixed_features_list (list[dict[int, float]] | None) – A list of maps
{feature_index: value}. The i-th item represents the fixed_feature for the i-th optimization. Iffixed_features_listis provided,optimize_acqf_mixedis invoked.post_processing_func (Callable[[Tensor], Tensor] | None) – A function that post-processes an optimization result appropriately (i.e., according to
round-triptransformations).ic_generator (Callable[[qKnowledgeGradient, Tensor, int, int, int, dict[int, float] | None, dict[str, bool | float | int] | None, list[tuple[Tensor, Tensor, float]] | None, list[tuple[Tensor, Tensor, float]] | None], Tensor | None] | None) – Function for generating initial conditions. Not needed when
batch_initial_conditionsare provided. Defaults togen_one_shot_kg_initial_conditionsforqKnowledgeGradientacquisition functions andgen_batch_initial_conditionsotherwise. Must be specified for nonlinear inequality constraints.ic_gen_kwargs (dict | None) – Additional keyword arguments passed to function specified by
ic_generatorreturn_acq_values (bool) – Return acquisition values.
inequality_constraints (list[tuple[Tensor, Tensor, float]] | None)
equality_constraints (list[tuple[Tensor, Tensor, float]] | None)
- Returns:
A two-element tuple containing
a
q x d-dim tensor of generated candidates.- a
q-dim tensor of expected acquisition values, where the value at index
iis the acquisition value conditional on having observed all candidates except candidatei. ReturnsNoneifreturn_acq_values=False.
- a
- Return type:
tuple[Tensor, Tensor | None]
- botorch.optim.optimize.optimize_acqf_mixed(acq_function, bounds, q, num_restarts, fixed_features_list, raw_samples=None, options=None, inequality_constraints=None, equality_constraints=None, nonlinear_inequality_constraints=None, post_processing_func=None, batch_initial_conditions=None, return_best_only=True, gen_candidates=None, ic_generator=None, timeout_sec=None, retry_on_optimization_warning=True, ic_gen_kwargs=None, return_acq_values=True)[source]
Optimize over a list of fixed_features and returns the best solution.
This is useful for optimizing over mixed continuous and discrete domains. For q > 1 this function always performs sequential greedy optimization (with proper conditioning on generated candidates).
NOTE: This method does not support the kind of “inter-point constraints” that are supported by
optimize_acqf().- Parameters:
acq_function (AcquisitionFunction) – An AcquisitionFunction
bounds (Tensor) – A
2 x dtensor of lower and upper bounds for each column ofX(if inequality_constraints is provided, these bounds can be -inf and +inf, respectively).q (int) – The number of candidates.
num_restarts (int) – Number of starting points for multistart acquisition function optimization.
raw_samples (int | None) – Number of samples for initialization. This is required if
batch_initial_conditionsis not specified.fixed_features_list (list[dict[int, float]]) – A list of maps
{feature_index: value}. The i-th item represents the fixed_feature for the i-th optimization.options (dict[str, bool | float | int | str] | None) – Options for candidate generation.
constraints (equality) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) >= rhsconstraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) = rhsnonlinear_inequality_constraints (list[tuple[Callable, bool]] | None) – A list of tuples representing the nonlinear inequality constraints. The first element in the tuple is a callable representing a constraint of the form
callable(x) >= 0. Thecallable()takes in an one-dimensional tensor of shapedand returns a scalar. The second element is a boolean, indicating if it is an intra-point or inter-point constraint (Truefor intra-point.Falsefor inter-point). Since inter-point constraints are not supported by this method, this has to beTrueand raises an error if beingFalse.post_processing_func (Callable[[Tensor], Tensor] | None) – A function that post-processes an optimization result appropriately (i.e., according to
round-triptransformations).batch_initial_conditions (Tensor | None) – A tensor to specify the initial conditions. Set this if you do not want to use default initialization strategy.
return_best_only (bool) – If False, outputs the solutions corresponding to all random restart initializations of the optimization. Setting this keyword to False is only allowed for
q=1. Defaults to True.gen_candidates (Callable[[Tensor, AcquisitionFunction, Any], tuple[Tensor, Tensor]] | None) – A callable for generating candidates (and their associated acquisition values) given a tensor of initial conditions and an acquisition function. Other common inputs include lower and upper bounds and a dictionary of options, but refer to the documentation of specific generation functions (e.g gen_candidates_scipy and gen_candidates_torch) for method-specific inputs. Default:
gen_candidates_scipyic_generator (Callable[[qKnowledgeGradient, Tensor, int, int, int, dict[int, float] | None, dict[str, bool | float | int] | None, list[tuple[Tensor, Tensor, float]] | None, list[tuple[Tensor, Tensor, float]] | None], Tensor | None] | None) – Function for generating initial conditions. Not needed when
batch_initial_conditionsare provided. Defaults togen_one_shot_kg_initial_conditionsforqKnowledgeGradientacquisition functions andgen_batch_initial_conditionsotherwise. Must be specified for nonlinear inequality constraints.timeout_sec (float | None) – Max amount of time optimization can run for.
retry_on_optimization_warning (bool) – Whether to retry candidate generation with a new set of initial conditions when it fails with an
OptimizationWarning.ic_gen_kwargs (dict | None) – Additional keyword arguments passed to function specified by
ic_generatorreturn_acq_values (bool) – Return acquisition values. Can be set to False to avoid memory intensive joint forward evaluation of the acquisition function.
inequality_constraints (list[tuple[Tensor, Tensor, float]] | None)
equality_constraints (list[tuple[Tensor, Tensor, float]] | None)
- Returns:
A two-element tuple containing
- A tensor of generated candidates. The shape is
–
q x difreturn_best_onlyis True (default) –num_restarts x q x difreturn_best_onlyis False
- a tensor of associated acquisition values of dim
num_restarts if
return_best_only=Falseelse a scalar acquisition value.
- a tensor of associated acquisition values of dim
- Return type:
tuple[Tensor, Tensor | None]
- botorch.optim.optimize.optimize_acqf_discrete(acq_function, q, choices, max_batch_size=2048, unique=True, return_acq_values=True, X_avoid=None, inequality_constraints=None)[source]
Optimize over a discrete set of points using batch evaluation.
For
q > 1this function generates candidates by means of sequential conditioning (rather than joint optimization), since for all but the smalles number of choices the setchoices^qof discrete points to evaluate quickly explodes.- Parameters:
acq_function (AcquisitionFunction) – An AcquisitionFunction.
q (int) – The number of candidates.
choices (Tensor) – A
num_choices x dtensor of possible choices.max_batch_size (int) – The maximum number of choices to evaluate in batch. A large limit can cause excessive memory usage if the model has a large training set.
unique (bool) – If True return unique choices, o/w choices may be repeated (only relevant if
q > 1).X_avoid (Tensor | None) – An
n x dtensor of candidates that we aren’t allowed to pick. These will be removed from the set of choices.constraints (inequality) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) >= rhs. Infeasible points will be removed from the set of choices.return_acq_values (bool) – Return acquisition values. Can be set to False to avoid memory intensive joint forward evaluation of the acquisition function.
inequality_constraints (list[tuple[Tensor, Tensor, float]] | None)
- Returns:
A two-element tuple containing
a
q x d-dim tensor of generated candidates.an associated acquisition value.
- Return type:
tuple[Tensor, Tensor]
- botorch.optim.optimize.optimize_acqf_discrete_local_search(acq_function, discrete_choices, q, num_restarts=20, raw_samples=4096, inequality_constraints=None, X_avoid=None, batch_initial_conditions=None, max_batch_size=2048, max_tries=100, unique=True, return_acq_values=True)[source]
Optimize acquisition function over a lattice.
This is useful when d is large and enumeration of the search space isn’t possible. For q > 1 this function always performs sequential greedy optimization (with proper conditioning on generated candidates).
NOTE: While this method supports arbitrary lattices, it has only been thoroughly tested for {0, 1}^d. Consider it to be in alpha stage for the more general case.
- Parameters:
acq_function (AcquisitionFunction) – An AcquisitionFunction
discrete_choices (list[Tensor]) – A list of possible discrete choices for each dimension. Each element in the list is expected to be a torch tensor.
q (int) – The number of candidates.
num_restarts (int) – Number of starting points for multistart acquisition function optimization.
raw_samples (int) – Number of samples for initialization. This is required if
batch_initial_conditionsis not specified.inequality_constraints (list[tuple[Tensor, Tensor, float]] | None) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) >= rhsX_avoid (Tensor | None) – An
n x dtensor of candidates that we aren’t allowed to pick.batch_initial_conditions (Tensor | None) – A tensor of size
n x 1 x dto specify the initial conditions. Set this if you do not want to use default initialization strategy.max_batch_size (int) – The maximum number of choices to evaluate in batch. A large limit can cause excessive memory usage if the model has a large training set.
max_tries (int) – Maximum number of iterations to try when generating initial conditions.
unique (bool) – If True return unique choices, o/w choices may be repeated (only relevant if
q > 1).return_acq_values (bool) – Return acquisition values. Can be set to False to avoid memory intensive joint forward evaluation of the acquisition function.
- Returns:
A two-element tuple containing
a
q x d-dim tensor of generated candidates.an associated acquisition value.
- Return type:
tuple[Tensor, Tensor]
Model Fitting Optimization
Tools for model fitting.
- botorch.optim.fit.fit_gpytorch_mll_scipy(mll, parameters=None, bounds=None, closure=None, closure_kwargs=None, method='L-BFGS-B', options=None, callback=None, timeout_sec=None)[source]
Generic scipy.optimize-based fitting routine for GPyTorch MLLs.
For
BatchedMultiOutputGPyTorchModelinstances with a non-trivial_aug_batch_shape(e.g., multi-outputSingleTaskGPorEnsembleMapSaasSingleTaskGP), this automatically runsfmin_l_bfgs_b_batchedto optimize each batch element’s hyperparameters independently. This converts the single high-dimensional optimization problem into multiple lower-dimensional problems that are easier to solve.The model and likelihood in mll must already be in train mode.
- Parameters:
mll (MarginalLogLikelihood) – MarginalLogLikelihood to be maximized.
parameters (dict[str, Tensor] | None) – Optional dictionary of parameters to be optimized. Defaults to all parameters of
mllthat require gradients.bounds (dict[str, tuple[float | None, float | None]] | None) – A dictionary of user-specified bounds for
parameters. Used to update default parameter bounds obtained frommll.closure (Callable[[], tuple[Tensor, Sequence[Tensor | None]]] | None) – Callable that returns a tensor and an iterable of gradient tensors. Responsible for setting the
gradattributes ofparameters. If no closure is provided, one will be obtained by callingget_loss_closure_with_grads. When no closure is provided and the model is a batched multi-output model, batched independent fitting is used automatically.closure_kwargs (dict[str, Any] | None) – Keyword arguments passed to
closure.method (str) – Solver type, passed along to scipy.optimize.minimize.
options (dict[str, Any] | None) – Dictionary of solver options, passed along to scipy.optimize.minimize or
fmin_l_bfgs_b_batched.callback (Callable[[dict[str, Tensor], OptimizationResult], None] | None) – Optional callback taking
parametersand anOptimizationResultas its sole arguments.timeout_sec (float | None) – Timeout in seconds after which to terminate the fitting loop (note that timing out can result in bad fits!). Not currently supported for batched independent fitting.
- Returns:
The final OptimizationResult.
- Return type:
- botorch.optim.fit.fit_gpytorch_mll_torch(mll, parameters=None, bounds=None, closure=None, closure_kwargs=None, step_limit=None, stopping_criterion=<class 'botorch.utils.types.DEFAULT'>, optimizer=<class 'torch.optim.adam.Adam'>, scheduler=None, callback=None, timeout_sec=None)[source]
Generic torch.optim-based fitting routine for GPyTorch MLLs.
- Parameters:
mll (MarginalLogLikelihood) – MarginalLogLikelihood to be maximized.
parameters (dict[str, Tensor] | None) – Optional dictionary of parameters to be optimized. Defaults to all parameters of
mllthat require gradients.bounds (dict[str, tuple[float | None, float | None]] | None) – A dictionary of user-specified bounds for
parameters. Used to update default parameter bounds obtained frommll.closure (Callable[[], tuple[Tensor, Sequence[Tensor | None]]] | None) – Callable that returns a tensor and an iterable of gradient tensors. Responsible for setting the
gradattributes ofparameters. If no closure is provided, one will be obtained by callingget_loss_closure_with_grads.closure_kwargs (dict[str, Any] | None) – Keyword arguments passed to
closure.step_limit (int | None) – Optional upper bound on the number of optimization steps.
stopping_criterion (StoppingCriterion | None) – A StoppingCriterion for the optimization loop.
optimizer (Optimizer | Callable[[...], Optimizer]) – A
torch.optim.Optimizerinstance or a factory that takes a list of parameters and returns anOptimizerinstance.scheduler (_LRScheduler | Callable[[...], _LRScheduler] | None) – A
torch.optim.lr_scheduler._LRSchedulerinstance or a factory that takes anOptimizerinstance and returns an_LRSchedule.callback (Callable[[dict[str, Tensor], OptimizationResult], None] | None) – Optional callback taking
parametersand an OptimizationResult as its sole arguments.timeout_sec (float | None) – Timeout in seconds after which to terminate the fitting loop (note that timing out can result in bad fits!).
- Returns:
The final OptimizationResult.
- Return type:
Initialization Helpers
References
R. G. Regis, C. A. Shoemaker. Combining radial basis function surrogates and dynamic coordinate search in high-dimensional expensive black-box optimization, Engineering Optimization, 2013.
- botorch.optim.initializers.transform_constraints(constraints, q, d)[source]
Transform constraints to sample from a d*q-dimensional space instead of a d-dimensional state.
This function assumes that constraints are the same for each input batch, and broadcasts the constraints accordingly to the input batch shape.
- Parameters:
constraints (list[tuple[Tensor, Tensor, float]] | None) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an (in-)equality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) (>)= rhs. Ifindicesis a 2-d Tensor, this supports specifying constraints across the points in theq-batch (inter-point constraints). IfNone, this function is a nullop and simply returnsNone.q (int) – Size of the
q-batch.d (int) – Dimensionality of the problem.
- Returns:
List of transformed constraints, if there are constraints. Returns
Noneotherwise.- Return type:
List[Tuple[Tensor, Tensor, float]]
- botorch.optim.initializers.transform_intra_point_constraint(constraint, d, q)[source]
Transforms an intra-point/pointwise constraint from d-dimensional space to a d*q-dimensional space.
- Parameters:
constraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an (in-)equality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) (>)= rhs. Hereindicesmust be one-dimensional, and the constraint is applied to all points within theq-batch.d (int) – Dimensionality of the problem.
constraint (tuple[Tensor, Tensor, float])
q (int)
- Raises:
ValueError – If indices in the constraints are larger than the dimensionality d of the problem.
- Returns:
List of transformed constraints.
- Return type:
List[Tuple[Tensor, Tensor, float]]
- botorch.optim.initializers.transform_inter_point_constraint(constraint, d)[source]
Transforms an inter-point constraint from d-dimensional space to a d*q dimensional space.
- Parameters:
constraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an (in-)equality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) (>)= rhs.indicesmust be a 2-d Tensor, where in each rowindices[i] = (k_i, l_i)the first indexk_icorresponds to thek_i-th element of theq-batch and the second indexl_icorresponds to thel_i-th feature of that element.constraint (tuple[Tensor, Tensor, float])
d (int)
- Raises:
ValueError – If indices in the constraints are larger than the dimensionality d of the problem.
- Returns:
Transformed constraint.
- Return type:
List[Tuple[Tensor, Tensor, float]]
- botorch.optim.initializers.sample_q_batches_from_polytope(n, q, bounds, n_burnin, n_thinning, seed=None, inequality_constraints=None, equality_constraints=None)[source]
Samples
nq-baches from a polytope of dimensiond.- Parameters:
n (int) – Number of q-batches to sample.
q (int) – Number of samples per q-batch
bounds (Tensor) – A
2 x dtensor of lower and upper bounds for each column ofX.n_burnin (int) – The number of burn-in samples for the Markov chain sampler.
n_thinning (int) – The amount of thinning. The sampler will return every
n_thinningsample (after burn-in).seed (int | None) – The random seed.
inequality_constraints (list[tuple[Tensor, Tensor, float]] | None) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) >= rhs.equality_constraints (list[tuple[Tensor, Tensor, float]] | None) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) = rhs.
- Returns:
A
n x q x d-dim tensor of samples.- Return type:
Tensor
- botorch.optim.initializers.gen_batch_initial_conditions(acq_function, bounds, q, num_restarts, raw_samples, fixed_features=None, options=None, inequality_constraints=None, equality_constraints=None, generator=None, fixed_X_fantasies=None)[source]
Generate a batch of initial conditions for random-restart optimization.
TODO: Support t-batches of initial conditions.
- Parameters:
acq_function (AcquisitionFunction) – The acquisition function to be optimized.
bounds (Tensor) – A
2 x dtensor of lower and upper bounds for each column ofX.q (int) – The number of candidates to consider.
num_restarts (int) – The number of starting points for multistart acquisition function optimization.
raw_samples (int) – The number of raw samples to consider in the initialization heuristic. Note: if
sample_around_bestis True (the default is False), then2 * raw_samplessamples are used.fixed_features (dict[int, float] | None) – A map
{feature_index: value}for features that should be fixed to a particular value during generation.options (dict[str, bool | float | int] | None) – Options for initial condition generation. For valid options see
initialize_q_batch_topn,initialize_q_batch_nonneg, andinitialize_q_batch. Ifoptionscontains atopn=Truetheninitialize_q_batch_topnwill be used. Else ifoptionscontains anonnegative=Trueentry, thenacq_functionis assumed to be non-negative (useful when using custom acquisition functions).initialize_q_batchwill be used otherwise. In addition, an “init_batch_limit” option can be passed to specify the batch limit for the initialization. This is useful for avoiding memory limits when computing the batch posterior over raw samples.constraints (equality) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) >= rhs.constraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) = rhs.generator (Callable[[int, int, int | None], Tensor] | None) – Callable for generating samples that are then further processed. It receives
n,qandseedas arguments and returns a tensor of shapen x q x d.fixed_X_fantasies (Tensor | None) – A fixed set of fantasy points to concatenate to the
qcandidates being initialized along the-2dimension. The shape should benum_pseudo_points x d. E.g., this should benum_fantasies x dfor KG andnum_fantasies*num_pareto x dfor HVKG.inequality_constraints (list[tuple[Tensor, Tensor, float]] | None)
equality_constraints (list[tuple[Tensor, Tensor, float]] | None)
- Returns:
A
num_restarts x q x dtensor of initial conditions.- Return type:
Tensor
Example
>>> qEI = qExpectedImprovement(model, best_f=0.2) >>> bounds = torch.tensor([[0.], [1.]]) >>> Xinit = gen_batch_initial_conditions( >>> qEI, bounds, q=3, num_restarts=25, raw_samples=500 >>> )
- botorch.optim.initializers.gen_one_shot_kg_initial_conditions(acq_function, bounds, q, num_restarts, raw_samples, fixed_features=None, options=None, inequality_constraints=None, equality_constraints=None)[source]
Generate a batch of smart initializations for qKnowledgeGradient.
This function generates initial conditions for optimizing one-shot KG using the maximizer of the posterior objective. Intuitively, the maximizer of the fantasized posterior will often be close to a maximizer of the current posterior. This function uses that fact to generate the initial conditions for the fantasy points. Specifically, a fraction of
1 - frac_random(see options) is generated by sampling from the set of maximizers of the posterior objective (obtained via random restart optimization) according to a softmax transformation of their respective values. This means that this initialization strategy internally solves an acquisition function maximization problem. The remainingfrac_randomfantasy points as well as allqcandidate points are chosen according to the standard initialization strategy ingen_batch_initial_conditions.- Parameters:
acq_function (qKnowledgeGradient) – The qKnowledgeGradient instance to be optimized.
bounds (Tensor) – A
2 x dtensor of lower and upper bounds for each column of task features.q (int) – The number of candidates to consider.
num_restarts (int) – The number of starting points for multistart acquisition function optimization.
raw_samples (int) – The number of raw samples to consider in the initialization heuristic.
fixed_features (dict[int, float] | None) – A map
{feature_index: value}for features that should be fixed to a particular value during generation.options (dict[str, bool | float | int] | None) – Options for initial condition generation. These contain all settings for the standard heuristic initialization from
gen_batch_initial_conditions. In addition, they containfrac_random(the fraction of fully random fantasy points),num_inner_restartsandraw_inner_samples(the number of random restarts and raw samples for solving the posterior objective maximization problem, respectively) andeta(temperature parameter for sampling heuristic from posterior objective maximizers).constraints (equality) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) >= rhs.constraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) = rhs.inequality_constraints (list[tuple[Tensor, Tensor, float]] | None)
equality_constraints (list[tuple[Tensor, Tensor, float]] | None)
- Returns:
A
num_restarts x q' x dtensor that can be used as initial conditions foroptimize_acqf(). Hereq' = q + num_fantasiesis the total number of points (candidate points plus fantasy points).- Return type:
Tensor | None
Example
>>> qHVKG = qHypervolumeKnowledgeGradient(model, ref_point=num_fantasies=64) >>> bounds = torch.tensor([[0., 0.], [1., 1.]]) >>> Xinit = gen_one_shot_hvkg_initial_conditions( >>> qHVKG, bounds, q=3, num_restarts=10, raw_samples=512, >>> options={"frac_random": 0.25}, >>> )
- botorch.optim.initializers.gen_one_shot_hvkg_initial_conditions(acq_function, bounds, q, num_restarts, raw_samples, fixed_features=None, options=None, inequality_constraints=None, equality_constraints=None)[source]
Generate a batch of smart initializations for qHypervolumeKnowledgeGradient.
This function generates initial conditions for optimizing one-shot HVKG using the hypervolume maximizing set (of fixed size) under the posterior mean. Intuitively, the hypervolume maximizing set of the fantasized posterior mean will often be close to a hypervolume maximizing set under the current posterior mean. This function uses that fact to generate the initial conditions for the fantasy points. Specifically, a fraction of
1 - frac_random(see options) of the restarts are generated by learning the hypervolume maximizing sets under the current posterior mean, where each hypervolume maximizing set is obtained from maximizing the hypervolume from a different starting point. Given a hypervolume maximizing set, theqcandidate points are selected using the standard initialization strategy ingen_batch_initial_conditions, with the fixed hypervolume maximizing set. The remainingfrac_randomrestarts fantasy points as well as allqcandidate points are chosen according to the standard initialization strategy ingen_batch_initial_conditions.- Parameters:
acq_function (qHypervolumeKnowledgeGradient) – The qKnowledgeGradient instance to be optimized.
bounds (Tensor) – A
2 x dtensor of lower and upper bounds for each column of task features.q (int) – The number of candidates to consider.
num_restarts (int) – The number of starting points for multistart acquisition function optimization.
raw_samples (int) – The number of raw samples to consider in the initialization heuristic.
fixed_features (dict[int, float] | None) – A map
{feature_index: value}for features that should be fixed to a particular value during generation.options (dict[str, bool | float | int] | None) – Options for initial condition generation. These contain all settings for the standard heuristic initialization from
gen_batch_initial_conditions. In addition, they containfrac_random(the fraction of fully random fantasy points),num_inner_restartsandraw_inner_samples(the number of random restarts and raw samples for solving the posterior objective maximization problem, respectively) andeta(temperature parameter for sampling heuristic from posterior objective maximizers).constraints (equality) – Optionally, list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) >= rhs. Each tensor of indices must be one-dimensional, since inter-point constraints are not supported here.constraints – Optionally, a list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) = rhs.inequality_constraints (list[tuple[Tensor, Tensor, float]] | None)
equality_constraints (list[tuple[Tensor, Tensor, float]] | None)
- Returns:
A
num_restarts x q' x dtensor that can be used as initial conditions foroptimize_acqf(). Hereq' = q + num_fantasiesis the total number of points (candidate points plus fantasy points).- Return type:
Tensor | None
Example
>>> qHVKG = qHypervolumeKnowledgeGradient(model, ref_point) >>> bounds = torch.tensor([[0., 0.], [1., 1.]]) >>> Xinit = gen_one_shot_hvkg_initial_conditions( >>> qHVKG, bounds, q=3, num_restarts=10, raw_samples=512, >>> options={"frac_random": 0.25}, >>> )
- botorch.optim.initializers.gen_value_function_initial_conditions(acq_function, bounds, num_restarts, raw_samples, current_model, fixed_features=None, options=None)[source]
Generate a batch of smart initializations for optimizing the value function of qKnowledgeGradient.
This function generates initial conditions for optimizing the inner problem of KG, i.e. its value function, using the maximizer of the posterior objective. Intuitively, the maximizer of the fantasized posterior will often be close to a maximizer of the current posterior. This function uses that fact to generate the initial conditions for the fantasy points. Specifically, a fraction of
1 - frac_random(see options) of raw samples is generated by sampling from the set of maximizers of the posterior objective (obtained via random restart optimization) according to a softmax transformation of their respective values. This means that this initialization strategy internally solves an acquisition function maximization problem. The remaining raw samples are generated usingdraw_sobol_samples. All raw samples are then evaluated, and the initial conditions are selected according to the standard initialization strategy in ‘initialize_q_batch’ individually for each inner problem.- Parameters:
acq_function (AcquisitionFunction) – The value function instance to be optimized.
bounds (Tensor) – A
2 x dtensor of lower and upper bounds for each column of task features.num_restarts (int) – The number of starting points for multistart acquisition function optimization.
raw_samples (int) – The number of raw samples to consider in the initialization heuristic.
current_model (Model) – The model of the KG acquisition function that was used to generate the fantasy model of the value function.
fixed_features (dict[int, float] | None) – A map
{feature_index: value}for features that should be fixed to a particular value during generation.options (dict[str, bool | float | int] | None) – Options for initial condition generation. These contain all settings for the standard heuristic initialization from
gen_batch_initial_conditions. In addition, they containfrac_random(the fraction of fully random fantasy points),num_inner_restartsandraw_inner_samples(the number of random restarts and raw samples for solving the posterior objective maximization problem, respectively) andeta(temperature parameter for sampling heuristic from posterior objective maximizers).
- Returns:
A
num_restarts x batch_shape x q x dtensor that can be used as initial conditions foroptimize_acqf(). Herebatch_shapeis the batch shape of value function model.- Return type:
Tensor
Example
>>> fant_X = torch.rand(5, 1, 2) >>> fantasy_model = model.fantasize(fant_X, SobolQMCNormalSampler(16)) >>> value_function = PosteriorMean(fantasy_model) >>> bounds = torch.tensor([[0., 0.], [1., 1.]]) >>> Xinit = gen_value_function_initial_conditions( >>> value_function, bounds, num_restarts=10, raw_samples=512, >>> options={"frac_random": 0.25}, >>> )
- botorch.optim.initializers.initialize_q_batch(X, acq_vals, n, eta=1.0)[source]
Heuristic for selecting initial conditions for candidate generation.
This heuristic selects points from
X(without replacement) with probability proportional toexp(eta * Z), whereZ = (acq_vals - mean(acq_vals)) / std(acq_vals)andetais a temperature parameter.When using an acquisition function that is non-negative and possibly zero over large areas of the feature space (e.g. qEI), you should use
initialize_q_batch_nonneginstead.- Parameters:
X (Tensor) – A
b x batch_shape x q x dtensor ofb-batch_shapesamples ofq-batches from a d`-dim feature space. Typically, these are generated using qMC sampling.acq_vals (Tensor) – A tensor of
b x batch_shapeoutcomes associated with the samples. Typically, this is the value of the batch acquisition function to be maximized.n (int) – The number of initial condition to be generated. Must be less than
b.eta (float) – Temperature parameter for weighting samples.
- Returns:
An
n x batch_shape x q x dtensor ofn-batch_shapeq-batch initial conditions, where each batch ofn x q x dsamples is selected independently.An
n x batch_shapetensor of the corresponding acquisition values.
- Return type:
tuple[Tensor, Tensor]
Example
>>> # To get ``n=10`` starting points of q-batch size ``q=3`` >>> # for model with ``d=6``: >>> qUCB = qUpperConfidenceBound(model, beta=0.1) >>> X_rnd = torch.rand(500, 3, 6) >>> X_init, acq_init = initialize_q_batch(X=X_rnd, acq_vals=qUCB(X_rnd), n=10)
- botorch.optim.initializers.initialize_q_batch_nonneg(X, acq_vals, n, eta=1.0, alpha=0.0001)[source]
Heuristic for selecting initial conditions for non-neg. acquisition functions.
This function is similar to
initialize_q_batch, but designed specifically for acquisition functions that are non-negative and possibly zero over large areas of the feature space (e.g. qEI). All samples for whichacq_vals < alpha * max(acq_vals)will be ignored (assuming thatacq_valscontains at least one positive value).- Parameters:
X (Tensor) – A
b x q x dtensor ofbsamples ofq-batches from ad-dim. feature space. Typically, these are generated using qMC.acq_vals (Tensor) – A tensor of
boutcomes associated with the samples. Typically, this is the value of the batch acquisition function to be maximized.n (int) – The number of initial condition to be generated. Must be less than
b.eta (float) – Temperature parameter for weighting samples.
alpha (float) – The threshold (as a fraction of the maximum observed value) under which to ignore samples. All input samples for which
Y < alpha * max(Y)will be ignored.
- Returns:
An
n x q x dtensor ofnq-batch initial conditions.An
ntensor of the corresponding acquisition values.
- Return type:
tuple[Tensor, Tensor]
Example
>>> # To get ``n=10`` starting points of q-batch size ``q=3`` >>> # for model with ``d=6``: >>> qEI = qExpectedImprovement(model, best_f=0.2) >>> X_rnd = torch.rand(500, 3, 6) >>> X_init, acq_init = initialize_q_batch_nonneg( ... X=X_rnd, acq_vals=qEI(X_rnd), n=10 ... )
- botorch.optim.initializers.initialize_q_batch_topn(X, acq_vals, n, largest=True, sorted=True)[source]
Take the top
ninitial conditions for candidate generation.- Parameters:
X (Tensor) – A
b x q x dtensor ofbsamples ofq-batches from ad-dim. feature space. Typically, these are generated using qMC.acq_vals (Tensor) – A tensor of
boutcomes associated with the samples. Typically, this is the value of the batch acquisition function to be maximized.n (int) – The number of initial condition to be generated. Must be less than
b.largest (bool)
sorted (bool)
- Returns:
An
n x q x dtensor ofnq-batch initial conditions.An
ntensor of the corresponding acquisition values.
- Return type:
tuple[Tensor, Tensor]
Example
>>> # To get ``n=10`` starting points of q-batch size ``q=3`` >>> # for model with ``d=6``: >>> qUCB = qUpperConfidenceBound(model, beta=0.1) >>> X_rnd = torch.rand(500, 3, 6) >>> X_init, acq_init = initialize_q_batch_topn( ... X=X_rnd, acq_vals=qUCB(X_rnd), n=10 ... )
- botorch.optim.initializers.sample_points_around_best(acq_function, n_discrete_points, sigma, bounds, best_pct=5.0, subset_sigma=0.1, prob_perturb=None)[source]
Find best points and sample nearby points.
- Parameters:
acq_function (AcquisitionFunction) – The acquisition function.
n_discrete_points (int) – The number of points to sample.
sigma (float) – The standard deviation of the additive gaussian noise for perturbing the best points.
bounds (Tensor) – A
2 x d-dim tensor containing the bounds.best_pct (float) – The percentage of best points to perturb.
subset_sigma (float) – The standard deviation of the additive gaussian noise for perturbing a subset of dimensions of the best points.
prob_perturb (float | None) – The probability of perturbing each dimension.
- Returns:
- An optional
n_discrete_points x d-dim tensor containing the sampled points. This is None if no baseline points are found.
- An optional
- Return type:
Tensor | None
- botorch.optim.initializers.is_nonnegative(acq_function)[source]
Determine whether a given acquisition function is non-negative.
- Parameters:
acq_function (AcquisitionFunction) – The
AcquisitionFunctioninstance.- Returns:
True if
acq_functionis non-negative, False if not, or if the behavior is unknown (for custom acquisition functions).- Return type:
bool
Example
>>> qEI = qExpectedImprovement(model, best_f=0.1) >>> is_nonnegative(qEI) # returns True
Stopping Criteria
- class botorch.optim.stopping.StoppingCriterion(*args, **kwargs)[source]
Bases:
ProtocolProtocol for evaluating optimization convergence.
Stopping criteria are implemented as objects rather than functions, so that they can keep track of past function values between optimization steps.
- class botorch.optim.stopping.ExpMAStoppingCriterion(maxiter=10000, minimize=True, n_window=10, eta=1.0, rel_tol=1e-05)[source]
Bases:
objectExponential moving average stopping criterion.
Computes an exponentially weighted moving average over window length
n_windowand checks whether the relative decrease in this moving average between steps is less than a provided tolerance level. That is, in iterationi, it computesv[i,j] := fvals[i - n_window + j] * w[j]
for all
j = 0, ..., n_window, wherew[j] = exp(-eta * (1 - j / n_window)). Lettingma[i] := sum_j(v[i,j]), the criterion evaluates toTruewhenever(ma[i-1] - ma[i]) / abs(ma[i-1]) < rel_tol (if minimize=True) (ma[i] - ma[i-1]) / abs(ma[i-1]) < rel_tol (if minimize=False)
Exponential moving average stopping criterion.
- Parameters:
maxiter (int) – Maximum number of iterations.
minimize (bool) – If True, assume minimization.
n_window (int) – The size of the exponential moving average window.
eta (float) – The exponential decay factor in the weights.
rel_tol (float) – Relative tolerance for termination.
Acquisition Function Optimization with Homotopy
- botorch.optim.optimize_homotopy.prune_candidates(candidates, acq_values, prune_tolerance)[source]
Prune candidates based on their distance to other candidates.
- Parameters:
candidates (Tensor) – An
n x dtensor of candidates.acq_values (Tensor) – An
ntensor of candidate values.prune_tolerance (float) – The minimum distance to prune candidates.
- Returns:
An
m x dtensor of pruned candidates.- Return type:
Tensor
- botorch.optim.optimize_homotopy.optimize_acqf_homotopy(acq_function, bounds, q, num_restarts, homotopy, prune_tolerance=0.0001, raw_samples=None, options=None, final_options=None, inequality_constraints=None, equality_constraints=None, nonlinear_inequality_constraints=None, fixed_features=None, discrete_dims=None, cat_dims=None, post_processing_func=None, batch_initial_conditions=None, gen_candidates=None, *, ic_generator=None, timeout_sec=None, retry_on_optimization_warning=True, return_acq_values=True, **ic_gen_kwargs)[source]
Generate a set of candidates via multi-start optimization.
- Parameters:
acq_function (AcquisitionFunction) – An AcquisitionFunction.
bounds (Tensor) – A
2 x dtensor of lower and upper bounds for each column ofX(if inequality_constraints is provided, these bounds can be -inf and +inf, respectively).q (int) – The number of candidates.
homotopy (Homotopy) – Homotopy object that will make the necessary modifications to the problem when calling
step().prune_tolerance (float) – The minimum distance to prune candidates.
num_restarts (int) – The number of starting points for multistart acquisition function optimization.
raw_samples (int | None) – The number of samples for initialization. This is required if
batch_initial_conditionsis not specified.options (dict[str, bool | float | int | str] | None) – Options for candidate generation in the initial step of the homotopy.
final_options (dict[str, bool | float | int | str] | None) – Options for candidate generation in the final step of the homotopy.
inequality_constraints (list[tuple[Tensor, Tensor, float]] | None) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) >= rhs.indicesandcoefficientsshould be torch tensors. See the docstring ofmake_scipy_linear_constraintsfor an example. When q=1, or when applying the same constraint to each candidate in the batch (intra-point constraint),indicesshould be a 1-d tensor. For inter-point constraints, in which the constraint is applied to the whole batch of candidates,indicesmust be a 2-d tensor, where in each rowindices[i] =(k_i, l_i)the first indexk_icorresponds to thek_i-th element of theq-batch and the second indexl_icorresponds to thel_i-th feature of that element.equality_constraints (list[tuple[Tensor, Tensor, float]] | None) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an equality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) = rhs. See the docstring ofmake_scipy_linear_constraintsfor an example.nonlinear_inequality_constraints (list[tuple[Callable, bool]] | None) – A list of tuples representing the nonlinear inequality constraints. The first element in the tuple is a callable representing a constraint of the form
callable(x) >= 0. In case of an intra-point constraint,callable()``takes in an one-dimensional tensor of shape ``dand returns a scalar. In case of an inter-point constraint,callable()takes a two dimensional tensor of shapeq x dand again returns a scalar. The second element is a boolean, indicating if it is an intra-point or inter-point constraint (Truefor intra-point.Falsefor inter-point). For more information on intra-point vs inter-point constraints, see the docstring of theinequality_constraintsargument tooptimize_acqf(). The constraints will later be passed to the scipy solver. You need to pass inbatch_initial_conditionsin this case. Using non-linear inequality constraints also requires thatbatch_limitis set to 1, which will be done automatically if not specified inoptions.fixed_features (dict[int, float] | None) – A map
{feature_index: value}for features that should be fixed to a particular value during generation. Used withoptimize_acqforoptimize_acqf_mixed_alternating.discrete_dims (Mapping[int, Sequence[float]] | None) – A dictionary mapping indices of discrete and binary dimensions to a list of allowed values for that dimension. If provided along with
cat_dims, the optimizer is chosen based on the number of discrete combinations:optimize_acqf_mixed_alternatingis used if there are more than 10 combinations, otherwiseoptimize_acqf_mixedis used.cat_dims (Mapping[int, Sequence[float]] | None) – A dictionary mapping indices of categorical dimensions to a list of allowed values for that dimension. If provided along with
discrete_dims, the optimizer is chosen based on the number of discrete combinations (seediscrete_dims).post_processing_func (Callable[[Tensor], Tensor] | None) – A function that post-processes an optimization result appropriately (i.e., according to
round-triptransformations).batch_initial_conditions (Tensor | None) – A tensor to specify the initial conditions. Set this if you do not want to use default initialization strategy.
gen_candidates (Callable[[Tensor, AcquisitionFunction, Any], tuple[Tensor, Tensor]] | None) – A callable for generating candidates (and their associated acquisition values) given a tensor of initial conditions and an acquisition function. Other common inputs include lower and upper bounds and a dictionary of options, but refer to the documentation of specific generation functions (e.g gen_candidates_scipy and gen_candidates_torch) for method-specific inputs. Default:
gen_candidates_scipyic_generator (Callable[[qKnowledgeGradient, Tensor, int, int, int, dict[int, float] | None, dict[str, bool | float | int] | None, list[tuple[Tensor, Tensor, float]] | None, list[tuple[Tensor, Tensor, float]] | None], Tensor | None] | None) – Function for generating initial conditions. Not needed when
batch_initial_conditionsare provided. Defaults togen_one_shot_kg_initial_conditionsforqKnowledgeGradientacquisition functions andgen_batch_initial_conditionsotherwise. Must be specified for nonlinear inequality constraints.timeout_sec (float | None) – Max amount of time optimization can run for.
retry_on_optimization_warning (bool) – Whether to retry candidate generation with a new set of initial conditions when it fails with an
OptimizationWarning.return_acq_values (bool) – Return acquisition values.
ic_gen_kwargs (Any) – Additional keyword arguments passed to function specified by
ic_generator
- Return type:
tuple[Tensor, Tensor | None]
Acquisition Function Optimization with Mixed Integer Variables
- botorch.optim.optimize_mixed.should_use_mixed_alternating_optimizer(discrete_dims=None, cat_dims=None)[source]
Determine whether to use
optimize_acqf_mixed_alternatingfor a mixed (not fully discrete) search space based on the number of discrete combinations.For mixed search spaces, if there are more than
ALTERNATING_OPTIMIZER_THRESHOLDcombinations of discrete choices, we useoptimize_acqf_mixed_alternating, which alternates between continuous and discrete optimization steps. Otherwise, we useoptimize_acqf_mixed, which enumerates all discrete combinations and optimizes the continuous features with discrete features being fixed.- Parameters:
discrete_dims (Mapping[int, Sequence[float]] | None) – A dictionary mapping indices of discrete (ordinal) dimensions to their respective sets of values provided as a sequence.
cat_dims (Mapping[int, Sequence[float]] | None) – A dictionary mapping indices of categorical dimensions to their respective sets of values provided as a sequence.
- Returns:
Trueifoptimize_acqf_mixed_alternatingshould be used,Falseifoptimize_acqf_mixedshould be used instead.- Return type:
bool
- botorch.optim.optimize_mixed.get_nearest_neighbors(current_x, bounds, discrete_dims)[source]
Generate all 1-Manhattan distance neighbors of a given input. The neighbors are generated for the discrete dimensions only.
NOTE: This assumes that
current_xis detached and uses in-place operations, which are known to be incompatible with autograd.- Parameters:
current_x (Tensor) – The design to find the neighbors of. A tensor of shape
d.bounds (Tensor) – A
2 x dtensor of lower and upper bounds for each column ofX.discrete_dims (dict[int, list[float]]) – A dictionary mapping indices of discrete dimensions to a list of allowed values for that dimension.
- Returns:
A tensor of shape
num_neighbors x d, denoting all unique 1-Manhattan distance neighbors.- Return type:
Tensor
- botorch.optim.optimize_mixed.get_categorical_neighbors(current_x, cat_dims, max_num_cat_values=20)[source]
Generate all 1-Hamming distance neighbors of a given input. The neighbors are generated for the categorical dimensions only.
We assume that all categorical values are equidistant. If the number of values is greater than
max_num_cat_values, we sample uniformly from the possible values for that dimension.NOTE: This assumes that
current_xis detached and uses in-place operations, which are known to be incompatible with autograd.- Parameters:
current_x (Tensor) – The design to find the neighbors of. A tensor of shape
d.cat_dims (dict[int, list[float]]) – A dictionary mapping indices of categorical dimensions to a list of allowed values for that dimension.
max_num_cat_values (int) – Maximum number of values for a categorical parameter, beyond which values are uniformly sampled.
- Returns:
A tensor of shape
num_neighbors x d, denoting up tomax_num_cat_valuesunique 1-Hamming distance neighbors for each categorical dimension.- Return type:
Tensor
- botorch.optim.optimize_mixed.get_spray_points(X_baseline, cont_dims, discrete_dims, cat_dims, bounds, num_spray_points, std_cont_perturbation=0.1)[source]
Generate spray points by perturbing the Pareto optimal points.
Given the points on the Pareto frontier, we create perturbations (spray points) by adding Gaussian perturbation to the continuous parameters and 1-Manhattan distance neighbors of the discrete (binary and integer) parameters.
- Parameters:
X_baseline (Tensor) – Tensor of best acquired points across BO run.
cont_dims (Tensor) – Indices of continuous parameters/input dimensions.
discrete_dims (dict[int, list[float]]) – A dictionary mapping indices of discrete dimensions to a list of allowed values for that dimension.
cat_dims (dict[int, list[float]]) – A dictionary mapping indices of categorical dimensions to a list of allowed values for that dimension.
bounds (Tensor) – A
2 x dtensor of lower and upper bounds for each column ofX.num_spray_points (int) – Number of spray points to return.
std_cont_perturbation (float) – standard deviation of Normal perturbations of continuous dimensions. Default is STD_CONT_PERTURBATION = 0.2.
- Returns:
A (num_spray_points x d)-dim tensor of perturbed points.
- Return type:
Tensor
- botorch.optim.optimize_mixed.sample_feasible_points(opt_inputs, discrete_dims, cat_dims, num_points)[source]
Sample feasible points from the optimization domain.
Feasibility is determined according to the discrete dimensions taking integer values and the inequality constraints being satisfied.
If there are no inequality constraints, Sobol is used to generate the base points. Otherwise, we use the polytope sampler to generate the base points. The base points are then rounded to the nearest integer values for the discrete dimensions, and the infeasible points are filtered out (in case rounding leads to infeasibility).
This method will do 10 attempts to generate
num_pointsfeasible points, and return the points generated so far. If no points are generated, it will error out.- Parameters:
opt_inputs (OptimizeAcqfInputs) – Common set of arguments for acquisition optimization.
discrete_dims (dict[int, list[float]]) – A dictionary mapping indices of discrete dimensions to a list of allowed values for that dimension.
cat_dims (dict[int, list[float]]) – A dictionary mapping indices of categorical dimensions to a list of allowed values for that dimension.
num_points (int) – The number of points to sample.
- Returns:
A tensor of shape
num_points x dcontaining the sampled points.- Return type:
Tensor
- botorch.optim.optimize_mixed.round_discrete_dims(X, discrete_dims)[source]
Round the discrete dimensions of a tensor to the nearest allowed values.
- Parameters:
X (Tensor) – A tensor of shape
n x d, wheredis the number of dimensions.discrete_dims (dict[int, list[float]]) – A dictionary mapping indices of discrete dimensions to a list of allowed values for that dimension.
- Returns:
A tensor of the same shape as
X, with discrete dimensions rounded to the nearest allowed values.- Return type:
Tensor
- botorch.optim.optimize_mixed.generate_starting_points(opt_inputs, discrete_dims, cat_dims, cont_dims)[source]
Generate initial starting points for the alternating optimization.
This method attempts to generate the initial points using the specified options and completes any missing points using
sample_feasible_points.- Parameters:
opt_inputs (OptimizeAcqfInputs) – Common set of arguments for acquisition optimization. This function utilizes
acq_function,bounds,num_restarts,raw_samples,options,fixed_featuresand constraints fromopt_inputs.discrete_dims (dict[int, list[float]]) – A dictionary mapping indices of discrete dimensions to a list of allowed values for that dimension.
cat_dims (dict[int, list[float]]) – A dictionary mapping indices of categorical dimensions to a list of allowed values for that dimension.
cont_dims (Tensor) – A tensor of indices corresponding to continuous parameters.
- Returns:
a (num_restarts x d)-dim tensor of starting points and a (num_restarts)-dim tensor of their respective acquisition values. In rare cases, this method may return fewer than
num_restartspoints.- Return type:
A tuple of two tensors
- botorch.optim.optimize_mixed.discrete_step(opt_inputs, discrete_dims, cat_dims, current_x)[source]
Discrete nearest neighbour search.
- Parameters:
opt_inputs (OptimizeAcqfInputs) – Common set of arguments for acquisition optimization. This function utilizes
acq_function,bounds,optionsand constraints fromopt_inputs.discrete_dims (dict[int, list[float]]) – A dictionary mapping indices of discrete dimensions to a list of allowed values for that dimension.
cat_dims (dict[int, list[float]]) – A dictionary mapping indices of categorical dimensions to a list of allowed values for that dimension.
current_x (Tensor) – Batch of starting points. A tensor of shape
b x d.
- Returns:
- a (b, d)-dim tensor of optimized point
and a scalar tensor of correspondins acquisition value.
- Return type:
A tuple of two tensors
- botorch.optim.optimize_mixed.continuous_step(opt_inputs, discrete_dims, cat_dims, current_x)[source]
Continuous search using L-BFGS-B through optimize_acqf.
- Parameters:
opt_inputs (OptimizeAcqfInputs) – Common set of arguments for acquisition optimization. This function utilizes
acq_function,bounds,options,fixed_featuresand constraints fromopt_inputs.opt_inputs.return_best_onlyshould beFalse.discrete_dims (Tensor) – A tensor of indices corresponding to discrete dimensions.
cat_dims (Tensor) – A tensor of indices corresponding to categorical parameters.
current_x (Tensor) – Starting point. A tensor of shape
b x d.
- Returns:
- a (b x d)-dim tensor of optimized points
and a (b)-dim tensor of acquisition values.
- Return type:
A tuple of two tensors
- botorch.optim.optimize_mixed.optimize_acqf_mixed_alternating(acq_function, bounds, discrete_dims=None, cat_dims=None, options=None, q=1, raw_samples=1024, num_restarts=20, post_processing_func=None, sequential=True, fixed_features=None, inequality_constraints=None, equality_constraints=None, return_acq_values=True)[source]
Optimizes acquisition function over mixed integer, categorical, and continuous input spaces. Multiple random restarting starting points are picked by evaluating a large set of initial candidates. From each starting point, alternating discrete/categorical local search and continuous optimization via (L-BFGS) is performed for a fixed number of iterations.
The discrete dimensions that have more than
options.get("max_discrete_values", MAX_DISCRETE_VALUES)values will be optimized using continuous relaxation. The categorical dimensions that have more thanMAX_DISCRETE_VALUESvalues will be optimized by selecting random subsamples of the possible values.- Parameters:
acq_function (AcquisitionFunction) – BoTorch Acquisition function.
bounds (Tensor) – A
2 x dtensor of lower and upper bounds for each column ofX.discrete_dims (Mapping[int, Sequence[float]] | None) – A dictionary mapping indices of discrete and binary dimensions to a list of allowed values for that dimension.
cat_dims (Mapping[int, Sequence[float]] | None) – A dictionary mapping indices of categorical dimensions to a list of allowed values for that dimension.
options (dict[str, Any] | None) – Dictionary specifying optimization options. Supports the following:
"initialization_strategy" (-) – Strategy used to generate the initial candidates. “random”, “continuous_relaxation” or “equally_spaced” (linspace style).
"tol" (-) – The algorithm terminates if the absolute improvement in acquisition value of one iteration is smaller than this number.
"maxiter_alternating" (-) – Number of alternating steps. Defaults to 64.
"maxiter_discrete" (-) – Maximum number of iterations in each discrete step. Defaults to 4.
"maxiter_continuous" (-) – Maximum number of iterations in each continuous step. Defaults to 8.
"max_discrete_values" (-) – Maximum number of values for a discrete dimension to be optimized using discrete step / local search. The discrete dimensions with more values will be optimized using continuous relaxation.
"num_spray_points" (-) – Number of spray points (around
X_baseline) to add to the points generated by the initialization strategy. Defaults to 20 if all discrete variables are binary and to 0 otherwise."std_cont_perturbation" (-) – Standard deviation of the normal perturbations of the continuous variables used to generate the spray points. Defaults to 0.1.
"batch_limit" (-) – The maximum batch size for jointly evaluating candidates during optimization.
"init_batch_limit" (-) – The maximum batch size for jointly evaluating candidates during initialization. During initialization, candidates are evaluated in a
no_gradcontext, which reduces memory usage. As a result,init_batch_limitcan be set to a larger value thanbatch_limit. Defaults tobatch_limit, if given.q (int) – Number of candidates.
raw_samples (int) – Number of initial candidates used to select starting points from. Defaults to 1024.
num_restarts (int) – Number of random restarts. Defaults to 20.
post_processing_func (Callable[[Tensor], Tensor] | None) – A function that post-processes an optimization result appropriately (i.e., according to
round-triptransformations).sequential (bool) – Whether to use joint or sequential optimization across q-batch. This currently only supports sequential optimization.
fixed_features (dict[int, float] | None) – A map
{feature_index: value}for features that should be fixed to a particular value during generation.inequality_constraints (list[tuple[Tensor, Tensor, float]] | None) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) >= rhs.indicesandcoefficientsshould be torch tensors. See the docstring ofmake_scipy_linear_constraintsfor an example.equality_constraints (list[tuple[Tensor, Tensor, float]] | None) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an equality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) == rhs.indicesandcoefficientsshould be torch tensors. Example:[(torch.tensor([1, 3]), torch.tensor([1.0, 0.5]), -0.1)]Equality constraints can only be used with continuous degrees of freedom.return_acq_values (bool) – Return acquisition values.
- Returns:
- a (q x d)-dim tensor of optimized points
and a (q)-dim tensor of their respective acquisition values. Returns
Nonefor acquisition values ifreturn_acq_values=False.
- Return type:
A tuple of two tensors
- botorch.optim.optimize_mixed.complement_indices_like(indices, d)[source]
Computes a tensor of complement indices: {range(d) \ indices}. Same as complement_indices but returns an integer tensor like indices.
- Parameters:
indices (Tensor)
d (int)
- Return type:
Tensor
- botorch.optim.optimize_mixed.complement_indices(indices, d)[source]
Computes a list of complement indices: {range(d) \ indices}.
- Parameters:
indices (list[int]) – a list of integers.
d (int) – an integer dimension in which to compute the complement.
- Returns:
A list of integer indices.
- Return type:
list[int]
Closures
Core
Core methods for building closures in torch and interfacing with numpy.
- class botorch.optim.closures.core.ForwardBackwardClosure(forward, parameters)[source]
Bases:
objectWrapper for fused forward and backward closures.
Initializes a ForwardBackwardClosure instance.
- Parameters:
forward (Callable[[], Tensor]) – Callable that returns a tensor.
parameters (dict[str, Tensor]) – A dictionary of tensors whose
gradfields are to be returned.
- class botorch.optim.closures.core.NdarrayOptimizationClosure(closure, parameters)[source]
Bases:
objectAdds stateful behavior and a numpy.ndarray-typed API to a closure with an expected return type Tuple[Tensor, Union[Tensor, Sequence[Optional[Tensor]]]].
NaN values will be replaced with 0.0 in the returned ndarray.
Initializes a NdarrayOptimizationClosure instance.
- Parameters:
closure (Callable[[], tuple[Tensor, Sequence[Tensor | None]]]) – A ForwardBackwardClosure instance.
parameters (dict[str, Tensor]) – A dictionary of tensors representing the closure’s state. Expected to correspond with the first
len(parameters)optional gradient tensors returned byclosure.
- property state: ndarray[tuple[Any, ...], dtype[_ScalarT]]
- class botorch.optim.closures.core.BatchedNDarrayOptimizationClosure(forward, parameters, batch_shape)[source]
Bases:
objectWraps a forward closure and batched parameters for use with
fmin_l_bfgs_b_batched.Unlike
NdarrayOptimizationClosurewhich flattens all parameters into a single 1D vector, this class manages parameters as a 2D array of shape(batch_size, per_element_size)where each row corresponds to one batch element’s independent parameter vector.This enables independent optimization of each batch element (e.g., each output of a
BatchedMultiOutputGPyTorchModel) using batched L-BFGS-B.Initializes a BatchedNDarrayOptimizationClosure instance.
- Parameters:
forward (Callable[[], Tensor]) – Callable that returns a tensor of shape
batch_shape(per-batch-element loss values, e.g., negated per-output MLL).parameters (dict[str, Tensor]) – A dictionary of parameter tensors, each with shape
(*batch_shape, *trailing_shape).batch_shape (torch.Size) – The batch shape shared by all parameters (typically
model._aug_batch_shape).
- property state: ndarray[tuple[Any, ...], dtype[_ScalarT]]
Returns the current parameter state as a 2D ndarray of shape
(batch_size, per_element_size).
Model Fitting Closures
Utilities for building model-based closures.
- botorch.optim.closures.model_closures.get_loss_closure(mll, data_loader=None)[source]
Factory function for creating loss closures from MarginalLogLikelihoods.
This method acts as a clearing house for factory functions that define how
mllis evaluated.Users may specify custom evaluation routines by passing an
mllor anmll.modelwith a methodcompute_custom_loss.- Parameters:
mll (MarginalLogLikelihood) – A MarginalLogLikelihood instance whose negative defines the loss.
data_loader (DataLoader | None) – An optional DataLoader instance for cases where training data is passed in rather than obtained from
mll.model.
- Returns:
A closure that takes zero positional arguments and returns the negated value of
mll.- Return type:
Callable[[], Tensor]
- botorch.optim.closures.model_closures.get_loss_closure_with_grads(mll, parameters, data_loader=None)[source]
Add a backward pass to a loss closure obtained by calling
get_loss_closure, wrapping it in aForwardBackwardClosure.For further details, see
get_loss_closure.- Parameters:
mll (MarginalLogLikelihood) – A MarginalLogLikelihood instance whose negative defines the loss.
parameters (dict[str, Tensor]) – A dictionary of tensors whose
gradfields are to be returned.data_loader (DataLoader | None) – An optional DataLoader instance for cases where training data is passed in rather than obtained from
mll.model.
- Returns:
A closure that takes zero positional arguments and returns the reduced and negated value of
mllalong with the gradients ofparameters.- Return type:
Batched L-BFGS-B Scipy Port
This is a port of the L-BFGS-B implementation from SciPy s.t. it supports batched evaluations. That is, the objective function’s output value (and its gradient) can be evaluated at a batch of points at once. This yields optimization speedups for acquisition function optimization, where multiple independent problems with the same structure are optimized in parallel.
This file is written such that it explicitly supports all scipy versions from 1.13 to 1.15 (likely 1.16, too, based on its pre-release version). This file might break for higher versions, as it uses internal APIs. There is a major revision of the core optimization code in 1.15, as it is ported from FORTRAN to C, we handle the API changes, though, and are compatible with both.
- botorch.optim.batched_lbfgs_b.fmin_l_bfgs_b_batched(func, x0, bounds=None, maxcor=10, factr=10000000.0, ftol=None, pgtol=1e-05, tol=None, maxiter=15000, disp=None, callback=None, maxls=20, pass_batch_indices=False)[source]
Minimize multiple inputs to a batched function
funcusing the L-BFGS-B algorithm. We minimize multiple inputs to the function at once (by providing a 2d array of shape [b, n]). We assume that the functionfuncis batched, i.e. it will return a 1d array of shape [b,] of independent function values, when passed a 2d array of shape [b, n].- Parameters:
func (callable f(x,*args)) – Function to minimize.
x0 (ndarray) – Initial guess of shape [b, n].
bounds (list, optional) –
(min, max)pairs for each element inx, defining the bounds on that parameter. Use None or +-inf for one ofminormaxwhen there is no bound in that direction.maxcor (int, optional) – The maximum number of variable metric corrections used to define the limited memory matrix. (The limited memory BFGS method does not store the full hessian but uses this many terms in an approximation to it.)
factr (float, optional) – The iteration stops when
(f^k - f^{k+1})/max{|f^k|,|f^{k+1}|,1} <= factr * eps, whereepsis the machine precision, which is automatically generated by the code. Typical values forfactrare: 1e12 for low accuracy; 1e7 for moderate accuracy; 10.0 for extremely high accuracy. See Notes for relationship toftol, which is exposed (instead offactr) by thescipy.optimize.minimizeinterface to L-BFGS-B.ftol (float, optional) – Set ftol directly, meaning the iteration stops when
(f^k - f^{k+1})/max{|f^k|,|f^{k+1}|,1} <= ftolpgtol (float, optional) – The iteration will stop when
max{|proj g_i | i = 1, ..., n} <= pgtolwhereproj g_iis the i-th component of the projected gradient.tol (float, optional) – An alias for
pgtolto be compatible with thescipy.optimize.minimize.maxiter (int, optional) – Maximum number of iterations.
disp (int, optional) – This is depcreated and only here for backwards compatibility.
callback (callable, optional) – Called after each iteration for each batch item, as
callback(xk), wherexkis the current parameter vector.maxls (int, optional) – Maximum number of line search steps (per iteration). Default is 20.
pass_batch_indices (bool) – If True, fun is called with an additional kwargs
batch_indices, which is a list that is as long as the current batch is wide, and indexes into the original batch specified viax0.
- Returns:
x (array_like) – Estimated position of the minimum.
f (float) – Value of
funcat the minimum.d (dict) – Information dictionary.
d[‘warnflag’] is
0 if converged,
1 if too many function evaluations or too many iterations,
2 if stopped for another reason, given in d[‘task’]
d[‘grad’] is the gradient at the minimum (should be 0 ish)
d[‘funcalls’] is the number of function calls made.
d[‘nit’] is the number of iterations.
See also
minimizeInterface to minimization algorithms for multivariate functions. See the ‘L-BFGS-B’
methodin particular. Note that theftoloption is made available via that interface, whilefactris provided via this interface, wherefactris the factor multiplying the default machine floating-point precision to arrive atftol:ftol = factr * numpy.finfo(float).eps.
Notes
License of L-BFGS-B (FORTRAN code):
The version included here (in fortran code) is 3.0 (released April 25, 2011). It was written by Ciyou Zhu, Richard Byrd, and Jorge Nocedal <nocedal@ece.nwu.edu>. It carries the following condition for use:
This software is freely available, but we expect that all publications describing work using this software, or all commercial products using it, quote at least one of the references given below. This software is released under the BSD License.
SciPy uses a C-translated and modified version of the Fortran code, L-BFGS-B v3.0 (released April 25, 2011, BSD-3 licensed). Original Fortran version was written by Ciyou Zhu, Richard Byrd, Jorge Nocedal and, Jose Luis Morales.
References
R. H. Byrd, P. Lu and J. Nocedal. A Limited Memory Algorithm for Bound Constrained Optimization, (1995), SIAM Journal on Scientific and Statistical Computing, 16, 5, pp. 1190-1208.
C. Zhu, R. H. Byrd and J. Nocedal. L-BFGS-B: Algorithm 778: L-BFGS-B, FORTRAN routines for large scale bound constrained optimization (1997), ACM Transactions on Mathematical Software, 23, 4, pp. 550 - 560.
J.L. Morales and J. Nocedal. L-BFGS-B: Remark on Algorithm 778: L-BFGS-B, FORTRAN routines for large scale bound constrained optimization (2011), ACM Transactions on Mathematical Software, 38, 1.
Examples
Solve a linear regression problem via
fmin_l_bfgs_b. To do this, first we define an objective functionf(m, b) = (y - y_model)**2, whereydescribes the observations andy_modelthe prediction of the linear model asy_model = m*x + b. The bounds for the parameters,mandb, are arbitrarily chosen as(0,5)and(5,10)for this example.>>> import numpy as np >>> from scipy.optimize import fmin_l_bfgs_b >>> X = np.arange(0, 10, 1) >>> M = 2 >>> B = 3 >>> Y = M * X + B >>> def func(parameters, *args): ... x = args[0] ... y = args[1] ... m, b = parameters ... y_model = m*x + b ... error = sum(np.power((y - y_model), 2)) ... return error
>>> initial_values = np.array([0.0, 1.0])
>>> x_opt, f_opt, info = fmin_l_bfgs_b(func, x0=initial_values, args=(X, Y), ... approx_grad=True) >>> x_opt, f_opt array([1.99999999, 3.00000006]), 1.7746231151323805e-14 # may vary
The optimized parameters in
x_optagree with the ground truth parametersmandb. Next, let us perform a bound contrained optimization using theboundsparameter.>>> bounds = [(0, 5), (5, 10)] >>> x_opt, f_op, info = fmin_l_bfgs_b(func, x0=initial_values, args=(X, Y), ... approx_grad=True, bounds=bounds) >>> x_opt, f_opt array([1.65990508, 5.31649385]), 15.721334516453945 # may vary
Utilities
General Optimization Utilities
General-purpose optimization utilities.
- botorch.optim.utils.common.check_scipy_version_at_least(minor, major=1)[source]
Check if SciPy version is at least major.minor.0.
- Parameters:
major (The major version to at least fulfill, always 1 for our purposes.)
minor (The minor version as an int.)
- Returns:
bool
- Return type:
True if the SciPy version is major.minor.0 or later.
Acquisition Optimization Utilities
Utilities for maximizing acquisition functions.
- botorch.optim.utils.acquisition_utils.columnwise_clamp(X, lower=None, upper=None, raise_on_violation=False)[source]
Clamp values of a Tensor in column-wise fashion (with support for t-batches).
This function is useful in conjunction with optimizers from the torch.optim package, which don’t natively handle constraints. If you apply this after a gradient step you can be fancy and call it “projected gradient descent”. This funtion is also useful for post-processing candidates generated by the scipy optimizer that satisfy bounds only up to numerical accuracy.
- Parameters:
X (Tensor) – The
b x n x dinput tensor. If 2-dimensional,bis assumed to be 1.lower (float | Tensor | None) – The column-wise lower bounds. If scalar, apply bound to all columns.
upper (float | Tensor | None) – The column-wise upper bounds. If scalar, apply bound to all columns.
raise_on_violation (bool) – If
True, raise an exception when the elments inXare out of the specified bounds (up to numerical accuracy). This is useful for post-processing candidates generated by optimizers that satisfy imposed bounds only up to numerical accuracy.
- Returns:
The clamped tensor.
- Return type:
Tensor
- botorch.optim.utils.acquisition_utils.fix_features(X, fixed_features=None, replace_current_value=True)[source]
Fix feature values in a Tensor.
The fixed features will have zero gradient in downstream calculations.
- Parameters:
X (Tensor) – input Tensor with shape
b x q x (reduced_p | p), wherereduced_pis the number of features not fixed to a constant value and p is the full, usepifreplace_current_valueis True, andreduced_potherwise.fixed_features (Mapping[int, float | Tensor] | None) – A mapping with keys as column indices and values equal to what the feature should be set to in
X. Keys should be in the range[0, p - 1]. If a tensor is passed as value, it has to either have shapeb x qorb, in which case the same value is used across the q dimension.replace_current_value (bool) – If True, replace the specified indexes, otherwise the indices are inserted.
- Returns:
The tensor X with fixed features.
- Return type:
Tensor
- botorch.optim.utils.acquisition_utils.get_X_baseline(acq_function)[source]
Extract X_baseline from an acquisition function.
This tries to find the baseline set of points. First, this checks if the acquisition function has an
X_baselineattribute. If it does not, then this method attempts to use the model’strain_inputsasX_baseline.- Parameters:
acq_function (AcquisitionFunction) – The acquisition function.
- Return type:
Tensor | None
- Returns
- An optional
n x d-dim tensor of baseline points. This is None if no baseline points are found.
- An optional
Model Fitting Utilities
Utilities for fitting and manipulating models.
- class botorch.optim.utils.model_utils.TorchAttr(shape, dtype, device)[source]
Bases:
NamedTupleCreate new instance of TorchAttr(shape, dtype, device)
- Parameters:
shape (Size)
dtype (dtype)
device (device)
- shape: Size
Alias for field number 0
- dtype: dtype
Alias for field number 1
- device: device
Alias for field number 2
- botorch.optim.utils.model_utils.get_data_loader(model, batch_size=1024, **kwargs)[source]
- Parameters:
model (GPyTorchModel)
batch_size (int)
kwargs (Any)
- Return type:
DataLoader
- botorch.optim.utils.model_utils.get_parameters(module, requires_grad=None, name_filter=None)[source]
Helper method for obtaining a module’s parameters and their respective ranges.
- Parameters:
module (Module) – The target module from which parameters are to be extracted.
requires_grad (bool | None) – Optional Boolean used to filter parameters based on whether or not their require_grad attribute matches the user provided value.
name_filter (Callable[[str], bool] | None) – Optional Boolean function used to filter parameters by name.
- Returns:
A dictionary of parameters.
- Return type:
dict[str, Tensor]
- botorch.optim.utils.model_utils.get_parameters_and_bounds(module, requires_grad=None, name_filter=None, default_bounds=(-inf, inf))[source]
Helper method for obtaining a module’s parameters and their respective ranges.
- Parameters:
module (Module) – The target module from which parameters are to be extracted.
name_filter (Callable[[str], bool] | None) – Optional Boolean function used to filter parameters by name.
requires_grad (bool | None) – Optional Boolean used to filter parameters based on whether or not their require_grad attribute matches the user provided value.
default_bounds (tuple[float, float]) – Default lower and upper bounds for constrained parameters with
Nonetyped bounds.
- Returns:
A dictionary of parameters and a dictionary of parameter bounds.
- Return type:
tuple[dict[str, Tensor], dict[str, tuple[float | None, float | None]]]
- botorch.optim.utils.model_utils.get_name_filter(patterns)[source]
Returns a binary function that filters strings (or iterables whose first element is a string) according to a bank of excluded patterns. Typically, used in conjunction with generators such as
module.named_parameters().- Parameters:
patterns (Iterator[Pattern | str]) – A collection of regular expressions or strings that define the set of names to be excluded.
- Returns:
A binary function indicating whether or not an item should be filtered.
- Return type:
Callable[[str | tuple[str, Any, …]], bool]
- botorch.optim.utils.model_utils.sample_all_priors(model, max_retries=100)[source]
Sample from hyperparameter priors (in-place).
- Parameters:
model (GPyTorchModel) – A GPyTorchModel.
max_retries (int)
- Return type:
None
Numpy - Torch Conversion Tools
Utilities for interfacing Numpy and Torch.
- botorch.optim.utils.numpy_utils.as_ndarray(values, dtype=None, inplace=True)[source]
Helper for going from torch.Tensor to numpy.ndarray.
- Parameters:
values (Tensor) – Tensor to be converted to ndarray.
dtype (dtype | None) – Optional numpy.dtype for the converted tensor.
inplace (bool) – Boolean indicating whether memory should be shared if possible.
- Returns:
An ndarray with the same data as
values.- Return type:
ndarray[tuple[Any, …], dtype[_ScalarT]]
- botorch.optim.utils.numpy_utils.get_bounds_as_ndarray(parameters, bounds)[source]
Helper method for converting bounds into an ndarray.
- Parameters:
parameters (dict[str, Tensor]) – A dictionary of parameters.
bounds (dict[str, tuple[float | Tensor | None, float | Tensor | None]]) – A dictionary of (optional) lower and upper bounds.
- Returns:
An ndarray of bounds.
- Return type:
ndarray[tuple[Any, …], dtype[_ScalarT]] | None
- botorch.optim.utils.numpy_utils.get_per_element_bounds(parameters, bounds, batch_shape)[source]
Convert bounds to an ndarray for a single batch element’s parameters.
For batched models where all batch elements share the same parameter constraints, this extracts bounds for one element’s worth of parameters.
- Parameters:
parameters (dict[str, Tensor]) – A dictionary of batched parameter tensors, each with shape
(*batch_shape, *trailing_shape).bounds (dict[str, tuple[float | Tensor | None, float | Tensor | None]]) – A dictionary of (optional) lower and upper bounds.
batch_shape (Size) – The batch shape shared by all parameters.
- Returns:
An ndarray of shape
(per_element_size, 2)or None if all bounds are infinite.- Return type:
ndarray[tuple[Any, …], dtype[_ScalarT]] | None
Optimization with Timeouts
- botorch.optim.utils.timeout.minimize_with_timeout(fun, x0, args=(), method=None, jac=None, hess=None, hessp=None, bounds=None, constraints=(), tol=None, callback=None, options=None, timeout_sec=None)[source]
Wrapper around scipy.optimize.minimize to support timeout.
This method calls scipy.optimize.minimize with all arguments forwarded verbatim. The only difference is that if provided a
timeout_secargument, it will automatically stop the optimization after the timeout is reached.Internally, this is achieved by automatically constructing a wrapper callback method that is injected to the scipy.optimize.minimize call and that keeps track of the runtime and the optimization variables at the current iteration.
- Parameters:
fun (Callable[[ndarray[tuple[Any, ...], dtype[_ScalarT]], ...], float])
x0 (ndarray[tuple[Any, ...], dtype[_ScalarT]])
args (tuple[Any, ...])
method (str | None)
jac (str | Callable | bool | None)
hess (str | Callable | HessianUpdateStrategy | None)
hessp (Callable | None)
bounds (Sequence[tuple[float, float]] | Bounds | None)
tol (float | None)
callback (Callable | None)
options (dict[str, Any] | None)
timeout_sec (float | None)
- Return type:
OptimizeResult
Parameter Constraint Utilities
Utility functions for constrained optimization.
- botorch.optim.parameter_constraints.get_constraint_tolerance(dtype)[source]
Get the constraint tolerance for a given dtype.
- Parameters:
dtype (dtype) – The dtype to use.
- Returns:
The constraint tolerance for the given dtype.
- Return type:
float
- botorch.optim.parameter_constraints.make_scipy_bounds(X, lower_bounds=None, upper_bounds=None)[source]
Creates a scipy Bounds object for optimization
- Parameters:
X (Tensor) –
... x dtensorlower_bounds (float | Tensor | None) – Lower bounds on each column (last dimension) of
X. If this is a single float, then all columns have the same bound.upper_bounds (float | Tensor | None) – Lower bounds on each column (last dimension) of
X. If this is a single float, then all columns have the same bound.
- Returns:
A scipy
Boundsobject if either lower_bounds or upper_bounds is not None, and None otherwise.- Return type:
Bounds | None
Example
>>> X = torch.rand(5, 2) >>> scipy_bounds = make_scipy_bounds(X, 0.1, 0.8)
- botorch.optim.parameter_constraints.make_scipy_linear_constraints(shapeX, inequality_constraints=None, equality_constraints=None)[source]
Generate scipy constraints from torch representation.
- Parameters:
shapeX (Size) – The shape of the torch.Tensor to optimize over (i.e.
(b) x q x d)constraints (equality) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) >= rhs, whereindicesis a single-dimensional index tensor (long dtype) containing indices into the last dimension ofX,coefficientsis a single-dimensional tensor of coefficients of the same length, and rhs is a scalar.constraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) == rhs(withindicesandcoefficientsof the same form as ininequality_constraints).inequality_constraints (list[tuple[Tensor, Tensor, float]] | None)
equality_constraints (list[tuple[Tensor, Tensor, float]] | None)
- Returns:
A list of dictionaries containing callables for constraint function values and Jacobians and a string indicating the associated constraint type (“eq”, “ineq”), as expected by
scipy.optimize.minimize.- Return type:
list[dict[str, str | Callable[[ndarray], float] | Callable[[ndarray], ndarray]]]
This function assumes that constraints are the same for each input batch, and broadcasts the constraints accordingly to the input batch shape. This function does support constraints across elements of a q-batch if the indices are a 2-d Tensor.
Example
The following will enforce that
x[1] + 0.5 x[3] >= -0.1for eachxin both elements of the q-batch, and each of the 3 t-batches:>>> constraints = make_scipy_linear_constraints( >>> torch.Size([3, 2, 4]), >>> [(torch.tensor([1, 3]), torch.tensor([1.0, 0.5]), -0.1)], >>> )
The following will enforce that
x[0, 1] + 0.5 x[1, 3] >= -0.1where x[0, :] is the first element of the q-batch and x[1, :] is the second element of the q-batch, for each of the 3 t-batches:>>> constraints = make_scipy_linear_constraints( >>> torch.size([3, 2, 4]) >>> [(torch.tensor([[0, 1], [1, 3]), torch.tensor([1.0, 0.5]), -0.1)], >>> )
- botorch.optim.parameter_constraints.eval_lin_constraint(x, flat_idxr, coeffs, rhs)[source]
Evaluate a single linear constraint.
- Parameters:
x (ndarray[tuple[Any, ...], dtype[_ScalarT]]) – The input array.
flat_idxr (list[int]) – The indices in
xto consider.coeffs (ndarray[tuple[Any, ...], dtype[_ScalarT]]) – The coefficients corresponding to the indices.
rhs (float) – The right-hand-side of the constraint.
- Returns:
\sum_i (coeffs[i] * x[i]) - rhs- Return type:
The evaluted constraint
- botorch.optim.parameter_constraints.lin_constraint_jac(x, flat_idxr, coeffs, n)[source]
Return the Jacobian associated with a linear constraint.
- Parameters:
x (ndarray[tuple[Any, ...], dtype[_ScalarT]]) – The input array.
flat_idxr (list[int]) – The indices for the elements of x that appear in the constraint.
coeffs (ndarray[tuple[Any, ...], dtype[_ScalarT]]) – The coefficients corresponding to the indices.
n (int) – number of elements
- Returns:
The Jacobian.
- Return type:
ndarray[tuple[Any, …], dtype[_ScalarT]]
- botorch.optim.parameter_constraints.project_to_equality_constraints(X, equality_constraints)[source]
Project X onto the equality constraint manifold via least-squares.
For linear equality constraints of the form
Ax = b, this finds the closest point to X (in L2 sense) that satisfies all constraints, using the closed-form least-squares projection:X_proj = X + A^T (A A^T)^{-1} (b - A X).This operates on each point in the q-batch independently (intra-point constraints only).
- Parameters:
X (Tensor) – A
... x q x d-dim tensor of inputs.equality_constraints (list[tuple[Tensor, Tensor, float]]) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an equality constraint of the form
sum_i (X[indices[i]] * coefficients[i]) = rhs. Only supports 1-d indices (intra-point constraints).
- Returns:
A tensor of the same shape as X, projected onto the constraint manifold.
- Return type:
Tensor
- botorch.optim.parameter_constraints.nonlinear_constraint_is_feasible(nonlinear_inequality_constraint, is_intrapoint, x, tolerance=None)[source]
Checks if a nonlinear inequality constraint is fulfilled (within tolerance).
- Parameters:
nonlinear_inequality_constraint (Callable) – Callable to evaluate the constraint.
intra – If True, the constraint is an intra-point constraint that is applied pointwise and is broadcasted over the q-batch. Else, the constraint has to evaluated over the whole q-batch and is a an inter-point constraint.
x (Tensor) – Tensor of shape (batch x q x d).
tolerance (float | None) – Rather than using the exact
const(x) >= 0constraint, this helper checks feasibility ofconst(x) >= -tolerance. This avoids marking the candidates as infeasible due to tiny violations.is_intrapoint (bool)
- Returns:
A boolean tensor of shape (batch) indicating if the constraint is satified by the corresponding batch of
x.- Return type:
Tensor
- botorch.optim.parameter_constraints.make_scipy_nonlinear_inequality_constraints(nonlinear_inequality_constraints, f_np_wrapper, x0, shapeX)[source]
Generate Scipy nonlinear inequality constraints from callables.
- Parameters:
nonlinear_inequality_constraints (list[tuple[Callable, bool]]) – A list of tuples representing the nonlinear inequality constraints. The first element in the tuple is a callable representing a constraint of the form
callable(x) >= 0. In case of an intra-point constraint,callable()``takes in an one-dimensional tensor of shape ``dand returns a scalar. In case of an inter-point constraint,callable()takes a two dimensional tensor of shapeq x dand again returns a scalar. The second element is a boolean, indicating if it is an intra-point or inter-point constraint (Truefor intra-point.Falsefor inter-point). For more information on intra-point vs inter-point constraints, see the docstring of theinequality_constraintsargument tooptimize_acqf(). The constraints will later be passed to the scipy solver.f_np_wrapper (Callable) – A wrapper function that given a constraint evaluates the value and gradient (using autograd) of a numpy input and returns both the objective and the gradient.
x0 (Tensor) – The starting point for SLSQP. We return this starting point in (rare) cases where SLSQP fails and thus require it to be feasible.
shapeX (Size) – Shape of the three-dimensional batch X, that should be optimized.
- Returns:
A list of dictionaries containing callables for constraint function values and Jacobians and a string indicating the associated constraint type (“eq”, “ineq”), as expected by
scipy.optimize.minimize.- Return type:
list[dict]
- botorch.optim.parameter_constraints.evaluate_feasibility(X, inequality_constraints=None, equality_constraints=None, nonlinear_inequality_constraints=None, tolerance=None)[source]
Evaluate feasibility of candidate points (within a tolerance).
- Parameters:
X (Tensor) – The candidate tensor of shape
batch x q x d.inequality_constraints (list[tuple[Tensor, Tensor, float]] | None) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) >= rhs.indicesandcoefficientsshould be torch tensors. See the docstring ofmake_scipy_linear_constraintsfor an example. When q=1, or when applying the same constraint to each candidate in the batch (intra-point constraint),indicesshould be a 1-d tensor. For inter-point constraints, in which the constraint is applied to the whole batch of candidates,indicesmust be a 2-d tensor, where in each rowindices[i] =(k_i, l_i)the first indexk_icorresponds to thek_i-th element of theq-batch and the second indexl_icorresponds to thel_i-th feature of that element.equality_constraints (list[tuple[Tensor, Tensor, float]] | None) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an equality constraint of the form
\sum_i (X[indices[i]] * coefficients[i]) = rhs. See the docstring ofmake_scipy_linear_constraintsfor an example.nonlinear_inequality_constraints (list[tuple[Callable, bool]] | None) – A list of tuples representing the nonlinear inequality constraints. The first element in the tuple is a callable representing a constraint of the form
callable(x) >= 0. In case of an intra-point constraint,callable()``takes in an one-dimensional tensor of shape ``dand returns a scalar. In case of an inter-point constraint,callable()takes a two dimensional tensor of shapeq x dand again returns a scalar. The second element is a boolean, indicating if it is an intra-point or inter-point constraint (Truefor intra-point.Falsefor inter-point). For more information on intra-point vs inter-point constraints, see the docstring of theinequality_constraintsargument.tolerance (float | None) – The tolerance used to check the feasibility of constraints. For inequality constraints, we check if
const(X) >= rhs - tolerance. For equality constraints, we check ifabs(const(X) - rhs) < tolerance. For non-linear inequality constraints, we check ifconst(X) >= -tolerance. This avoids marking the candidates as infeasible due to tiny violations.
- Returns:
A boolean tensor of shape
batchindicating if the corresponding candidate of shapeq x dis feasible.- Return type:
Tensor
- botorch.optim.parameter_constraints.project_to_feasible_space_via_slsqp(X, bounds, inequality_constraints=None, equality_constraints=None, fixed_features=None)[source]
Project X onto the feasible space by solving a quadratic program.
This uses SLSQP with gradients to solve the quadratic program. NOTE: A proper specialized QP solver would be a better choice here, but we’d like to avoid adding dependency on additional packages. SLSQP should be able to solve this reliably and quickly since the dimension is typically low and the number of constraints is typically limited.
- Parameters:
X (Tensor) – A
(batch_shape x) n x d-dim tensor of inputs.bounds (Tensor) – A
2 x d-dim tensor of lower and upper bounds.inequality_constraints (list[tuple[Tensor, Tensor, float]] | None) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form
sum_i (X[indices[i]] * coefficients[i]) >= rhs.indicesandcoefficientsshould be torch tensors. See the docstring ofmake_scipy_linear_constraintsfor an example.equality_constraints (list[tuple[Tensor, Tensor, float]] | None) – A list of tuples (indices, coefficients, rhs).
fixed_features (dict[int, float | Tensor] | None) – A dictionary mapping feature indices to their fixed values. These dimensions will not be modified during projection. Values can be scalars (applied to all elements) or 1D tensors matching the batch size of X (for per-element fixed values).
- Returns:
A
(batch_shape x) n x d-dim tensor of projected values.- Return type:
Tensor
Homotopy Utilities
- class botorch.optim.homotopy.FixedHomotopySchedule(values)[source]
Bases:
objectHomotopy schedule with a fixed list of values.
Initialize FixedHomotopySchedule.
- Parameters:
values (list[float]) – A list of values used in homotopy
- property num_steps: int
- property value: float
- property should_stop: bool
- class botorch.optim.homotopy.LinearHomotopySchedule(start, end, num_steps)[source]
Bases:
FixedHomotopyScheduleLinear homotopy schedule.
Initialize LinearHomotopySchedule.
- Parameters:
start (float) – start value of homotopy
end (float) – end value of homotopy
num_steps (int) – number of steps in the homotopy schedule.
- class botorch.optim.homotopy.LogLinearHomotopySchedule(start, end, num_steps)[source]
Bases:
FixedHomotopyScheduleLog-linear homotopy schedule.
Initialize LogLinearHomotopySchedule.
- Parameters:
start (float) – start value of homotopy
end (float) – end value of homotopy
num_steps (int) – number of steps in the homotopy schedule.
- class botorch.optim.homotopy.HomotopyParameter(parameter, schedule)[source]
Bases:
objectHomotopy parameter.
The parameter is expected to either be a torch parameter or a torch tensor which may correspond to a buffer of a module. The parameter has a corresponding schedule.
- Parameters:
parameter (Parameter | Tensor)
schedule (FixedHomotopySchedule)
- parameter: Parameter | Tensor
- schedule: FixedHomotopySchedule
- class botorch.optim.homotopy.Homotopy(homotopy_parameters, callbacks=None)[source]
Bases:
objectGeneric homotopy class.
This class is designed to be used in
optimize_acqf_homotopy. Given a set of homotopy parameters and corresponding schedules we step through the homotopies until we have solved the final problem. We additionally support passing in a list of callbacks that will be executed each timestep,reset, andrestartare called.Initialize the homotopy.
- Parameters:
homotopy_parameters (list[HomotopyParameter]) – List of homotopy parameters
callbacks (list[Callable] | None) – Optional list of callbacks that are executed each time
restart,reset, orstepare called. These may be used to, e.g., reinitialize the acquisition function which is needed when using qNEHVI.
- property should_stop: bool
Returns true if all schedules have reached the end.