CombinatorialBandits
This package implements several algorithms to deal with combinatorial multi-armed bandit (CMAB), including the first polynomial-time optimum-regret algorithms: AESCB (described in our paper) and AOSSB (article in press).
See also Bandits.jl, focusing on multi-armed bandits (i.e. not combinatorial).
To install:
]add CombinatorialBandits
Example usage:
using CombinatorialBandits, Distributions
n = 20
m = 8
ε = 0.1
distr = Distribution[Bernoulli(.5 + ((i % 3 == 0) ? ε : -ε)) for i in 1:n]
i = MSet(distr, 8, MSetAlgosSolver())
@time simulate(i, ThompsonSampling(), 200)
@time simulate(i, LLR(), 200)
@time simulate(i, CUCB(), 200)
@time simulate(i, ESCB2(ESCB2Budgeted(.1, true)), 200)
Citing
If you use this package in your research, please cite this preprint:
@misc{cuvelier2020,
title={Statistically Efficient, Polynomial-Time Algorithms for Combinatorial Semi Bandits},
author={Thibaut Cuvelier and Richard Combes and Eric Gourdin},
year={2020},
eprint={2002.07258},
archivePrefix={arXiv},
primaryClass={stat.ML}
}