CombinatorialBandits

Project Status: Active – The project has reached a stable, usable state and is being actively developed.The MIT LicenseBuild StatusCoverage Statuscodecov.io

This package implements several algorithms to deal with combinatorial multi-armed bandit (CMAB), including the first polynomial-time optimum-regret algorithms: AESCB (described in our paper) and AOSSB (article in press).

See also Bandits.jl, focusing on multi-armed bandits (i.e. not combinatorial).

To install:

]add CombinatorialBandits

Example usage:

using CombinatorialBandits, Distributions

n = 20
m = 8
ε = 0.1
distr = Distribution[Bernoulli(.5 + ((i % 3 == 0) ? ε : -ε)) for i in 1:n]

i = MSet(distr, 8, MSetAlgosSolver())
@time simulate(i, ThompsonSampling(), 200)
@time simulate(i, LLR(), 200)
@time simulate(i, CUCB(), 200)
@time simulate(i, ESCB2(ESCB2Budgeted(.1, true)), 200)

Citing

If you use this package in your research, please cite this preprint:

@misc{cuvelier2020,
    title={Statistically Efficient, Polynomial-Time Algorithms for Combinatorial Semi Bandits},
    author={Thibaut Cuvelier and Richard Combes and Eric Gourdin},
    year={2020},
    eprint={2002.07258},
    archivePrefix={arXiv},
    primaryClass={stat.ML}
}