`BioCCP.approximate_moment`

— Method```
approximate_moment(n, fun; p=ones(n)/n, q=1, m=1, r=1,
steps=1000, normalize=true, ϵ = 1e-3)
```

Calculates the q-th rising moment of `T[N]`

(number of designs that are needed to collect all modules `m`

times). Integral is approximated by the Riemann sum.

Reference:

- Doumas, A. V., & Papanicolaou, V. G. (2016). The coupon collector’s problem revisited: generalizing the double Dixie cup problem of Newman and Shepp. ESAIM: Probability and Statistics, 20, 367-399.

**Examples**

```
julia> n = 100
julia> fun = exp_ccdf
julia> approximate_moment(n, fun; p=ones(n)/n, q=1, m=1, r=1,
steps=10000, normalize=true)
518.8175339489885
```

`BioCCP.exp_ccdf`

— Method`exp_ccdf(n, T; p=ones(n)/n, m=1, r=1, normalize=true)`

Calculates `1 - F(t)`

, which is the complement of the success probability `F(t)=P(T ≤ t)`

(= probability that the expected minimum number of designs `T`

is smaller than `t`

in order to see each module at least `m`

times). This function serves as the integrand for calculating `E[T]`

.

`n`

: number of modules in the design space`p`

: vector with the probabilities/abundances of the different modules in the design space during library generation`T`

: number of designs`m`

: number of times each module has to observed in the sampled set of designs`r`

: number of modules per design- normalize: if true, normalize
`p`

References:

- Doumas, A. V., & Papanicolaou, V. G. (2016). The coupon collector’s problem revisited: generalizing the double Dixie cup problem of Newman and Shepp. ESAIM: Probability and Statistics, 20, 367-399.
- Boneh, A., & Hofri, M. (1997). The coupon-collector problem revisited—a survey of engineering problems and computational methods. Stochastic Models, 13(1), 39-66.

**Examples**

```
julia> n = 100
julia> t = 500
julia> exp_ccdf(n, t; p=ones(n)/n, m=1, r=1, normalize=true)
0.4913906004535237
```

`BioCCP.expectation_fraction_collected`

— Method`expectation_fraction_collected(n::Integer, t::Integer; p=ones(n)/n, r=1, normalize=true)`

Calculates the fraction of all modules that is expected to be observed after collecting `t`

designs.

`n`

: number of modules in design space`t`

: sample size/number of designs`p`

: vector with the probabilities or abundances of the different modules`r`

: number of modules per design- normalize: if true, normalize
`p`

References:

- Boneh, A., & Hofri, M. (1997). The coupon-collector problem revisited—a survey of engineering problems and computational methods. Stochastic Models, 13(1), 39-66.

**Examples**

```
julia> n = 100
julia> t = 200
julia> expectation_fraction_collected(n, t; p=ones(n)/n, r=1, normalize=true)
0.8660203251420364
```

`BioCCP.expectation_minsamplesize`

— Method`expectation_minsamplesize(n; p=ones(n)/n, m=1, r=1, normalize=true)`

Calculates the expected minimum number of designs `E[T]`

to observe each module at least `m`

times.

`n`

: number of modules in the design space`p`

: vector with the probabilities or abundances of the different modules`m`

: number of times each module has to be observed in the sampled set of designs`r`

: number of modules per design- normalize: if true, normalize
`p`

References:

- Doumas, A. V., & Papanicolaou, V. G. (2016). The coupon collector’s problem revisited: generalizing the double Dixie cup problem of Newman and Shepp. ESAIM: Probability and Statistics, 20, 367-399.
- Boneh, A., & Hofri, M. (1997). The coupon-collector problem revisited—a survey of engineering problems and computational methods. Stochastic Models, 13(1), 39-66.

**Examples**

```
julia> n = 100
julia> expectation_minsamplesize(n; p=ones(n)/n, m=1, r=1, normalize=true)
518
```

`BioCCP.logfactorial`

— MethodComputes the log of factorial(n), falls back on Stirling's approximation for `n`

> 20

`BioCCP.prob_occurrence_module`

— Method`prob_occurrence_module(pᵢ, t::Integer, r, k::Integer)`

Calculates probability that specific module with module probability `pᵢ`

has occurred `k`

times after collecting `t`

designs.

Sampling processes of individual modules are assumed to be independent Poisson processes.

`pᵢ`

: module probability`t`

: sample size/number of designs`k`

: number of occurrence

References:

**Examples**

```
julia> pᵢ = 0.005
julia> t = 500
julia> k = 2
julia> r = 1
julia> prob_occurrence_module(pᵢ, t, r, k)
0.25651562069968376
```

`BioCCP.std_minsamplesize`

— Method`std_minsamplesize(n::Integer; p=ones(n)/n, m::Integer=1, r=1, normalize=true)`

Calculates the standard deviation on the minimum number of designs to observe each module at least `m`

times.

`n`

: number of modules in the design space`p`

: vector with the probabilities or abundances of the different modules`m`

: number of complete sets of modules that need to be collected`r`

: number of modules per design- normalize: if true, normalize
`p`

**Examples**

```
julia> n = 100
julia> std_minsamplesize(n; p=ones(n)/n, m=1, r=1, normalize=true)
126
```

`BioCCP.success_probability`

— Method`success_probability(n::Integer, t::Integer; p=ones(n)/n, m::Integer=1, r=1, normalize=true)`

Calculates the success probability `F(t) = P(T ≤ t)`

or the probability that the minimum number of designs `T`

to see each module at least `m`

times is smaller than `t`

.

`n`

: number of modules in design space`t`

: sample size/number of designs for which to calculate the success probability`p`

: vector with the probabilities or abundances of the different modules`m`

: number of complete sets of modules that need to be collected`r`

: number of modules per design- normalize: if true, normalize
`p`

References:

**Examples**

```
julia> n = 100
julia> t = 600
julia> success_probability(n, t; p=ones(n)/n, m=1, r=1, normalize=true)
0.7802171997092149
```