samplespace.algorithms
- General Sampling Algorithms¶
This module implements several general-purpose sampling algorithms.
-
samplespace.algorithms.
sample_discrete_roulette
(randfloat: Callable[], float], cum_weights: Sequence[float]) → int¶ Sample from a given discrete distribution given its cumulative weights, using binary search / the roulette wheel selection algorithm.
- Parameters
- Returns
An index into the population from 0 to
len(cum_weights) - 1
.
-
class
samplespace.algorithms.
AliasTable
(probability: Sequence[float], alias: Sequence[int])¶ Sample from a given discrete distribution given its relative weights, using the alias table algorithm.
Alias tables are slow to construct, but offer increased sampling speed compared to the roulette wheel algorithm.
Construct a table using
from_weights()
.-
classmethod
from_weights
(weights: Sequence[float])¶ Construct an alias table using a list of relative weights.
The provided weights need not sum to 1.0.
-
sample
(randbelow: Callable[[int], int], randchance: Callable[[float], bool]) → int¶ Sample from the alias table.
- Parameters
randbelow (Callable[[int], int]) – A function returning a random int in [0, arg), e.g.
random.randrange()
. Will only be called once per sample.randchance (Callable[[float], bool) – A function returning
True
with chance arg andFalse
otherwise, e.g.lambda arg: random.random() < arg
. Will only be called once per sample.
- Returns
An index into the population from 0 to
len(weights) - 1
.
-
classmethod
Examples¶
Sampling from a discrete categorical distribution using the Roulette Wheel Selection algorithm:
import random
import itertools
from samplespace import algorithms
population = 'abcde'
weights = [0.1, 0.4, 0.2, 0.3, 0.5]
cum_weights = list(itertools.accumulate(weights))
indices = [algorithms.sample_discrete_roulette(random.random, cum_weights)
for _ in range(25)] # [0, 4, 4, ...]
samples = [population[index] for index in indices] # ['a', 'e', 'e', ...]
Sampling from a discrete categorical distribution using the Alias Table algorithm and a repeatable sequence:
from samplespace import algorithms, RepeatableRandomSequence
population = 'abcde'
weights = [0.1, 0.4, 0.2, 0.3, 0.5]
rrs = RepeatableRandomSequence(seed=12345)
at = algorithms.AliasTable.from_weights(weights)
indices = [at.sample(rrs.randrange, rrs.chance) for _ in range(25)]
samples = [population[index] for index in indices] # ['e', 'e', 'c', ...]
# Note that rrs.index == 50 at this point, since two random values
# were required for each sample.
# It is also possible to use AliasTable within a cascade:
rrs.reset()
indices = [0] * 25
for i in range(len(indices)):
with rrs.cascade():
indices[i] = at.sample(rrs.randrange, rrs.chance)
samples = [population[index] for index in indices] # ['e', 'd', 'b', ...]
# Because cascades were used, rrs.index == 25.
# Note that the values produced are different than the previous
# attempt, because a different number of logical samples were made.