samplespace.distributions - Serializable Probability Distributions¶
This module implements a number of useful probability distributions.
Each distribution can be sampled using any random number generator
providing at least the same functionality as the random module;
this includes samplespace.repeatablerandom.RepeatableRandomSequence.
The classes in this module are primarily intended for storing information
on random distributions in configuration files using
Distribution.as_dict()/distribution_from_dict() or
Distribution.as_list()/distribution_from_list().
See the Examples section for examples on how to do this.
Integer distributions¶
-
class
samplespace.distributions.DiscreteUniform(min_val: int, max_val: int)¶ Represents a discrete uniform distribution of integers in [min_value, max_value).
-
as_dict() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
max_val¶ Read-only property for the distribution’s upper limit.
-
property
min_val¶ Read-only property for the distribution’s lower limit.
-
-
class
samplespace.distributions.Geometric(mean: float, include_zero: bool = False)¶ Represents a geometric distribution.
If include_zero is
False, returns integers from 1 to infinity according to \(\text{Pr}(x = k) = p {(1 - p)}^{k - 1}\) where \(p = \frac{1}{\mathit{mean}}\).If include_zero is
True, returns integers from 0 to infinity according to \(\text{Pr}(x = k) = p {(1 - p)}^{k}\) where \(p = \frac{1}{\mathit{mean} + 1}\).-
as_dict() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
include_zero¶ Read-only property for whether or not the distribution’s support includes zero.
-
property
mean¶ Read-only property for the distribution’s mean.
-
-
class
samplespace.distributions.FiniteGeometric(s: float, n: int)¶ Represents a geometric-like distribution with exponent s and finite support {1, …, n}.
The finite geometric distribution is defined by the equation
\[\text{Pr}(x=k) = \frac{s^{k}}{\sum_{i=1}^{N} s^{i}}\]The distribution is defined such that each result is s times as likely to occur as the previous; i.e. \(\text{Pr}(x=k) = s \text{Pr}(x=k-1)\) over the support.
-
as_dict() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
n¶ Read-only property for the number of values in the distribution’s support.
-
property
s¶ Read-only property for the distribution’s s exponent.
-
-
class
samplespace.distributions.ZipfMandelbrot(s: float, q: float, n: int)¶ Represents a Zipf-Mandelbrot distribution with exponent s, offset q, and support {1, …, n}.
The Zipf-Mandelbrot distribution is defined by the equation
\[\text{Pr}(x=k) = \frac{(k+q)^{-s}}{\sum_{i=1}^{N} (i+q)^{-s}}\]When
q == 0, the distribution becomes the Zipf distribution, and as n increases, it approaches the Zeta distribution.-
as_dict() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
n¶ Read-only property for the number of values in the distribution’s support.
-
property
q¶ Read-only property for the distribution’s q offset.
-
property
s¶ Read-only property for the distribution’s s exponent.
-
-
class
samplespace.distributions.Bernoulli(p: float)¶ Represents a Bernoulli distribution with parameter p.
Returns
Truewith probability p, andFalseotherwise.-
as_dict() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
p¶ Read-only property for the distribution’s p parameter.
-
Categorical distributions¶
-
class
samplespace.distributions.WeightedCategorical(items: Optional[Sequence[Tuple[float, Any]]] = None, population: Optional[Sequence] = None, weights: Optional[Sequence[float]] = None, *, cum_weights: Optional[Sequence[float]] = None)¶ Represents a categorical distribution defined by a population and a list of relative weights.
Either items, population and weights, or population and cum_weights should be provided, not all three.
- Parameters
items (Sequence[Tuple[Any]]) – A sequence of tuples in the format (weight, relative value).
population (Sequence) – A sequence of possible values.
weights (Sequence[float], optional) – A sequence of relative weights corresponding to each item in the population. Must be the same length as the population list.
cum_weights (Sequence[float], optional) – A sequence of cumulative weights corresponding to each item in the population. Must be the same length as the population list. Only one of weights and cum_weights should be provided.
-
as_dict() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
cum_weights¶ A read-only property for the distribution’s cumulative weights.
-
property
items¶ A read-only property returning a sequence of tuples in the format (weight, relative value).
-
property
population¶ A read-only property for the distribution’s population.
-
sample(rand)¶ Sample from the distribution.
- Parameters
rand – The random generator used to generate the sample.
-
class
samplespace.distributions.UniformCategorical(population: Sequence)¶ Represents a uniform categorical distribution over a given population.
-
as_dict() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
population¶ A read-only property for the distribution’s population.
-
sample(rand)¶ Sample from the distribution.
- Parameters
rand – The random generator used to generate the sample.
-
-
class
samplespace.distributions.FiniteGeometricCategorical(population: Sequence, s: float)¶ Represents a categorical distribution with weights corresponding to a finite geometric-like distribution with exponent s.
The finite geometric distribution is defined by the equation
\[\text{Pr}(x=k) = \frac{s^{k}}{\sum_{i=1}^{N} s^{i}}\]The distribution is defined such that each result is s times as likely to occur as the previous; i.e. \(\text{Pr}(x=k) = s \text{Pr}(x=k-1)\) over the support.
-
as_dict() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
cum_weights¶ A read-only property for the distribution’s cumulative weights.
-
property
items¶ A read-only property returning a sequence of tuples in the format (weight, relative value).
-
property
n¶ Read-only property for the number of values in the distribution’s support.
-
property
population¶ A read-only property for the distribution’s population.
-
property
s¶ Read-only property for the distribution’s s exponent.
-
sample(rand)¶ Sample from the distribution.
- Parameters
rand – The random generator used to generate the sample.
-
-
class
samplespace.distributions.ZipfMandelbrotCategorical(population: Sequence, s: float, q: float)¶ Represents a categorical distribution with weights corresponding to a Zipf-Mandelbrot distribution with exponent s and offset q.
The Zipf-Mandelbrot distribution is defined by the equation
\[\text{Pr}(x=k) = \frac{(k+q)^{-s}}{\sum_{i=1}^{N} (i+q)^{-s}}\]When
q == 0, the distribution becomes the Zipf distribution, and as n increases, it approaches the Zeta distribution.-
as_dict() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
cum_weights¶ A read-only property for the distribution’s cumulative weights.
-
property
items¶ A read-only property returning a sequence of tuples in the format (weight, relative value).
-
property
n¶ Read-only property for the number of values in the distribution’s support.
-
property
population¶ A read-only property for the distribution’s population.
-
property
q¶ Read-only property for the distribution’s q offset.
-
property
s¶ Read-only property for the distribution’s s exponent.
-
sample(rand)¶ Sample from the distribution.
- Parameters
rand – The random generator used to generate the sample.
-
Continuous distributions¶
-
class
samplespace.distributions.Constant(value)¶ Represents a distribution that always returns a constant value.
-
as_dict() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
sample(rand)¶ Sample from the distribution.
- Parameters
rand – The random generator used to generate the sample.
-
property
value¶ Read-only property for the distribution’s constant value.
-
-
class
samplespace.distributions.Uniform(min_val: float = 0.0, max_val: float = 1.0)¶ Represents a continuous uniform distribution with support [min_val, max_val).
The uniform distribution is defined as
\[\begin{split}\text{P}(x) = \begin{cases} \frac{1}{b - a} & \text{for } x \in [a, b) \\ 0 & \text{otherwise} \end{cases}\end{split}\]-
as_dict() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
max_val¶ Read-only property for the distribution’s upper limit.
-
property
min_val¶ Read-only property for the distribution’s lower limit.
-
-
class
samplespace.distributions.Gamma(alpha: float, beta: float)¶ Represents a gamma distribution with parameters alpha and beta.
The gamma distribution is defined as
\[\text{P}(x) = \frac{x^{\alpha - 1} e^{-\frac{x}{\beta}}} {\Gamma(\alpha) \beta^{\alpha}}\]Caution
This implementation defines its parameters to match
random.gammavariate(). The parametrization differs from most common definitions of the gamma distribution, as defined on Wikipedia, et al. Take care when setting alpha and beta!-
property
alpha¶ Read-only property for the distribution’s alpha parameter.
-
as_dict() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
beta¶ Read-only property for the distribution’s beta parameter.
-
property
-
class
samplespace.distributions.Triangular(low: float = 0.0, high: float = 1.0, mode: Optional[float] = None)¶ Represents a triangular distribution with lower limit low, upper limit low, and mode mode.
The triangular distribution is defined by
\[\begin{split}\text{P}(x) = \begin{cases} 0 & \text{for } x \lt l, \\ \frac{2(x-l)}{(h-l)(m-l)} & \text{for }l\le x \lt h, \\ \frac{2}{h-l} & \text{for } x = m, \\ \frac{2(h-x)}{(h-l)(h-m)} & \text{for } m \lt x \le h, \\ 0 & \text{for } h \lt x \end{cases}\end{split}\]-
as_dict() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
high¶ Read-only property for the distribution’s upper bound.
-
property
low¶ Read-only property for the distribution’s lower bound.
-
property
mode¶ Read-only property for the distribution’s mode, if specified, otherwise
None.
-
-
class
samplespace.distributions.UniformProduct(n: int)¶ Represents a distribution whose values are the product of N uniformly distributed variables.
This distribution has the following PDF
\[\begin{split}\text{P}(x) = \begin{cases} \frac{(-1)^{n-1} log^{n-1}(x)}{(n - 1)!} & \text{for } x \in [0, 1) \\ 0 & \text{otherwise} \end{cases}\end{split}\]-
as_dict() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
n¶ Read-only property for the number of uniformly distributed variables to multiply.
-
-
class
samplespace.distributions.LogNormal(mu: float = 0.0, sigma: float = 1.0)¶ Represents a log-normal distribution with parameters mu and sigma.
The logarithms of values sampled from a log-normal distribution are normally distributed with mean mu and standard deviation sigma. The distribution is defined by
\[\text{P}(x) = \frac{1}{x \sigma \sqrt{2 \pi}} \exp{\left(- \frac{(\ln{x} - \mu)^2}{2 \sigma ^2}\right)}\]-
as_dict() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
mu¶ Read-only property for the distribution’s mu parameter.
-
sample(rand) → float¶ Sample from the distribution.
- Parameters
rand – The random generator used to generate the sample.
-
property
sigma¶ Read-only property for the distribution’s sigma parameter.
-
-
class
samplespace.distributions.Exponential(lambd: float)¶ Represents an exponential distribution with rate lambd.
The probability density function for the exponential distribution is defined by \(\text{P}(x)= \lambda e^{-\lambda x}\) where \(x\ge 0\)
-
as_dict() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
lambd¶ Read-only property for the distribution’s rate parameter.
-
-
class
samplespace.distributions.VonMises(mu: float, kappa: float)¶ Represents a von Mises distribution with parameters mu and kappa.
Samples from a von Mises distribution represent randomly-chosen angles clustered around a mean angle. This distribution is an approximation to the wrapped normal distribution and is defined by
\[\text{P}(x) = \frac{e^{\kappa \cos{(x - \mu)}}}{2 \pi I_0(\kappa)}\]where \(I_0(\kappa)\) is the modified Bessel function of order 0.
-
as_dict() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
kappa¶ Read-only property for the distribution’s kappa parameter.
-
property
mu¶ Read-only property for the distribution’s mu parameter.
-
-
class
samplespace.distributions.Beta(alpha: float, beta: float)¶ Represents a beta distribution with parameters alpha and beta.
The beta distribution is defined by
\[\text{P}(x) = x^{\alpha - 1} (1 - x)^{\beta - 1} \frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha)\Gamma(\beta)}\]-
property
alpha¶ Read-only property for the distribution’s alpha parameter.
-
as_dict() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
beta¶ Read-only property for the distribution’s beta parameter.
-
property
-
class
samplespace.distributions.Pareto(alpha: float)¶ Represents a Pareto distribution with shape parameter alpha and minimum value 1.
The Pareto distribution has PDF
\[\text{P}(x) = \frac{\alpha}{x^{\alpha + 1}}\]-
property
alpha¶ Read-only property for the distribution’s alpha parameter.
-
as_dict() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
-
class
samplespace.distributions.Weibull(alpha: float, beta: float)¶ Represents a Weibull distribution with scale parameter alpha and shape parameter beta.
The distribution is defined by
\[\text{P}(x) = \frac{\beta}{\alpha} \left(\frac{x}{\alpha}\right)^{k-1} e^{-(x/\alpha)^k}\]where \(x \ge 0\).
-
property
alpha¶ Read-only property for the distribution’s alpha parameter.
-
as_dict() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
beta¶ Read-only property for the distribution’s beta parameter.
-
property
-
class
samplespace.distributions.Gaussian(mu: float = 0.0, sigma: float = 1.0)¶ Represents a Gaussian distribution with parameters mu and sigma.
The Gaussian, or Normal distribution is defined by
\[\text{P}(x) = \frac{1}{\sigma \sqrt{2 \pi}} e^{-\frac{1}{2} \left(\frac{x - \mu}{\sigma}\right)^2}\]-
as_dict() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
mu¶ Read-only property for the distribution’s mu parameter.
-
sample(rand)¶ Sample from the distribution.
- Parameters
rand – The random generator used to generate the sample.
-
property
sigma¶ Read-only property for the distribution’s sigma parameter.
-
Serialization functions¶
-
samplespace.distributions.distribution_from_list(as_list)¶ Build a distribution using a list of parameters.
Expects a list of values in the same form return by a call to
Distribution.as_list(); i.e.['name', arg0, arg1, ...].Examples
>>> tri = Triangular(2.0, 5.0) >>> tri_as_list = tri.as_list() >>> tri_as_list ['triangular', 2.0, 5.0] >>> new_tri = distribution_from_list(tri_as_list) >>> assert tri == new_tri
-
samplespace.distributions.distribution_from_dict(as_dict)¶ Build a distribution using a dictionary of keyword arguments.
Expects a dict in the same form as returned by a call to
Distribution.as_dict(); i.e.{'distribution':'name', 'arg0':'value', 'arg1':'value', ...}.Examples
>>> gauss = Gaussian(0.8, 5.0) >>> gauss_as_dict = gauss.as_dict() >>> gauss_as_dict {'distribution': 'gaussian', 'mu': 0.8, 'sigma': 5.0} >>> new_gauss = distribution_from_dict(gauss_as_dict) >>> assert gauss == new_gauss
-
classmethod
samplespace.distribution.Distribution.from_list()¶ An alias for
distribution_from_list().
-
classmethod
samplespace.distribution.Distribution.from_dict()¶ An alias for
distribution_from_dict().
Examples¶
Sampling from a distribution:
import random
import statistics
from samplespace.distributions import Gaussian, FiniteGeometric
gauss = Gaussian(15.0, 2.0)
samples = [gauss.sample(random) for _ in range(100)]
print('Mean:', statistics.mean(samples))
print('Standard deviation:', statistics.stdev(samples))
geo = FiniteGeometric(['one', 'two', 'three', 'four', 'five'], 0.7)
samples = [geo.sample(random) for _ in range(10)]
print(' '.join(samples))
Using other random generators:
from samplespace import distributions, RepeatableRandomSequence
exponential = distributions.Exponential(0.8)
rrs = RepeatableRandomSequence(seed=12345)
print([exponential.sample(rrs) for _ in range(5)])
# Will always print:
# [1.1959827296976795, 0.6056492468915003, 0.9155454941988664, 0.5653478889068511, 0.6500080335986231]
Representations of distributions:
from samplespace.distributions import Pareto, DiscreteUniform, UniformCategorical
pareto = Pareto(2.5)
print('Pareto as dict:', pareto.as_dict()) # {'distribution': 'pareto', 'alpha': 2.5}
print('Pareto as list:', pareto.as_list()) # ['pareto', 2.5]
discrete = DiscreteUniform(3, 8)
print('Discrete uniform as dict:', discrete.as_dict()) # {'distribution': 'discreteuniform', 'min_val': 3, 'max_val': 8}
print('Discrete uniform as list:', discrete.as_list()) # ['discreteuniform', 3, 8]
cat = UniformCategorical(['string', 4, {'a':'dict'}])
print('Uniform categorical as dict:', cat.as_dict()) # {'distribution': 'uniformcategorical', 'population': ['string', 4, {'a': 'dict'}]}
print('Uniform categorical as list:', cat.as_list()) # ['uniformcategorical', ['string', 4, {'a': 'dict'}]]
Storing distributions as in config files as lists:
import random
from samplespace import distributions
...
skeleton_config = {
'name': 'Skeleton',
'starting_hp': ['gaussian', 50.0, 5.0],
'coins_dropped': ['geometric', 0.8, True],
}
...
class Skeleton(object):
def __init__(self, name, starting_hp, coins_dropped_dist):
self.name = name
self.starting_hp = starting_hp
self.coins_dropped_dist = coins_dropped_dist
def drop_coins(self):
return self.coins_dropped_dist.sample(random)
...
class SkeletonFactory(object):
def __init__(self, config):
self.name = config['name']
self.starting_hp_dist = distributions.distribution_from_list(config['starting_hp'])
self.coins_dropped_dist = distributions.distribution_from_list(config['coins_dropped'])
def make_skeleton(self):
return Skeleton(
self.name,
int(self.starting_hp_dist.sample(random)),
self.coins_dropped_dist)
Storing distributions in config files as dictionaries:
from samplespace import distributions, RepeatableRandomSequence
city_config = {
"building_distribution": {
"distribution": "weightedcategorical",
"items": [
["house", 0.2],
["store", 0.4],
["tree", 0.8],
["ground", 5.0]
]
}
}
rrs = RepeatableRandomSequence()
building_dist = distributions.distribution_from_dict(city_config['building_distribution'])
buildings = [[building_dist.sample(rrs) for col in range(20)] for row in range(5)]
for row in buildings:
for building_type in row:
if building_type == 'house':
print('H', end='')
elif building_type == 'store':
print('S', end='')
elif building_type == 'tree':
print('T', end='')
else:
print('.', end='')
print()