samplespace.distributions
- Serializable Probability Distributions¶
This module implements a number of useful probability distributions.
Each distribution can be sampled using any random number generator
providing at least the same functionality as the random
module;
this includes samplespace.repeatablerandom.RepeatableRandomSequence
.
The classes in this module are primarily intended for storing information
on random distributions in configuration files using
Distribution.as_dict()
/distribution_from_dict()
or
Distribution.as_list()
/distribution_from_list()
.
See the Examples section for examples on how to do this.
Integer distributions¶
-
class
samplespace.distributions.
DiscreteUniform
(min_val: int, max_val: int)¶ Represents a discrete uniform distribution of integers in [min_value, max_value).
-
as_dict
() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list
() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
max_val
¶ Read-only property for the distribution’s upper limit.
-
property
min_val
¶ Read-only property for the distribution’s lower limit.
-
-
class
samplespace.distributions.
Geometric
(mean: float, include_zero: bool = False)¶ Represents a geometric distribution.
If include_zero is
False
, returns integers from 1 to infinity according to \(\text{Pr}(x = k) = p {(1 - p)}^{k - 1}\) where \(p = \frac{1}{\mathit{mean}}\).If include_zero is
True
, returns integers from 0 to infinity according to \(\text{Pr}(x = k) = p {(1 - p)}^{k}\) where \(p = \frac{1}{\mathit{mean} + 1}\).-
as_dict
() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list
() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
include_zero
¶ Read-only property for whether or not the distribution’s support includes zero.
-
property
mean
¶ Read-only property for the distribution’s mean.
-
-
class
samplespace.distributions.
FiniteGeometric
(s: float, n: int)¶ Represents a geometric-like distribution with exponent s and finite support {1, …, n}.
The finite geometric distribution is defined by the equation
\[\text{Pr}(x=k) = \frac{s^{k}}{\sum_{i=1}^{N} s^{i}}\]The distribution is defined such that each result is s times as likely to occur as the previous; i.e. \(\text{Pr}(x=k) = s \text{Pr}(x=k-1)\) over the support.
-
as_dict
() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list
() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
n
¶ Read-only property for the number of values in the distribution’s support.
-
property
s
¶ Read-only property for the distribution’s s exponent.
-
-
class
samplespace.distributions.
ZipfMandelbrot
(s: float, q: float, n: int)¶ Represents a Zipf-Mandelbrot distribution with exponent s, offset q, and support {1, …, n}.
The Zipf-Mandelbrot distribution is defined by the equation
\[\text{Pr}(x=k) = \frac{(k+q)^{-s}}{\sum_{i=1}^{N} (i+q)^{-s}}\]When
q == 0
, the distribution becomes the Zipf distribution, and as n increases, it approaches the Zeta distribution.-
as_dict
() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list
() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
n
¶ Read-only property for the number of values in the distribution’s support.
-
property
q
¶ Read-only property for the distribution’s q offset.
-
property
s
¶ Read-only property for the distribution’s s exponent.
-
-
class
samplespace.distributions.
Bernoulli
(p: float)¶ Represents a Bernoulli distribution with parameter p.
Returns
True
with probability p, andFalse
otherwise.-
as_dict
() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list
() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
p
¶ Read-only property for the distribution’s p parameter.
-
Categorical distributions¶
-
class
samplespace.distributions.
WeightedCategorical
(items: Optional[Sequence[Tuple[float, Any]]] = None, population: Optional[Sequence] = None, weights: Optional[Sequence[float]] = None, *, cum_weights: Optional[Sequence[float]] = None)¶ Represents a categorical distribution defined by a population and a list of relative weights.
Either items, population and weights, or population and cum_weights should be provided, not all three.
- Parameters
items (Sequence[Tuple[Any]]) – A sequence of tuples in the format (weight, relative value).
population (Sequence) – A sequence of possible values.
weights (Sequence[float], optional) – A sequence of relative weights corresponding to each item in the population. Must be the same length as the population list.
cum_weights (Sequence[float], optional) – A sequence of cumulative weights corresponding to each item in the population. Must be the same length as the population list. Only one of weights and cum_weights should be provided.
-
as_dict
() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list
() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
cum_weights
¶ A read-only property for the distribution’s cumulative weights.
-
property
items
¶ A read-only property returning a sequence of tuples in the format (weight, relative value).
-
property
population
¶ A read-only property for the distribution’s population.
-
sample
(rand)¶ Sample from the distribution.
- Parameters
rand – The random generator used to generate the sample.
-
class
samplespace.distributions.
UniformCategorical
(population: Sequence)¶ Represents a uniform categorical distribution over a given population.
-
as_dict
() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list
() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
population
¶ A read-only property for the distribution’s population.
-
sample
(rand)¶ Sample from the distribution.
- Parameters
rand – The random generator used to generate the sample.
-
-
class
samplespace.distributions.
FiniteGeometricCategorical
(population: Sequence, s: float)¶ Represents a categorical distribution with weights corresponding to a finite geometric-like distribution with exponent s.
The finite geometric distribution is defined by the equation
\[\text{Pr}(x=k) = \frac{s^{k}}{\sum_{i=1}^{N} s^{i}}\]The distribution is defined such that each result is s times as likely to occur as the previous; i.e. \(\text{Pr}(x=k) = s \text{Pr}(x=k-1)\) over the support.
-
as_dict
() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list
() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
cum_weights
¶ A read-only property for the distribution’s cumulative weights.
-
property
items
¶ A read-only property returning a sequence of tuples in the format (weight, relative value).
-
property
n
¶ Read-only property for the number of values in the distribution’s support.
-
property
population
¶ A read-only property for the distribution’s population.
-
property
s
¶ Read-only property for the distribution’s s exponent.
-
sample
(rand)¶ Sample from the distribution.
- Parameters
rand – The random generator used to generate the sample.
-
-
class
samplespace.distributions.
ZipfMandelbrotCategorical
(population: Sequence, s: float, q: float)¶ Represents a categorical distribution with weights corresponding to a Zipf-Mandelbrot distribution with exponent s and offset q.
The Zipf-Mandelbrot distribution is defined by the equation
\[\text{Pr}(x=k) = \frac{(k+q)^{-s}}{\sum_{i=1}^{N} (i+q)^{-s}}\]When
q == 0
, the distribution becomes the Zipf distribution, and as n increases, it approaches the Zeta distribution.-
as_dict
() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list
() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
cum_weights
¶ A read-only property for the distribution’s cumulative weights.
-
property
items
¶ A read-only property returning a sequence of tuples in the format (weight, relative value).
-
property
n
¶ Read-only property for the number of values in the distribution’s support.
-
property
population
¶ A read-only property for the distribution’s population.
-
property
q
¶ Read-only property for the distribution’s q offset.
-
property
s
¶ Read-only property for the distribution’s s exponent.
-
sample
(rand)¶ Sample from the distribution.
- Parameters
rand – The random generator used to generate the sample.
-
Continuous distributions¶
-
class
samplespace.distributions.
Constant
(value)¶ Represents a distribution that always returns a constant value.
-
as_dict
() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list
() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
sample
(rand)¶ Sample from the distribution.
- Parameters
rand – The random generator used to generate the sample.
-
property
value
¶ Read-only property for the distribution’s constant value.
-
-
class
samplespace.distributions.
Uniform
(min_val: float = 0.0, max_val: float = 1.0)¶ Represents a continuous uniform distribution with support [min_val, max_val).
The uniform distribution is defined as
\[\begin{split}\text{P}(x) = \begin{cases} \frac{1}{b - a} & \text{for } x \in [a, b) \\ 0 & \text{otherwise} \end{cases}\end{split}\]-
as_dict
() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list
() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
max_val
¶ Read-only property for the distribution’s upper limit.
-
property
min_val
¶ Read-only property for the distribution’s lower limit.
-
-
class
samplespace.distributions.
Gamma
(alpha: float, beta: float)¶ Represents a gamma distribution with parameters alpha and beta.
The gamma distribution is defined as
\[\text{P}(x) = \frac{x^{\alpha - 1} e^{-\frac{x}{\beta}}} {\Gamma(\alpha) \beta^{\alpha}}\]Caution
This implementation defines its parameters to match
random.gammavariate()
. The parametrization differs from most common definitions of the gamma distribution, as defined on Wikipedia, et al. Take care when setting alpha and beta!-
property
alpha
¶ Read-only property for the distribution’s alpha parameter.
-
as_dict
() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list
() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
beta
¶ Read-only property for the distribution’s beta parameter.
-
property
-
class
samplespace.distributions.
Triangular
(low: float = 0.0, high: float = 1.0, mode: Optional[float] = None)¶ Represents a triangular distribution with lower limit low, upper limit low, and mode mode.
The triangular distribution is defined by
\[\begin{split}\text{P}(x) = \begin{cases} 0 & \text{for } x \lt l, \\ \frac{2(x-l)}{(h-l)(m-l)} & \text{for }l\le x \lt h, \\ \frac{2}{h-l} & \text{for } x = m, \\ \frac{2(h-x)}{(h-l)(h-m)} & \text{for } m \lt x \le h, \\ 0 & \text{for } h \lt x \end{cases}\end{split}\]-
as_dict
() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list
() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
high
¶ Read-only property for the distribution’s upper bound.
-
property
low
¶ Read-only property for the distribution’s lower bound.
-
property
mode
¶ Read-only property for the distribution’s mode, if specified, otherwise
None
.
-
-
class
samplespace.distributions.
UniformProduct
(n: int)¶ Represents a distribution whose values are the product of N uniformly distributed variables.
This distribution has the following PDF
\[\begin{split}\text{P}(x) = \begin{cases} \frac{(-1)^{n-1} log^{n-1}(x)}{(n - 1)!} & \text{for } x \in [0, 1) \\ 0 & \text{otherwise} \end{cases}\end{split}\]-
as_dict
() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list
() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
n
¶ Read-only property for the number of uniformly distributed variables to multiply.
-
-
class
samplespace.distributions.
LogNormal
(mu: float = 0.0, sigma: float = 1.0)¶ Represents a log-normal distribution with parameters mu and sigma.
The logarithms of values sampled from a log-normal distribution are normally distributed with mean mu and standard deviation sigma. The distribution is defined by
\[\text{P}(x) = \frac{1}{x \sigma \sqrt{2 \pi}} \exp{\left(- \frac{(\ln{x} - \mu)^2}{2 \sigma ^2}\right)}\]-
as_dict
() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list
() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
mu
¶ Read-only property for the distribution’s mu parameter.
-
sample
(rand) → float¶ Sample from the distribution.
- Parameters
rand – The random generator used to generate the sample.
-
property
sigma
¶ Read-only property for the distribution’s sigma parameter.
-
-
class
samplespace.distributions.
Exponential
(lambd: float)¶ Represents an exponential distribution with rate lambd.
The probability density function for the exponential distribution is defined by \(\text{P}(x)= \lambda e^{-\lambda x}\) where \(x\ge 0\)
-
as_dict
() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list
() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
lambd
¶ Read-only property for the distribution’s rate parameter.
-
-
class
samplespace.distributions.
VonMises
(mu: float, kappa: float)¶ Represents a von Mises distribution with parameters mu and kappa.
Samples from a von Mises distribution represent randomly-chosen angles clustered around a mean angle. This distribution is an approximation to the wrapped normal distribution and is defined by
\[\text{P}(x) = \frac{e^{\kappa \cos{(x - \mu)}}}{2 \pi I_0(\kappa)}\]where \(I_0(\kappa)\) is the modified Bessel function of order 0.
-
as_dict
() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list
() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
kappa
¶ Read-only property for the distribution’s kappa parameter.
-
property
mu
¶ Read-only property for the distribution’s mu parameter.
-
-
class
samplespace.distributions.
Beta
(alpha: float, beta: float)¶ Represents a beta distribution with parameters alpha and beta.
The beta distribution is defined by
\[\text{P}(x) = x^{\alpha - 1} (1 - x)^{\beta - 1} \frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha)\Gamma(\beta)}\]-
property
alpha
¶ Read-only property for the distribution’s alpha parameter.
-
as_dict
() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list
() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
beta
¶ Read-only property for the distribution’s beta parameter.
-
property
-
class
samplespace.distributions.
Pareto
(alpha: float)¶ Represents a Pareto distribution with shape parameter alpha and minimum value 1.
The Pareto distribution has PDF
\[\text{P}(x) = \frac{\alpha}{x^{\alpha + 1}}\]-
property
alpha
¶ Read-only property for the distribution’s alpha parameter.
-
as_dict
() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list
() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
-
class
samplespace.distributions.
Weibull
(alpha: float, beta: float)¶ Represents a Weibull distribution with scale parameter alpha and shape parameter beta.
The distribution is defined by
\[\text{P}(x) = \frac{\beta}{\alpha} \left(\frac{x}{\alpha}\right)^{k-1} e^{-(x/\alpha)^k}\]where \(x \ge 0\).
-
property
alpha
¶ Read-only property for the distribution’s alpha parameter.
-
as_dict
() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list
() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
beta
¶ Read-only property for the distribution’s beta parameter.
-
property
-
class
samplespace.distributions.
Gaussian
(mu: float = 0.0, sigma: float = 1.0)¶ Represents a Gaussian distribution with parameters mu and sigma.
The Gaussian, or Normal distribution is defined by
\[\text{P}(x) = \frac{1}{\sigma \sqrt{2 \pi}} e^{-\frac{1}{2} \left(\frac{x - \mu}{\sigma}\right)^2}\]-
as_dict
() → Dict¶ Return a representation of the distribution as a dict.
The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.
-
as_list
() → List¶ Return a representation of the distribution as a list.
The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.
-
property
mu
¶ Read-only property for the distribution’s mu parameter.
-
sample
(rand)¶ Sample from the distribution.
- Parameters
rand – The random generator used to generate the sample.
-
property
sigma
¶ Read-only property for the distribution’s sigma parameter.
-
Serialization functions¶
-
samplespace.distributions.
distribution_from_list
(as_list)¶ Build a distribution using a list of parameters.
Expects a list of values in the same form return by a call to
Distribution.as_list()
; i.e.['name', arg0, arg1, ...]
.Examples
>>> tri = Triangular(2.0, 5.0) >>> tri_as_list = tri.as_list() >>> tri_as_list ['triangular', 2.0, 5.0] >>> new_tri = distribution_from_list(tri_as_list) >>> assert tri == new_tri
-
samplespace.distributions.
distribution_from_dict
(as_dict)¶ Build a distribution using a dictionary of keyword arguments.
Expects a dict in the same form as returned by a call to
Distribution.as_dict()
; i.e.{'distribution':'name', 'arg0':'value', 'arg1':'value', ...}
.Examples
>>> gauss = Gaussian(0.8, 5.0) >>> gauss_as_dict = gauss.as_dict() >>> gauss_as_dict {'distribution': 'gaussian', 'mu': 0.8, 'sigma': 5.0} >>> new_gauss = distribution_from_dict(gauss_as_dict) >>> assert gauss == new_gauss
-
classmethod
samplespace.distribution.Distribution.
from_list
()¶ An alias for
distribution_from_list()
.
-
classmethod
samplespace.distribution.Distribution.
from_dict
()¶ An alias for
distribution_from_dict()
.
Examples¶
Sampling from a distribution:
import random
import statistics
from samplespace.distributions import Gaussian, FiniteGeometric
gauss = Gaussian(15.0, 2.0)
samples = [gauss.sample(random) for _ in range(100)]
print('Mean:', statistics.mean(samples))
print('Standard deviation:', statistics.stdev(samples))
geo = FiniteGeometric(['one', 'two', 'three', 'four', 'five'], 0.7)
samples = [geo.sample(random) for _ in range(10)]
print(' '.join(samples))
Using other random generators:
from samplespace import distributions, RepeatableRandomSequence
exponential = distributions.Exponential(0.8)
rrs = RepeatableRandomSequence(seed=12345)
print([exponential.sample(rrs) for _ in range(5)])
# Will always print:
# [1.1959827296976795, 0.6056492468915003, 0.9155454941988664, 0.5653478889068511, 0.6500080335986231]
Representations of distributions:
from samplespace.distributions import Pareto, DiscreteUniform, UniformCategorical
pareto = Pareto(2.5)
print('Pareto as dict:', pareto.as_dict()) # {'distribution': 'pareto', 'alpha': 2.5}
print('Pareto as list:', pareto.as_list()) # ['pareto', 2.5]
discrete = DiscreteUniform(3, 8)
print('Discrete uniform as dict:', discrete.as_dict()) # {'distribution': 'discreteuniform', 'min_val': 3, 'max_val': 8}
print('Discrete uniform as list:', discrete.as_list()) # ['discreteuniform', 3, 8]
cat = UniformCategorical(['string', 4, {'a':'dict'}])
print('Uniform categorical as dict:', cat.as_dict()) # {'distribution': 'uniformcategorical', 'population': ['string', 4, {'a': 'dict'}]}
print('Uniform categorical as list:', cat.as_list()) # ['uniformcategorical', ['string', 4, {'a': 'dict'}]]
Storing distributions as in config files as lists:
import random
from samplespace import distributions
...
skeleton_config = {
'name': 'Skeleton',
'starting_hp': ['gaussian', 50.0, 5.0],
'coins_dropped': ['geometric', 0.8, True],
}
...
class Skeleton(object):
def __init__(self, name, starting_hp, coins_dropped_dist):
self.name = name
self.starting_hp = starting_hp
self.coins_dropped_dist = coins_dropped_dist
def drop_coins(self):
return self.coins_dropped_dist.sample(random)
...
class SkeletonFactory(object):
def __init__(self, config):
self.name = config['name']
self.starting_hp_dist = distributions.distribution_from_list(config['starting_hp'])
self.coins_dropped_dist = distributions.distribution_from_list(config['coins_dropped'])
def make_skeleton(self):
return Skeleton(
self.name,
int(self.starting_hp_dist.sample(random)),
self.coins_dropped_dist)
Storing distributions in config files as dictionaries:
from samplespace import distributions, RepeatableRandomSequence
city_config = {
"building_distribution": {
"distribution": "weightedcategorical",
"items": [
["house", 0.2],
["store", 0.4],
["tree", 0.8],
["ground", 5.0]
]
}
}
rrs = RepeatableRandomSequence()
building_dist = distributions.distribution_from_dict(city_config['building_distribution'])
buildings = [[building_dist.sample(rrs) for col in range(20)] for row in range(5)]
for row in buildings:
for building_type in row:
if building_type == 'house':
print('H', end='')
elif building_type == 'store':
print('S', end='')
elif building_type == 'tree':
print('T', end='')
else:
print('.', end='')
print()