samplespace.distributions - Serializable Probability Distributions


This module implements a number of useful probability distributions.

Each distribution can be sampled using any random number generator providing at least the same functionality as the random module; this includes samplespace.repeatablerandom.RepeatableRandomSequence.

The classes in this module are primarily intended for storing information on random distributions in configuration files using Distribution.as_dict()/distribution_from_dict() or Distribution.as_list()/distribution_from_list(). See the Examples section for examples on how to do this.

Integer distributions

class samplespace.distributions.DiscreteUniform(min_val: int, max_val: int)

Represents a discrete uniform distribution of integers in [min_value, max_value).

as_dict()Dict

Return a representation of the distribution as a dict.

The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.

as_list()List

Return a representation of the distribution as a list.

The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.

property max_val

Read-only property for the distribution’s upper limit.

property min_val

Read-only property for the distribution’s lower limit.

sample(rand)int

Sample from the distribution.

Parameters

rand – The random generator used to generate the sample.

class samplespace.distributions.Geometric(mean: float, include_zero: bool = False)

Represents a geometric distribution.

If include_zero is False, returns integers from 1 to infinity according to \(\text{Pr}(x = k) = p {(1 - p)}^{k - 1}\) where \(p = \frac{1}{\mathit{mean}}\).

If include_zero is True, returns integers from 0 to infinity according to \(\text{Pr}(x = k) = p {(1 - p)}^{k}\) where \(p = \frac{1}{\mathit{mean} + 1}\).

as_dict()Dict

Return a representation of the distribution as a dict.

The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.

as_list()List

Return a representation of the distribution as a list.

The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.

property include_zero

Read-only property for whether or not the distribution’s support includes zero.

property mean

Read-only property for the distribution’s mean.

sample(rand)int

Sample from the distribution.

Parameters

rand – The random generator used to generate the sample.

class samplespace.distributions.FiniteGeometric(s: float, n: int)

Represents a geometric-like distribution with exponent s and finite support {1, …, n}.

The finite geometric distribution is defined by the equation

\[\text{Pr}(x=k) = \frac{s^{k}}{\sum_{i=1}^{N} s^{i}}\]

The distribution is defined such that each result is s times as likely to occur as the previous; i.e. \(\text{Pr}(x=k) = s \text{Pr}(x=k-1)\) over the support.

as_dict()Dict

Return a representation of the distribution as a dict.

The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.

as_list()List

Return a representation of the distribution as a list.

The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.

property n

Read-only property for the number of values in the distribution’s support.

property s

Read-only property for the distribution’s s exponent.

sample(rand)int

Sample from the distribution.

Parameters

rand – The random generator used to generate the sample.

class samplespace.distributions.ZipfMandelbrot(s: float, q: float, n: int)

Represents a Zipf-Mandelbrot distribution with exponent s, offset q, and support {1, …, n}.

The Zipf-Mandelbrot distribution is defined by the equation

\[\text{Pr}(x=k) = \frac{(k+q)^{-s}}{\sum_{i=1}^{N} (i+q)^{-s}}\]

When q == 0, the distribution becomes the Zipf distribution, and as n increases, it approaches the Zeta distribution.

as_dict()Dict

Return a representation of the distribution as a dict.

The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.

as_list()List

Return a representation of the distribution as a list.

The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.

property n

Read-only property for the number of values in the distribution’s support.

property q

Read-only property for the distribution’s q offset.

property s

Read-only property for the distribution’s s exponent.

sample(rand)int

Sample from the distribution.

Parameters

rand – The random generator used to generate the sample.

class samplespace.distributions.Bernoulli(p: float)

Represents a Bernoulli distribution with parameter p.

Returns True with probability p, and False otherwise.

as_dict()Dict

Return a representation of the distribution as a dict.

The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.

as_list()List

Return a representation of the distribution as a list.

The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.

property p

Read-only property for the distribution’s p parameter.

sample(rand)bool

Sample from the distribution.

Parameters

rand – The random generator used to generate the sample.

Categorical distributions

class samplespace.distributions.WeightedCategorical(items: Optional[Sequence[Tuple[float, Any]]] = None, population: Optional[Sequence] = None, weights: Optional[Sequence[float]] = None, *, cum_weights: Optional[Sequence[float]] = None)

Represents a categorical distribution defined by a population and a list of relative weights.

Either items, population and weights, or population and cum_weights should be provided, not all three.

Parameters
  • items (Sequence[Tuple[Any]]) – A sequence of tuples in the format (weight, relative value).

  • population (Sequence) – A sequence of possible values.

  • weights (Sequence[float], optional) – A sequence of relative weights corresponding to each item in the population. Must be the same length as the population list.

  • cum_weights (Sequence[float], optional) – A sequence of cumulative weights corresponding to each item in the population. Must be the same length as the population list. Only one of weights and cum_weights should be provided.

as_dict()Dict

Return a representation of the distribution as a dict.

The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.

as_list()List

Return a representation of the distribution as a list.

The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.

property cum_weights

A read-only property for the distribution’s cumulative weights.

property items

A read-only property returning a sequence of tuples in the format (weight, relative value).

property population

A read-only property for the distribution’s population.

sample(rand)

Sample from the distribution.

Parameters

rand – The random generator used to generate the sample.

class samplespace.distributions.UniformCategorical(population: Sequence)

Represents a uniform categorical distribution over a given population.

as_dict()Dict

Return a representation of the distribution as a dict.

The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.

as_list()List

Return a representation of the distribution as a list.

The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.

property population

A read-only property for the distribution’s population.

sample(rand)

Sample from the distribution.

Parameters

rand – The random generator used to generate the sample.

class samplespace.distributions.FiniteGeometricCategorical(population: Sequence, s: float)

Represents a categorical distribution with weights corresponding to a finite geometric-like distribution with exponent s.

The finite geometric distribution is defined by the equation

\[\text{Pr}(x=k) = \frac{s^{k}}{\sum_{i=1}^{N} s^{i}}\]

The distribution is defined such that each result is s times as likely to occur as the previous; i.e. \(\text{Pr}(x=k) = s \text{Pr}(x=k-1)\) over the support.

as_dict()Dict

Return a representation of the distribution as a dict.

The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.

as_list()List

Return a representation of the distribution as a list.

The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.

property cum_weights

A read-only property for the distribution’s cumulative weights.

property items

A read-only property returning a sequence of tuples in the format (weight, relative value).

property n

Read-only property for the number of values in the distribution’s support.

property population

A read-only property for the distribution’s population.

property s

Read-only property for the distribution’s s exponent.

sample(rand)

Sample from the distribution.

Parameters

rand – The random generator used to generate the sample.

class samplespace.distributions.ZipfMandelbrotCategorical(population: Sequence, s: float, q: float)

Represents a categorical distribution with weights corresponding to a Zipf-Mandelbrot distribution with exponent s and offset q.

The Zipf-Mandelbrot distribution is defined by the equation

\[\text{Pr}(x=k) = \frac{(k+q)^{-s}}{\sum_{i=1}^{N} (i+q)^{-s}}\]

When q == 0, the distribution becomes the Zipf distribution, and as n increases, it approaches the Zeta distribution.

as_dict()Dict

Return a representation of the distribution as a dict.

The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.

as_list()List

Return a representation of the distribution as a list.

The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.

property cum_weights

A read-only property for the distribution’s cumulative weights.

property items

A read-only property returning a sequence of tuples in the format (weight, relative value).

property n

Read-only property for the number of values in the distribution’s support.

property population

A read-only property for the distribution’s population.

property q

Read-only property for the distribution’s q offset.

property s

Read-only property for the distribution’s s exponent.

sample(rand)

Sample from the distribution.

Parameters

rand – The random generator used to generate the sample.

Continuous distributions

class samplespace.distributions.Constant(value)

Represents a distribution that always returns a constant value.

as_dict()Dict

Return a representation of the distribution as a dict.

The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.

as_list()List

Return a representation of the distribution as a list.

The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.

sample(rand)

Sample from the distribution.

Parameters

rand – The random generator used to generate the sample.

property value

Read-only property for the distribution’s constant value.

class samplespace.distributions.Uniform(min_val: float = 0.0, max_val: float = 1.0)

Represents a continuous uniform distribution with support [min_val, max_val).

The uniform distribution is defined as

\[\begin{split}\text{P}(x) = \begin{cases} \frac{1}{b - a} & \text{for } x \in [a, b) \\ 0 & \text{otherwise} \end{cases}\end{split}\]
as_dict()Dict

Return a representation of the distribution as a dict.

The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.

as_list()List

Return a representation of the distribution as a list.

The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.

property max_val

Read-only property for the distribution’s upper limit.

property min_val

Read-only property for the distribution’s lower limit.

sample(rand)float

Sample from the distribution.

Parameters

rand – The random generator used to generate the sample.

class samplespace.distributions.Gamma(alpha: float, beta: float)

Represents a gamma distribution with parameters alpha and beta.

The gamma distribution is defined as

\[\text{P}(x) = \frac{x^{\alpha - 1} e^{-\frac{x}{\beta}}} {\Gamma(\alpha) \beta^{\alpha}}\]

Caution

This implementation defines its parameters to match random.gammavariate(). The parametrization differs from most common definitions of the gamma distribution, as defined on Wikipedia, et al. Take care when setting alpha and beta!

property alpha

Read-only property for the distribution’s alpha parameter.

as_dict()Dict

Return a representation of the distribution as a dict.

The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.

as_list()List

Return a representation of the distribution as a list.

The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.

property beta

Read-only property for the distribution’s beta parameter.

sample(rand)float

Sample from the distribution.

Parameters

rand – The random generator used to generate the sample.

class samplespace.distributions.Triangular(low: float = 0.0, high: float = 1.0, mode: Optional[float] = None)

Represents a triangular distribution with lower limit low, upper limit low, and mode mode.

The triangular distribution is defined by

\[\begin{split}\text{P}(x) = \begin{cases} 0 & \text{for } x \lt l, \\ \frac{2(x-l)}{(h-l)(m-l)} & \text{for }l\le x \lt h, \\ \frac{2}{h-l} & \text{for } x = m, \\ \frac{2(h-x)}{(h-l)(h-m)} & \text{for } m \lt x \le h, \\ 0 & \text{for } h \lt x \end{cases}\end{split}\]
as_dict()Dict

Return a representation of the distribution as a dict.

The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.

as_list()List

Return a representation of the distribution as a list.

The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.

property high

Read-only property for the distribution’s upper bound.

property low

Read-only property for the distribution’s lower bound.

property mode

Read-only property for the distribution’s mode, if specified, otherwise None.

sample(rand)float

Sample from the distribution.

Parameters

rand – The random generator used to generate the sample.

class samplespace.distributions.UniformProduct(n: int)

Represents a distribution whose values are the product of N uniformly distributed variables.

This distribution has the following PDF

\[\begin{split}\text{P}(x) = \begin{cases} \frac{(-1)^{n-1} log^{n-1}(x)}{(n - 1)!} & \text{for } x \in [0, 1) \\ 0 & \text{otherwise} \end{cases}\end{split}\]
as_dict()Dict

Return a representation of the distribution as a dict.

The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.

as_list()List

Return a representation of the distribution as a list.

The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.

property n

Read-only property for the number of uniformly distributed variables to multiply.

sample(rand)float

Sample from the distribution.

Parameters

rand – The random generator used to generate the sample.

class samplespace.distributions.LogNormal(mu: float = 0.0, sigma: float = 1.0)

Represents a log-normal distribution with parameters mu and sigma.

The logarithms of values sampled from a log-normal distribution are normally distributed with mean mu and standard deviation sigma. The distribution is defined by

\[\text{P}(x) = \frac{1}{x \sigma \sqrt{2 \pi}} \exp{\left(- \frac{(\ln{x} - \mu)^2}{2 \sigma ^2}\right)}\]
as_dict()Dict

Return a representation of the distribution as a dict.

The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.

as_list()List

Return a representation of the distribution as a list.

The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.

property mu

Read-only property for the distribution’s mu parameter.

sample(rand)float

Sample from the distribution.

Parameters

rand – The random generator used to generate the sample.

property sigma

Read-only property for the distribution’s sigma parameter.

class samplespace.distributions.Exponential(lambd: float)

Represents an exponential distribution with rate lambd.

The probability density function for the exponential distribution is defined by \(\text{P}(x)= \lambda e^{-\lambda x}\) where \(x\ge 0\)

as_dict()Dict

Return a representation of the distribution as a dict.

The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.

as_list()List

Return a representation of the distribution as a list.

The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.

property lambd

Read-only property for the distribution’s rate parameter.

sample(rand)float

Sample from the distribution.

Parameters

rand – The random generator used to generate the sample.

class samplespace.distributions.VonMises(mu: float, kappa: float)

Represents a von Mises distribution with parameters mu and kappa.

Samples from a von Mises distribution represent randomly-chosen angles clustered around a mean angle. This distribution is an approximation to the wrapped normal distribution and is defined by

\[\text{P}(x) = \frac{e^{\kappa \cos{(x - \mu)}}}{2 \pi I_0(\kappa)}\]

where \(I_0(\kappa)\) is the modified Bessel function of order 0.

as_dict()Dict

Return a representation of the distribution as a dict.

The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.

as_list()List

Return a representation of the distribution as a list.

The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.

property kappa

Read-only property for the distribution’s kappa parameter.

property mu

Read-only property for the distribution’s mu parameter.

sample(rand)float

Sample from the distribution.

Parameters

rand – The random generator used to generate the sample.

class samplespace.distributions.Beta(alpha: float, beta: float)

Represents a beta distribution with parameters alpha and beta.

The beta distribution is defined by

\[\text{P}(x) = x^{\alpha - 1} (1 - x)^{\beta - 1} \frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha)\Gamma(\beta)}\]
property alpha

Read-only property for the distribution’s alpha parameter.

as_dict()Dict

Return a representation of the distribution as a dict.

The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.

as_list()List

Return a representation of the distribution as a list.

The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.

property beta

Read-only property for the distribution’s beta parameter.

sample(rand)float

Sample from the distribution.

Parameters

rand – The random generator used to generate the sample.

class samplespace.distributions.Pareto(alpha: float)

Represents a Pareto distribution with shape parameter alpha and minimum value 1.

The Pareto distribution has PDF

\[\text{P}(x) = \frac{\alpha}{x^{\alpha + 1}}\]
property alpha

Read-only property for the distribution’s alpha parameter.

as_dict()Dict

Return a representation of the distribution as a dict.

The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.

as_list()List

Return a representation of the distribution as a list.

The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.

sample(rand)float

Sample from the distribution.

Parameters

rand – The random generator used to generate the sample.

class samplespace.distributions.Weibull(alpha: float, beta: float)

Represents a Weibull distribution with scale parameter alpha and shape parameter beta.

The distribution is defined by

\[\text{P}(x) = \frac{\beta}{\alpha} \left(\frac{x}{\alpha}\right)^{k-1} e^{-(x/\alpha)^k}\]

where \(x \ge 0\).

property alpha

Read-only property for the distribution’s alpha parameter.

as_dict()Dict

Return a representation of the distribution as a dict.

The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.

as_list()List

Return a representation of the distribution as a list.

The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.

property beta

Read-only property for the distribution’s beta parameter.

sample(rand)float

Sample from the distribution.

Parameters

rand – The random generator used to generate the sample.

class samplespace.distributions.Gaussian(mu: float = 0.0, sigma: float = 1.0)

Represents a Gaussian distribution with parameters mu and sigma.

The Gaussian, or Normal distribution is defined by

\[\text{P}(x) = \frac{1}{\sigma \sqrt{2 \pi}} e^{-\frac{1}{2} \left(\frac{x - \mu}{\sigma}\right)^2}\]
as_dict()Dict

Return a representation of the distribution as a dict.

The ‘distribution’ key is the name of the distribution, and the remaining keys are the distribution’s kwargs.

as_list()List

Return a representation of the distribution as a list.

The first element if the list is the name of the distribution, and the subsequent elements are ordered parameters.

property mu

Read-only property for the distribution’s mu parameter.

sample(rand)

Sample from the distribution.

Parameters

rand – The random generator used to generate the sample.

property sigma

Read-only property for the distribution’s sigma parameter.

Serialization functions

samplespace.distributions.distribution_from_list(as_list)

Build a distribution using a list of parameters.

Expects a list of values in the same form return by a call to Distribution.as_list(); i.e. ['name', arg0, arg1, ...].

Examples

>>> tri = Triangular(2.0, 5.0)
>>> tri_as_list = tri.as_list()
>>> tri_as_list
['triangular', 2.0, 5.0]
>>> new_tri = distribution_from_list(tri_as_list)
>>> assert tri == new_tri
Raises
  • KeyError – if the distribution name is not recognized.

  • TypeError – if the wrong number of arguments is given.

samplespace.distributions.distribution_from_dict(as_dict)

Build a distribution using a dictionary of keyword arguments.

Expects a dict in the same form as returned by a call to Distribution.as_dict(); i.e. {'distribution':'name', 'arg0':'value', 'arg1':'value', ...}.

Examples

>>> gauss = Gaussian(0.8, 5.0)
>>> gauss_as_dict = gauss.as_dict()
>>> gauss_as_dict
{'distribution': 'gaussian', 'mu': 0.8, 'sigma': 5.0}
>>> new_gauss = distribution_from_dict(gauss_as_dict)
>>> assert gauss == new_gauss
Raises
  • KeyError – if the distribution name is not recognized or provided.

  • TypeError – if the wrong keyword arguments are provided.

classmethod samplespace.distribution.Distribution.from_list()

An alias for distribution_from_list().

classmethod samplespace.distribution.Distribution.from_dict()

An alias for distribution_from_dict().

Examples

Sampling from a distribution:

import random
import statistics

from samplespace.distributions import Gaussian, FiniteGeometric

gauss = Gaussian(15.0, 2.0)
samples = [gauss.sample(random) for _ in range(100)]
print('Mean:', statistics.mean(samples))
print('Standard deviation:', statistics.stdev(samples))

geo = FiniteGeometric(['one', 'two', 'three', 'four', 'five'], 0.7)
samples = [geo.sample(random) for _ in range(10)]
print(' '.join(samples))

Using other random generators:

from samplespace import distributions, RepeatableRandomSequence

exponential = distributions.Exponential(0.8)

rrs = RepeatableRandomSequence(seed=12345)
print([exponential.sample(rrs) for _ in range(5)])
# Will always print:
# [1.1959827296976795, 0.6056492468915003, 0.9155454941988664, 0.5653478889068511, 0.6500080335986231]

Representations of distributions:

from samplespace.distributions import Pareto, DiscreteUniform, UniformCategorical

pareto = Pareto(2.5)
print('Pareto as dict:', pareto.as_dict())  # {'distribution': 'pareto', 'alpha': 2.5}
print('Pareto as list:', pareto.as_list())  # ['pareto', 2.5]

discrete = DiscreteUniform(3, 8)
print('Discrete uniform as dict:', discrete.as_dict())  # {'distribution': 'discreteuniform', 'min_val': 3, 'max_val': 8}
print('Discrete uniform as list:', discrete.as_list())  # ['discreteuniform', 3, 8]

cat = UniformCategorical(['string', 4, {'a':'dict'}])
print('Uniform categorical as dict:', cat.as_dict())  # {'distribution': 'uniformcategorical', 'population': ['string', 4, {'a': 'dict'}]}
print('Uniform categorical as list:', cat.as_list())  # ['uniformcategorical', ['string', 4, {'a': 'dict'}]]

Storing distributions as in config files as lists:

import random
from samplespace import distributions

...

skeleton_config = {
    'name': 'Skeleton',
    'starting_hp': ['gaussian', 50.0, 5.0],
    'coins_dropped': ['geometric', 0.8, True],
}

...

class Skeleton(object):
    def __init__(self, name, starting_hp, coins_dropped_dist):
        self.name = name
        self.starting_hp = starting_hp
        self.coins_dropped_dist = coins_dropped_dist

    def drop_coins(self):
        return self.coins_dropped_dist.sample(random)

...

class SkeletonFactory(object):

    def __init__(self, config):
        self.name = config['name']
        self.starting_hp_dist = distributions.distribution_from_list(config['starting_hp'])
        self.coins_dropped_dist = distributions.distribution_from_list(config['coins_dropped'])

    def make_skeleton(self):
        return Skeleton(
            self.name,
            int(self.starting_hp_dist.sample(random)),
            self.coins_dropped_dist)

Storing distributions in config files as dictionaries:

from samplespace import distributions, RepeatableRandomSequence

city_config = {
    "building_distribution": {
        "distribution": "weightedcategorical",
        "items": [
            ["house", 0.2],
            ["store", 0.4],
            ["tree", 0.8],
            ["ground", 5.0]
        ]
    }
}

rrs = RepeatableRandomSequence()
building_dist = distributions.distribution_from_dict(city_config['building_distribution'])

buildings = [[building_dist.sample(rrs) for col in range(20)] for row in range(5)]

for row in buildings:
    for building_type in row:
        if building_type == 'house':
            print('H', end='')
        elif building_type == 'store':
            print('S', end='')
        elif building_type == 'tree':
            print('T', end='')
        else:
            print('.', end='')
    print()