fast_poibin package

Module contents

class fast_poibin.PoiBin(probabilities: Sequence[float] | ndarray[Any, dtype[floating[Any]]], mode: Literal['mixed'] | Literal['dp'] = 'mixed')[source]

Bases: object

Class for storing PMF and CDF of Poisson binomial distribution.

pmf

array of probability mass function

cdf

array of cumulative distribution function

Examples

>>> poibin = PoiBin([0.1, 0.2, 0.2])
>>> poibin.pmf
array([0.576, 0.352, 0.068, 0.004])
>>> poibin.cdf
array([0.576, 0.928, 0.996, 1.   ])

Note

This class stores data as arrays of numpy.float64 no matter which type of float is passed at initialization. This is partly because numpy.fft deals with only float64, and partly because it’s easier to implement. Please see also https://numpy.org/doc/1.24/reference/routines.fft.html#type-promotion.

Notes

The internal algorithm for mixed mode is based on the well-known divide-and-conquer approach for convolving many polynomials. Its complexities are as follows.

  • Time: O(N(logN)^2)

  • Space: O(N)

In the context of Poisson binomial distribution, that seems to be equivalent to the one proposed in the following paper.

Biscarri, William & Zhao, Sihai & Brunner, Robert. (2018). A simple and fast method for computing the Poisson binomial distribution function. Computational Statistics & Data Analysis. 122. 10.1016/j.csda.2018.01.007.

A downside of this approach is numerical precision. Since convolution by FFT is unreliable in the world under the float epsilon (i.e. 2^-52 ~ 2 * 10^-16), the relative errors of the computed PMF could be large though it is still not bad w.r.t. the absolute errors. For details, please see the following issue: https://github.com/privet-kitty/fast-poibin/issues/9. For situations where the relative precision is required, you may want to use dp mode instead, whose time complexity is O(N^2).

You should also note that neither mode can express a probability mass below the least positive double float (~ 5 * 10^-324).

quantile(p: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes])[source]

Quantile function (inverse of cdf).

Parameters:

p – probability value(s)

Returns:

int or array of ints