vpdl : Probability Distribution Library

This library contains data structures to represent univariate and multivariate probability distributions and provides algorithms to operate on them. This includes estimating a distribution from data points and sampling from a distribution. vpdl is built on top of vnl and vnl_algo.

vpdl is comprised of two programming paradigms: generic programming and polymorphic inheritance. The generic programming part is its own sub-library called vpdt. vpdt is a template library (like STL). There is no compiled library, only a collection of header files in vpdl/vpdt. vpdt works with vnl types, but in many cases can generalize to other types. The goal of vpdt is to provide generic implementations that are both time and memory efficient when types are known at compile time.

The rest of vpdl uses a polymorphic design as provides greater run time flexibility and easy of use. It is limited to distributions over scalar, vnl_vector, and vnl_vector_fixed types.

History

vpdl is the merger of two different design patterns for probability distributions. It was formed from the merger of three contrib libraries: mul/vpdfl, mul/pdf1d, and brl/bbas/bsta.

Created by Manchester, vpdfl provided a polymorphic hierarchy (using virtual functions) for multivariate distributions based on vnl_vector and vnl_matrix types. For univariate distributions, pdf1d mirrored the design of vpdfl, but used scalar types (i.e. double). These libraries were very flexible at run time. Both distribution type and, in the case of vpdfl, dimension could be selected at run time.

Create by Brown, bsta provided a generic programming hierarchy (using templates) for both univariate and multivariate distributions. Template parameters specified scalar type (float or double) and dimension. Templates allowed the same code base to used scalars in the univariate case and vnl_vector_fixed and vnl_matrix_fixed in the multivariate case. The goal of bsta was to be very efficient. Many optimizations are possible by assuming types and dimension are known at compile time.

vpdl was designed as a core library to meet the need of both original designs. It uses templates to select type and dimension at compile time, but for each selection of template parameters there is a polymorphic hierarchy. In addition, the default dimension is 0 which has the special meaning of "dimension determined at run time".

Distributions

Each distribution is derived (directly or indirectly) from a common templated base class called vpdl_distribution <T,n>. Template parameter T specifies the numeric type (float or double) and n specifies the dimension. vpdl_distribution <T,n> is derived from vpdl_base_traits <T,n> which is a partially specialized class that defines the key data types for representation of vectors and matrices in each dimension. In particular:

vpdl_base_traits <T,n> also defines various functions to operate on these different types with a consistent API. These included functions to get/set dimension, access a vector or matrix element, resize a vector or matrix, etc. For some template parameters these functions may do nothing, but their existence allows a single implementation of many functions on distributions without need for further template specialization.

The following distributions are in vpdl:

Lead

Matt Leotta is responsible for co-ordinating significant changes to vpdl. http://sourceforge.net/sendmessage.php?touser=857661