This library contains data structures to represent univariate and multivariate probability distributions and provides algorithms to operate on them. This includes estimating a distribution from data points and sampling from a distribution. vpdl is built on top of vnl and vnl_algo.
vpdl is comprised of two programming paradigms: generic programming and polymorphic inheritance. The generic programming part is its own sub-library called
vpdt is a template library (like STL). There is no compiled library, only a collection of header files in
vpdt works with vnl types, but in many cases can generalize to other types. The goal of vpdt is to provide generic implementations that are both time and memory efficient when types are known at compile time.
The rest of vpdl uses a polymorphic design as provides greater run time flexibility and easy of use. It is limited to distributions over scalar, vnl_vector, and vnl_vector_fixed types.
Created by Manchester, vpdfl provided a polymorphic hierarchy (using virtual functions) for multivariate distributions based on
vnl_matrix types. For univariate distributions, pdf1d mirrored the design of vpdfl, but used scalar types (i.e. double). These libraries were very flexible at run time. Both distribution type and, in the case of vpdfl, dimension could be selected at run time.
Create by Brown, bsta provided a generic programming hierarchy (using templates) for both univariate and multivariate distributions. Template parameters specified scalar type (float or double) and dimension. Templates allowed the same code base to used scalars in the univariate case and
vnl_matrix_fixed in the multivariate case. The goal of bsta was to be very efficient. Many optimizations are possible by assuming types and dimension are known at compile time.
vpdl was designed as a core library to meet the need of both original designs. It uses templates to select type and dimension at compile time, but for each selection of template parameters there is a polymorphic hierarchy. In addition, the default dimension is 0 which has the special meaning of "dimension determined at run time".
vpdl_distribution <T,n>. Template parameter T specifies the numeric type (float or double) and n specifies the dimension.
vpdl_distribution <T,n>is derived from
vpdl_base_traits <T,n>which is a partially specialized class that defines the key data types for representation of vectors and matrices in each dimension. In particular:
vnl_matrix <T>(dimension specified at run time)
vnl_matrix_fixed <T,n,n>(fixed dimension of n)
vpdl_base_traits <T,n> also defines various functions to operate on these different types with a consistent API. These included functions to get/set dimension, access a vector or matrix element, resize a vector or matrix, etc. For some template parameters these functions may do nothing, but their existence allows a single implementation of many functions on distributions without need for further template specialization.
The following distributions are in vpdl:
vpdl_gaussian <T,n>: A general Gaussian (aka Normal) distribution
vpdl_gaussian_indep <T,n>: A Gaussian with axis independent covariance
vpdl_gaussian_sphere <T,n>: A hyper-spherical Gaussian with a scalar variance
vpdl_mixture <T,n>: A polymorphic weighted mixture of distributions
vpdl_mixture_of <dist_t>: A weighted mixture of distributions of fixed type
vpdl_kernel_gaussian_sfbw <T,n>: a kernel distribution with fixed bandwidth using a spherically symmetric Gaussian kernel