Discretization of Probability Distributions: Random, Deterministic and Randomistic Sampling

2019-11
Discretization of Probability Distributions: Random, Deterministic and

Randomistic Sampling
Hugo Hernandez
ForsChem Research, 050030 Medellin, Colombia
hugo.hernandez@forschem.org
doi: 10.13140/RG.2.2.11389.92643
Abstract
Sampling procedures are commonly used to extract a finite number of elements from a
particular probability distribution. This discretization of the probability distribution is usually
performed using pseudo-random number generators. This type of discretization, known as
random sampling, requires suitable functions for transforming standard uniform random
numbers into random numbers following any arbitrary probability distribution. While random
sampling resembles the natural behavior of experimentation, individual samples do not
necessarily preserve all the properties of the original probability distribution. Those properties
include the cumulative probability and the moments of the distribution. The match between
the cumulative probability observed in a sample and that of the original distribution can be
determined using the random goodness-of-fit criterion. Random samples seldom achieve a
100% fit to the original distribution. Deterministic sampling methods, on the other hand, always
present a 100% random goodness-of-fit, but their values are always the same, depending on the
size of the sample. One particular case of deterministic sampling is optimal sampling, which
ensure goodness-of-fit but also allows preserving the moments of the original distribution.
Finally, randomistic sampling combines the fitness of deterministic sampling with the changing
behavior of random samples, resulting in an interesting alternative for representing random
variables, particularly in applications involving Monte Carlo methods, where the sample is
expected to represent the properties of the full distribution.
Keywords
Cumulative Probability, Discretization, Heuristics, Inverse Transform, Monte Carlo,

Optimization, Probability Distribution, Pseudo-random Numbers, Randomistics, Randomness,
Sampling, Standard Random Variables.
24/09/2019 ForsChem Research Reports 2019-11 (1 / 30)

www.forschem.org
Discretization of Probability Distributions:
Random, Deterministic and Randomistic Sampling
Hugo Hernandez
ForsChem Research
1. Introduction
In a previous report, different methods for obtaining probability density functions from a data
sample were described and discussed.[1] The purpose of those methods was finding the best
probability density function fitting a particular set of data. It was shown that several different
probability density functions are capable of describing a data sample, particularly when the
sample size is small. The purpose of the present report is describing and discussing some
methods for doing the opposite, that is, obtaining data samples from probability distributions,
and particularly, from probability density functions. These procedures, usually known as
sampling procedures, transform the probability distribution of a certain random variable into a
discrete number of data points representing the random variable. Three different types of
discretization methods of the probability distribution will be considered: Random,
Deterministic and Randomistic discretization methods. Random sampling methods use pseudo-
random number generation algorithms for determining the values of the elements in a sample
following the particular probability distribution.[2] Thus, each time a discretization is made, the
values of the elements obtained will be different, although some properties of the original
distribution (moments) are approximately conserved. More details about random sampling will
be presented in Section 2. Deterministic sampling, on the other hand, will result always in the
same set of elements, as long as the sample size and the original distribution are the same.
Deterministic sampling can be classified into Heuristic and Optimal. Randomistic sampling
combines certain elements of deterministic sampling with random sampling. Particularly, it
defines equiprobable intervals just as deterministic sampling, but for each interval it randomly
chooses a number within the interval. More details about deterministic and randomistic
discretization of probability distributions will be introduced in Section 3. Finally, various
comparative examples of probability distributions discretized using different sampling
methods are presented in Section 4. Even though the description of the different sampling
methods is done for continuous probability distributions, they can also be extended to discrete
probability distributions.
2. Random Sampling
2.1. True Random vs. Pseudo-Random Number Generation
As it was previously mentioned, random sampling consists of obtaining a sample of values

following a particular probability distribution, with the help of random numbers. Random
numbers can be classified into true random numbers and pseudo-random numbers. A true
random number, in a certain sequence of random numbers, cannot be predicted from any
previous random number in the sequence. Sequences of true random numbers can be obtained

www.forschem.org
Hugo Hernandez
ForsChem Research
from measurements of noisy natural phenomena.[3] On the other hand, pseudo-random

numbers depend on the previous values in the sequence. However, the pattern followed by the
numbers in a sequence generated by a pseudo-random number generator resembles the
behavior of true random number sequences. Furthermore, a pseudo-random number can only
be predicted if the generation function is known, which typically not the case is for practical
purposes. Thus, the sequence of random numbers obtained from a pseudo-random number
generator can be used for random sampling, since the user usually cannot predict the next
random number in the sequence, and therefore cannot influence the outcome.
2.2. Standard Random Numbers
Particularly, when a random number is obtained from any standard probability distribution, it is
considered a standard random number. There are three main types of standard probability
distributions, depending on whether the variable is unbounded (Type I), semi-bounded (Type
II) or bounded (Type III).[4] In general, standard random numbers ( ) are linear
transformations of any general random number ( ) obtained from a random variable as
follows:
(2.1)
where is a random number, is the corresponding standard random number, and and are
constant parameters defined according to Table 1. Notice that Type I transformations are also
possible for bounded and semi-bounded variables, and Type II transformations are also
possible for bounded variables.
Table 1. Parameters used in the definition of standard random numbers

Type of Standard
Transformation
Type I (Unbounded) ( ) √ ( )
Type II (Lower bounded) ( ) ( ) ( )
Type II (Upper bounded) ( ) ( ) ( )
Type III (Bounded) ( ) ( ) ( )
Now, if a random number is obtained for a standard probability distribution, it can be

transformed into a random number of any probability distribution from the same family, simply
by using (from Eq. 2.1):
(2.2)

www.forschem.org
Hugo Hernandez
ForsChem Research
2.3. Uniform Random Numbers
Random numbers following a Type III standard uniform distribution (values in the range
between 0 and 1) can be obtained using true or pseudo-random number generators.
Particularly, there are different types of pseudo-random number generators available including:
Linear congruential generators, Mersenne Twisters, xorshift, complimentary multiply-with-
carry and counter-based generators.[2] Spreadsheets and mathematical software are
commonly developed incorporating one or more pseudo-random number generator algorithms
for the standard uniform distribution. Thus, standard uniform random numbers are nowadays
widely available.
Type III standard uniform random numbers ( ) are the core of random sampling, since
random numbers following any other arbitrary probability distribution can be obtained from
them, as it will be seen in the following section. Certain software packages contain random
number generators for specific distributions including: Standard normal distribution, the
exponential distribution, binomial distribution, etc., but other types of probability distributions
are not so common.
2.4. Non-Uniform Random Numbers: Inverse Transform Sampling
The most direct method for transforming a standard uniform random number into a random
number of any arbitrary probability distribution is known as inverse transform sampling. This
method assumes that the cumulative probability distribution function ( ) of any arbitrary
distribution of the random variable can be described by a standard uniform random variable.
Any value of the cumulative probability distribution function is equally probable, and bounded
between 0 and 1, just as the standard uniform distribution. Thus, it is possible to say that in
general:
( )
(2.3)
where is a standard uniform random number, and is a random number of . Therefore,
( )
(2.4)
where is the inverse of the cumulative probability function. In terms of standard random
variables, it is also possible to represent Eq. (2.4) as:
( )
(2.5)

www.forschem.org
Hugo Hernandez
ForsChem Research
where is a standard random number following the distribution of , and are the
corresponding parameters of the standard transformation, and is the inverse of the
cumulative probability function for the standard transformation of ( ).
Let us consider for example the exponential probability distribution, whose cumulative
probability function is given by:
( )
(2.6)
where is a parameter of the distribution. By setting ( ) , and solving for results in:
( )
(2.7)
which is a simple expression for obtaining random numbers following an exponential

probability distribution using standard uniform random numbers.
Given that the exponential distribution is lower bounded, it is possible to use a Type II standard
transformation resulting in:
( )
(2.8)
Unfortunately, certain cumulative probability distribution functions cannot be explicitly solved

for the random variable , limiting the use of this method. However, it is possible to circumvent
this problem either by approximating the cumulative probability function using an invertible
function, of by using a numerical solution.
As an example for the first case, let us consider the cumulative probability function of a
standard Maxwell-Boltzmann distribution:
( ) ( )
√
(2.9)
Even though Eq. (2.9) cannot be explicitly solved for , it can be approximated using the
following expression:[5]
( ) ( )
(2.10)

www.forschem.org
Hugo Hernandez
ForsChem Research
which can now be solved yielding:
( )
( )
(2.11)
Figure 1 shows a comparison between the exact cumulative probability function of the
standard Maxwell-Boltzmann distribution (Eq. 2.9) and its approximation (Eq. 2.10).
Figure 1. Comparison between the cumulative probability function for the standard
Maxwell-Boltzmann random number (blue solid line) and the empirical approximation given
in Eq. (2.10) (red dashed line).
If a simple invertible approximated function of the cumulative probability is not possible, then
the second approach consists on numerically solving the following non-linear algebraic
equation:
( )
(2.12)
or solving the following numerical optimization problem:
( ( ))
(2.13)
in order to find the corresponding random number .
Eq. (2.12) can be solved by means of the Newton-Raphson method, using the following
recurrence from an initial guess :

www.forschem.org
Hugo Hernandez
ForsChem Research
( )
( )
(2.14)
( )
where ( ) represents the probability density function of , is obtained when
| | , and is a predefined tolerance in the determination of the random number
.
Similarly, the optimization problem (2.13) can be solved using a gradient method, expressed in
the following recurrence from the initial guess :
( ( ))
(2.15)
where is a search step size, which can be selected to be (using the Barzilai-Borwen method
[6]):
( ( ) ( ))
(2.16)
in order to guarantee convergence. Please notice that using Eq. (2.16), Eq. (2.15) becomes
equivalent to Eq. (2.14) when .
Given the monotonic nature of the cumulative probability function, the initial guess used in
both methods can be simply the expected value of the distribution:
( )
(2.17)
2.5. Non-Uniform Random Numbers: General Transformation Sampling
Another alternative for generating non-uniform random numbers is using random variable
transformations. This is a very general approach that can be used for generating random
numbers following any distribution whose probability density function is known. In fact, the
inverse transform method, described in Section 2.4., turns out to be a particular case of
random variable transformation, as it will be shown.
The basic principle of this method consists on describing a certain random variable with
known probability density function as a function ( ) of one or more random variables
whose random numbers are readily available. Particularly only standard random variables ( )

www.forschem.org
Hugo Hernandez
ForsChem Research
will be used considering that they can represent a family of random variables using linear
transformations (see Eq. 2.2). Thus, the random variable can be expressed as:
( )
(2.18)
Since the change of variable theorem for multivariable transformations [7] must hold, then:
( ) ∫ ∫ ∫ ∏( ( ) )
( )
∑ ( ( )) | |
(2.19)
where represents the -th solution from possible solutions resulting from solving Eq.
(2.18) for . Thus, this method requires that the function can be solved in terms of at least
one standard random variable, which will arbitrarily be denoted as .
Assuming standard random variables with known distributions (whose random numbers are
available), and given that ( ) is known, the problem consists on finding at least one function
which satisfies Eq. (2.19). This is a very general problem whose difficulty increases with the
number of standard random variables considered, and their particular probability density
functions. Since standard uniform random numbers are readily generated without any further
processing, and because of the simplicity of their probability density function, they are the best
choice for this method. Thus, using standard uniform random variables, Eq. (2.19) becomes:
( ) ∫ ∫ ∫ ∏
( )
∑ ( ( )) | |
(2.20)
where
( )
( ( )) {
(2.21)
Now, let us consider the simplest transformation when only one standard uniform random
variable is used, and the transformation is assumed to be a one-to-one function, then:

www.forschem.org
Hugo Hernandez
ForsChem Research
( )
| | ( )
( ) {
(2.22)
Clearly, one possible solution to Eq. (2.22) is the cumulative probability function , since:
( )
( )
(2.23)
and ( ) .
Since ( ) ( ) is one possible solution, then the corresponding function for

describing an arbitrary random variable in terms of one standard uniform random variable is:
( ) ( )
(2.24)
Eq. (2.24) corresponds to the inverse transform method of Section 2.4. However, the
cumulative probability function should be invertible for this method to be useful.
It is also possible to describe the arbitrary random variable in terms of two standard uniform
random variables. In this case, Eq. (2.20) becomes:
( )
( ) ∫ ∑ ( ( )) | |
(2.25)
where
( )
( ( )) {
(2.26)
Let us now define the function ( ) such that the following equations are valid:
( ) ∫ | ( )|
(2.27)

www.forschem.org
Hugo Hernandez
ForsChem Research
( ) ∫ ( ) [ ]
(2.28)
Thus, any function ( ) meeting those two requirements will allow finding a function
( ) for representing the random variable in terms of two standard uniform random
variables, as long as ( ) can be explicitly solved for . Eq. (2.28) may impose some
constraints on the integration limits of Eq. (2.27). If that is the case, those constraints will be
expressed as functions of .
Let us consider the case of a standard normal random variable , whose probability density
function is:
( )
√
(2.29)
and its cumulative probability function can be expressed as:
( )
( ) ∫ √
√
(2.30)
where is the error function.
Transforming in terms of a single standard uniform random variable will result in (Eq. 2.24):
( ) √ ( )
(2.31)
where represents the inverse error function.
If neither the inverse standard normal cumulative function nor are available, then it
is necessary to use a second standard uniform random variable. In this case, a suitable function
( ) must be found such that
∫ | ( )|
√
(2.32)
( ) ∫ ( ) [ ]
(2.33)

www.forschem.org
Hugo Hernandez
ForsChem Research
Finding a suitable function is not a straightforward task. One possibility is transforming . Let
us consider for example the absolute value of the standard normal distribution:
| |
(2.34)
For this transformation, Eq. (2.32) and (2.33) become:
∫ | ( )| √
(2.35)
( ) ∫ ( ) [ ]
(2.36)
Then, one possible function that solves the system is:
( )
( )
( )
(2.37)
Given that this function is not continuous for [ ], it must be integrated piecewise as
follows:
( ) ( ) ( )
∫ | | ∫ ∫
( ) ( ) ( )
( ) ( )
( ) ( )
√ √
√ √
[ ] [ ]
( ( ) ( ) ) √
√
(2.38)
and

www.forschem.org
Hugo Hernandez
ForsChem Research
| |
( )
( ) ∫ ( )
( )
| |
(2.39)
Given that (| | ) [ ] for [ ] and any real value of , then solving for results
in:
√ ( )
(2.40)
Since the sine function provides a uniform sign already (positive and negative equally
probable), Eq. (2.40) can be expressed simply as:
√ ( )
(2.41)
Eq. (2.41) corresponds to the Box-Muller method [8] for obtaining standard normal random
numbers from two standard uniform random numbers. The corresponding cosine expression
can be obtained in a similar way. In fact, a more general expression is possible:
√ ( )
(2.42)
where is any arbitrary constant.
In another example, let us consider a Type III standard logarithmic random variable described
by the following probability density function:
( ) ( )
(2.43)
This random variable can be described in terms of two standard uniform random numbers as
long as there is a function ( ) meeting the following conditions:
∫ | ( )| ( )
(2.44)
( ) ∫ ( ) [ ]
(2.45)

www.forschem.org
Hugo Hernandez
ForsChem Research
One possible solution is the following function:
( )
(2.46)
Now, since
( ) ∫
(2.47)
condition (2.45) can only be met if:
(2.48)
Then Eq. (2.44) should be expressed as:
∫ ( ) ( )
(2.49)
Therefore, inverting the function ( ) from Eq. (2.47) results in:
(2.50)
3. Deterministic Sampling
It is also possible to discretize the probability distribution without using random (or pseudo-
random) numbers. Methods not using random (or pseudo-random) numbers are deterministic
sampling methods. The set of values in a sample obtained with a deterministic sampling
method will always be the same, for any specific probability distribution and a given number of
elements in a sample. Furthermore, these methods will always guarantee a 100% goodness-of-
fit between the sample and the probability model, when using the random coefficient.[9]
We will consider two types of deterministic sampling methods: Heuristic and Optimal.
Furthermore, in this Section, a particular modification of heuristic sampling involving random
numbers (randomistic sampling) will also be considered.

www.forschem.org
Hugo Hernandez
ForsChem Research
3.1. Heuristic Sampling
Heuristic sampling methods are based on simple rules for determining the elements in the
sample. The simplest method consists on dividing the range of values of the variable into a
number of equiprobable intervals equal to the requested number of elements in the sample,
and then selecting a class mark for each interval. The set of class marks will correspond to the
deterministic sample. The class mark can be selected as the median of the interval (i.e. interval
median heuristic sampling). Mathematically, it can be expressed as follows. Let be the
number of elements requested in a sample. Then, the range of values of the sampled variable
will be divided in different intervals with equal probability . The intervals obtained
will be the following:
[ ( ) ( )]
(3.1)
where is the inverse function of the cumulative probability function of ( ).
Since the sampled element is the class mark of each interval, which will be selected as the
median of the interval, then each the -th element of the sample will be:
( )
(3.2)
Even if the cumulative probability function is not invertible, the sample values can be obtained
using numerical methods as those described in Section 2.4. The interval median heuristic
sampling method is used in this work as a representative of the heuristic approach. There are
other heuristic methods for generating samples available, most of them known as low-
discrepancy sequences or quasi-random sequences (commonly used in quasi-Monte Carlo
methods), such as the van der Corput, Hammersley, Halton, and Sobol’ sequences, amongst
others.[10]
3.2. Optimal Sampling
The main drawback of random sampling methods as well as the heuristic method previously
described is that the discretization of the probability distribution do not necessarily preserves
the exact value of the distribution moments. The idea behind optimal sampling is finding a set
of values such that most moments of the distribution are preserved, by solving the following
optimization problem:

www.forschem.org
Hugo Hernandez
ForsChem Research
∑ (( ∑ ) ( ))
(3.3)
where are the elements of the deterministic sample of size , is the number of integer
moments considered in the objective function, and ( ) is the -th moment of the distribution
of given by:
( ) ( ) ∫ ( )
(3.4)
The moments used in the objective function are normalized by using the power of the -th
moment. On the other hand, considering the degrees of freedom of the data in the sample, it is
recommended setting:
(3.5)
If is considered larger than , then it is not possible to guarantee that all moments of the
original distribution are satisfied by the sample. On the other hand, if is considered smaller
than , then it may be possible to find different optimal sets of values satisfying all moments of
the original distribution. However since higher moments of the distribution might not be
relevant for certain practical purposes, can be used, and any optimum will be
satisfactory.
During optimization, the decision variables should remain within their corresponding original
intervals. Alternatively, this can be guaranteed by adding the following constraint to the
optimization problem:
( )∑
(3.6)
where represents the random goodness-of-fit between the decision variables and the
original probability distribution, and
( ) ( )
( )
( ) ( )
{
(3.7)

www.forschem.org
Hugo Hernandez
ForsChem Research
Given the nature of the objective function used in Eq. (3.3), the corresponding optimization
problem presented can be considered a randomistic optimization problem.[11]
For the optimal sampling method, the set of values in the sample is always the same for the
same distribution, the same sample size and the same number of moments considered in the
objective function. Now, since Eq. (3.3) must be numerically solved, a starting set of values is
required. An efficient starting point for the optimization is the set of values obtained from the
interval median heuristic sampling method described in Section 3.1.
3.3. Randomistic Sampling
The method denoted here as randomistic sampling is a modification of the heuristic method
presented in Section 3.1, considering as class mark a random value obtained from each interval,
following the probability distribution of the variable of interest, instead of its median. The
random selection of the class mark can easily be done considering a standard uniform random
number ( ) as follows:
( )
(3.8)
Even though the values obtained for the elements in the sample are random, there is only one
element per equiprobable interval, just as in the case of deterministic methods. Also in this
case, the random goodness-of-fit obtained between the sample and the probability distribution
model is 100%.
Given that deterministic and randomistic samples are obtained in ascending order, it is possible
to randomize the order of the set in order to obtain quasi-random sequences. This
randomization is easily done by arbitrarily assigning a standard uniform random number to
each element in the sample, and sorting the elements in ascending or descending order with
respect to the standard uniform random numbers generated.
The randomistic sampling method is the most general expression for discretizing probability
distributions. If , Eq. (3.8) becomes equivalent to the inverse transform method of
random sampling (Eq. 2.4). On the other hand, for large values of , the sample obtained
approximates to the result of deterministic sampling methods. Particularly, for , the
optimal sample is also obtained. If the random number is replaced in Eq. (3.8) by its
expected value ( ( ) ), then the expression for interval median heuristic sampling (Eq.
3.2) is obtained.

www.forschem.org
Hugo Hernandez
ForsChem Research
4. Sampling Methods Comparison
In this Section, the random procedures described in Section 2 and Section 3 will be used for
discretizing some selected probability distributions. Particularly, only standard probability
distributions will be considered, given that they represent a family of distributions just by linear
transformation. The examples considered include:
 Type III standard uniform distribution

 Type I standard normal distribution
 Type II standard exponential distribution
 Type II standard Maxwell-Boltzmann distribution
On the other hand, and only for comparison purposes, samples with a size of 20 elements are
considered in all examples. Since the sets obtained using random and randomistic sampling
methods are always different, 50 samples are used for assessing the average fitness to the
original distribution, as well as the average value of the first five natural moments and their
variation. This specific number of moments was arbitrarily chosen as an example. For optimal
sampling methods, the objective function is also defined in terms only of the first five natural
moments. The heuristic sampling method considered is the interval median method described
in Section 3.
4.1. Standard Uniform Distribution
Random sampling of the standard uniform distribution is directly performed using algorithms
for pseudo-random numbers generation. Figure 2 shows an example of a sample of 20
elements obtained using the RAND function of Microsoft® Excel, which employs the Mersenne
Twister Algorithm (MT19937).§
The cumulative probability function for the standard uniform distribution is given by:
( ) [ ]
(4.1)
Thus, the corresponding inverse cumulative probability function is:
( ) [ ]
(4.2)
§
https://support.office.com/en-us/article/rand-function-4cbfa695-8869-4788-8d90-021ea9f5be73. Last
access: 17/09/2019.

www.forschem.org
Hugo Hernandez
ForsChem Research
The sample shown in Figure 2 presented a random goodness-of-fit of 93.04% with respect to
the standard uniform random model. As it can be seen, random sampling do not necessarily
exactly describes the behavior of the distribution. The same situation is observed during any
type of experimentation, where a sample of the results do not necessarily represents the exact
behavior of the population, especially for small sample sizes.
Figure 2. Blue solid line: Observed cumulative probability for a sample of 20 standard uniform
random numbers obtained using the Mersenne Twister algorithm (MT19937). Red dashed line:
Expected theoretical cumulative probability for a standard uniform distribution.
Figure 3. Histogram of absolute frequencies for the random goodness-of-fit observed in 50

samples each containing 20 standard uniform random numbers.
Figure 3 shows the random goodness-of-fit obtained in 50 different random samples of 20

elements each. The maximum fit of the uniform random model in the 50 samples was 99.89%,

www.forschem.org
Hugo Hernandez
ForsChem Research
and their average value was 93.69%. As the sample size is increased, the goodness-of-fit of the
sample is also expected to improve.
On the other hand, the goodness-of-fit of the deterministic and randomistic sampling methods
is 100%. The heuristic and optimal sample for a uniform distribution with 20 elements in the
sample is presented in Table 2. The optimal sample was obtained using a generalized reduced
gradient (GRG) nonlinear optimization algorithm, considering the heuristic sample as initial
guess. The objective function considered only the first 5 natural moments. A maximum
difference of 0.00404 is observed between the values of the elements in the heuristic and the
optimal sample. The values of the elements in the interval median heuristic sample of size will
be denoted as ( ) , whereas the optimal values considering the first natural moments in
the objective function will be denoted as ( ) .
Table 2. Deterministic samples for the standard uniform distribution with a sample size of 20.
0.025 0.075 0.125 0.175 0.225 0.275 0.325 0.375 0.425 0.475
( ) 0.525 0.575 0.625 0.675 0.725 0.775 0.825 0.875 0.925 0.975
0.02496 0.07469 0.12434 0.17406 0.22396 0.27407 0.32439 0.37487 0.42541 0.47591
( ) 0.52624 0.57630 0.62604 0.67546 0.72463 0.77378 0.82323 0.87348 0.92516 0.97904
Randomistic sampling is an interesting alternative for improving the performance of random

sampling. The values of the elements in the randomistic sample for the standard uniform
distribution can be determined as follows:
( ) ( )
(4.3)
where ( ) represents the values of the elements in the randomistic sample of size , and
is a standard uniform random number.
Figure 4 shows an example of a sample obtained using the randomistic sampling approach. The
goodness-of-fit for all possible randomistic samples is 100%. In this case, the values of the
elements in the sample remain random while at the same time the properties of the
distribution are better preserved. That is, the sample moments are closer to the moments of
the original distribution, and they present less variability. Table 3 summarizes the values
obtained for the first 5 natural moments for each sampling method. In the case of random and
randomistic sampling, the average of 50 samples is presented, along with the corresponding
coefficient of variation (CV).

www.forschem.org
Hugo Hernandez
ForsChem Research
Figure 4. Blue solid line: Observed cumulative probability for a sample of 20 standard uniform
randomistic numbers obtained using Eq. (3.8), where the random values were obtained
using the Mersenne Twister algorithm (MT19937). Red dashed line: Expected theoretical
cumulative probability for a standard uniform distribution.
Table 3. Summary of the first natural moments for the different sampling methods for
discretizing the standard uniform distribution.
Sampling method
Exact [12] 0.5 0.33333 0.25 0.2 0.16667
Interval Median Heuristic 0.5 0.33313 0.24969 0.19958 0.16615
(rel. abs. error) (0%) (0.06%) (0.12%) (0.21%) (0.31%)
Optimal 0.5 0.33333 0.25001 0.2 0.16667
(rel. abs. error) (0%) (0%) (0.002%) (0%) (0%)
Average 0.50275 0.33674 0.25390 0.20440 0.17154
Random (rel.abs.error) (0.55%) (1.02%) (1.56%) (2.20%) (2.92%)
CV 12.7% 18.9% 23.7% 28.1% 32.2%
Average 0.50045 0.33388 0.25056 0.20051 0.16710
Randomistic (rel.abs.error) (0.09%) (0.17%) (0.22%) (0.26%) (0.26%)
CV 0.68% 1.05% 1.58% 2.24% 3.00%
4.2. Standard Normal Distribution
The cumulative probability function of the standard normal distribution ( ) is given by:
( )
( ) √ ( )
(4.4)
where represents the error function.

www.forschem.org
Hugo Hernandez
ForsChem Research
The corresponding inverse cumulative probability function is:
( ) √ ( ) [ ]
(4.5)
where represents the inverse error function.
Particular samples of 20 elements obtained using: the inverse transform random sampling, the
Box-Muller transform random sampling and randomistic sampling are compared in Figure 5.
Randomistic sampling was done using the following equation:
( ) ( ) √ ( ( ) )
(4.6)
where ( ) represents the standard uniform numbers obtained by randomistic sampling

considering the sample size .
Figure 5. Comparison of observed cumulative probability for samples obtained from a standard
normal distribution. Red dashed line: Expected theoretical cumulative probability for a
standard normal distribution. Green dotted line: Random sample obtained using the inverse
transform method (Eq. 4.5). Gray dashed line: Random sample obtained using the Box-Muller
transform method (Eq. 2.41). Blue solid line: Randomistic sample obtained using Eq. (4.6).
Standard uniform random numbers used in the calculations were obtained from the Mersenne
Twister algorithm (MT19937).

www.forschem.org
Hugo Hernandez
ForsChem Research
A comparison of the performance of the different sampling methods, including heuristic and
optimal determinist sampling, is presented in Table 4. The heuristic and optimal sets of values
for a sample of 20 elements from the standard normal distribution are presented in Table 5.
Table 4. Performance comparison of different sampling methods for discretizing the standard
normal distribution.
Sampling method
Exact [12] 0 1 0 3 0 1
Interval Median Heuristic 0 0.93856 0 2.20724 0 1
(abs. error) (0) (0.061) (0) (0.793) (0) (0)
Optimal 0 1 0 2.99999 0 1
(abs. error) (0) (0) (0) (0.00001) (0) (0)
Random Average -0.04102 0.96367 -0.14739 2.64333 -0.92709 0.9230
(Inverse (abs.error) (0.04102) (0.03633) (0.14739) (0.35667) (0.92709) (0.0770)
Transform) Std.dev. 0.23446 0.31500 0.75326 1.70107 4.69863 0.0700
Random Average -0.03405 0.98328 -0.09868 2.88601 -0.72279 0.9424
(Box-Muller (abs.error) (0.03405) (0.01672) (0.09868) (0.11399) (0.72279) (0.0576)
Transform) Std.dev. 0.21922 0.29720 0.80587 2.12274 7.04394 0.0972
Average 0.00034 1.02938 0.01515 3.28732 0.27328 1
Randomistic (abs.error) (0.00034) (0.02938) (0.01515) (0.28732) (0.27328) (0)
Std.dev. 0.03235 0.12897 0.54881 1.63524 6.26138 0
The heuristic set of standard normal values ( ( ) ) was obtained using the following
expression:
( ) ( ( ) ) √ ( ( ) )
(4.7)
Table 5. Deterministic samples for the standard normal distribution with a sample size of 20.
-1.95996 -1.43953 -1.15035 -0.93459 -0.75542 -0.59776 -0.45376 -0.31864 -0.18912 -0.06271
( ) 0.06271 0.18912 0.31864 0.45376 0.59776 0.75542 0.93459 1.15035 1.43953 1.95996
-2.20106 -1.42678 -1.08218 -0.87647 -0.71678 -0.57586 -0.44332 -0.31476 -0.18827 -0.06267
( ) 0.06267 0.18827 0.31476 0.44332 0.57586 0.71678 0.87647 1.08218 1.42678 2.20106
Then, starting from this set of values the optimal set ( ( ) ) was found by minimizing the
objective function (3.3) and using constraint (3.6), with , and . Given that the
standard normal distribution is symmetric about zero, only the non-negative values of the set
were considered as decision variables in the optimization. The remaining values were just
determined as the negative values of the decision variables. When is odd, the central value
should be kept constant at a value of zero.
In this case, the randomistic sampling method again results in lower differences between the
average moments of the samples and the theoretical moments of a standard normal

www.forschem.org
Hugo Hernandez
ForsChem Research
distribution, as well as less variability compared to random sampling. On the other hand, the
Box-Muller method presented less error compared to the inverse transform method, although
its variability is larger. Regarding the deterministic methods, the optimal sample successfully
preserves the first five natural moments of the standard normal distribution, with significant
improvements in the even moments, compared to the heuristic sample.
4.3. Standard Exponential Distribution
The cumulative probability function of the Type II standard exponential distribution ( ) is:
( )
(4.8)
and the corresponding inverse cumulative probability function is (Eq. 2.8):
( ) ( ) [ ]
(4.9)
exponential distribution. Red dashed line: Expected theoretical cumulative probability for a
standard exponential distribution. Green dotted line: Random sample obtained using the
inverse transform method (Eq. 4.9). Blue solid line: Randomistic sample obtained using Eq.
(4.10). Standard uniform random numbers used in the calculations were obtained from the
Mersenne Twister algorithm (MT19937).
Samples of 20 elements obtained using: the inverse transform random and randomistic
sampling are compared in Figure 6. Randomistic sampling was performed using:

www.forschem.org
Hugo Hernandez
ForsChem Research
( ) ( ( ) ) ( ( ) ) ( )
(4.10)
A comparison of the performance of the different sampling methods, including heuristic and
optimal determinist sampling, is presented in Table 6. The heuristic and optimal sets of values
for a sample of 20 elements from the standard exponential distribution are presented in Table
7.
exponential distribution.
Sampling method
Exact [12] 1 2 6 24 120 1
Interval Median Heuristic 0.98278 1.82047 4.56990 13.4783 43.4607 1
(rel. abs. error) (1.72%) (8.98%) (23.83%) (43.84%) (63.78%) (0%)
Optimal 0.98297 2.03265 6.43826 25.2534 108.657 1
(rel. abs. error) (1.70%) (1.63%) (7.30%) (5.22%) (9.45%) (0%)
Random Average 1.07000 2.30347 7.56299 33.47961 182.75996 0.93215
(Inverse (rel.abs.error) (7.0%) (15.2%) (26.0%) (39.5%) (52.3%) (6.8%)
Transform) CV 23.4% 51.6% 95.3% 154.6% 219.9% 9.1%
Average 1.00374 2.04933 6.45011 27.68133 148.94477 1
Randomistic (rel.abs.error) (0.4%) (2.5%) (7.5%) (15.3%) (24.1%) (0%)
CV 6.0% 29.6% 81.3% 162.0% 259.9% 0%
Similarly, the heuristic set of standard exponential values ( ( ) ) was obtained using:
( ) ( ( ) ) ( ( ) )
(4.11)
Table 7. Deterministic samples for the standard exponential distribution with a sample size of
20.
0.02532 0.07796 0.13353 0.19237 0.25489 0.32158 0.39304 0.47000 0.55339 0.64436
( ) 0.74444 0.85567 0.98083 1.12393 1.29098 1.49165 1.74297 2.07944 2.59027 3.68888
0.02533 0.07803 0.13371 0.19268 0.25529 0.32195 0.39315 0.46941 0.55134 0.63965
( ) 0.73511 0.83856 0.95087 1.07284 1.20500 1.38629 1.60942 1.89711 2.30246 4.60122
Again, the heuristic set of values was used as starting point for finding the optimal set
( ( ) ) by minimization of the objective function (3.3) and using constraint (3.6), with
, and .
In this example, larger deviations in the moments of the samples with respect to the
theoretical exponential distribution were observed, even for the optimal sample. This result is

www.forschem.org
Hugo Hernandez
ForsChem Research
probably due to the high influence of the presence of large values on the determination of the
moments. And given that large values have a low frequency, they do no contribute with
enough degrees of freedom for adjusting the whole set of moments considered. In the case of
larger sample sizes, it is expected to obtain a better fit to the theoretical moments. Comparing
the random and randomistic samples, the latter resulted in less absolute error although
without significantly reducing the variability. For this particular example it was possible to
observe similar relative errors in the distribution moments for both the heuristic and the
average of the random samples.
4.4. Standard Maxwell-Boltzmann Distribution
As it was previously mentioned in Section 2, the standard Maxwell-Boltzmann distribution has a

non-invertible cumulative probability function given by (Eq. 2.9):
( ) ( )
√
(4.12)
Maxwell-Boltzmann distribution. Red dashed line: Expected theoretical cumulative probability
for a standard Maxwell-Boltzmann distribution (Eq. 4.12). Green dotted line: Random sample
obtained using the inverse transform method (Eq. 4.13). Blue solid line: Randomistic sample
obtained using Eq. (4.14). Standard uniform random numbers used in the calculations were
obtained from the Mersenne Twister algorithm (MT19937).
Thus, in this example the inverse cumulative function will be approximated using Eq. (2.11):

www.forschem.org
Hugo Hernandez
ForsChem Research
( )
( ) ( )
(4.13)
Samples of 20 elements obtained using: the inverse transform random and randomistic
sampling are compared in Figure 7. Randomistic sampling was performed using:
( )
( ) ( ( ) )
( )
(4.14)
Following the procedure of the previous examples, a performance evaluation of the different
sampling methods, including interval median heuristic and optimal determinist sampling, is
presented in Table 8. The heuristic and optimal sets of values for a sample of 20 elements from
the standard exponential distribution are presented in Table 9.
In this case, the heuristic set of standard Maxwell-Boltzmann values ( ( ) ) was obtained
using the approximation:
( ( ) )
( ) ( ( ) ) ( )
(4.15)
Whereas the optimal set ( ( ) ) was found by minimization of the objective function (3.3)
and using constraint (3.6), with , and , starting from the heuristic sample.
Maxwell-Boltzmann distribution.
Sampling method
Exact [12] 1 1.17810 1.57080 2.31319 3.70110 1
Interval Median Heuristic 0.99277 1.15630 1.52032 2.19793 3.42265 1
(rel. abs. error) (0.72%) (1.85%) (3.21%) (4.98%) (7.52%) (0%)
Optimal 0.99998 1.17819 1.57056 2.31381 3.70059 1
(rel. abs. error) (0.002%) (0.008%) (0.015%) (0.027%) (0.014%) (0%)
Random Average 0.98144 1.14582 1.53468 2.31395 3.86950 0.93991
(Inverse (rel.abs.error) (1.9%) (2.7%) (2.3%) (0.03%) (4.5%) (6.0%)
Transform) CV 9.6% 18.6% 29.7% 43.9% 61.2% 8.1%
Average 0.99586 1.17901 1.60387 2.46574 4.22513 0.99986
Randomistic (rel.abs.error) (0.4%) (0.08%) (2.1%) (6.6%) (14.2%) (0.014%)
CV 1.7% 5.6% 13.6% 26.8% 45.4% 0.011%

www.forschem.org
Hugo Hernandez
ForsChem Research
It can be observed again that for the randomistic sampling method, the error and variability of
the moments are reduced compared to random sampling. For this example, the goodness-of-fit
obtained for the randomistic samples was not exactly 1. The cause of this small deviation is the
use of an approximation for the calculation of the inverse cumulative function (Eq. 4.14). It can
also be appreciated that the optimal sample adequately preserves the moments considered,
whereas the heuristic sample presents important deviations.
Table 9. Deterministic samples for the standard Maxwell-Boltzmann distribution with a sample
size of 20.
0.26395 0.43444 0.53289 0.60760 0.67064 0.72704 0.77948 0.82966 0.87877 0.92781
( ) 0.97767 1.02929 1.08378 1.14251 1.20744 1.28160 1.37013 1.48335 1.64746 1.97984
0.24742 0.41212 0.51495 0.59683 0.66789 0.73235 0.79242 0.84945 0.90431 0.95765
( ) 1.01000 1.06185 1.11373 1.16642 1.22118 1.28054 1.35009 1.44519 1.61412 2.06114
4.5. Application Example: Monte Carlo Method
The Monte Carlo method is a problem-solving strategy consisting on performing statistical

sampling experiments in a computer.[13] Monte Carlo methods are useful for analyzing the
behavior of highly non-linear random variables, when analytical solutions are not possible. In
this example, however, a problem with analytical solution is selected as case study for
comparison purposes. The case study is the determination of the following non-linear
properties of the standard Maxwell-Boltzmann distribution ( ): ( ) ( ) ( ) (√ ).
Some of these properties are useful for understanding the behavior of molecular collisions.[14-
15]
Table 10. Performance comparison of different sampling procedures using in the Monte Carlo
method.
Sampling method ( ) ( ) ( ) (√ )
Exact [16] 1.27324 2.54648 0.92534 0.97628

Heuristic 1.25501 2.10328 0.52822 0.97358
(rel. abs. error) (1.43%) (17.4%) (42.9%) (0.28%)
Optimal 1.26809 2.22674 0.61869 0.97608
(rel. abs. error) (0.40%) (12.6%) (33.1%) (0.02%)
Random Average 1.86299 136.934 127.077 0.96986
(Inverse (rel.abs.error) (46.3%) (5277%) (13633%) (0.7%)
Transform) CV 137.0% 617.5% 627.3% 5.6%
Average 1.31265 3.55935 1.78701 0.97401
Randomistic (rel.abs.error) (3.1%) (39.8%) (93.1%) (0.23%)
CV 17.1% 190.4% 334.1% 0.7%

www.forschem.org
Hugo Hernandez
ForsChem Research
Table 10 summarizes the results obtained using the Monte Carlo method for estimating those
properties of the standard Maxwell-Boltzmann distribution, using different sampling
procedures and a sample size of 20 elements. Random and randomistic sampling were
replicated 50 times. The results obtained are compared to the exact analytical solutions.[16]
Please notice that none of those properties was explicitly used in the objective function for
finding the optimal sample.
From the selected properties, particularly ( ) and ( ) are quite challenging because
small values of have a large effect on the estimation of the expected values. However, the
deterministic and randomistic methods were capable of estimating those parameters within
the same order of magnitude. Random sampling, on the other hand, resulted in large
deviations up to two orders of magnitude different. The optimal deterministic sampling
method achieved the best performance in all properties considered. The largest relative
difference observed by this method was in the estimation of ( ) with 33.1% error.
Regarding the variability of the results of the sampling methods based on pseudo-random
number generation, the randomistic approach presented a lower coefficient of variation (CV)
for all properties evaluated. The largest fluctuation of this method was again ( ) with a CV
of 334.1%. Almost half of the variation observed with the random sampling method. Of course,
better accuracy is expected in the estimation when larger sample sizes are used.
5. Conclusion
There are different methods for discretizing probability distributions into finite samples. Those
sampling procedures have been classified into random, deterministic and randomistic
sampling. Random sampling uses random (or pseudo-random) number generators for
obtaining individual elements from the population described by the corresponding probability
distribution being sampled. The most direct method for obtaining random samples is the
inverse transform method which requires an exact or approximate analytical inverse function
of the cumulative probability function of the probability distribution. Deterministic sampling
provides always the same set of values for any specific sample size and probability distribution.
They can be designed in order to reduce discrepancy, achieving a better coverage of the
sampling space, or to better preserve the moments of the original distribution while
adequately fitting the behavior of the cumulative probability distribution. Finally, randomistic
sampling combines properties from the previous methods, achieving a better goodness-of-fit
of the sample to the probability distribution (as in deterministic sampling), while resulting in
different, unpredictable values of the elements in different samples (as in random sampling). A
comparison of the different sampling methods was done considering different representative

www.forschem.org
Hugo Hernandez
ForsChem Research
standard probability distributions. It was observed that randomistic sampling allowed reducing
the variability in the results and improve the accuracy in the estimation of the moments,
compared to random sampling. Probability discretization methods are useful for Monte Carlo
(and quasi-Monte Carlo) applications, especially when analytical solutions are non-existent or
difficult to obtain.
Acknowledgments
The author wishes to thank Prof. Dr. Silvia Ochoa (Universidad de Antioquia, Colombia), for
helpful discussions on this topic.
This research did not receive any specific grant from funding agencies in the public,
commercial, or not-for-profit sectors.
References
[1] Hernandez, H. (2018). Comparison of Methods for the Reconstruction of Probability Density
Functions from Data Samples. ForsChem Research Reports 2018-12. doi:
10.13140/RG.2.2.30177.35686.
[2] Kneusel, R. T. (2018). Random Numbers and Computers. Springer International Publishing AG.
[3] Haahr, M. (2019, September 6). RANDOM.ORG: True Random Number Service. Retrieved
from https://www.random.org
[4] Hernandez, H. (2018). Multidimensional Randomness, Standard Random Variables and

Variance Algebra. ForsChem Research Reports 2018-02. doi: 10.13140/RG.2.2.11902.48966.
[5] Hernandez, H. (2017). Standard Maxwell-Boltzmann distribution: Definition and properties.

ForsChem Research Reports 2017-2. doi: 10.13140/RG.2.2.29888.74244.
[6] Barzilai, J., & Borwein, J. M. (1988). Two-point step size gradient methods. IMA Journal of
Numerical Analysis, 8(1), 141-148.
[7] Hernandez, H. (2017). Multivariate Probability Theory: Determination of Probability Density

Functions. ForsChem Research Reports 2017-13. doi: 10.13140/RG.2.2.28214.60481.

www.forschem.org
Hugo Hernandez
ForsChem Research
[8] Box, G. E. P. and Muller, Mervin E. (1958). A Note on the Generation of Random Normal
Deviates. Ann. Math. Statist. 29(2), 610-611. doi: 10.1214/aoms/1177706645.
[9] Hernandez, H. (2019). Goodness-of-fit of Randomistic Models. ForsChem Research Reports

2019-10. doi: 10.13140/RG.2.2.35386.34248.
[10] Niederreiter, H. (1978). Quasi-Monte Carlo methods and pseudo-random numbers. Bulletin
of the American Mathematical Society, 84(6), 957-1041.
[11] Hernandez, H. (2018). Introduction to Randomistic Optimization. ForsChem Research

Reports 2018-11. doi: 10.13140/RG.2.2.30110.18246.
[12] Hernandez, H. (2018). Expected Value, Variance and Covariance of Natural Powers of
Representative Standard Random Variables. ForsChem Research Reports 2018-08. doi:
10.13140/RG.2.2.15187.07205.
[13] Fishman, G. S. (2013). Monte Carlo. Concepts, Algorithms, and Applications. Springer Science
& Business Media: New York.
[14] Hernandez, H. (2017). Molecular Free Path Statistical Distribution of Multicomponent

Systems. ForsChem Research Reports 2017-6. doi: 10.13140/RG.2.2.15605.58088.
[15] Hernandez, H. (2017). Multicomponent Molecular Collision Kinetics: Rigorous Collision Time
Distribution. ForsChem Research Reports 2017-7. doi: 10.13140/RG.2.2.26218.31689.
[16] Hernandez, H. (2017). Standard Maxwell-Boltzmann Distribution: Additional Nonlinear and

Multivariate Properties. ForsChem Research Reports 2017-14. doi: 10.13140/RG.2.2.35761.07520.

www.forschem.org

Discretization of Probability Distributions: Random, Deterministic and Randomistic Sampling

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Discretization of Probability Distributions: Random, Deterministic and Randomistic Sampling

Enviado por

Direitos autorais:

Formatos disponíveis

2019-11

Discretization of Probability Distributions: Random, Deterministic and

Cumulative Probability, Discretization, Heuristics, Inverse Transform, Monte Carlo,

24/09/2019 ForsChem Research Reports 2019-11 (1 / 30)

2.1. True Random vs. Pseudo-Random Number Generation

As it was previously mentioned, random sampling consists of obtaining a sample of values

24/09/2019 ForsChem Research Reports 2019-11 (2 / 30)

from measurements of noisy natural phenomena.[3] On the other hand, pseudo-random

2.2. Standard Random Numbers

Table 1. Parameters used in the definition of standard random numbers

Now, if a random number is obtained for a standard probability distribution, it can be

24/09/2019 ForsChem Research Reports 2019-11 (3 / 30)

2.3. Uniform Random Numbers

2.4. Non-Uniform Random Numbers: Inverse Transform Sampling

where is a standard uniform random number, and is a random number of . Therefore,

24/09/2019 ForsChem Research Reports 2019-11 (4 / 30)

which is a simple expression for obtaining random numbers following an exponential

Unfortunately, certain cumulative probability distribution functions cannot be explicitly solved

24/09/2019 ForsChem Research Reports 2019-11 (5 / 30)

which can now be solved yielding:

or solving the following numerical optimization problem:

in order to find the corresponding random number .

24/09/2019 ForsChem Research Reports 2019-11 (6 / 30)

2.5. Non-Uniform Random Numbers: General Transformation Sampling

24/09/2019 ForsChem Research Reports 2019-11 (7 / 30)

24/09/2019 ForsChem Research Reports 2019-11 (8 / 30)

Since ( ) ( ) is one possible solution, then the corresponding function for

24/09/2019 ForsChem Research Reports 2019-11 (9 / 30)

24/09/2019 ForsChem Research Reports 2019-11 (10 / 30)

For this transformation, Eq. (2.32) and (2.33) become:

Then, one possible function that solves the system is:

24/09/2019 ForsChem Research Reports 2019-11 (11 / 30)

24/09/2019 ForsChem Research Reports 2019-11 (12 / 30)

One possible solution is the following function:

condition (2.45) can only be met if:

Then Eq. (2.44) should be expressed as:

24/09/2019 ForsChem Research Reports 2019-11 (13 / 30)

3.1. Heuristic Sampling

where is the inverse function of the cumulative probability function of ( ).

3.2. Optimal Sampling

24/09/2019 ForsChem Research Reports 2019-11 (14 / 30)

24/09/2019 ForsChem Research Reports 2019-11 (15 / 30)

3.3. Randomistic Sampling

24/09/2019 ForsChem Research Reports 2019-11 (16 / 30)

4. Sampling Methods Comparison

 Type III standard uniform distribution

4.1. Standard Uniform Distribution

Thus, the corresponding inverse cumulative probability function is:

24/09/2019 ForsChem Research Reports 2019-11 (17 / 30)

Figure 3. Histogram of absolute frequencies for the random goodness-of-fit observed in 50

Figure 3 shows the random goodness-of-fit obtained in 50 different random samples of 20

24/09/2019 ForsChem Research Reports 2019-11 (18 / 30)

Randomistic sampling is an interesting alternative for improving the performance of random

24/09/2019 ForsChem Research Reports 2019-11 (19 / 30)

4.2. Standard Normal Distribution

24/09/2019 ForsChem Research Reports 2019-11 (20 / 30)

The corresponding inverse cumulative probability function is:

where ( ) represents the standard uniform numbers obtained by randomistic sampling

24/09/2019 ForsChem Research Reports 2019-11 (21 / 30)

24/09/2019 ForsChem Research Reports 2019-11 (22 / 30)

4.3. Standard Exponential Distribution

24/09/2019 ForsChem Research Reports 2019-11 (23 / 30)

24/09/2019 ForsChem Research Reports 2019-11 (24 / 30)

4.4. Standard Maxwell-Boltzmann Distribution