Você está na página 1de 5

(Title) An Analysis of Execution Duration of Sorting

Algorithms
Research Question
Do the type of input data (sorted reals, unsorted reals, unsorted mixture of reals
and characters) and the type of algorithm (Bubble, Selection) affect the
execution duration of sorting?

Hypotheses
H0: No significant difference exist in execution duration among three types of
input data (reals, unsorted reals, unsorted mixture of reals and characters)
HA: Significant difference in execution duration among three types of input data
(reals, unsorted reals, unsorted mixture of reals and characters)
H0: No significant difference in execution duration between Bubble Sort and
Selection Sort HA: Significant difference in execution duration between Bubble
Sort and Selection Sort

Material and Methods


The data are the execution durations of two sorting algorithms (Bubble and
Selection). The algorithms have been applied fifty times to each of the following
types of input:
1) 100 unsorted (= randomly ordered) real numbers;
2) 100 already sorted real numbers;
3) A mixture of 100 unsorted reals and characters;
4) 100 partially sorted real numbers and
5) 100 reversely sorted real numbers.
For this practical only the first three input types will be considered.
The sorting algorithms were run via Linux (and Windows, but these data will not
be used here) and programmed in Java by Hussain Alzaheri as a part of his MSc
Project in 2011, supervised by R. te Boekhorst. The resulting execution times are
stored as data set 7 in the DataSet Spread Sheet (StudyNet > Teaching
Resources > Practicals > DataSet) of the module 7COM1017.
Frequency distributions will be constructed from the data by pasting them into
an Excel sheet especially created for this purpose. The frequency distributions
are to be visualised as histograms and summarised by parameters of location
(mean, median and mode) and dispersion (variance and standard deviation).
The hypotheses will be evaluated by comparing the distributions of execution
times among input types and between the two algorithms Bubble and Selection.

Results
Parameter values and frequency distributions of the samples are brought
together in Table 1 below.

Table 1 Descriptive Statistics and frequency distributions of the execution durations of


two sorting algorithms and three types of input (1 mark for each correct set of
parameters and each correct frequency distribution = 10 marks in total).

The frequency distributions of all six samples are visualised in Figure 1.

Figure 1
Frequency distributions of execution
durations of three different types of input and two
sorting algorithms (1 mark for each correct

plot).
In Figure 2a the histograms of the two algorithms are compared for each input
type, whereas in Figure 2b the comparison is made among the three input type
for each algorithm.

Figure 2 Comparisons of frequency distributions of execution durations of different


types of input and two sorting algorithms (1 mark for each correct plot).

Conclusion
The shape of the distributions is typically skewed [to the left](1 mark), and this
appears to be stronger the case for the [Bubble] ( mark) algorithm than for
the [Selection] ( mark) algorithm (Figure 1, 2b).
From the parameter values it can be seen that the average duration is larger for
[Selection] ( mark) than for [Bubble] ( mark) (Table 1). This is visible in the
histograms of Fig. 2a as the [blue] ( mark) bars ([Selection]) ( mark) being
slightly more shifted [to the right] ( mark) than the [blue] ( mark) bars (
[Bubble]) ( mark). This is especially clear in the case of [unsorted] input (1
mark) (Figure 2a). With respect to the input type, [sorted] (1 mark) input yields
the shortest average execution time, whereas the longest average processing

time was found for [mixed] (1 mark) input applied to the [Selection] (1 mark)
algorithm (Table 1).
A comparison among the distributions of different input types for each algorithm
(Figure 2b) shows that distributions of [sorted] (1 mark) input are [narrower] (1
mark) than those of other types of input (i.e. have a [smaller] (1 mark)
[distribution] (1 mark)).
However, because no statistical test has been applied, no statements about the
significance of the results (and hence about the acceptance or rejection of the
null hypothesis) can be made.

Discussion
The results illustrate the relationship between values of dispersion parameters
on the one hand and the shape of the distribution on the other hand. [Larger] (1
mark)
[Unsorted] (1 mark) corresponds to [wider] (1 mark) distributions.
This is particularly clear in a plot in which the histogram of the algorithm
[Selection] (1 mark), input type [unsorted] (1 mark) ([duration] (1 mark) =
77.5 (1 mark)) is compared to that of the algorithm [Bubble] (1 mark) , input
type [sorted] (1 mark) ([Duration] (1 mark) = 30 (1 mark) ) (Figure 3).

Figure 3 Comparisons of frequency distributions of execution durations of samples


with most diverging [Duration] ( mark) (1 mark for correct plot).

Você também pode gostar