Escolar Documentos
Profissional Documentos
Cultura Documentos
2, June 2011
2011 TFSA
C.-J. Lin et al.: Protein 3D HP Model Folding Simulation Using a Hybrid of Genetic Algorithm and Particle Swarm Optimization 141
the set of all valid conformations for s [22]. velocity enables every particle to search around its best
individual and global positions. After the new velocity is
3. Methods obtained, the particle updates the position by following
Eq. (3). When every particle is updated, the fitness value
A. Particle Swarm Optimization of each particle is calculated again. If the fitness value of
The idea of particle swarm optimization was origi- the new particle is higher than those of the local best,
nally introduced by Kennedy and Eberhart in 1995 to then the local best will be replaced with the new particle;
study social and cognitive behavior [23]-[24]. The idea and further, if the local best is better than the current
originated in studies on the synchronous flocking of global best, then we replace the global best with the local
birds and the schooling of fish. The PSO has come to be best in the swarm.
widely used as a problem solving method in engineering
and computer science. This algorithm has several highly
desirable attributes, including a basic algorithm that is
very easy to understand and implement. It is similar in
some ways to evolutionary algorithms, but requires less
computational bookkeeping and generally fewer lines of
code.
In the PSO, the trajectory of each individual in the
search space is adjusted by dynamically altering the ve-
locity of each particle, according to its own flight ex-
perience and the flight experience of the other particles
in the search space. The position vector and the velocity Figure 2. The diagram of the updated velocity in the PSO.
vector of the ith particle in the D-dimensional search
space can be represented by X i = ( xi1 , xi 2 , xi 3 ,......, xid ) B. Particle Swarm Optimization
Figure 3 shows the flowchart of the proposed hybrid
and Vi = ( i1 , i 2 , i 3 ,......, id ) , respectively. According to a genetic-based particle swarm optimization (HGA-PSO).
user-defined fitness function, suppose that the best posi- The hybrid learning algorithm works with a population
tion of each particle (which corresponds to the best fit- of candidate solutions. At each generation, the n best
ness value obtained by the particle at that time) individuals of the population are selected based on their
is Pi = ( pi1 , pi 2 , pi 3 ,......, pid ) and that the fittest particle fitness (the minimum free energy). The details are as
found so far is Pg = ( p g1 , p g 2 , p g 3 ,......, p gd ) . Then the follows:
z Initialization step: If the input amino acid sequence is
new velocities and the positions of the particles for the
of length n, then each individual in the population is a
next fitness evaluation are calculated using the following
string of length n-1 over the symbols = {R, L, F, B, D,
two equations:
U}, and that denotes a valid conformation in the 3D
idk +1 = idk + c1 rand () ( Pid xidk ) + c 2 rand () ( Pgd xidk ) (2) square lattice [10]. The symbols R, L, F, B, D and U
are used to denote the fold directions right, left, for-
xidk +1 = xidk + vidk +1 (3) ward, backward, up and down in the encoding scheme,
respectively. An initial population is generated ran-
where c1 and c2 are constants known as acceleration co-
domly and initializes an n-1 dimensional space within
efficients, and rand () are two separately generated uni- a fixed range. Figure 4 shows that the adopted schemes
formly distributed random numbers in the range [0,1]. for representing internal movements are absolute di-
The concept of the updated velocity is illustrated in rections.
Figure 2. The first part of Equation (2) represents the z Reproduction step: Reproduction is a process in
previous velocity, which provides the necessary mo- which individual strings are copied according to their
mentum for particles to roam across the search space. fitness value. In this study, we use the roulette-wheel
The second part of Equation (2), known as the cogni- selection method [25]: a simulated roulette is spun for
tive component, represents the personal thinking of this reproduction process. The best performing indi-
each particle. The cognitive component encourages the viduals (that is, the whole of the top half is considered
particles to move toward their own best positions found as best) [26] advance to the next generation. The other
up to that point. The third part of Eq. (2) is known as the half is generated to perform crossover and mutation
social component, which represents the collaborative operations on individuals in the top half of the parent
effect of the particles in finding the global optimal solu- generation.
tion. The social component always pulls the particles
toward the best global particle found so far. Changing
C.-J. Lin et al.: Protein 3D HP Model Folding Simulation Using a Hybrid of Genetic Algorithm and Particle Swarm Optimization 143
eral highly desirable attributes, including the fact that the best direction in a local search by three methods.
the basic algorithm is very easy to understand and im- The new folding direction is superior to the original
plement. It requires less computational memory and direction. If the new folding direction is not better
fewer lines of code. Each particle has a velocity vector than the original direction, the original direction will
and a position vector x to represent a possible not change
solution. There are three possibilities for each particle
during evolution: (1) Remain itself; (2) move towards
the present optimum; each particle remembers its
own personal best position that it has found, called the
local best; and (3) move towards the best population it
has encountered. Each particle is also influenced by
the best position found by any particle in the swarm,
called the global best.
z Local search step: After mutation, a local search is
local best in the swarm. the genetic-based particle swarm optimization mutation
process, was able to obtain a new minimal energy value
z Termination Condition: The algorithm is run for a for protein instances 6 and 7. In HGA-PSO, the standard
maximum of 50,000 iterations or until minimum free deviation was less than 1. Other algorithms are very sen-
energy is achieved in the sequence. The best member sitive.
of the population is then returned.
4. Simulation Results
as described in [28]. B cells had the aging parameter B 1 20 -11 -11 -11 -11 -11 -11
2 24 -13 -13 -13 -13 -13 -13
= 5, with the memory B cells Bm = 10, and a maximum 3 25 -9 -9 -9 -9 -9 -9
number of evaluations equal to 105. ClonalgI used the 4 36 -18 -18 -18 -18 -18 -18
10 individuals in the population. The duplication rate 5 48 -29 -29 -25 -25 -29 -29
was equal to 4, the mutation rate was equal to 0.6, and 6 50 -26 -26 -23 -23 -23 -26
7 60 -49 -49 -37 -39 -41 -48
the termination criterion was 105 evaluations.
The simulation results are summarized in Table 4 in
The simulation results are summarized in Table 4 in
terms of the best found free energy (best), mean, and
terms of the best found free energy (best), mean, and
standard deviation (SD). The HGA-PSO, which adopts
standard deviation (SD). The HGA-PSO, which adopts
146 International Journal of Fuzzy Systems, Vol. 13, No. 2, June 2011
the genetic-based particle swarm optimization mutation protein folding simulations, Journal of Molecular
process, was able to obtain a new minimal energy value Biology, vol. 231, no. 1, pp. 75-81, May 1993.
for protein instances 6 and 7. In HGA-PSO, the standard [3] A. Shmygelska, R. Anguirre-Hernandez, and H. H.
deviation was less than 1. Other algorithms are very sen- Hoos, An ant colony optimization algorithm for
sitive. the 2D HP protein folding problem, Proc. Int.
Workshop Ant Algorithms, Brussels, Belgium, pp.
Table 4. Performance of HTGA on the benchmark sequences. 40-52, Sep. 2002.
No. 1 2 3 4 5 6 7 [4] F. M. Liang and W. H. Wong, Evolutionary
Size 20 24 25 36 48 50 60 Monte Carlo for protein folding simulations,
HGA-PSO Journal of Chemical Physics, vol. 115 no. 7, pp.
Best -11 -13 -9 -18 -29 -26 -49
Mean -11 -13 -9 -17.72 -28.88 -25.92 -48.62
3374-3380, 2001.
SD 0 0 0 0.98 0.47 0.27 0.59 [5] N. Krasnogor, B. P. Blackburne, E. K. Burke, and J.
GA D. Hirst, Multimeme algorithms for protein struc-
Best -11 -13 -9 -18 -25 -23 -37 ture prediction, Proc. Int. Conf. Parallel Problem
Mean -11 -13 -9 -16.16 -24.22 -22.58 -36.6
Solving from Nature (PPSN VII), Granada, Spain,
SD 0 0 0 1.99 0.60 0.66 0.48
Backtracking-EA [27]
pp. 769-778, Sep. 2002.
Best -11 -13 -9 -18 -25 -23 -39 [6] Y. Z. Guo, E. M. Feng, and Y. Wang, Optimal HP
Mean -10.31 -10.90 -7.98 -14.38 -20.80 -20.20 -34.18 configurations of proteins by combining local
SD 0 0.36 0 0.88 1.17 1.15 2.00
search with elastic net algorithm, Journal of Bio-
Aging-AIS [28]
Best -11 -13 -9 -18 -29 -23 -41 chemical and Biophysical Methods, vol. 70, no. 3,
Mean -11 - 13 -9 -16.76 -25.16 -22.60 -39.28 pp. 335-340, April 2007.
SD 0 0 0 1.02 0.45 0.40 0.24 [7] M. T. Hoque, M. Chetty, and L.S. Dooley, An
ClonalgI [29]
Best -11 -13 -9 -18 -29 -26 -48
efficient algorithm for computing the fitness func-
Mean -10.40 -11.26 -8.06 -15.04 -24.20 -23.08 -42.65 tion of a hydrophobic-hydrophilic model, 4th In-
SD 0.57 0.90 0.87 1.37 2.22 2.05 2.74 ternational Conference on Hybrid Intelligent Sys-
tems, pp. 285-290, 2004.
5. Conclusions [8] R. Knig and T. Dandekar, Refined genetic algo-
rithm simulation to model proteins, Journal of
This paper proposed an HGA-PSO for solving the Molecular Modeling, vol. 5, no. 12, pp. 317-324,
protein structure prediction problem. We presented an Dec. 1999.
improved genetic algorithm with mutation based on par- [9] O. Takahashi, H. Kita, and S. Kobayashi, Protein
ticle swarm optimization. The HGA-PSO with mutation folding by a hierarchical genetic algorithm, The
was based on particle swarm optimization, in which the 4th Int. Symp. On Artificial Life and Robotics
cognitive component encourages the particles to move (AROB), pp. 19-22, 1999.
toward their own best positions. For long sequences, we [10] A. Yap and I. Cosic, Application of genetic algo-
found the optimal protein structure with minimum en- rithm for predicting tertiary structure of peptide
ergy. We demonstrate that our algorithm can be applied chains, IEEE Engineering in Medicine and Biol-
successfully to the protein folding problem based on the ogy, pp.1214, Oct. 1999.
three-dimensional hydrophobic-polar lattice model. [11] T. N. Bui, and G. Sundarraj, An efficient genetic
Simulation results indicate that our approach performs algorithm for predicting protein tertiary structures
better than those of existing evolutionary algorithms. in the 2D HP model, Proceedings of the 2005
conference on Genetic and evolutionary computa-
tion (GECCO05), pp. 385-392, 2005.
Acknowledgement [12] T. Jiang, Q. Cui, G. Shi, and S. Ma, Protein fold-
ing simulations for the hydrophobic-hydrophilic
This research is supported by the National Science model by combining tabu search with genetic algo-
Council of R.O.C. under grant NSC 99-2221-E-167-022. rithms, Journal of Chemical Physics, vol. 119, no.
8, pp.4592-4596, Aug. 2003.
References [13] A. Ranganath, K. C. S. Shet, N. Vidyavathi, Effi-
cient Shape Descriptors for Feature Extraction in
[1] K. A. Dill, Theory for the folding and stability of 3D Protein Structures, In Silico Biology, vol. 7,
globular proteins, Biochemistry, vol. 24 no. 6, pp.169-174, 2007.
pp.1501-1509, 1985. [14] N. Krasnogor, W. E. Hart, J. Smith, and D. A. Pelta,
[2] R. Unger and J. Moult, Genetic algorithms for Protein structure prediction with evolutionary al-
C.-J. Lin et al.: Protein 3D HP Model Folding Simulation Using a Hybrid of Genetic Algorithm and Particle Swarm Optimization 147
gorithms, Proceedings of the Genetic and Evolu- in Fuzzy Systems-Applications and Theory, vol. 19,
tionary Computation Conference, Orlando, FL, World Scientific Publishing, NJ, 2001.
Morgan Kaufmann, USA, pp. 1596-1601, July [26] C. J. Lin and Y. C. Hsu, Reinforcement hybrid
1999. evolutionary learning for recurrent wavelet-based
[15] A. L. Patton, W. F. Punch III, and E. D. Goodman, neuro-fuzzy systems, IEEE Trans. on Fuzzy Sys-
A standard GA approach to native protein con- tems , vol. 15, no. 4, pp. 729-745, Aug. 2007.
formation prediction, Proceedings of the Sixth In- [27] C. Cotta, Protein structure prediction using evolu-
ternational Conference on Genetic Algorithms, pp. tionary algorithms hybridized with backtracking,
574-581. Morgan Kauffman, 1995. Lecture Notes in Computer Science, vol. 2687, pp.
[16] J. Pedersen and J. Moukt, Protein folding simula- 321-328, 2003.
tions with genetic algorithms and a detailed mo- [28] V. Cutello, G. Morelli, G. Nicosia, and M. Pavone,
lecular description, Journal of Molecular Biology, Immune algorithms with aging operators for the
vol. 269, no. 2, pp. 240-259, June 1997. string folding problem and the protein folding
[17] N. Krasnogor, D. Pelta, P. M. Lopez, P. Mocciola, problem, Evolutionary Computation in Combina-
and E. de la Canal. Genetic algorithms for the torial Optimization (EvoCOP), pp. 80-90, May.
protein folding problem: a critical view, In C.F.E. 2005.
Alpaydin, ed., Proc. Engineering of Intelligent Sys- [29] C. P. de Almeida, R. A. Gonalves, and M. R.
tems. ICSC Academic Press, 1998. Delgado, A hybrid immune-based system for the
[18] J. Song, J. Cheng, and T. Zheng, Protein 3D HP protein folding problem, Evolutionary Computa-
model folding simulation based on ACO, Sixth tion in Combinatorial Optimization (EvoCOP),
International Conference on Intelligent Systems pp.13-24, 2007.
Design and Applications (ISDA06), vol. 1, pp.
410-415, 2006. Cheng-Jian Lin received the B.S.
[19] C. J. Lin, Y. C. Liu, and C. Y. Lee, An efficient degree in electrical engineering from
neural fuzzy network based on immune particle Ta-Tung University, Taiwan, R.O.C., in
swarm optimization for prediction and control ap- 1986 and the M.S. and Ph.D. degrees in
electrical and control engineering from
plications, International Journal of Innovative
the National Chiao-Tung University,
Computing, Information and Control (IJICIC), vol. Taiwan, R.O.C., in 1991 and 1996.
4, no. 7, pp. 1711-1722, July 2008. Currently, he is a full Professor of Com-
[20] C. J. Lin and S. J. Hong, The design of puter Science and Information
neuro-fuzzy networks using particle swarm opti- Engineering Department, National Chin-Yi University of
mization and recursive singular value decomposi- Technology, Taichung County, Taiwan, R.O.C. His current
tion, Neurocomputing, vol. 71, pp. 297-310, Dec. research interests are soft computing, pattern recognition, in-
2007. telligent control, image processing, bioinformatics, and FPGA
[21] C. Huang, X. Yang, and Z. He, Protein folding design.
simulations of 2D HP model by the genetic algo-
Shih-Chieh Su received his B.Sc.
rithm based on optimal secondary structures,
Degree in Information Management from
Computational Biology and Chemistry, pp. Ling-Tung University of Technology,
137-142, 2010 Taiwan, R.O.C., in 2006 and his M.Sc.
[22] A. Shmygelska and H. H. Hoos, An ant colony Degree in Computer Science and
optimisation algorithm for the 2D and 3D hydro- Information Engineering from Chaoyang
phobic polar protein folding problem, BMC Bio- University of Technology, Taiwan,
informatics, pp. 30, 2005. R.O.C., in 2006. Currently, he is
[23] R. Eberhart and J. Kennedy, A new optimizer us- studying for a PhD in Computer Science
ing particle swarm theory, Proceedings of the and Information in the Engineering Department, National
Sixth International Symposium on Micro Machine Chung Cheng University, Chiayi County, Taiwan, R.O.C. His
current research interests are evolutionary computation, meta-
and Human Science (MHS 95), pp. 39-43, Oct.
heuristic algorithms, and bioinformatics.
1995.
[24] R. Eberhart and J. Kennedy, Particle swarm opti-
mization, IEEE International Conference on Neu-
ral Networks, vol. 4, no. 27, pp. 1942-1948, 1995.
[25] O. Cordon, F. Herrera, F. Hoffmann, and L. Mag-
dalena, Genetic fuzzy systems evolutionary tuning
and learning of fuzzy knowledge bases, Advances