Você está na página 1de 13

Computers and Chemical Engineering 29 (2005) 22292241

Factorial design technique applied to genetic algorithm parameters


in a batch cooling crystallization optimisation
Caliane B.B. Costa , Maria Regina Wolf Maciel, Rubens Maciel Filho
Chemical Engineering School, State University of Campinas (UNICAMP), Cidade Universitaria Zeferino Vaz,
CP 6066, CEP 13081-970 Campinas-SP, Brazil
Received 30 November 2004; received in revised form 6 August 2005; accepted 9 August 2005

Abstract
An original approach is proposed in this work for the evaluation of genetic algorithm (GA) applied to a batch cooling crystallization optimisation.
Since a lot of parameters must be set in a GA in order to perform an optimisation study, factorial design, a well-known technique for the selection
of the variables with the most meaningful effects on a response, is applied in an optimisation problem solved through GA. No systematic approach
to establish the best set of parameters for GA was found in literature and a relatively easy to use and meaningful approach is proposed. The results
show that the parameters with significant (95% confidence) effect are initial population, the population size and the jump and creep mutation
probabilities, being the ones in which alterations should be made during a GA study of optimisation, in the search for the optimum.
2005 Elsevier Ltd. All rights reserved.
Keywords: Batch; Crystallization; Dynamic simulation; Factorial design; Genetic algorithm; Optimisation

1. Introduction
Crystallization is a very important unit operation, used in
many processes mainly because it leads to the formation of particulate material with high purity. Batch operation offers the flexibility required when there are many simple steps to be executed,
with changing recipes. In this way, batch crystallization is the
preferred process in pharmaceutical, specialty and fine chemicals industries for obtaining their products. Nowadays, operation
requirements involve the trade-off between large throughput and
product with specified properties related to size distribution and
particle size. Furthermore, the operation employed in the crystallizer during the batch influences all the subsequent processes
(downstream processing), since the solids produced constitutes
a mass of particulate material, which may exhibit an infinite
number of different features, like habit, crystal size distribution (CSD) or solvent hindering (Ma, Tafti, & Braatz, 2002).
Optimal operation is then important to obtain the desired product specification, as well as to improve the efficiency of the
overall process. In the batch crystallization field, this optimum

Corresponding author. Tel.: +55 19 3788 3971; fax: +55 19 3788 3965.
E-mail address: caliane@lopca.feq.unicamp.br (C.B.B. Costa).

0098-1354/$ see front matter 2005 Elsevier Ltd. All rights reserved.
doi:10.1016/j.compchemeng.2005.08.005

must be determined in terms of the extent, in each batch instant,


of the kinetic phenomena that governs the extraction of solute
from solution and its deposition into crystal lattice. The driving force for these phenomena is the supersaturation, which, in
batch cooling crystallization, is achieved through the cooling
of the solution. Therefore, many optimisation studies in batch
cooling crystallization are focused on finding the optimal cooling profile (Costa, da Costa, & Maciel Filho, 2005; Zhang &
Rohani, 2003).
The solution of an optimisation problem can be found
through, among others, deterministic or stochastic approaches.
The former composes the traditional optimisation methods
(direct and gradient-based methods) and have the disadvantages of requiring the first and/or second-order derivatives of
the objective function and/or constraints or of being not efficient in non-differentiable or discontinuous problems. Furthermore, the deterministic methods are dependent on the chosen
initial solution (Deb, 1999). The stochastic methods, such as
Genetic Algorithms (GAs), do not possess these drawbacks.
Genetic Algorithms (GAs) are part of the so-called evolutionary algorithms and compose a search and optimisation tool with
increasing application in scientific problems. They do not need
to have any information about the search space, just needing
an objective/fitness function that assigns a value to any solu-

2230

C.B.B. Costa et al. / Computers and Chemical Engineering 29 (2005) 22292241

Nomenclature


indicates absolute value of the operand, if it is


negative, and zero value otherwise
A
pre-exponential factor (primary nucleation)
(m3 s1 )
exponential constant of Type 1 function
A1
A2
exponential constant of Type 2 function
Ac
heat transfer area (m2 )
best tness vector that records the best fitness function in
each generation
best individual vector that records the best individual in
each generation
B
kinetic parameter of the primary nucleation law
c
solute molecules concentration in solution,
mol m3 of solution
*
c
solute molecules concentration in solution at
supersaturation, mol m3 of solution
Ci
granulometric class of rank i
Ci
width of class Ci
Cp
slurry specific heat (J kg1 K1 )
CS
solid concentration in the suspension, mol m3 of
suspension
C0
initial concentration of adipic acid, mol m3 of
solution
CV
coefficient of variation of the crystal size distribution (%)
f(x)
objective function
F(x)
fitness function
fmax
objective function value of the worst feasible solution in the population
G
growth rate (m s1 )
GC
generation counter
gj (x)
inequality constraint
Hc
heat of crystallization (J mol1 )
(HR)
concentration of molecular adipic acid in solution,
mol m3 of solution
(HR* ) concentration of molecular adipic acid in solution
at saturation, mol m3 of solution
+
(H )
concentration of protons in solution, mol m3 of
solution
idum
GA parameter to determine the initial population
if individuals
iniche GA parameter to determine if niching is used
iunifrm GA parameter to determine if single or uniform
crossover is used
i
kinetic order of the secondary nucleation law
IC
individual counter
j
kinetic order of the integration growth law
K
modified acidity constant of adipic acid, mol m3
of solution
k
exponent to the solid concentration in secondary
nucleation law
ka
surface shape factor
kc
kinetic constant of the integration law


(m3j 2 mol1j s1 )


kN

kinetic constant of the secondary nucleation law






(m3(i +k )3 moli j s1 )
volumetric shape factor
kv
L
characteristic size of crystals (m)
Li
upper limit of class of number i (m)
m
number of constraints
maxgen maximum number of generations in the evolution
of GA code
microga GA parameter to determine if migroga option is
used
MM
molecular weight of the crystal (kg mol1 )
n
number distribution density (population) per unit
volume of suspension (m4 )
N
number of granulometric classes
nchild GA parameter to determine the number of children per pair of parents
Ni (t)
number of crystals per unit volume of suspension
in granulometric class Ci at time t, m3 of suspension
p
p-level, probability of error that is involved in
accepting an effect as valid
npopsiz GA parameter to determine the number of individuals per generation
pcreep creep mutation probability in the GA code
pcross crossover probability in the GA code
pmutate jump mutation probability in the GA code
rN
net rate of nucleation (m3 s1 )
rN1
primary rate of nucleation (m3 s1 )
rN2
secondary rate of nucleation (m3 s1 )
r(I)
intrinsic rate of agglomeration of rank Im,n
(m3 s1 )
net rate of agglomeration in the granulometric
RA,i
class Ci (m3 s1 )
net rate of breakage in the granulometric class Ci
RB,i
(m3 s1 )
t
instantaneous time (s)
T
crystallizer solution absolute temperature (K)
Tc
coolant absolute temperature (K)
t(freedom degree) t-statistics
tf
final time (s)
tintermediate intermediate time, where Type 1 and Type 2
functions have the same value (s)
ttotal
total batch time (s)
U
global heat transfer coefficient (J m2 s1 K1 )
V
solution volume (m3 )
Vsusp
suspension volume (m3 )
V0
initial volume of the solution in the crystallizer
(m3 )
x
vector containing the optimising (adjustable) variables
Greek letters
r
effectiveness factor

C.B.B. Costa et al. / Computers and Chemical Engineering 29 (2005) 22292241

est I,i

stoichiometric coefficient of class i in agglomeration of number I


slurry density (concentration), kg m3 of slurry
crystal density, kg m3 of crystal

tion. Details about the working principle of GAs can be found


elsewhere (Deb, 1998, 1999; Fuhner & Jung, 2004).
The working principle of the GAs requires setting up a relatively large number of parameters. The history of advance in the
evolution of evolutionary algorithms is part provided at random,
part by the values of their parameters. Due to this feature, it is
recommended, in an optimisation search by GAs, to perform a
lot of runs to increase the chance to obtain the global optimum.
The factorial design is a well-known technique based on
statistical considerations that brings the most meaningful information about the influences of parameters on a specific problem.
The present work proposes the application of factorial design
in the Genetic Algorithms parameters to determine which ones
affect significantly the optimisation of the cooling profile in a
batch crystallization system. The proposed approach needs to
be conducted prior to the optimisation trials through GAs, since
it removes GA parameters that are not statistically significant
for the evolutionary search, saving time and computation burden in evolutionary optimisation studies. The proposed approach
makes the GA drive the system to an optimal solution through a
systematic procedure. This approach, compared to the trial-anderror setting of GA parameters, leads to less time in the optimum
search. The results indicate that the initial population, the population size and the jump and creep mutation probabilities are
the parameters with significant relevance in the search for the
optimal cooling profile in a batch cooling crystallization system
by Genetic Algorithms.
2. Batch cooling crystallization
In a batch cooling crystallization operation, as shown
schematically in Fig. 1, the solution is cooled in order to create
a supersaturation into the system, which is the driving force for
the kinetic mechanisms. The nucleation and growth are the most

2231

dominant phenomena. Apart from them, other phenomena, such


as agglomeration and breakage, may occur during the process,
making it difficult to carry out reliable predictions. Neglecting
agglomeration may result in poor representation of reality, especially when the crystallizing substance is classically known as
having an agglomerating behaviour (Costa et al., 2005).
The modelling of the process involves mass, energy and
population balances. This latter is a general approach and constitutes a complex partial differential equation, which accounts
how the kinetic phenomena alter the population density both in
size and time. A lot of work in literature (Kiparissides, 2004;
Puel, Fevotte, & Klein, 2003; Rawlings, Miller, & Witkowski,
1993) reviews the many techniques and methods used to solve
the population balance equation (PBE). In the present work, the
Method of Classes (Costa et al., 2005; Marchal, David, Klein,
& Villermaux, 1988; Nallet, Mangin, & Klein, 1998; Puel et al.,
2003) is used to solve the PBE in the modelling of an adipic
acid crystallization process, the chosen study system. It is worth
mentioning that the model of the process is highly nonlinear.
The model equations are composed of Eqs. (1)(3), which represent, respectively, the population, mass and energy balances,
coupled with the kinetic equations for the growth, nucleation
and aggregation mechanisms, Eqs. (4)(7):
dN1
1 dVsusp
G(L1 )
G(L1 )
+
N2 +
N1
N1 +
dt
Vsusp dt
2C2
2C1
= rN + RA,1 RB,1 ,
dNi
1 dVsusp
G(Li )
+
Ni +
Ni+1
dt
Vsusp dt
2Ci+1
+

G(Li ) G(Li1 )
G(Li1 )
Ni
Ni1 = RA,i RB,i ,
2Ci
2Ci1

1 dVsusp
G(LN1 )
dNN
G(LN1 )
+
NN +
NN
NN1
dt
Vsusp dt
2CN
2CN1
= RA,N RB,N
(H + )
V0 C0 =
K


1+

(1)

V0
K
Cs
V0 +
+
(H )
1 MM
Cs

dT
Cp V
= Hc 3c kv Vsusp
dt

(2)

nL2 G dL UAc (T Tc )

(3)

dL
ka MMkc
r (c c )j
=
dt
3c kv

B

= A exp 
(HR)
ln2 (HR)

G=

(4)

rN1

(5)


[(HR) (HR) ]i Csk
rN2 = kN

Fig. 1. Schematic drawing of the batch cooling crystallizer and the concentration
vs. temperature curve, showing two different cooling profiles.

RA,i =

N(N+1)/2

I=1

est I,i r(I)

(6)
(7)

2232

C.B.B. Costa et al. / Computers and Chemical Engineering 29 (2005) 22292241

In Eqs. (1) and (7), the subscript i indicates the granulometric


class in which the population is being balanced and, so, varies
from 1 to N, the total number of granulometric classes.
The mass balance is based on the fact that changes in the solution concentration results in alteration of the mass of crystals per
volume unit and on the dissociation constant of the crystallizing substance. Eq. (3), the energy balance, takes into account
the heat of crystallization and the heat removed by the cooling
device.
The method of classes transforms the population balance partial differential equation into one ordinary differential equation
system, Eq. (1), by discretizing the range of variation of the
variable L, related to crystal size, and assuming that the number density function is constant at each granulometric class. The
discretization is done from the nuclei size to the largest crystals
size. All defined sizes determine the existence of N granulometric classes Ci , whose widths are defined by Ci = Li Li1 . The
obtained differential equations are no longer written with population density functions, but with absolute number of crystals
in each class.
The nucleation rate, represented by rN , includes primary (rN1 )
and secondary (rN2 ) nucleation. The former takes place when
there are no crystals of the crystallizing substance in suspension,
while the latter is more common in industrial practice because
seeding is almost always present.
The growth rate expression (Eq. (4)) is based on the film
model and the effectiveness factor r is found by a proper relation
(Costa et al., 2005) to the mass transfer coefficient, kd . This
latter coefficient is found by an expression for Sherwood number.
Details can be found in Costa et al. (2005). Each granulometric
class has a value for the mass transfer coefficient, which means
that the growth rate is size dependent.
Only dual agglomeration is considered and its rate in each
granulometric class, RA,i , is dependent on an intrinsic rate of
agglomeration r(I). Further details about the agglomeration rate
expression are given in Costa et al. (2005).
The solubility of adipic acid (c* as a function of T, the solution temperature) is used in the calculation of supersaturation
(c c* ), the driving force for the process of crystallization. The
 ) for adipic acid
kinetic parameters (A, B, i , j , k , kc and kN
are known and so one has to fix only the initial condition, i.e.,
the initial seeding (number of seeds added per volume unit for
each granulometric class), and the cooling profile (the curve of
Tc during all the batch) in order to simulate the crystallization
process, i.e., how Ni , the number of crystals per volume unit
of suspension in each granulometric class, evolves during batch
time.
2.1. The objectives of the optimisation problem
The rate of cooling used during the batch determines the values of supersaturation achieved, which characterize the extent
of the kinetic mechanisms. The favouring of nucleation over
growth leads to a large crystal size distribution (CSD), with
many small crystals, thanks to a great peak of supersaturation at
the early stages of the crystallization process. In batch crystallization, a large mean size and a narrow distribution are desired.

According to the literature, a cooling profile characterized as


having a soft decrease in the beginning and a more pronounced
one at the end of the process makes the supersaturation to evolve
softly, without peaks, leading to a narrower CSD, due to the
favouring of growth (Choong & Smith, 2004; Costa et al., 2005;
Mullin, 1993). Due to the importance of the final CSD in the
downstream processes and in product applications, the objectives of the optimisation in crystallization problems are normally
chosen according to features related to product specifications
and market requirements. The most common objective functions in crystallization optimisation problems are maximization
of the mean crystal size at the end of the batch, minimization
of the standard deviation () of the final CSD or minimization of its coefficient of variation (CV). This latter is a very
interesting objective function, since it relates the standard deviation to the mean crystal size. Sometimes, the batch time is also
included in the objective function, but this is not considered
in this work, since a specified throughput is assumed. Bearing
this in mind, the optimisation problem is formulated so as to
minimize the CV. The final product CSD depends strongly on
the selected optimisation objective function (Zhang & Rohani,
2003) and this, in fact, makes the problem more difficult to be
postulated.
The high non-linearity of the crystallization model as well
as its dimensionality makes deterministic optimisation methods
inefficient and unlikely to be successful, apart from the fact that
the derivatives of the system variables are not easily computed
(Choong & Smith, 2004). GA is used in the present work in order
to determine the optimal cooling profile, with a fixed seeding
policy. In addition, a factorial design method is proposed as a tool
to improve the performance of the GA method for crystallization
processes through a choice of a suitable set of parameters.
2.2. Cooling prole
As mentioned earlier, the cooling profile is part of the operating strategy to obtain the product at desired properties. In
practice, it is an usual procedure in industry with two relevant
difficulties, to know, to find out the optimal cooling profile batch
to batch and how to implement it in real time operations due to
design restrictions. This last feature is not considered in this
work. In this way, the coolant temperature, Tc , is the considered
control variable.
The parameterisation of the control profiles usually takes the
form of a sum of a convergent series of linearly independent
functions of time. Choong and Smith (2004) propose a new
parameterisation framework for the control variable profile, able
to produce all types of continuous curves. It consists of two distinct profiles, named Type 1 and Type 2, whose mathematical
formulations for the control variable considered in the present
work are described by the following equations:


t A1
Type 1 : Tc = TcF (TcF Tc0 ) 1
(8)
ttotal


t A2
Type 2 : Tc = Tc0 (Tc0 TcF )
(9)
ttotal

C.B.B. Costa et al. / Computers and Chemical Engineering 29 (2005) 22292241

2233

In these equations, Tc is the instantaneous value of coolant temperature at time t, Tc0 the initial value and TcF is the final value.
ttotal is the total batch time. The whole control variable profile
is composed by a combination of Type 1 and Type 2 functions,
each one present in a batch period.
This framework is interesting due to the reduction in the
dimensionality of the problem, possessing only six adjustable
variables, to know: the initial and the final control variable values, two exponential constants, the intermediate time and its
corresponding intermediate value of the control variable (there
is the constraint for the whole function, composed by Type 1 and
Type 2 functions, to be continuous). This proposed framework is
chosen in the formulation of the dynamic problem in the present
work.
2.3. A constrained problem
The problem to be solved in the crystallization system is the
optimisation of the cooling profile, that is, the best values of
coolant temperature to be imposed during the batch time are
sought through the optimisation algorithm. The cooling of the
crystallizer solution causes the appearance of supersaturation,
the driving force for the crystallization process to occur. No crystal would appear and/or grow into the system if supersaturation
does not take place. In this way, the formulated optimisation
problem presents some constraints that must be imposed to the
optimisation algorithm. The first constraint deals with the need
to have supersaturation. It is necessary in order to dispose of the
so-called trivial solutions, in which there is no crystallization at
all because of the maintenance of temperature at the initial value
(no generation of supersaturation). With no crystal being produced, there is no crystal size distribution and with the objective
function being set as the minimization of the CV, it would reach
a minimum (zero) value in this condition. It should, therefore, be
imposed a constraint of minimum acceptable yield of particles,
in order to force the optimizer to search for coolant temperature
values that cause the production of a minimum mass of crystals.
Furthermore, as mentioned above, the control variable (coolant
temperature) must have a continuous profile and the intermediate value of the control variable of both types functions (Eqs.
(8) and (9)) must be constrained to have the same value at the
intermediate time. Fig. 2 shows in a schematic way two profiles
for the control variable. For each profile, Type 1 and Type 2
functions possess the same value for the coolant temperature at
the intermediate time (each cooling profile is continuous during
the whole batch).
2.4. Optimisation problem statement

Fig. 2. Two hypothetical cooling profiles, illustrating the feature of continuity


of the whole control variable profile.

values during batch time) imposed to the crystallizer is determined by a combination of Eqs. (8) and (9). The infinite possible
coolant temperature profiles are determined by the initial (Tc0 )
and final (TcF ) control variable values, the two exponential constants (A1 and A2 ), the intermediate time (tintermediate ) where both
functions (Type 1 and Type 2) intercept each other and the definition whether the control variable must be represented by Type
1 + Type 2 or Type 2 + Type 1 functions (more details on how the
sequence of functions is determined is given in Section 5). The
coolant temperature in each instant of the batch time is, therefore, determined by the values assigned to these six variables
(Tc0 , TcF , A1 , A2 , tintermediate and the variable that determines
the sequence of functions). These six variables are assigned to
a vector, x, containing the optimising (adjustable) variables.
So, in the sense of an optimisation problem formulation, the
CV of the final CSD is the objective function and the optimising
variables are joined in x vector. The constraints that must be
imposed to the optimisation problem are the model equations,
Eqs. (1)(7), the minimum acceptable yield of crystals (here
translated in mass of particles obtained at the end of the batch),
and the need for both functions of Eqs. (8) and (9) to have an
interception at the intermediate time, that is, the Tc value calculated by the Type 1 function minus the Tc value calculated by
the Type 2 function must equal zero (equality constraint). This
equality constraint was handled transforming it in an inequality
constraint with the use of a tolerance set to 104 .
In this way, the formal mathematical description of the formulated optimisation problem is given by Eq. (10):
Minimize

The general objective in optimisation problems is to choose a


set of variables values subject to the various constraints that will
produce the desired optimum response for the chosen objective
function. The purpose of the present optimisation problem is the
minimization of the coefficient of variation (CV) of the CSD at
the end of the crystallization batch and the operating strategy to
obtain the product at desired properties is the manipulation of
the cooling profile. The cooling profile (i.e., coolant temperature

Subject to

CVtf (x)
model equations (Eqs. 17)
mass of crystals (tf ) 50.0

A1


TcF Tc0 (TcF Tc0 ) 1 tintermediate

t

 total
tintermediate A2
4
+(Tc0 TcF )
10
ttotal

(10)

2234

C.B.B. Costa et al. / Computers and Chemical Engineering 29 (2005) 22292241

The objective function in Eq. (10) presents CV as a function of


x. The extent of each mechanism in each batch instant will determine how the crystal size distribution (CSD) evolves during the
batch. So, at the end of a batch, different CSDs are obtained if different supersaturation profiles were imposed, which, ultimately,
can be translated to imposition of different coolant temperature
profiles. For each CSD, a CV can be calculated and, so, since
x determines the coolant temperature profile, CV is an implicit
function of x.
3. Genetic algorithms
3.1. Framework
Genetic Algorithms have proven to be very adaptable to a
great variety of different optimisation tasks (Fuhner & Jung,
2004). The algorithms work with a population of possible solutions, which suffers evolution during the generations, an analogy
borrowed from the Darwins Evolutionary Theory. Each solution is coded as a collection (chromosome) of binary or real
strings; each string representing a variable in the solution. The
evolution is achieved by some genetic operators as reproduction,
crossover and mutation. The survival of the fittest is achieved
by the assignment of a fitness function, usually defined as the
objective function for the unconstrained optimisation problem,
or a combination of the objective function and a penalty function
for constrained optimisation (Deb, 1998, 1999).
The set of solutions (i.e., the population) per iteration (generation) is fixed. In each iteration, pairs of individuals are selected
randomly and are recombined into new solutions (crossover
operator). A random change on the offspring generation is
optionally applied (mutation operator). The newly created solutions are evaluated according to the fitness function (Fuhner &
Jung, 2004).
In a search for the optimum through the use of GA, it is
necessary to set the population size, the maximum number of
generations allowed during the search, the number of children
in the offspring generation per pair of parents and the crossover,
jump mutation and creep mutation probabilities. The difference
between the two types of mutation is that the jump mutation acts
on the chromosome (genotype), while creep mutation acts on the
decoded individual (phenotype). Concerning to the crossover
operator, it is possible to define single-point, two-point or uniform crossovers. In the first one, just one crossover point is
selected and the string from the beginning of the chromosome
to the crossover point is copied from the first parent, while the
rest is copied from the other parent. In the two-point crossover,
two crossover points are selected. The string from the beginning
of the chromosome to the first crossover point is copied from
the first parent; the part from the first to the second crossover
point is copied from the other parent and the rest is copied from
the first parent again. Finally, in the uniform crossover, bits are
randomly copied from the first or from the second parent.
Genetic Algorithm can also borrow the idea from nature of
coexistence of multiple niches in order to deal with multimodal
optimisation. A sharing concept (in an analogy to the sharing,
in nature, of available resources, such as land and food) may

be introduced artificially in GA population. This allows coexistence of multiple optimal solutions (both local and global). More
details about the niching in GA may be found in Deb (1999).
Another interesting tool available in the Genetic Algorithms
is the micro-GA technique, which uses a very small population
(micro-population) that converges towards a single individual
representing the best result obtainable with that particular population. Once the convergence is reached, the best individual is
preserved and the micro-population is restarted with new individuals. This GA largely depends on the mutation operator, since
such a small population cannot take advantage of the discovery
of good partial solutions by crossover. This tool works better
with unimodal or simple problems (Deb, 1999).
3.2. Constraint handling
Constraint handling in optimisation problems that use GAs is
not a simple task. The most usual approach is the use of penalty
functions. Nevertheless, its use may require a lot of refinement, in order to determine the most suitable penalty parameters
needed to guide the search towards the constrained optimum.
Deb (2000) proposed a different constraint handling method,
exploiting the feature of the GAs algorithm of pair-wise comparison during the selection of individuals, being the selection
done by tournament or not. Penalty parameters are not needed
in the proposed method because, in any scenario of comparison between two solutions, they are never compared in terms
of both objective function and constraint violation information.
The proposed fitness function is formulated in the following
manner, where infeasible solutions are compared based only on
their constraint violation (for a minimization problem):

f (x),
if gj (x) 0, j = 1, 2, . . . , m,

F (x)=
fmax +
gj (x), otherwise

j=1

(11)
The parameter fmax is the objective function value of the worst
feasible solution in the population. In this way, when two feasible
solutions are compared, the one with better objective function
value is chosen; when one feasible and one infeasible solutions
are compared, the feasible solution is chosen; when two infeasible solutions are compared, the one with smaller constraint
violation is chosen (Deb, 2000).
For the present optimisation problem, i.e., minimization of
CVtf , the CV at the end of a batch in a cooling crystallization
process, Eq. (11) can be translated in the following manner.
F(x), the fitness function, is equal to CVtf , the CV at the end
of the batch (the objective function, f(x), of the crystallization
problem being considered) for feasible individuals, while, for
infeasible solutions, it is equal to the worst CVtf among all CVtf
of feasible individuals in that generation plus the constraints
violation amount.
3.3. The employed code
The GA used was basically the FORTRAN Genetic Algorithm Driver by David Carroll, Version 1.7a (Carroll, 2004),

C.B.B. Costa et al. / Computers and Chemical Engineering 29 (2005) 22292241

with some modifications. The code initializes a random sample of individuals with different parameters (problem variables).
This initial random sample of individuals is actually dictated by
the value assigned to a GA parameter named idum: the same
initial population is generated every time the code is run with
the same value assigned to idum. The selection scheme used is
tournament selection (Deb, 1999) with a shuffling technique for
choosing random pairs for mating. The individuals are coded in
binary manner and the routine can apply jump mutation, creep
mutation and single-point or uniform crossover. Niching (sharing) and an option for the number of children per pair of parents
are added. An option for the use of a micro-GA is also part of
the code.
The routine is used coupled with the crystallization model
(Eqs. (1)(7)) in order to optimise the cooling profile parameterised as given by Eqs. (8) and (9). The constraint handling
method given by Eq. (11) was implemented to the original
Carrolls code in order to perform the needed constrained optimisation of the cooling profile.
Carrolls code has the following variables to be set, in order
to run the optimisation:
microga: if set to 1, the micro-GA search is activated. In this
work, microga is set to 0 (deactivated)
npopsiz: determines the number of individuals in each generation (iteration)
pmutate: jump mutation probability
maxgen: maximum number of generations to be accounted in
the evolution
idum: a parameter that determines the initial population of
individuals; in the code idum is the initial random number
seed for the GA run and it must equal a negative integer
pcross: crossover probability
pcreep: creep mutation probability
iunifrm: 0 for single-point crossover; 1 for uniform crossover;
in this work uniform crossover is used
iniche: 0 for no niching, 1 for niching. In this work niching is
used
nchild: determines if the number of children per pair of parents
is 1 or 2. Two children per pair of parents are used in the
present work.
More details about these parameters can be found in Carroll
(2004).
To accomplish an optimisation with the GA code, it is necessary to study the remaining parameters: pmutate, pcross, pcreep,
npopsiz, maxgen and idum. A factorial design was conducted in
order to determine which ones of the six parameters have significant effects on the optimisation result, as well as how they
interact among themselves. This procedure is proposed in this
work, since it allows for a systematic approach to find out the
suitable set of parameters for GA method.
4. Factorial design and its application to the problem
The factorial design method is a statistical technique that
evaluates at the same time all process (or any focus of study)

2235

variables in order to determine which ones really exert significant influence on the final response, which gives a better analysis
of the response (Kar, Banerjee, & Bhattacharyya, 2002). All variables are called factors and the different values chosen to study
the factors are called levels (Barros Neto, Scarminio, & Bruns,
2001; Box, Hunter, & Hunter, 1978).
In a complete factorial design, all possible combinations of
the selected levels for the factors are made, but this procedure may be too time-consuming. On the other hand, the most
common factorial designs are the two levels ones, which bring
enough information for the purpose of this work. Important
trends may be observed with these factorial designs and the
effect of each independent variable, on the dependent one are
estimated. The values of the resulting first-order effects indicate
the more sensitive parameters applied to the case studied and
consequently which ones are more important in the seeking procedure. It is worth mentioning that the obtained results depend
strongly on the case study to which the methodology is being
applied (Rodrigues, Toledo, & Maciel Filho, 2002).
When a relatively large number of factors is evaluated, the
total number of combinations may be too large. Furthermore,
the high order interactions (third, fourth or superior) are small
and may be mixed with the standard deviation of the effects.
In this case, it is advisable and convenient to use a fractional
factorial. The number of combinations is diminished and the
most important effects are determined (Barros Neto et al., 2001).
In the interpretation of the results generated by a complete
or fractional factorial design, it is necessary to decide which
calculated effects are significantly different from zero. The usual
practice is use the concept of statistical significance (generally
95% of confidence).
When analysing the results of a factorial design, two statistic parameters are of relevance. The t-statistics of a factor is
obtained by the division of its effect by its error. This statistic
parameter is dependent on the freedom degree, which is calculated by the subtraction of the number of calculated effects from
the total number of experiments/trials available. The higher the
t-statistics, the higher is the significance of the corresponding
factor. On the other hand, the p-level, which represents the probability of error that is involved in accepting the effect as valid,
is a decreasing index of the reliability of a result. The higher
the p-level, the less one can believe that the observed relation
between factor and effect is reliable. The common practice is
to consider 95% of confidence in a result, so that, for an effect
to be considered statistically significant, its p-level must be less
than 0.05.
The two levels evaluated in a factorial design are coded by (+)
and (), representing the upper and lower levels, respectively.
5. Systematic approach in optimisation with
GAsprior detection of signicant parameters
The proposal to use factorial design in the selection of the
significant GA parameters in the optimisation of coolant temperature profile in order to minimize the coefficient of variation
of the CSD at the end of the batch in the batch cooling crystallization is here presented in a systematic way.

2236

C.B.B. Costa et al. / Computers and Chemical Engineering 29 (2005) 22292241

Fig. 3. General working structure of GAs.

A supporting chart, Fig. 3 is presented, which depicts the


general working structure of GAs. The input data that must be
supplied to GAs is composed by:
GA settings: the characteristics of micro GA, type of
crossover, niching and number of children per pair of parents
must be supplied to the GA code. As explained in Section 3.3,
these characteristics are determined by the values assigned to
microga, iunifrm, iniche and nchild, which were respectively
0, 1, 1 and 2 in this particular problem. These values are fixed
and are not part of the so-called GA parameters, whose effects
on the optimisation response are object of study in the detection of statistical significance.

Problem variables minimum and maximum allowed variables: the decision on which are these minimum and maximum allowed variables is dependent on the specific problem
being considered. In the present work, optimising variables
are assigned to x vector, which contains the values of Tc0 ,
TcF , A1 , A2 , tintermediate and the variable that determines the
sequence of Type 1 and Type 2 functions (Eqs. (8) and (9)).
In this way, based on the physical problem, Tc0 and TcF are
allowed to vary between 298 and 340 K, the exponentials constants between 106 and 20, the intermediate time is allowed
to vary between 0 and 1500 s, the batch time. The determination of the sequence of Type 1 and Type 2 functions, since
the optimisation search should investigate whether it is best

C.B.B. Costa et al. / Computers and Chemical Engineering 29 (2005) 22292241

to have Type 1 + Type 2 or Type 2 + Type 1, is delegated to a


variable allowed to vary between 1 and +1: negative values
determine Type 1 + Type 2, while nonnegative values determine Type 2 + Type 1.
GA parameters: these parameters must be varied in many
GA trials in order to drive so many possible evolutionary
paths that the achievement of the specific process global optimum can be more assured. These parameters are, as exposed
in Section 3.3, npopsiz, pmutate, maxgen, idum, pcross and
pcreep.
The objective function is coupled with constraints violation,
as proposed by Deb (2000)Eq. (11), in order to calculate
the fitness function of each individual. The problem model, the
batch cooling crystallization, is necessary for the evaluation of
both the objective function and constraints violation. GC and IC
that figures in Fig. 3 are only counters (respectively generation
counter and individual counter) used by the algorithm to make
calculations for each individual of each generation. The vectors
best tness and best individual are responsible for recording
the best fitness function and the corresponding best individual
in each generation.
As can be seen by the structure outlined in Fig. 3, given a set of
values of the GA parameters, the GA optimisation code executes
the evolutionary search and gives as output the best fitness function, that is, the minimum value found for the objective function
(the CV of the CSD at the end of the batch in the crystallization
process, for the present work). The outer box of Fig. 3, which
encloses all the sequence of steps for Genetic Algorithms, can be
seen as a black-box: given an input (GA settings, minimum and
maximum allowed values and GA parameters), for a particular
problem model, the black-box gives an output. Since the problem
is fixed (here, batch cooling crystallization model), the minimum
and maximum allowed values are fixed. The GA settings are
also fixed. In this way, the only variables able to be varied in the
input are the GA parameters. And it is here that it is based on the
approach of using factorial design to identify which ones of these
GA parameters really exert significant influence on the output.
The proposed approach should be seen as a prior and important
analysis to be conducted in optimisation trials through GAs in
order to discharge GAs parameters that are not statistically significant for the evolutionary search to the specific problem, saving time and computation burden in evolutionary optimisation
studies.
A step-by-step description of the proposed approach may be
outlined as follows:
1. Define the case study/problem and formulate it mathematically (process model).
2. Define the objective function.
3. Define the constraints of the problem.
4. Define the control variables (optimising variables), i.e., the
variables that compose the individuals and that should suffer
evolution in order to provide better fitness functions.
5. Stipulate the GA settings and the minimum and maximum
allowed values of the control variables.

2237

6. Stipulate the values of the upper and lower levels for the
GA parameters to be used in the factorial design study.
7. Build the complete or fractional factorial design spreadsheet, with the many combinations of GA parameters levels
that must be supplied to a GA to perform the evolutionary
optimisation. For information on how to build fractional
factorial designs, the reader is referred to Barros Neto et al.
(2001) and Box et al. (1978).
8. Perform the optimisation through GA for each combination
of GA parameters in order to obtain the problem response
to these GA parameters values.
9. Calculate effects of each GA parameters on the problem
response, as well as their errors and statistical significance
(p-level). Information on how to calculate the effects, its
errors and p-levels, is found in Barros Neto et al. (2001) and
Box et al. (1978). Calculate, as well, the effects, errors and plevel for the interactions between factors (GA parameters).
10. The GA parameters that do not show statistical significance
on the problem response may be discharged in further optimisation studies because, irrespective of which value is
stipulated to these parameters, the problem response will
not vary significantly, in statistical sense. The GA parameters that show effects statistically significant should be
extensively varied in further optimisation works with this
particular problem.
6. Results and discussion
The independent variables considered in this work and their
corresponding values for each level are presented in Table 1. A
fractional factorial design 261 study was conducted, since the
following GA parameters are taken into account: jump mutation probability (pmutate), crossover probability (pcross), creep
mutation probability (pcreep), population size (npopsiz), maximum number of generations allowed (maxgen) and the initial
sample of individuals (idum). Table 2 presents the combinations
of GA parameters for the optimisations that were conducted,
for the fractional factorial design with a central point. The central point is normally used with repetition for error estimation.
However, there is no error in computer simulations (the crystallization model Eqs. (1)(7) are used) and, therefore, only one
point is used. The results of CV of the best individual in the last
generation generated by the GA in each case are presented in the
final column. An explanation of the crystallization optimisation
Table 1
Levels of the parameters used in sensitivity analysis of the GA code applied to
the crystallization problem
GA parameters

() Level

Central

(+) Level

(1) pmutate
(2) pcross
(3) pcreep
(4) npopsiz
(5) maxgen
(6) iduma

0.0425
0.68
0.034
43
43
1150

0.05
0.8
0.04
50
50
1000

0.0575
0.92
0.046
58
58
850

idum assumes negative integer value and is the initial seed for the GA run;
each value assigned to idum gives rise to a different initial population.

C.B.B. Costa et al. / Computers and Chemical Engineering 29 (2005) 22292241

2238

Table 2
Fractional factorial design 261 study results
Run name

pmutate

pcross

pcreep

npopsiz

maxgen

idum

Response (CV)

pl01a
pl02a
pl03a
pl04a
pl05a
pl06a
pl07a
pl08a
pl09a
pl10a
pl11a
pl12a
pl13a
pl14a
pl15a
pl16a
pl17a
pl18a
pl19a
pl20a
pl21a
pl22a
pl23a
pl24a
pl25a
pl26a
pl27a
pl28a
pl29a
pl30a
pl31a
pl32a
Zero

+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
0

+
+
+
+
+
+
+
+

+
+
+
+
+
+
+
+
0

+
+
+
+

+
+
+
+

+
+
+
+

+
+
+
+
0

+
+

+
+

+
+

+
+

+
+

+
+

+
+

+
+
0

+
0

+
+

+
+

+
+

+
+

+
+

+
0

103.0
127.7
129.4
99.03
112.8
105.6
110.6
120.4
104.2
118.6
99.29
131.8
102.3
110.4
131.8
125.6
125.1
97.55
98.51
115.1
100.2
104.0
116.1
100.4
100.9
113.2
128.4
99.0
108.2
102.3
108.4
122.3
102.0

problem is given in Appendix A, which brings detailed information on optimisation and model variables for one selected
optimal solution, to know, for that of run pl32a.
The software STATISTICA (Statsoft, v. 6.0) was used to analyze the results. Table 3 presents the effect estimates of the GA
parameters, calculated with 95% of confidence, with no interaction between the effects. Fig. 4 brings the corresponding Pareto
chart, used for identification of the most important factors. The
t statistics that figures in Table 3 is presented with its freedom
degree, which is 26, since there were 33 available runs and only 7
effects were calculated (the mean effect plus the effects for each

factor). The values for the t-statistics are also indicated next to
each bar in the Pareto chart.
As can be seen, two parameters, the initial population
(expressed by idum) and its size (expressed by npopsiz) have
significant effects on the search for a minimum CV of the final
CSD in the batch cooling crystallization system optimisation.
The Pareto chart of Fig. 4 shows that the jump mutation probabil-

Table 3
Effect estimates on CV for the fractional factorial design with no factor
interactions
Factor

Effect

S.E.

t(26)

Mean
(1) pmutate
(2) pcross
(3) pcreep
(4) npopsiz
(5) maxgen
(6) idum

111.3461
5.8038
2.5750
0.5863
6.2929
0.8992
14.3513

1.428299
2.900876
2.900876
2.900876
2.900681
2.900681
2.900876

77.95712
2.00069
0.88766
0.20209
2.16947
0.30999
4.94721

0.000000
0.055969
0.382860
0.841416
0.039373
0.759037
0.000039

Italic values denotes significant effect for a 95% confidence level.

Fig. 4. Pareto chart of variables effects for CV of the best individual (at 95% of
confidence level), with no factor interactions.

C.B.B. Costa et al. / Computers and Chemical Engineering 29 (2005) 22292241

2239

Fig. 5. Pareto chart of variables effects for CV of the best individual (at 95% of confidence level), with two factor interactions.

ity (pmutate) has an effect, if considered no interaction between


the factors, near the limit of significance. In this way, an analysis
with two-way factor interactions was performed. These results
are outlined in Table 4 and the correspondent Pareto chart of
effects presented in Fig. 5.
Once again, factors (6) and (4) (initial population and its size)
have shown a great effect on the CV of the final CSD of the best
individual generated by GA at the end of the evolution process. The interactions between the creep mutation probability
Table 4
Effect estimates on CV for the fractional factorial design with two factor
interactions
Factor
Mean
(1) pmutate
(2) pcross
(3) pcreep
(4) npopsiz
(5) maxgen
(6) idum
1 by 2
1 by 3
1 by 4
1 by 5
1 by 6
2 by 3
2 by 4
2 by 5
2 by 6
3 by 4
3 by 5
3 by 6
4 by 5
4 by 6
5 by 6

Effect
111.3462
5.8038
2.5750
0.5863
6.2929
0.8992
14.3513
0.6425
1.3963
1.6600
4.8563
1.2912
2.5750
4.5562
4.1025
2.6125
4.9700
0.7862
5.5262
1.9725
4.9575
2.2487

S.E.
1.028469
2.088822
2.088822
2.088822
2.088681
2.088681
2.088822
2.088822
2.088822
2.088822
2.088822
2.088822
2.088822
2.088822
2.088822
2.088822
2.088822
2.088822
2.088822
2.088821
2.088822
2.088822

t(11)
108.2640
2.7785
1.2328
0.2807
3.0129
0.4305
6.8705
0.3076
0.6684
0.7947
2.3249
0.6182
1.2328
2.1813
1.9640
1.2507
2.3793
0.3764
2.6456
0.9443
2.3733
1.0766

Italic values denotes significant effect for a 95% confidence level.

p
0.000000
0.017953
0.243365
0.784179
0.011805
0.675146
0.000027
0.764140
0.517634
0.443593
0.040231
0.549048
0.243365
0.051752
0.075296
0.236991
0.036544
0.713773
0.022763
0.365283
0.036933
0.304720

and the initial population (pcreep and idum, factors (3) and (6))
and between the creep mutation probability and the population
size (pcreep and npopsiz, factors (3) and (4)) have presented a
significant effect. These interaction results carry a great influence from the factors (6) and (4), the most meaningful factors
in the GA response in this case study, but also show that the
creep mutation probability (factor (3)) is a factor with important
influence. As a result from the strong influence of factors (6)
and (4), their interaction is also of significant effect on the final
response.
Another GA parameter that should be carefully varied during
a GA optimisation study, according to Table 4 and Fig. 5, is the
jump mutation probability (pmutate, factor (1)), evidenced by
the meaningful effect of this factor. The interaction between the
jump mutation probability and the maximum number of generations (pmutate and maxgen, factors (1) and (5)) is presented as
of significant importance, but this result is attributed mainly to
the strong effect of factor (1) and not to the importance of factor
(5) on the final response.
The crossover probability (pcross, factor (2)) does not seem
to affect significantly the GA optimisation response.
7. Conclusions
An original perspective is proposed to genetic algorithm
parameters in the application of this stochastic optimisation technique in batch cooling crystallization systems. Factorial design
technique was used in order to select the most meaningful parameters, when optimising the coefficient of variation of the final
crystal size distribution. The results guide to the significance
(95% confidence level) of the initial population, the population size and the jump and creep mutation probabilities. Future
optimisation works should direct focus on alterations in these
parameters during GA optimisation.

2240

C.B.B. Costa et al. / Computers and Chemical Engineering 29 (2005) 22292241

Acknowledgements
The authors would like to thank David L. Carroll for the FORTRAN GA code, available in the world wide web and FAPESP
(Fundaca o de Amparo a` Pesquisa do Estado de Sao Paulo), process # 01/01586-1.
Appendix A
This appendix brings detailed information on model variables
and optimisation results for run pl32a that figures in Table 2.
As can be extracted from Tables 1 and 2, pl32a was run with
all GA parameters in the upper level, i.e., the values assigned
to pmutate, pcross, pcreep, npopsiz, maxgen and idum were,
respectively, 0.0575, 0.92, 0.046, 58, 58 and 850. The 58
individuals defined by idum, each one carrying values for all
Fig. A1. Profiles of coolant temperature, temperature of solution inside the crysparameters of Eqs. (8) and (9), plus the definition of the right
tallizer and supersaturation for best individual of run pl32a.
sequence of functions (Type 1 + Type 2 or Type 2 + Type 1) to be
followed, evolve for 58 generations, with the rate of GA operators defined by pmutate, pcross and pcreep. Each individual
is transferred to the crystallization process model subroutine.
Extremely detailed information on the crystallization system
variables (like crystallizer dimensions, global heat coefficient,
initial solution concentration, solubility data of adipic acid as a
function of solution temperature, classes boundaries and so on)
can be found in Costa et al. (2005). The seeding policy used consisted of 1.32 1012 crystals in the seventh class and 1.62 108
crystals in the 20th class.
The CV that figures in Table 2 for pl32a is the minimum
CV calculated from all CSD generated with the crystallization
simulations whose input were, for each one of them, the set of
parameters of Eqs. (8) and (9) (vector x). The minimum CV
determines the best individual evolved, that is, the best values
for the parameters of Eqs. (8) and (9), as well as the definition
of the sequence of functions. The best individual for pl32a is
Fig. A2. CSD generated for best individual of run pl32a.
presented in Table A1.
The coolant temperature for the best individual of pl32a is
then defined as expressed in Eq. (12), which is determined first
by Type 2 function and then by Type 1.

 t 1.7613

337.6164 (337.6164 309.3425)


,
if t 1267.0
1500
Tc =
(12)



t 0.7292

309.3425 (309.3425 337.6164) 1


, if t > 1267.0
1500
Fig. A1 shows the best cooling profile (Eq. (12)), determined by
the information presented in Table A1, as well as the profile of the
Table A1
Vector x for the best individual of pl32a
x component

Value

Sequence of functions
Tc0
TcF
tintermediate
A1
A2

Type 2 + Type 1
337.6164
309.3425
1267.0000
0.7292
1.7613

crystallizer medium temperature and of supersaturation during


all batch time. Both profiles of crystallizer medium temperature
and of supersaturation are generated with the process simulation
with the model equations and process parameters, whose details
can be found in Costa et al. (2005). An illustration of the obtained
CSD for the individual presented in Table A1 is depicted in
Fig. A2.
References
Barros Neto, B., Scarminio, I. S., & Bruns, R. E. (2001) Como fazer experimentos: pesquisa e desenvolvimento na ciencia e na industria. Editora da
Unicamp, Campinas, pp. 83184.

C.B.B. Costa et al. / Computers and Chemical Engineering 29 (2005) 22292241


Box, G. E. P., Hunter, W. G., & Hunter, J. S. (1978). Statistic for
experimentersAn introduction to design data analysis and model building. New York: Wiley, pp. 306342, 374409.
Carroll, D. L. (2004). D. L. Carrolls FORTRAN Genetic Algorithm Driver,
in: http://cuaerospace.com/carroll/ga.html accessed in 28 September
2004.
Choong, K. L., & Smith, R. (2004). Optimization of batch cooling crystallization. Chemical Engineering Science, 59, 313327.
Costa, C. B. B., da Costa, A. C., & Maciel Filho, R. (2005). Mathematical modeling and optimal control strategy development for an adipic
acid crystallization process. Chemical Engineering and Processing, 44,
737753.
Deb, K. (1998). Genetic algorithm in search and optimization: the technique and applications. In Proceeding of the International Workshop on
Soft Computing and Intelligent Systems, Machine Intelligence Unit (pp.
5887).
Deb, K. (1999). An introduction to Genetic Algorithms. In SadhanaAcademy Proceedings in Engineering Sciences, 24, Part 45 (pp. 293
315).
Deb, K. (2000). An efficient constraint handling method for Genetic Algorithms. Computer Methods in Applied Mechanics and Engineering, 186,
311338.
Fuhner, T., & Jung, T. (2004). Use of genetic algorithms for the development
and optimization of crystal growth processes. Journal of Crystal Growth,
266, 229238.
Kar, B., Banerjee, R., & Bhattacharyya, B. C. (2002). Optimization
of physicochemical parameters for gallic acid production by evolutionary operation-factorial design technique. Process Biochemistry, 37,
13951401.

2241

Kiparissides, C. (2004). Challenges in polymerization reactor modeling and


optimization: A population balance perspective. In Proceedings of DYCPOS 7-7th International Symposium on Dynamics and Control of Process
Systems.
Ma, D. L., Tafti, D. K., & Braatz, R. D. (2002). Optimal control and
simulation of multidimensional crystallization processes. Computers and
Chemical Engineering, 26, 11031116.
Marchal, P., David, R., Klein, J. P., & Villermaux, J. (1988). Crystallization
and precipitation engineeringI. An efficient method for solving population balance in crystallization with agglomeration. Chemical Engineering
Science, 43, 5967.
Mullin, J. W. (1993). Crystallization. Oxford: Butterworth-Heinemann, pp.
354355, 389391.
Nallet, V., Mangin, D., & Klein, J. P. (1998). Model identification of batch
precipitations: Application to salicylic acid. Computers and Chemical
Engineering, 22, S649S652.
Puel, F., Fevotte, G., & Klein, J. P. (2003). Simulation and analysis of
industrial crystallization processes through multidimensional population
balance equations Part 1: A resolution algorithm based on the method of
classes. Chemical Engineering Science, 58, 37153727.
Rawlings, J. B., Miller, S. M., & Witkowski, W. H. (1993). Model identification and control of solution crystallization processes: A review. Industrial
and Engineering Chemistry Research, 32, 12751296.
Rodrigues, J. A. D., Toledo, E. C. V., & Maciel Filho, R. (2002). A tuned
approach of the predicitiveadaptative GPC controller applied to a fedbatch bioreactor using complete factorial design. Computers and Chemical
Engineering, 26, 14931500.
Zhang, G. P., & Rohani, S. (2003). On-line optimal control of a seeded batch
cooling crystallizer. Chemical Engineering Science, 58, 18871896.

Você também pode gostar