Escolar Documentos
Profissional Documentos
Cultura Documentos
com
Abstract
Hume-Rotherys breadth of knowledge combined with a quest for generality gave him insights into the reasons for solubility in metal-
lic systems that have become known as Hume-Rotherys Rules. Presented with solubility details from similar sets of constitutional dia-
grams, can one expect articial neural networks (ANN), which are blind to the underlying metals physics, to reveal similar or better
correlations? The aim is to test whether it is feasible to predict solid solubility limits using ANN with the parameters that Hume-Rothery
identied. The results indicate that the correlations expected by Hume-Rotherys Rules work best for a certain range of copper or silver
alloy systems. The ANN can predict a value for solubility, which is a renement on the original qualitative duties of Hume-Rotherys
Rules. The best combination of input parameters can also be evaluated by ANN.
2007 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved.
Keywords: Hume-Rotherys Rules; Articial neural networks; Solubility limit of metals; Backpropagation networks; Binary alloys
1. Introduction complex problems [2]. They are model-free in the sense that
they can process complex inputoutput relationships with-
Materials science seeks to understand the causative rela- out an explicit mathematical model [3] and are becoming
tionships between composition, processing, structure and popular in materials science in solving problems that are
properties at a level that allows composition and processing not suitable for traditional statistical methods [4]. They
parameters to be selected to provide targeted properties. can process large amounts of information and mimic bio-
Such relationships can be discerned by experiment and, in logical systems in learning ability and capability to general-
a few instances, by predictive theory. They can sometimes ize. They can handle non-linearity, imprecise and fuzzy
be obtained by molecular modelling. All molecular model- information and are fault and failure tolerant [5]. Impor-
ling techniques can be classied under three general catego- tantly, they oer the materials scientist compositionally
ries: (1) ab initio electronic structure calculations, which are predictive power in which conventional theory is some-
based upon quantum mechanics; (2) semi-empirical meth- times lacking because of theoretical complexity. A good
ods, which are also founded upon quantum mechanics, example is their use in predicting dielectric constants from
but which enhance computational speed by using approxi- composition [6].
mations based upon experiment; (3) molecular mechanics, Various networks have been devised but backpropaga-
an empirical method based on classical physics which is tion networks in which the data are forward-fed into the
computationally fast [1]. network without feedback and without same-layer neural
Another approach is to use correlation methods made connections are the most widely used [2,7]. The model is
possible by articial neural networks (ANN), which are shown schematically in Fig. 1. In such articial systems,
nding growing acceptance in many subjects for modelling learning is a process of updating an internal representation
of an external system. During learning, the magnitude of
the weightings or synapse strengths is adjusted repetitively
*
Corresponding author. Tel.: +44 (0)20 76794689. as the network is presented with training data.
E-mail address: j.r.g.evans@qmul.ac.uk (J.R.G. Evans).
1359-6454/$30.00 2007 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved.
doi:10.1016/j.actamat.2007.10.059
Y.M. Zhang et al. / Acta Materialia 56 (2008) 10941105 1095
becomes large when new data are presented to the network. uration of the neurons of hidden layers, which would
This means that the network has learned the training impede the learning process [2]. There are two functions
examples but is unable to generalize to new situations. for scaling the inputs and targets of networks that have
There are two common methods for improving generaliza- been implemented in the Matlab Neural Network Toolbox:
tion: Bayesian Regularization and Early Stopping [60]. PREMNMX, which is used to scale inputs and targets so
Bayesian Regularization tends to provide better generaliza- that they fall in the range [1, 1]; and PRESTD, which
tion performance than Early Stopping in training function normalizes the inputs and targets so that they will have
approximation networks. As a result, Bayesian Regulariza- zero mean and unity standard deviation. As the transfer
tion is used for improving generalization. This involves functions employed here are tan-sigmoid transfer function
setting the sum of squares of the network errors on the and linear function, PREMNMX is adopted.
training set to give the best generalization. Because the size of the database in this work is small,
it is crucial to make the training set cover the problem
4.4. Partitioning of the database boundary. As in Malinov and Shas work [4], a looped
program is used in order to nd the best combination
The generalizing ability of the network depends on the of database distribution and number of neurons in the
training database size [2]. Although ANN can be obtained rst hidden layer. The criteria used to nd the best com-
from a training database of any size, like other empirical bination are discussed below.
models, generalization of these models outside the model
domain is adversely aected. Since ANN are required to 5. Determination of input parameters
generalize for the unseen data, data used for training
should be large enough to cover the possible known varia- The network input parameters are the physical param-
tion in the problem domain. eters including (1) atomic size parameter, (2) valence
The development of an ANN based on Bayesian Regu- parameter, (3) electrochemical parameter, i.e., electroneg-
larization requires partitioning of the parent database into ativity, and (4) structure parameter of solvent and solute
two sub-sets: training and testing. Currently, there are no atoms, which were not mentioned in 1934, but were
denitive rules for determining the required sizes of the introduced in 1948 [42] concerning the detailed examina-
data sub-sets. Rules of thumb derived from experience tion of Vegards law [61] in the case of metallic solid
and analogy between ANN and statistical regression exist solutions.
[2]. Following the method suggested by Matlab [60], the Three dierent expressions of these parameters are used:
sets are picked as equally spaced points throughout the
original data. Ratios between 2:1 and 5:1 are tested and, 1. The raw data that Hume-Rothery used. Details are dis-
based on regression coecient for the testing set, the ratio cussed below.
4:1 is selected, i.e., partitioning the whole data set into ve 2. The original collected values for each parameter of sol-
groups, four groups being used for training, while one vent and solute atoms.
group is used for testing. The size of the training set 3. The original collected parameters are converted into
and testing set are thus determined, but the choice of test- functionalized values before putting them into the
ing set still plays a crucial role, because the training set networks:
should include all the data belonging to the problem (a) For the size factor. The dierence between the
domain. In this work, the problem domain is not clear, atomic diameters of solvent and solute atoms
so, referring to Malinov and Shas work [4], a loop pro- divided by the diameter of the solvent atoms is
gram was used to redistribute the database in order to used.
make the training set cover the problem domain. The dis- (b) For the valence factor. These are integers and the
tribution was selected on the basis of regression coecient original values are used, leaving the neural network
R (R = 1 corresponds to perfect correlation), for the test- to decide the relations between valence of solvents
ing set. However, where M, the slope of the linear regres- and solutes.
sion line, is smaller than 0.9, the regression coecient (c) For the electrochemical factor. The dierence
provides an unreliable criterion, and so the selection was between that of the solvent and solute atoms is
B used.
based on u jM 1j 1 R Bmax , where B was the
(d) For the structure parameter. The expressions of the
intercept on the A-axis, and Bmax is the maximum solubil-
structures can be put in terms of numbers 114 for
ity. The ideal value of this parameter is zero. the Bravais lattices, but this revealed little eect of
the structures. They can also be expressed in three
4.5. Data normalization sets of numbers representing primitive cell dimen-
sions, angles and systems. This allows some
The data should be normalized within a uniform range similarities to be explored. The three sets are (1) unit
(e.g., [0, 1] or [1, 1]) in order to prevent larger numbers cell length (a = b = c; a = b 6 c; a 6 b 6 c), (2) axes
from overriding smaller ones and to prevent premature sat- angles (a = b = c = 90; a = b = 90, c = 120;
1098 Y.M. Zhang et al. / Acta Materialia 56 (2008) 10941105
Best Linear Fit: A = (0.977) T + (0.23) Best Linear Fit: A = (0.962) T + (-1.19)
100 100
80 80
40 40
20 20
0 R = 0.993 0 R = 0.992
-20 -20
0 50 100 0 50 100
Experimental (T) Experimental (T)
80
Predicted from NN (A)
60 Data Points
Best Linear Fit
40
A=T
20
0 R = 0.992
-20
0 50 100
Experimental (T)
Fig. 2. Prediction of solubility using three functionalized parameters for the 60-alloy system data set: atomic size, valency and electronegativity. (a)
Training set; (b) testing set; (c) whole set.
The functionalized structural parameter described above experimental value. The problem of (1) is that a zero mean
is then incorporated in place of the Bravais lattice number, error can be obtained from very large deviations from the
and the results are shown in Fig. 3. Comparing this with line, and the problem with (3) is that, when many experi-
Fig. 2, the dierence is not great, nor can it be said that mental values are zero or close to zero, the percentage is
one is superior to the other. This could imply that the struc- innite or very high, respectively. Thus (2) provides the
ture parameter does not play a very important role and, best criterion and, furthermore, the standard deviation of
indeed, Hume-Rothery did not include it in 1934. this modulus of error gives a measure of spread and, hence,
There are several ways to evaluate the performance of if large, indicates that the error is not systematic. So in
neural network predictions. The rst and simplest is based assessing the correlation, two parameters are used: the cor-
on the value of the linear regression coecient R for the relation coecient R and the average absolute deviation
plot of predicted vs experimental output. A problem occurs between theory and prediction (mean modulus of error).
when R is low (<0.9), and the slope M is close to unity or The two are plotted in Fig. 4 for all data sets and show a
vice versa. Under these circumstances, slope M, intercept good correlation at high R: at R = 1, the mean modulus
B, and R can be combined to give one parameter u as of error is zero.
dened above, which should be as close as possible to zero. These criteria are compared in the rst two data rows of
This has the advantage of providing a single value that can Table 3 for the testing set and whole set of the 60-alloy sys-
be used as a criterion for parameter selection in a looped tem from plots of predicted solubility against experimental
optimization program. However, in this composite param- solubility. The rst thing to notice about this table is that
eter, the contribution of each of M, B and R is treated as the four ways of assessing the accuracy of prediction (test-
equal, whereas a weighting might be preferable. An alter- ing set) concur. As the linear regression coecient
native method is to consider the mean error of the pre- decreases, parameter u increases much more dramatically
dicted value from the experimental value. There are three and can be regarded as a more sensitive indicator for this
ways in which this error can be calculated: (1) the mean reason. Also, a simple calculation of the mean deviation
true error having the same unit as target values; (2) the of the predicted values from the best t line (mean modulus
mean modulus of error again having the same unit as tar- of error) gives an estimate of the accuracy of prediction.
get; and (3) the percentage error (as modulus) based on the This follows the trend of increasing u and reduced R.
1100 Y.M. Zhang et al. / Acta Materialia 56 (2008) 10941105
Best Linear Fit: A = (0.961) T + (0.417) Best Linear Fit: A = (0.923) T + (6.67)
100 100
80 80
40 40
20 20
0 0
R = 0.985 R = 0.976
-20 -20
0 50 100 0 50 100
Experimental (T) Experimental (T)
80
Predicted from NN (A)
Data Points
60 Best Linear Fit
A=T
40
20
0
R = 0.975
-20
0 50 100
Experimental (T)
Fig. 3. Prediction of solubility using four functionalized parameters for the 60-alloy systems: atomic size, valence, electronegativity and structure. (a)
Training set; (b) testing set; (c) whole set.
to mean error is, in all but one case, greater than unity.
These trends in the assessment criteria are consistent for
both the testing set and the whole set.
The best predictive results for the network are obtained
using the functionalized values of atomic size, valence and
electronegativity to predict the original values of solubility
for the 60-alloy data set used by Hume-Rothery himself,
and the data are plotted in Fig. 2. Inclusion of the struc-
tural factor using the parameter described above weakens
the predictive power of the network (Fig. 3). The reason
for this slightly counter-intuitive nding is that crystallo-
graphic compatibility is likely to become more important
at higher solubility levels, being essential for continuous
solubility. However, the majority of data are at the low sol-
ubility end, where substitutional atoms are at a low coordi-
nation number. Another reason is that the number used to
represent structure actually conceals crystallographic simi-
larities, as discussed in more detail below, and there is not
enough training data for the network to establish these sim-
ilarities by itself. The structure parameter is used to assess
Fig. 4. The correlation between R-values and mean modulus of error. the criterion for solubility that the same crystal structure
for the two elements favours a wide solubility range [64].
The standard deviation for this error is an indicator that This makes it a type of classication problem, not com-
the error is random rather than systematic and, if so, the pletely the same as a mapping problem, and it could be
standard deviation is expected to increase with the mean, argued that including it in this type of network is
as it does, in fact. The ratio of standard deviation of error inappropriate.
Y.M. Zhang et al. / Acta Materialia 56 (2008) 10941105 1101
Table 3
Comparison of criteria for predicting solubility using dierent combinations of parameter groups
Conditionsa Test set Whole set
R u Mean modulus of SD of modulus of R u Mean modulus of SD of modulus of
error (at.%) error (at.%) error (at.%) error (at.%)
Size, valence, electronegativity 0.992 0.0579 2.46 3.21 0.992 0.0422 1.65 1.94
(60 alloys)
Size, valence, electronegativity, 0.976 0.168 6.98 4.58 0.975 0.0851 3.21 3.21
structure (60 alloys)
Size, valence, electronegativity 0.695 0.662 7.01 14.1 0.768 0.631 6.30 12.7
(408 alloys)
a
Using functionalized parameters.
7.2. Testing Hume-Rotherys rules with the 408-alloy a less ambiguous estimate of the accuracy of prediction.
systems The mean error of the prediction (testing set) increases by
a factor of 3, and the linear regression coecient drops well
From the results for the 60-alloy systems, it is clear that below 0.9. The mean error for the testing set and the whole
using the three functionalized values of parameters pro- set becomes closer, showing that this set does not train
vides better results, so the same approach is adopted for well, whereas for the 60-alloy set, the whole-set errors are
testing the 408-alloy systems. This represents a nearly much lower than the testing-set errors. It is an inevitable
exhaustive set of known silver and copper alloys. The conclusion that the wider application of the rules intro-
results, shown in the last row of Table 3 and plotted in duces diculties, some of which are discussed below.
Fig. 5, use the same format of inputs as those used for
the 60-alloys set. When this method (omitting structural 7.3. Relative importance of the rules
parameter) is applied to the larger 408-alloys data set, the
regression coecient is low (<0.9), and the comparison It is interesting to enquire which of the four parameters,
between dierent regression coecient values has less i.e., atomic size, valence, structure and electronegativity, is
meaning. Calculation of the mean modulus of error gives the most inuential parameter, assuming that they are
Best Linear Fit: A = (0.623) T + (3.01) Best Linear Fit: A = (0.672) T + (2.9)
100 150
80
Predicted from NN (A)
100
60
40 50
20
0
0 R = 0.79 R = 0.695
-20 -50
0 50 100 0 50 100
Experimental (T) Experimental (T)
100
50 Data Points
Best Linear Fit
A=T
0
R = 0.768
-50
0 50 100
Experimental (T)
Fig. 5. Prediction of solubility using three functionalized parameters for the 408-alloy systems: atomic size, valency and electronegativity: (a) Training set,
(b) testing set, (c) whole set.
1102 Y.M. Zhang et al. / Acta Materialia 56 (2008) 10941105
independent of each other. The importance of the struc- negativity play more important roles than the valence and
tural parameter has been tested and found not to play a structural parameters.
very important overall role, although of course it does In the next stage, pairs of parameters are selected to pre-
inuence the possibility of continuous solubility. dict solubility: (1) atomic size and valence factors; (2)
The relative importance of size factor, valence and elec- atomic size and electronegativity factors; (3) atomic size
tronegativity is compared in Table 4. Using the same pro- and structural factors; (4) valence and electronegativity
cedure (functionalized parameters including structure), the factors; (5) valence and structural factors; and (6) structure
network is run with one parameter omitted at a time on the and electronegativity factors. They are shown in Table 5.
set of 60 systems. The rst thing to notice is that most of the mean errors
In general, mean error (data columns 3 and 7) varies are increased compared with the three-input tests reported
inversely with regression coecient (data columns 1 and in Table 4. The correlation coecient for the testing set is
5), and the standard deviation of error is between 1.1 and generally higher than that for the whole set, because the
1.8 times higher than the mean error. Using the mean error partitioning procedure described above selects minimum
of the testing set as our main criterion for accuracy of pre- u for the testing set as criteria rather than for the training
diction, the parameters atomic size, valence and electroneg- set. An ideal procedure would be to nd the correlation for
ativity provide the strongest prediction of solubility and, of both sets for each partition and select the distribution that
these, atomic size has the strongest eect because, when it is gives the closest and highest R-values, as described by Mal-
omitted, the error is highest (data row 2). Electronegativity inov and Sha [4]. When the correlation is poor, however, as
appears to have a stronger inuence than valence (data rows for the eects of valence and structure, the value of R has
3 and 4). In fact, these parameters are not wholly indepen- little meaning. Table 5 conrms the deductions from the
dent of each other. As mentioned by Hume-Rothery, they three-parameter tests that atomic size has the strongest
are related, and their interplay makes the determination eect on solubility, and the structural parameter the least
of solubility very dicult [22]. As a result, determining the eect. However, some ambiguity attends the relative roles
relative importance of each parameter is not easy; it can of electronegativity and valence, which are reversed in this
only be said descriptively that the atomic size and electro- assessment of ranking. Pearson [65] states that when one
Table 4
Comparison of criteria for predicting solubility using dierent combinations of three parameters
Conditionsa Test set Whole set
R u Mean modulus of SD of modulus of R u Mean modulus of SD of modulus of
error (at.%) error (at.%) error (at.%) error (at.%)
Size, valence, electronegativity 0.992 0.0579 2.46 3.21 0.992 0.0422 1.65 1.94
(60 alloys)
Valence, electronegativity, 0.867 0.308 8.19 11.7 0.924 0.197 4.20 6.34
structure (60 alloys)
Size, structure, electronegativity 0.93 0.365 3.17 4.19 0.968 0.142 3.46 3.66
(60 alloys)
Size, valence, structure (60 0.569 0.477 7.73 13.8 0.761 0.613 7.07 11.0
alloys)
a
Using functionalized parameters.
Table 5
Comparison of criteria for predicting solubility using dierent combinations of two parameters
Conditionsa Test set Whole set
R u Mean modulus of SD of modulus of R u Mean modulus of SD of modulus of
error (at.%) error (at.%) error (at.%) error (at.%)
Size, valence (60 alloys) 0.852 0.470 4.47 2.50 0.496 1.36 9.52 14.3
Size, electronegativity (60 0.679 0.860 6.99 4.83 0.495 1.36 10.3 13.8
alloys)
Size, structure (60 alloys) 0.675 0.889 10.2 6.42 0.441 1.50 10.7 14.4
Valence, electronegativity 0.91 0.153 7.31 10.1 0.925 0.184 4.54 6.06
(60 alloys)
Valence, structure (60 0.459 1.02 12.4 11.9 0.662 0.886 9.82 11.3
alloys)
Structure, electronegativity 0.607 1.30 11.3 21.1 0.524 1.35 9.00 14.5
(60 alloys)
a
Using functionalized parameters.
Y.M. Zhang et al. / Acta Materialia 56 (2008) 10941105 1103
component in a binary alloy is very electropositive relative As Basheer and Hajmeer [2] suggested, the most popular
to the other, there is a strong tendency for them to form way to nd the optimal number of hidden nodes is by trial
compounds of considerable stability in which valence rules and error with one of those rules as a starting point. How-
are satised. Such alloys are said to exhibit a strong elec- ever, facing exotic problems with high non-linearity and
trochemical factor and this is the strongest eect in deter- hysteresis such as are shown in Basheers work [66,72],
mining the constitution of alloys, and one which dominates these rules of thumb may need to be abandoned. There
all other eects such as energy band or geometrical factors. is some value in beginning with a small number of hidden
nodes and building up iteratively to attain the accuracy
8. Discussion required. This method is adopted in this work through
implementation of the program.
It is important to recognize that there are four factors
that limit the predictive capability of the networks: (1) 8.4. The reliability of input parameters
imperfections in the network conguration, which the
authors have attempted to minimize through design; (2) Hume-Rothery et al. [22] themselves made it clear that
paucity of learning data, which has been discussed above; the exact atomic diameter of an element is always dicult
(3) the generality of Hume-Rotherys Rules which were to dene. Their denition of atomic diameter, as given by
conceived as guidelines; and (4) the fact, recognized by the nearest-neighbour distance in a crystal of the pure
Hume-Rothery and co-authors, that the available data metal, was used here but the radius of an atom is probably
are subject to inexactitudes. aected by coordination number. Except for the heavy ele-
ments, elements of the B sub-groups tend to crystallize with
coordination number 8N, where N is the group to which
8.1. The validity of ANN models
the element belongs. This is due to the partly covalent nat-
ure of the forces in these crystals and, except in Group IV B
It can be seen that, if the parameters are selected appro-
(diamond structure), results in the atoms having two sets of
priately, as shown in Figs. 2 and 3, the prediction of the
neighbours at dierent distances in the crystal. Cottrell [73]
solid solubility limit by the ANN is reasonably consistent
suggests that the concept of a characteristic size, which sug-
with Hume-Rotherys Rules. The ANN, as a method, can
gests hard spheres butted together, is doubtful. Allocating a
be treated as feasible, although it cannot be relied on den-
single atomic diameter for each element, independent of its
itively, and others have reiterated this. It may be regarded
environment, and valences of solvent and solute is too sim-
as a useful tool for cautious use in materials science, but the
plistic an approach [62]. Furthermore, within the 408-alloy
choice of right ANN plays a critical role in its success, espe-
systems, the metallic radius of some elements could not be
cially when the data set is restricted, as is often the case in
found, and the covalent radius was used instead. These fac-
materials science.
tors contribute to the errors for the prediction of solid sol-
ubility limit and are to be distinguished from the intrinsic
8.2. The eect of number of layers weaknesses of the ANN.
An early discovery by Hume-Rothery was that a metal
Basheer and Hajmeer [2] indicate that the choice of the of lower valence is more likely to dissolve one of higher
number of hidden layers and the number of neurons in the valence than vice versa. However, more detailed examina-
hidden layers are among the most important choices in tion has not conrmed this. For example, silver dissolves
ANN design. It is often claimed that, in most function about 20% aluminium, but aluminium dissolves about
approximation problems, one hidden layer is sucient to 24% silver. For high valence, covalently bonded compo-
approximate continuous functions [48,66]; two hidden lay- nents, the relative valence factor applies. For example,
ers must generally be necessary for learning functions with copper dissolves about 11% of silicon, which behaves as
discontinuities [67]. In this work, the type of function is not a four-valent metal in forming CuSi electron phase
clear. Also the neural network users guide [60] suggested alloys, but the solubility of copper in covalently bonded
that a two-hidden-layer sigmoid/linear network can repre- silicon is negligible [73]. As a result, although Hume-
sent any function of input/output relationship. On these Rothery [62] accepted that it is still a general principle
bases and looking at the results produced from this work, that the solubility in the element of lower valency is of
it can be seen that the choice of two-hidden layer network greater extent when dealing with alloys of univalent met-
is a sensible one. als copper, silver and gold with metals of higher valency,
in its general form, this principle must be treated with
8.3. The eect of size of layer caution.
The valencies of transition metals are variable and com-
The choice of size of the rst hidden layer is critical in plex and have been analysed by Hume-Rothery et al. [74]
the ANN design. There are several rules of thumb available and Cockayne and Raynor [75]. As suggested by Cottrell
in the literature relating hidden layer size to the number of [73], due to the valency complication caused by partly lled
nodes in input (NINP) and output (NOUT) layers [48,6771]. d shells, the transition metal alloys generally do not follow
1104 Y.M. Zhang et al. / Acta Materialia 56 (2008) 10941105
the rule. Gschneider [76] modied the relative valence rule Hume-Rotherys Rules work properly in a certain range of
so that the solubility is low when a metal in which d orbi- alloy systems, but cannot be treated as general principles.
tals strongly inuence the valence behaviour is alloyed with Also, it needs to be said that, despite using Hume-Roth-
a simple sp metal, but that the solubility is likely to be bet- erys Rules, one cannot predict the solid solubility limits
ter in the d metal than the reverse. accurately. However, these rules are still useful guidelines
The electronegativity rule needs a scale, such as that for judging the solubility of alloy systems.
given by Mullikan, based on the equation v 12 I A,
where I is the ionization energy, A is electron anity, 9. Conclusions
and v is Mullikan electronegativity. When divided by 2.8,
this scale matches the empirical scale of Pauling reasonably ANN oer materials scientists a relatively new tool for
well. In the case of transition metals, as emphasized by examining their data with the intention of making predic-
Watson and Bennett [77], the partly lled d states of tran- tions while theory is still too opaque to be predictive. It
sition metals at energies near the Fermi energy inuence is often the case in materials science that data sets are lim-
electronegativity. Watson and Bennett presented an elec- ited, either inherently, because of the limits imposed by the
tronegativity scale for transition metals that matched Paul- number of elements, or extrinsically, because of the high
ings scale, and could be scaled by 2.8 to bring it to cost of experimentation. This study has taken one of the
Mullikans scale of v values. Most importantly, Li and cornerstones of physical metallurgy and adopted ANN
Xue [78] have mentioned that the although electronegativ- for predicting the solid solubility limit of alloy systems
ity is often treated as an invariant property of an atom, as based on Hume-Rotherys Rules. Application of a two-hid-
in Paulings scale, it actually depends on the chemical envi- den-layer backpropagation network with functionalized
ronment of the atom, e.g. valence state and coordination input parameter values for dierent classes and numbers
number. The electronegativity values adopted in this pro- of alloy systems, indicates that: (1) ANN is a useful tool
ject are based on Paulings work, so the above eects are for dealing with forecasting problems or mapping prob-
not entirely taken into account. lems in materials science; (2) Hume-Rotherys general prin-
The method adopted for expressing structure parame- ciples work well in several alloy systems, such that the
ter has some limitations. First, the expressions used to ANN can be used to estimate solid solubility. When
distinguish dierent crystal structures can conceal simi- the 60-alloy systems used by Hume-Rothery are tested,
larities. For unit cell length, a = b = c and a = b 6 c the rules work very well, as demonstrated by the ANN cor-
are distinguished but can have considerable similarity. relation. The wider application of the rules to a set of 408
Secondly, from this expression, the face centred cubic silver and copper alloys is less successful, but this is consis-
(fcc) and the hexagonal close packed (hcp) systems are tent with the inherent simplication of the rules which are
expressed as quite distinct sets, but there are some simi- already documented.
larities between these two structures. They are both close
packed systems, and stacking faults can blur the dier- References
ence. Indeed, the Cu and Zn systems demonstrate high
solubility, even though one component, Zn, is hcp and [1] Dorsett H, White A. Overview of molecular modeling and ab initio
the Cu is fcc. The Ag and a-Li system is a similar case. molecular orbital methods suitable for use with energetic materi-
Thirdly, there are several complex structural systems that als. Salisbury: Aeronautical and Maritime Research Laboratory;
2000. p. 3.
cannot be distinguished from other systems by using this
[2] Basheer IA, Hajmeer M. J Microbiol Methods 2000;43:3.
expression, such as a-Mn, whose structure is cI58, and b- [3] Fausett LV. Fundamentals of neural networks: Architectures, algo-
Mn, whose structure is cP20. These all aect the ability rithms and applications. Englewood Clis, NJ: Prentice-Hall; 1994.
of the structure parameter to contribute to predicting [4] Malinov S, Sha W. Comput Mater Sci 2003;28:179.
the solubility. [5] Jain AK, Mao JC, Mohiuddin KM. Computer 1996;29:31.
[6] Scott DJ, Coveney PV, Kilner JA, Rossiny JCH, Alford NMcN. J
Eur Ceram Soc 2007;27:4425.
8.5. The generality of Hume-Rotherys rules [7] Rumelhart DE, Hinton GE, Williams RJ. Learning internal
representation by error propagation. In: Rumelhart DE, Mcclle-
Hume-Rothery and co-workers state: In general, the land JL, editors. Parallel distributed processing: Exploration in the
solubility limit is mainly determined by these factors, and microstructure of cognition, vol. 1. Cambridge, MA: MIT Press;
1986.
it is their interplay that makes the results so complex
[8] Arkadan AA, Chen Y, Subramaniam S, Hoole SRH. IEEE Trans
[22]. For the 60-alloy systems mentioned by Hume-Rothery Magn 1995;31:1984.
in 1934, and using some of the parameter values that can be [9] Raj KH, Sharma RS, Srivastava S, Patvardhan C. Int J Mach Tools
found in Hume-Rotherys paper or his book and others Manufact 2000;40:851.
that follow his representations, the results for prediction [10] Guessasma S, Coddet C. Acta Mater 2004;52:5157.
[11] Huang CZ, Zhang L, He L, Sun J, Fang B, Zou B, et al. J Mater
of the solid solubility limit are satisfactory.
Process Technol 2002;129:399.
However, from theory as analysed by others in later [12] Malinov S, Sha W. Mater Sci Eng A 2004;365:202.
work [38,73,79], and also from attempts to predict the solid [13] Martinez SE, Smith AE, Bidanda B. J Intell Manuf 1994;5:277.
solubility limit of the 408-alloy systems, it can be said that [14] Bork U, Challis RE. Meas Sci Technol 1995;6:72.
Y.M. Zhang et al. / Acta Materialia 56 (2008) 10941105 1105
[15] Larkiola J, Myllykoski P, Nylander J, Korhonen AS. J Mater Process [48] Hecht-Nielsen R. Neurocomputing. Reading, MA: Addison-Wesley;
Technol 1996;60:381. 1990.
[16] Gavard L, Bhadeshia H, MacKay DJC, Suzuki S. Mater Sci Technol [49] Hecht-Nielsen R. Neural Networks 1988;1:131.
1996;12:453. [50] Zupan J, Gasteiger J. Neural networks for chemists: An introduc-
[17] Malinov S, Sha W, Guo Z. Mater Sci Eng A 2000;283:1. tion. New York: VCH-Weinheim; 1993.
[18] Homer J, Generalis SC, Robson JH. Phys Chem Chem Phys [51] Schalko RJ. Articial neural networks. London: McGraw-Hill;
1999;1:4075. 1997.
[19] Guo D, Wang YL, Nan C, Li LT, Xia JT. Sens Actuators A [52] Pal SK, Srimani PK. Computer 1996;29:24.
2002;102:93. [53] Attoh-Okine NO, Basheer IA, Chen D-H. Use of articial neural
[20] Cai K, Xia JT, Li LT, Gui ZL. Comput Mater Sci 2005;34:166. networks in geomechanical and pavement systems. Washing-
[21] Schooling JM, Brown M, Reed PAS. Mater Sci Eng A 1999;260:222. ton: Transportation Research Board, National Research Council;
[22] Hume-Rothery W, Mabbott GW, Channel-Evans KM. Philos Trans 1999. p. 5.
Soc A 1934;233:1. [54] Specht DF. Neural Networks 1990;3:109.
[23] Cooke CJ, Hume-Rothery W. J Less Common Met 1966;10:52. [55] Vicino F. Substance Use Misuse 1998;33:335.
[24] Pauling L. The nature of the chemical bond and the structure of [56] Moatt WG, editor. The handbook of binary phase diagrams. Sche-
molecules and crystals: An introduction to modern structural nectady (NY): General Electric Co.; 1977. Sections AgAl to AgZr,
chemistry. Ithaca, NY: Cornell University Press; 1960. p. 8895. CuAl to CuZr.
[25] Ohtani H, Ishida K. Thermochim Acta 1998;314:69. [57] ASM handbook, vol. 3, Alloy phase diagrams. Metals Park (OH):
[26] Kaufman L, Bernstein H. Computer calculation of phase diagrams: ASM International; 1992.
with special reference to refractory metals. New York: Academic [58] Aylward GH, Findlay T, editors. SI chemical data. New York: Wiley;
Press; 1970. 1998. p. 613.
[27] Hillert M. Calculations of phase equilibria. In: ASM, editor. [59] Hume-Rothery W. Elements of structural metallurgy. London: Insti-
American Society for Metals Seminar on Phase Transforma- tute of Metals; 1961. p. 10710.
tions. Metals Park (OH): ASM International; 1968. p. 181218. [60] Matlab. http://www.mathworks.com/access/helpdesk/help/pdf_doc/
[28] Lim SS, Rossiter PL, Tibballs JE. Calphad 1995;19:131. nnet/nnet.pdf/.
[29] Yang J, Silk NJ, Watson A, Bryant AW, Chart TG, Argent BB. [61] Vegard L. Z Phys 1921;5:17.
Calphad 1995;19:415. [62] Hume-Rothery W, Smallman RE, Haworth CW. The structure of
[30] Fries SG, Ansara I, Lukas HL. J Alloys Compd 2001;320:228. metals and alloys. London: Metals and Metallurgy Trust of the
[31] Ohnuma I, Fujita Y, Mitsui H, Ishikawa K, Kainuma R, Ishida K. Institute of Metals and the Institution of Metallurgists; 1969.
Acta Mater 2000;48:3113. [63] Hume-Rothery W. Acta Metall 1966;14:17.
[32] Du Z, Yang H, Li C. J Alloys Compd 2000;297:185. [64] Wyatt OH, Dew-Hughes D. Metals, ceramics and polymers: An
[33] Liu ZK, Zhong Y, Schlom DG, Xi XX, Li Q. Calphad 2001;25:299. introduction to the structure and properties of engineering materi-
[34] Darken LS, Gurry RW. Physical chemistry of metals. New als. London: Cambridge University Press; 1974. p. 42.
York: McGraw-Hill; 1953. [65] Pearson WB. The crystal chemistry and physics of metals and
[35] Chelikowsky JR. Phys Rev B 1979;19:686. alloys. New York: WileyInterscience; 1972. p. 68.
[36] Alonso JA, Simozar S. Phys Rev B 1980;22:5583. [66] Basheer IA. Comput Aided Civil Infrastruct Eng 2000;15:440.
[37] Alonso JA, Lopez JM, Simozar S, Girifalco LA. Acta Metall [67] Masters T. Practical neural network recipes in C++. Boston: Aca-
1982;30:105. demic Press; 1993.
[38] Zhang BW, Liao SZ. Shanghai Met 1999;21:3. [68] Widrow B, Lehr MA. Proc IEEE 1990;78:1415.
[39] Massalski TB, Murray JL, Bennett LH, Baker H, editors. Binary [69] Upadhaya B, Eryureka E. Neural Technol 1992;97:170.
alloy phase diagrams. Metals Park (OH): American Society for [70] Lachtermacher G, Fuller JD. J Forecast 1995;14:381.
Metals; 1986. p. 187. pp. 90882. [71] Jadid MN, Fairbairn DR. Eng Appl Artif Intell 1996;9:309.
[40] Stark JG, Wallace HG, editors. Chemistry data book. London: Mur- [72] Basheer IA. Neuromechanistic-based modeling and simulation of
ray; 1982. p. 24. pp. 279. constitutive behavior of ne-grained soils. PhD thesis, Kansas State
[41] Guessasma S, Montavon G, Coddet C. Neural Networks, design of University, Manhattan; 1998. 435p.
experiments and other optimizations methodologies to quantify [73] Cottrell A. Concepts in the electron theory of alloys. London: IOM
parameter dependence of atmospheric plasma spraying. In: Marple Communications; 1998. p. 567. pp. 72, 92.
R, Moreau C, editors. Proceedings of the Thermal Spray 2003: [74] Hume-Rothery W, Irving HM, Williams RJP. Proc Roy Soc A
Advancing the science and applying the technology. Materials Park 1951;208:431.
(OH): ASM International; 2003. p. 39. [75] Cockayne B, Raynor GV. Proc Roy Soc A 1961;261:175.
[42] Axon HJ, Hume-Rothery W. Proc Roy Soc A 1948;193:1. [76] Gschneider KA. L.S. Darkens contributions to the theory of alloy
[43] Hassoun MH. Fundamentals of articial neural networks. Cam- formation and where we are today. In: Bennett LH, editor. Theory of
bridge, MA: MIT Press; 1995. alloy phase formation. Warrendale: The Metallurgical Society of
[44] Hopeld JJ. Proc Natl Acad Sci USA-Biol Sci 1984;81:3088. AIME; 1980. p. 134.
[45] Hopeld JJ, Tank DW. Science 1986;233:625. [77] Watson RE, Bennett LH. Phys Rev B 1978;18:6439.
[46] Kohonen T. Self-organization and associative memory. Ber- [78] Li KY, Xue DF. J Phys Chem A 2006;110:11332.
lin: Springer; 1989. [79] Miedema AR. J Less Common Met 1973;32:117.
[47] Zupan J, Gasteiger J. Anal Chim Acta 1991;248:1.